How SIA Works in JSON Schema
SIA uses the x- vendor extension mechanism defined by the OpenAPI Specification. This is the standard way to extend JSON Schema — every validator, linter, and code generator ignores x- properties by design.
LLMs have seen millions of OpenAPI specs with x- extensions. They read them as metadata without breaking.
| SIA term | JSON Schema property | Type | Applies to |
|---|---|---|---|
| role | x-sia-role | string | Property |
| priority | x-sia-priority | integer 1–5 | Property |
| instruction | x-sia-instruction | string | Property / Type |
| mapsTo | x-maps-to | object | Property / Type |
type, required, $ref, enum, allOf, minLength, pattern, format), use the native keyword. Never create an x-sia-* version of something that already exists.x-sia-role
Type: string • Applies to: property level
Tells the AI what semantic role this property plays. See SIA vocabulary for the complete list of standard role values.
"title": { "type": "string", "x-sia-role": "identifier" // AI knows: this names the entity } "publishedDate": { "type": "string", "format": "date", "x-sia-role": "temporal" // AI knows: this is a date/time } "author": { "$ref": "#/$defs/Author", "x-sia-role": "relationship" // AI knows: links to another entity }
x-sia-priority
Type: integer 1–5 • Applies to: property level
When context is limited, tells the AI which properties to keep (1) and drop first (5). See SIA vocabulary for the full priority scale.
"title": { "x-sia-priority": 1 } // essential: always include "body": { "x-sia-priority": 2 } // important: standard context "summary": { "x-sia-priority": 3 } // useful: drop under pressure "email": { "x-sia-priority": 4 } // supplementary: generous context only "internalId": { "x-sia-priority": 5 } // background: drop first
x-sia-instruction
Type: string (10–80 words) • Applies to: property or type level
Natural language instruction for the AI. The primary anti-hallucination mechanism.
Property-level
"author": { "$ref": "#/$defs/Author", "x-sia-instruction": "Link to Author type. In summaries, use the author's display name, never the raw ID or $ref path." }
Type-level
{
"title": "BlogPost",
"type": "object",
"x-sia-instruction": "A blog post. Title and author are always needed.
Summary can be dropped in constrained contexts.",
"properties": { ... }
}
x-sia-instruction vs description
description (native) | x-sia-instruction (SIA) | |
|---|---|---|
| Audience | Human developers | AI agents |
| Tone | Reference documentation | Operational instructions |
| Example | "The title of the blog post." | "Main heading. Always include. Max 200 chars." |
| Both? | AI reads x-sia-instruction first, falls back to description | |
description already reads like an AI instruction, skip x-sia-instruction. Use it only when human docs and AI instructions need to differ.x-maps-to
Type: object • Applies to: property and type level
Declares equivalent properties or types in external standards. Keys are human-readable standard names, values are URIs.
"title": { "type": "string", "x-maps-to": { "schema.org": "https://schema.org/headline", "Dublin Core": "http://purl.org/dc/elements/1.1/title", "Open Graph": "og:title", "FHIR": "http://hl7.org/fhir/StructureDefinition/title" } }
- Keys are human-readable: "schema.org", "Dublin Core", "FHIR", "Open Graph"
- Values are full URIs or prefixed names for well-known vocabularies
- Multiple mappings per property are supported
- Omit standards with no equivalent — never add placeholders
Platform Extensions
Any platform can define its own x-{name}-* namespace for round-trip metadata. AI agents ignore these.
"author": { "$ref": "#/$defs/Author", "x-sia-role": "relationship", // Platform-specific round-trip metadata "x-cm-relation": "domainIncludes", "x-cm-order": 4 }
YAML Serialisation
SIA-annotated JSON Schema can be serialised as YAML — more compact and higher LLM comprehension for nested data in benchmarks.
$defs:
BlogPost:
type: object
x-sia-instruction: >-
A blog post. Title and author always needed.
x-maps-to:
schema.org: https://schema.org/BlogPosting
properties:
title:
type: string
maxLength: 200
x-sia-role: identifier
x-sia-priority: 1
x-maps-to:
schema.org: https://schema.org/headline
Dublin Core: http://purl.org/dc/elements/1.1/title
author:
$ref: "#/$defs/Author"
x-sia-role: relationship
x-sia-priority: 2
x-sia-instruction: >-
Use author name in summaries, not ID.
required: [title, body, author]
Token Efficiency
x-sia-priority enables progressive truncation — strip lower-priority annotations to fit any token budget.
| Mode | maxPriority | What's included | Use case |
|---|---|---|---|
| Full | 5 | All properties, all annotations | Deep schema exploration |
| Normal | 3 | All properties, priority 1–3 | Standard AI context |
| Lean | 2 | All properties, priority 1–2 | Multiple schemas in context |
| Minimal | 1 | All properties, priority 1 only | Max schemas, min tokens |
| Structure | 0 | Types and properties only | Schema overview |
Minification rules
- Strip whitespace — saves ~15–20% tokens
- Remove
descriptionifx-sia-instructionis present - Remove
$schemaand$idif consumer knows identity - Remove
x-maps-toif consumer doesn't need mappings - Remove
x-{platform}-*unless round-trip import is the goal
What JSON Schema Already Gives You
Rule: if JSON Schema can express it natively, never use x-sia-* for it.
| Concept | JSON Schema native | SIA needed? |
|---|---|---|
| String property | "type": "string" | No |
| Required fields | "required": [...] | No |
| Relation to type | "$ref": "#/$defs/..." | No |
| Enum / taxonomy | "enum": [...] | No |
| Inheritance | "allOf": [{"$ref":"..."}] | No |
| Description | "description": "..." | No |
| Constraints | "minimum", "maxLength", "pattern" | No |
| Default values | "default": ... | No |
| Read-only | "readOnly": true | No |
| AI role hint | — | Yes: x-sia-role |
| AI priority | — | Yes: x-sia-priority |
| AI instruction | Partially (description) | Optional: x-sia-instruction |
| Standard mapping | — | Yes: x-maps-to |
Complete Example
Blog content model: two types, taxonomy, schema.org + Dublin Core mappings. More examples on GitHub ›
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "Blog Content Model",
"x-sia-instruction": "Blog content model. Posts and authors.",
"x-maps-to": { "schema.org": "https://schema.org/Blog" },
"$defs": {
"BlogPost": {
"type": "object",
"x-sia-instruction": "A blog post. Title and author always needed.",
"x-maps-to": {
"schema.org": "https://schema.org/BlogPosting",
"Dublin Core": "http://purl.org/dc/dcmitype/Text"
},
"properties": {
"title": {
"type": "string", "maxLength": 200,
"x-sia-role": "identifier", "x-sia-priority": 1,
"x-maps-to": {
"schema.org": "https://schema.org/headline",
"Dublin Core": "http://purl.org/dc/elements/1.1/title"
}
},
"body": {
"type": "string",
"x-sia-role": "content", "x-sia-priority": 2,
"x-maps-to": { "schema.org": "https://schema.org/articleBody" }
},
"author": {
"$ref": "#/$defs/Author",
"x-sia-role": "relationship", "x-sia-priority": 2,
"x-sia-instruction": "Use author name in summaries, not ID.",
"x-maps-to": {
"schema.org": "https://schema.org/author",
"Dublin Core": "http://purl.org/dc/elements/1.1/creator"
}
},
"category": {
"enum": ["tech", "science", "culture"],
"x-sia-role": "classification", "x-sia-priority": 3,
"x-maps-to": { "schema.org": "https://schema.org/articleSection" }
}
},
"required": ["title", "body", "author"]
},
"Author": {
"type": "object",
"x-maps-to": { "schema.org": "https://schema.org/Person" },
"properties": {
"name": {
"type": "string",
"x-sia-role": "identifier", "x-sia-priority": 1,
"x-maps-to": { "schema.org": "https://schema.org/name" }
},
"email": {
"type": "string", "format": "email",
"x-sia-role": "contact", "x-sia-priority": 4,
"x-maps-to": { "schema.org": "https://schema.org/email" }
}
},
"required": ["name"]
}
}
}
Advanced Patterns
Polymorphism
"contactMethod": { "oneOf": [ { "type": "string", "format": "email" }, { "$ref": "#/$defs/PhoneNumber" } ], "x-sia-role": "contact", "x-sia-instruction": "Email or phone. Prefer email for display." }
Recursive schemas
"children": { "type": "array", "items": { "$ref": "#/$defs/Category" }, "x-sia-instruction": "Recursive subcategories. Render as tree." }
Conditional schemas
"if": { "properties": { "status": { "const": "published" } } }, "then": { "required": ["publishedDate", "author"] }, "x-sia-instruction": "Published posts require date and author."
Nullable
"middleName": { "type": ["string", "null"], "x-sia-priority": 4, "x-sia-instruction": "May be null. Never fabricate a middle name." }
x-sia-role. Placeholder values ("n/a", "TBD") waste tokens and confuse AI agents.