SIAShEx

SIA for ShEx

Express SIA annotations using ShEx's native // annotation syntax. 35% more compact than JSON Schema. Native semantic web and linked data interoperability.

blogpost.sia.shex
PREFIX :      <https://example.org/schema/>
PREFIX xsd:   <http://www.w3.org/2001/XMLSchema#>
PREFIX sia:   <https://www.schematica.io/sia#>
PREFIX schema: <https://schema.org/>

:BlogPost {

  :title xsd:string // xsd:maxLength 200
    // sia:role        "identifier"
    // sia:priority    1
    // sia:instruction "Main heading. Always include."
    // sia:mapsTo      schema:headline ;

  :author @:Author
    // sia:role        "relationship"
    // sia:priority    2
    // sia:instruction "Use author name, not ID."
    // sia:mapsTo      schema:author
}
Syntax Prefixes sia:role sia:priority sia:instruction sia:mapsTo Platform Ext. Structure Mapping Tokens Full Example Differences from JSON Schema

How SIA Works in ShEx

ShEx has a first-class annotation mechanism using the // syntax. Unlike JSON Schema's x- convention, ShEx annotations are typed RDF statements — they are part of the specification, preserved by every compliant parser, and can be queried with standard RDF tools.

SIA uses this native mechanism. Every SIA annotation is a // sia:* statement attached to a triple constraint (property) or shape declaration (type).

SIA termShEx annotationValue typeApplies to
role// sia:rolestring literalTriple constraint
priority// sia:priorityinteger literalTriple constraint
instruction// sia:instructionstring literalTriple constraint / shape
mapsTo// sia:mapsToIRITriple constraint / shape
Annotations are formal RDF. When you write // sia:role "identifier", this is an RDF statement: the subject is the triple constraint, the predicate is sia:role, and the object is "identifier". This means SIA metadata is queryable, indexable, and interoperable with any RDF toolchain.

PREFIX Declarations

Every ShEx + SIA schema begins with PREFIX declarations that define the namespaces used in the schema.

# Required: SIA vocabulary
PREFIX sia: <https://www.schematica.io/sia#>

# Required: XML Schema data types
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

# Your schema namespace
PREFIX : <https://example.org/schema/>

# Standards you map to (as needed)
PREFIX schema: <https://schema.org/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX fhir: <http://hl7.org/fhir/>
Convention. Use sia: for the SIA vocabulary, : (default prefix) for your own schema properties, and named prefixes for any standards you map to. This keeps the schema compact and readable.

// sia:role

Value: string literal  •  Applies to: triple constraint (property)

Tells the AI what semantic role this property plays. See SIA vocabulary for the complete list of standard role values.

:title xsd:string
  // sia:role "identifier" ;     # names the entity

:publishedDate xsd:date
  // sia:role "temporal" ;       # a date/time field

:author @:Author
  // sia:role "relationship" ;   # links to another entity

:category ["tech" "science"]
  // sia:role "classification" ; # taxonomy / enum

// sia:priority

Value: integer 1–5  •  Applies to: triple constraint (property)

Truncation priority. 1 = always keep, 5 = drop first. See SIA vocabulary for the full priority scale.

:title xsd:string
  // sia:priority 1 ;             # essential: always include

:body xsd:string
  // sia:priority 2 ;             # important: standard context

:summary xsd:string ?
  // sia:priority 3 ;             # useful: drop under pressure

:email xsd:string ?
  // sia:priority 4 ;             # supplementary

:internalId xsd:string ?
  // sia:priority 5 ;             # background: drop first

// sia:instruction

Value: string literal  •  Applies to: triple constraint or shape declaration

Natural language instruction for the AI. Anti-hallucination mechanism.

On a property

:author @:Author
  // sia:role "relationship"
  // sia:instruction "Link to Author type. In summaries,
     use the author's display name, never the raw ID." ;

On a shape (type-level)

# Type-level instruction as a comment-annotation pattern
:BlogPost {
  # A blog post. Title and author are always needed.
  # Summary can be dropped in constrained contexts.

  :title xsd:string
    // sia:role "identifier" ;
  ...
}
Note. ShEx annotations attach to triple constraints, not shapes directly. For type-level instructions, use a structured comment above the shape declaration, or define a dedicated sia:shapeInstruction annotation on the first triple constraint. The SIA vocabulary handles both patterns.

// sia:mapsTo

Value: IRI  •  Applies to: triple constraint or shape

Declares equivalent properties in external standards. Each mapping is a separate // sia:mapsTo annotation with a URI value.

:title xsd:string
  // sia:mapsTo schema:headline           # schema.org
  // sia:mapsTo dc:title                   # Dublin Core
  // sia:mapsTo <http://ogp.me/ns#title>   # Open Graph
  ;

How this differs from JSON Schema

In JSON Schema, x-maps-to is a single object with human-readable keys:

// JSON Schema: one object, named keys
"x-maps-to": {
  "schema.org": "https://schema.org/headline",
  "Dublin Core": "http://purl.org/dc/elements/1.1/title"
}

In ShEx, each mapping is a separate RDF annotation with a URI:

# ShEx: repeated annotations, URI values
// sia:mapsTo schema:headline
// sia:mapsTo dc:title
Both carry the same information. The JSON Schema form uses human-readable labels as keys. The ShEx form uses URIs directly — more machine-precise, and the standard name is inferred from the URI namespace. Use comments for readability when needed.

Platform Extensions

Any platform can define its own annotation namespace. In ShEx, this uses the same // mechanism with a platform-specific prefix.

PREFIX cm: <https://coremodels.io/ns#>

:author @:Author
  // sia:role "relationship"
  // sia:priority 2
  // cm:relation "domainIncludes"    # platform-specific
  // cm:order 4                     # platform-specific
  ;
When to include. Export for AI / external sharing → omit platform annotations. Export for round-trip import → include them.

Structural Mapping: ShEx vs JSON Schema

Most schema concepts map directly between formats. Here's how the base layer (Layer 1) translates.

ConceptJSON SchemaShEx
String"type": "string"xsd:string
Integer"type": "integer"xsd:integer
Boolean"type": "boolean"xsd:boolean
Date"format": "date"xsd:date
Date-time"format": "date-time"xsd:dateTime
RequiredIn required arrayDefault (no ?)
OptionalNot in requiredAdd ? after type
Relation"$ref": "#/$defs/X"@:X
Array"type": "array"xsd:string * or @:X *
Enum"enum": ["a","b"]["a" "b"]
Inheritance"allOf": [{"$ref":"..."}]EXTENDS @:Parent
Closed"additionalProperties": falseCLOSED
Pattern"pattern": "^[A-Z]+"xsd:string ~ /^[A-Z]+/
Max length"maxLength": 200// xsd:maxLength 200
Description"description": "..."# comment or // rdfs:comment "..."
Cardinality note. ShEx defaults to required (exactly 1). Use ? for optional (0 or 1), * for array (0 or more), + for non-empty array (1 or more). This is the opposite of JSON Schema where properties are optional by default.

Token Efficiency

ShEx + SIA achieves ~35% fewer tokens than JSON Schema + SIA for the same schema content. For bulk operations with many types, this compression is significant.

FormatTokens (BlogPost + Author)Savings
JSON Schema + SIA (JSON)~170baseline
JSON Schema + SIA (YAML)~140-18%
ShEx + SIA~110-35%
ShEx + SIA (minified)~90-47%

Where the savings come from:

SIA's progressive truncation via sia:priority works identically in ShEx. Strip lower-priority annotations to fit any token budget.

Complete Example

Blog content model: two shapes, a value set, and schema.org + Dublin Core mappings. More examples on GitHub ›

PREFIX :       <https://example.org/schema/blog/>
PREFIX xsd:    <http://www.w3.org/2001/XMLSchema#>
PREFIX sia:    <https://www.schematica.io/sia#>
PREFIX schema: <https://schema.org/>
PREFIX dc:     <http://purl.org/dc/elements/1.1/>

# BlogPost — A blog post. Title and author always needed.
:BlogPost {

  :title xsd:string // xsd:maxLength 200
    // sia:role        "identifier"
    // sia:priority    1
    // sia:mapsTo      schema:headline
    // sia:mapsTo      dc:title
  ;

  :body xsd:string
    // sia:role        "content"
    // sia:priority    2
    // sia:mapsTo      schema:articleBody
  ;

  :summary xsd:string ?
    // sia:role        "descriptive"
    // sia:priority    3
    // sia:instruction "Brief abstract. 1-2 sentences.
       Safe to drop in constrained contexts."
    // sia:mapsTo      schema:abstract
  ;

  :author @:Author
    // sia:role        "relationship"
    // sia:priority    2
    // sia:instruction "Use author display name in summaries,
       never the raw ID or shape reference."
    // sia:mapsTo      schema:author
    // sia:mapsTo      dc:creator
  ;

  :publishedDate xsd:date ?
    // sia:role        "temporal"
    // sia:priority    3
    // sia:mapsTo      schema:datePublished
    // sia:mapsTo      dc:date
  ;

  :category ["tech" "science" "culture"] ?
    // sia:role        "classification"
    // sia:priority    3
    // sia:mapsTo      schema:articleSection
  ;

  :tags xsd:string *
    // sia:role        "classification"
    // sia:priority    4
    // sia:mapsTo      schema:keywords

}

# Author
:Author {

  :name xsd:string
    // sia:role     "identifier"
    // sia:priority 1
    // sia:mapsTo   schema:name
  ;

  :email xsd:string ?
    // sia:role     "contact"
    // sia:priority 4
    // sia:mapsTo   schema:email
  ;

  :bio xsd:string ?
    // sia:role     "descriptive"
    // sia:priority 3
}

Differences from JSON Schema Expression

The SIA vocabulary translates losslessly between formats. But the base format layer (Layer 1) has some differences. These are documented honestly, not hidden.

FeatureJSON SchemaShExImpact
Conditional validationif/then/else nativeNo equivalentDocument the rule in sia:instruction
Pattern propertiespatternPropertiesNo equivalentDocument in sia:instruction
String format"format": "email"Regex pattern or NodeKindPartial — use ~ /regex/
Property namesLocal stringsURI-basedShEx names ARE the ontology
Mapping syntaxSingle x-maps-to objectRepeated // sia:mapsToSame semantics, different syntax
Annotationsx- convention// first-classShEx annotations are typed RDF
Default cardinalityOptionalRequiredRemember to add ? for optional
Structural schema and SIA annotations convert losslessly between formats. Only certain base-format validation features (if/then/else, patternProperties) don't have ShEx equivalents. When converting from JSON Schema, these are preserved as sia:instruction documentation rather than executable constraints.

What ShEx adds that JSON Schema cannot express