The Shape Expressions (ShEx) language describes RDF graph structures. A shape describes the triples touching nodes in RDF [[!rdf11-concepts]] graphs. These descriptions identify predicates and their associated cardinalities and datatypes. ShEx shapes can be used to communicate data structures associated with some process or interface, generate or validate data, or drive user interfaces.

This document defines the Shape Expressions JASON (ShExJ) format.

Introduction

Most applications that share data do so using prescribed data structures. While RDFS and OWL enable one to make assertions about the objects in some domain, ShEx describes data structures. This document defines a JSON [[!json]] structure for expressing ShEx schemas.

Notation

The ShExJ structure is a sublanguage of JSON. consists of lists and maps of typed objects and values. These are represented as definitions in the following grammar. The content of some constructs involve many choices. These are represented as declarations.

For example, the first production:

Schema           : prefixes:[PREFIX->IRI] startActs:[SemAct] start:shapeLabel shapes:[shapeLabel->Shape]

indicates that a Schema object will have these properties:

The grammars below can be viewed with or without human language annotations:

View with descriptions (press 'd') , examples (press 'e') . Double-clicking on production labels, e.g. Schema or PREFIX below, toggles their annotation display.

ShExJ Schema Grammar

ShExJ schema documents start with the Schema production. The root object will therefor have a "type":"Schema" property.

Schema:prefixes:[PREFIX->IRI]? valueExprDefns:[valueExprLabel->ValueExprDefn]? startActs:[SemAct]? start:shapeLabel? shapes:[shapeLabel->Shape]? ;
A Schema has a type: "Schema", prefixes: an optional a map from PREFIX to IRI, valueExprDefns: an optional a map from valueExprLabel to ValueExprDefn, startActs: an optional a list of SemActs, start: an optional shapeLabel, and shapes: an optional a map from shapeLabel to Shape.
{ "type":"Schema", "prefixes": { "foaf": "http://xmlns.com/foaf/0.1/", "dc": "http://purl.org/dc/elements/1.1/"}, "valueClasses": { "http://a.example/vs1" ValueClass* }, "startActs": [ SemAct* ], "start":"http://co.example/IssueShape", "shapes":{ "http://co.example/IssueShape": Shape } }
Shape:virtual:BOOL? closed:BOOL? extra:[IRI]? expression:expr? inherit:[shapeLabel]? semActs:[SemAct]? ;
A Shape has a type: "Shape", virtual: an optional BOOL, closed: an optional BOOL, extra: an optional a list of IRIs, expression: an optional expr, inherit: an optional a list of shapeLabels, and semActs: an optional a list of SemActs.
{ "type":"Shape", "virtual":true, "closed":false, "extra":["http://www.w3.org/2000/01/rdf-Schema#label"], "expression":{ "type":"EachOf", "expressions":[expr], "max":"*" }, "inherit":[] }
SemAct:name:IRI code:STRING ;
A SemAct has a type: "SemAct", name: IRI, and code: STRING.
{ "type":"SemAct", "name":"http://a.example/extension1", "code":"val dt=o.datatype(); assert(dt.substr(20) == \"http://b.example/dt#\"); " }
expr=EachOf | OneOf | TripleConstraint | Inclusion ;
An expr is either:
EachOf
OneOf
TripleConstraint
Inclusion.
{ "type":"TripleConstraint", "valueExpr":{ "type": "ValueClass", "reference": "http://a.example/UserShape" }, "max":"*" }, { "type":"EachOf", "expressions":[expr], "max":"*" }
EachOf:expressions:[expr] min:INTEGER? max:(INTEGER | "*")? semActs:[SemAct]? annotations:[Annotation]? ;
An EachOf has a type: "EachOf", expressions: a list of exprs, min: an optional INTEGER, max: an optional (INTEGER | "*"), semActs: an optional a list of SemActs, and annotations: an optional a list of Annotations.
{ "type":"EachOf", "expressions":[expr], "max":"*" }
OneOf:expressions:[expr] min:INTEGER? max:(INTEGER | "*")? semActs:[SemAct]? annotations:[Annotation]? ;
A OneOf has a type: "OneOf", expressions: a list of exprs, min: an optional INTEGER, max: an optional (INTEGER | "*"), semActs: an optional a list of SemActs, and annotations: an optional a list of Annotations.
{ "type":"EachOf", "expressions":[expr], "min":0 }
Inclusion:include:shapeLabel ;
An Inclusion has a type: "Inclusion", include: shapeLabel.
{ "type":"Inclusion", "include":"http://co.example/IssueShape" }
TripleConstraint:inverse:BOOL? negated:BOOL? predicate:IRI valueExpr:valueExpr min:INTEGER? max:(INTEGER | "*")? semActs:[SemAct]? annotations:[Annotation]? ;
A TripleConstraint has a type: "TripleConstraint", inverse: an optional BOOL, negated: an optional BOOL, predicate: IRI, valueExpr: valueExpr, min: an optional INTEGER, max: an optional (INTEGER | "*"), semActs: an optional a list of SemActs, and annotations: an optional a list of Annotations.
{ "type":"TripleConstraint", "value":ValueClass, "min":"1", "max":"*", "semActs":[ SemAct ] }
ValueExprDefn:valueExpr:valueExpr semActs:[SemAct]? annotations:[Annotation]? ;
A ValueExprDefn has a type: "ValueExprDefn", valueExpr: valueExpr, semActs: an optional a list of SemActs, and annotations: an optional a list of Annotations.
{ "type":"ValueExprDefn", "valueExpr": { "type":"ValueClass", "nodeKind":"IRI", "reference":"http://co.example/UserShape" "pattern":"^http://a.example/users/" } }
valueClassOrRef=ValueClass | ValueRef ;
A valueClassOrRef is either:
ValueClass
ValueRef.
{ "type":"ValueClass", "reference":"http://co.example/IssueShape" }, { "type": "ValueRef", "valueExprRef": "http://co.example/valueDefn3" }
valueExpr=valueClassOrRef | ValueOr | ValueAnd ;
A valueExpr is either:
valueClassOrRef
ValueOr
ValueAnd.
{ "type": "ValueClass", "reference":"http://co.example/IssueShape" }, { "type": "ValueAnd", "valueExprs": [ { "type":"ValueClass", "reference":"http://co.example/IssueShape" }, { "type":"ValueClass", "reference":"http://co.example/RecordShape" } ] }
ValueClass:nodeKind:"literal" xsFacet*
|nodeKind:("iri" | "bnode" | "nonliteral") reference:shapeLabel? stringFacet*
|datatype:IRI xsFacet*
|reference:shapeLabel stringFacet*
|values:[valueSetValue]
|/* empty */ ;
A ValueClass has a type: "ValueClass", either:
nodeKind: "literal", and xsFacet*
nodeKind: ("iri" | "bnode" | "nonliteral"), reference: an optional shapeLabel, and stringFacet*
datatype: IRI, and xsFacet*
reference: shapeLabel, and stringFacet*
values: a list of valueSetValues
‣ no additional properties.
{ "type":"ValueClass", "nodeKind":"IRI", "reference":"http://co.example/UserShape" "pattern":"^http://a.example/users/" }
ValueRef:valueExprRef:shapeLabel ;
A ValueRef has a type: "ValueRef", valueExprRef: shapeLabel.
{ "type": "ValueRef", "valueExprRef": "http://co.example/valueDefn3" }
ValueOr:valueExprs:[valueClassOrRef] ;
A ValueOr has a type: "ValueOr", valueExprs: a list of valueClassOrRefs.
{ "type": "ValueAnd", "valueExprs": [ { "type":"ValueClass", "reference":"http://co.example/ResolvedIssueShape" }, { "type":"ValueClass", "reference":"http://co.example/RejectedIssueShape" } ] }
ValueAnd:valueExprs:[valueClassOrRef] ;
A ValueAnd has a type: "ValueAnd", valueExprs: a list of valueClassOrRefs.
{ "type": "ValueAnd", "valueExprs": [ { "type":"ValueClass", "reference":"http://co.example/IssueShape" }, { "type":"ValueClass", "reference":"http://co.example/RecordShape" } ] }
Annotation:predicate:IRI object:IRI ;
An Annotation has a type: "Annotation", predicate: IRI, and object: IRI.
xsFacet=stringFacet | numericFacet ;
A xsFacet is either:
stringFacet
numericFacet.
"length":20, "mininclusive":6.02e23
stringFacet=(length|minlength|maxlength):INTEGER | pattern:STRING ;
A stringFacet is either:
‣ one of (length, minlength, maxlength):INTEGER
pattern: STRING.
"lenght":20, "pattern":"^http://"
numericFacet=(mininclusive|minexclusive|maxinclusive|maxexclusive):numericLiteral
|(totaldigits|fractiondigits):INTEGER ;
A numericFacet is either:
‣ one of (mininclusive, minexclusive, maxinclusive, maxexclusive):numericLiteral
‣ one of (totaldigits, fractiondigits):INTEGER.
"mininclusive":3.14, "totaldigits":1024
valueExprLabel=IRI | BNODE ;
A valueExprLabel is either:
IRI
BNODE.
"http://a.example/vs1"
shapeLabel=IRI | BNODE ;
A shapeLabel is either:
IRI
BNODE.
"http://co.example/IssueShape"
numericLiteral=INTEGER | DECIMAL | DOUBLE ;
A numericLiteral is either:
INTEGER
DECIMAL
DOUBLE.
1, 2.2, 3.3e3
valueSetValue=IRI
|STRING
|DATATYPE_STRING
|LANG_STRING
|StemRange ;
A valueSetValue is either:
IRI
STRING
DATATYPE_STRING
LANG_STRING
StemRange.
"http://a.example/assigned", "\"repair\"", { "type":"StemRange", "stem":"http://a.example/" }
StemRange:stem:IRI exclusions:[StemRange] ;
A StemRange has a type: "StemRange", stem: IRI, and exclusions: a list of StemRanges.
{ "type":"StemRange", "stem":"http://a.example/", "exclusions":[ "http://a.example/a", { "type":"StemRange", "http://a.example/b", } ] }
Terminals
PREFIX=^.*$
"foaf", "dc", the emtpy string ""
IRI=^[^_].*$
"http://co.example/IssueShape", "http://xmlns.com/foaf/0.1/"
BNODE=^_.*$
"_:IssueShape", "_:b1"
BOOL=^(true|false)$
true, false
INTEGER=^.*$
true, false
DECIMAL=^.*$
2.2
DOUBLE=^.*$
3.3e3
STRING=^.*$
"\"abc\"", "\"Bob said \"The moon is made of green cheese.\"\""
DATATYPE_STRING=^.*$
"\"2015-11-15\"<http://www.w3.org/2001/XMLSchema#date>", "\"255\"^^xsd:byte", "\"MCMLXVII\"^^my:romanNumeral"
LANG_STRING=^.*$
"\"chat\"@fr", "\"Cette Série des Années Septante\"@fr-BE"

ShExJ Schema Graphical Representation (Informative)

Following is a graphical representation of the ShExJ structure. The interconnections in the ShExJ schema are between these structures:

"

ShExJ Validation Results Grammar

status: experimental

The validation results grammar uses some objects from the schema grammar: ShExJ Validation Results documents start with the validation production. The root object will therefor have a "type":"validation" property.

validation       : node:IRI, Shape:IRI, solutions:soln
A validation object has: type:"validation", node:IRI, Shape:IRI, and solutions: an array of soln.
{
  "type": "test",
  "node": "http://a.example/Issue1",
  "shape": "http://co.example/IssueShape",
  "solutions": [ soln* ]
}


soln             = groupSolutions
                 | someOfSolutions
                 | tripleConstraintSolutions
A soln is either (group, OneOf or TripleConstraint).
{ "type":"group", "solutions":[soln], "max":"*" }


groupSolutions   : solutions:[groupSolution] min:INTEGER? max:(INTEGER|"*")? semActs:[SemAct]?
A groupSolutions object has: type:"groupSolutions", solutions: an array of groupSolution, (optional) min:INTEGER and (optional) max:((INTEGER|"*")) semActs:[SemAct]?.
{
  "type": "groupSolutions",
  "solutions": [ groupSolution* ],
  "min":"1",
  "max":"*",
  "semActs":[ SemAct ]
}


groupSolution    : expressions:soln
A groupSolution object has: type:"groupSolution", expressions: an array of soln.
{
  "type": "groupSolution",
  "expressions": [ soln* ],
}


someOfSolutions  : solutions:[someOfSolution] min:INTEGER? max:(INTEGER|"*")? semActs:[SemAct]?
A someOfSolutions object has: type:"someOfSolutions", solutions: an array of someOfSolution, (optional) min:INTEGER and (optional) max:((INTEGER|"*")) semActs:[SemAct]?.
{
  "type": "someOfSolutions",
  "solutions": [ someOfSolution* ],
  "min":"1",
  "max":"*",
  "semActs":[ SemAct ]
}


someOfSolution   : expressions:soln
A someOfSolution object has: type:"someOfSolution", expressions: an array of soln.
{
  "type": "someOfSolution",
  "expressions": [ soln* ],
}


tripleConstraintSolutions : predicate:IRI value:ValueClass solutions:[testedTriple] min:INTEGER? max:(INTEGER|"*")? semActs:[SemAct]?
A tripleConstraintSolutions object has: type:"tripleConstraintSolutions", predicate:IRI, value:ValueClass, solutions: an array of testedTriple, (optional) min:INTEGER and (optional) max:((INTEGER|"*")) semActs:[SemAct]?.
{
  "type": "tripleConstraintSolutions",
  "solutions": [ tripleConstraintSolution* ],
  "min":"1",
  "max":"*",
  "semActs":[ SemAct ]
}


testedTriple     : subject:(IRI|BNODE), predicate:IRI object:(IRI|STRING|BNODE|dtSTRING|langSTRING) referenced:validation?
A testedTriple object has: type:"testedTriple", subject:(IRI or BNODE), predicate:IRI, object:(IRI, STRING, BNODE, dtSTRING or langSTRING), and (optional) referenced:validation.
{
  "type": "test",
  "subject": "http://inst.example/#Issue1",
  "predicate": "http://ex.example/#reportedBy",
  "object": "http://inst.example/#User3",
  "referenced":{
    "type": "test",
    "node": "http://a.example/User2",
    "shape": "http://co.example/UserShape",
    "solutions": [ soln* ]
  }
}

Internet Media Type, File Extension and Macintosh File Type for ShExJ Schemas

Contact:
Eric Prud'hommeaux
See also:
How to Register a Media Type for a W3C Specification
Internet Media Type registration, consistency of use
TAG Finding 3 June 2002 (Revised 4 September 2002)

The Internet Media Type / MIME Type for ShExJ Schemas is "application/json+shexschema".

It is recommended that ShExJ Schemas files have the extension ".shes" (all lowercase) on all platforms.

It is recommended that ShExJ Schemas files stored on Macintosh HFS file systems be given a file type of "TEXT".

This information that follows will be submitted to the IESG for review, approval, and registration with IANA.

Type name:
application
Subtype name:
json+shexschema
Required parameters:
None
Optional parameters:
charset — this parameter is required when transferring non-ASCII data. If present, the value of charset is always UTF-8.
Encoding considerations:
The encoding of ShExJ Schemas is prescribed by [[!json]] with the further restriction that the encoding is always UTF-8 [[!UTF-8]].
Security considerations:
ShExJ Schemas express rules to be applied to RDF triples visited during ShEx validation. This can include semantic actions which some party will consume and interpret.
ShExJ Schemas are applied to application data; security considerations will vary by domain of use. Security tools and protocols applicable to text (e.g. PGP encryption, MD5 sum validation, password-protected compression) may also be used on ShExJ Schemas documents. Security/privacy protocols must be imposed which reflect the sensitivity of the embedded information.
ShExJ Schemas uses IRIs as term identifiers. Applications interpreting data expressed in ShExJ Schemas should address the security issues of Internationalized Resource Identifiers (IRIs) [RFC3987] Section 8, as well as Uniform Resource Identifier (URI): Generic Syntax [RFC3986] Section 7.
Multiple IRIs may have the same appearance. Characters in different scripts may look similar (a Cyrillic "о" may appear similar to a Latin "o"). A character followed by combining characters may have the same visual representation as another character (LATIN SMALL LETTER E followed by COMBINING ACUTE ACCENT has the same visual representation as LATIN SMALL LETTER E WITH ACUTE). Any person or application that is writing or interpreting data in ShExJ Schemas must take care to use the IRI that matches the intended semantics, and avoid IRIs that make look similar. Further information about matching of similar characters can be found in Unicode Security Considerations [UNICODE-SECURITY] and Internationalized Resource Identifiers (IRIs) [RFC3987] Section 8.
Interoperability considerations:
There are no known interoperability issues.
Published specification:
This specification.
Applications which use this media type:
No widely deployed applications are known to use this media type. It may be used by some web services and clients consuming their data.
Additional information:
Magic number(s):
ShExJ Schemas start with some form of '{"type":"validation"...' with arbitrary whitespace between the tokens and quoted strings.
File extension(s):
".shes"
Base URI:
The ShExJ Schemas use absolute URIs.
Macintosh file type code(s):
"TEXT"
Person & email address to contact for further information:
Eric Prud'hommeaux <eric@w3.org>
Intended usage:
COMMON
Restrictions on usage:
None
Author/Change controller:
Eric Prud'hommeaux <eric@w3.org>

Internet Media Type, File Extension and Macintosh File Type for ShExJ Validation Results

Contact:
Eric Prud'hommeaux
See also:
How to Register a Media Type for a W3C Specification
Internet Media Type registration, consistency of use
TAG Finding 3 June 2002 (Revised 4 September 2002)

The Internet Media Type / MIME Type for ShExJ Validation Results is "application/json+shexval".

It is recommended that ShExJ Validation Results files have the extension ".shev" (all lowercase) on all platforms.

It is recommended that ShExJ Validation Results files stored on Macintosh HFS file systems be given a file type of "TEXT".

This information that follows will be submitted to the IESG for review, approval, and registration with IANA.

Type name:
application
Subtype name:
json+shexval
Required parameters:
None
Optional parameters:
charset — this parameter is required when transferring non-ASCII data. If present, the value of charset is always UTF-8.
Encoding considerations:
The encoding of ShExJ Validation Results is prescribed by [[!json]] with the further restriction that the encoding is always UTF-8 [[!UTF-8]].
Security considerations:
ShExJ Validation Results expresses the ShEx rules and RDF triples visited during ShEx validation. This can include semantic actions which some party will consume and interpret.
ShExJ Validation Results include arbitrary application data; security considerations will vary by domain of use. Security tools and protocols applicable to text (e.g. PGP encryption, MD5 sum validation, password-protected compression) may also be used on ShExJ Validation Results documents. Security/privacy protocols must be imposed which reflect the sensitivity of the embedded information.
ShExJ Validation Results uses IRIs as term identifiers. Applications interpreting data expressed in ShExJ Validation Results should address the security issues of Internationalized Resource Identifiers (IRIs) [RFC3987] Section 8, as well as Uniform Resource Identifier (URI): Generic Syntax [RFC3986] Section 7.
Multiple IRIs may have the same appearance. Characters in different scripts may look similar (a Cyrillic "о" may appear similar to a Latin "o"). A character followed by combining characters may have the same visual representation as another character (LATIN SMALL LETTER E followed by COMBINING ACUTE ACCENT has the same visual representation as LATIN SMALL LETTER E WITH ACUTE). Any person or application that is writing or interpreting data in ShExJ Validation Results must take care to use the IRI that matches the intended semantics, and avoid IRIs that make look similar. Further information about matching of similar characters can be found in Unicode Security Considerations [UNICODE-SECURITY] and Internationalized Resource Identifiers (IRIs) [RFC3987] Section 8.
Interoperability considerations:
There are no known interoperability issues.
Published specification:
This specification.
Applications which use this media type:
No widely deployed applications are known to use this media type. It may be used by some web services and clients consuming their data.
Additional information:
Magic number(s):
ShExJ Validation Results start with some form of '{"type":"validation"...' with arbitrary whitespace between the tokens and quoted strings.
File extension(s):
".shev"
Base URI:
The ShExJ Validation Results use absolute URIs.
Macintosh file type code(s):
"TEXT"
Person & email address to contact for further information:
Eric Prud'hommeaux <eric@w3.org>
Intended usage:
COMMON
Restrictions on usage:
None
Author/Change controller:
Eric Prud'hommeaux <eric@w3.org>