The Shape Expressions (ShEx) language describes RDF nodes and graph structures. A node constraint describes an RDF node (IRI, blank node or literal) and a shape describes the triples involving nodes in an RDF graph. These descriptions identify predicates and their associated cardinalities and datatypes. ShEx shapes can be used to communicate data structures associated with some process or interface, generate or validate data, or drive user interfaces.

This document defines the ShEx language. See the Shape Expressions Primer for a non-normative description of ShEx.

This document has been developed by the Shape Expressions Community Group. The specification has undergone significant development since the 1.0 version.

This version of the document represents a Candidate Release, with stable features. Comments and implementations are solicited prior to an eventual Final 2.0 Release.

The Shape Expressions language is expected to remain stable with the exception of:

Introduction

The Shape Expressions (ShEx) language provides a structural schema for RDF data. This can be used to document APIs or datasets, aid in development of API-conformant messages, minimize defensive programming, guide user interfaces, or anything else that involves a machine-readable description of data organization and typing requirements.

ShEx describes RDF graph structures as sets of potentially connected Shapes. These constrain the triples involving nodes in an RDF graph. Node Constraints constrain RDF nodes by constraining their node kind (IRI, blank node or Literal), enumerating permissible values in value sets, specifying their datatype, and constraining value ranges of Literals. Additionally, they constrain lexical forms of Literals, IRIs and labeled blank nodes. Shape Expressions schemas share blank nodes with the constrained RDF graphs in the same way that graphs in RDF datasets [[!rdf11-concepts]] share blank nodes.

ShEx can be represented in JSON structures (ShExJ) or a compact syntax (ShExC). The compact syntax is intended for human consumption; the JSON structure for machine processing. This document defines ShEx in terms of ShExJ and includes a section on the ShEx Compact Syntax (ShEx).

To understand the programatic interface described in this specification and how it is intended to operate in a programming environment, it is useful to have working knowledge of WebIDL [[WebIDL]]. To understand how ShEx relates to RDF, it is helpful to be familiar with the basic RDF concepts [[RDF11-CONCEPTS]].

Notation

The JSON [[!rfc4627]] Syntax serves as a serializable proxy for an abstract syntax. This specification uses a simple grammar to describe the set of JSON documents that can be interpreted as a ShEx schema.

Schemas with the marker "𝙅/c" in the corner can be converted between the JSON representation and the compact syntax by clicking the marker. Pressing "j" or "c" converts all such examples to the JSON or compact form.

The following examples are excerpts from the definitions below. In the JSON notation,

Schema { startActs:[SemAct]? start:shapeExpr? shapes:[shapeExpr+]? }

signifies that a Schema has three optional components called startActs, start and shapes:

shapeExpr = ShapeOr | ShapeAnd | ShapeNot | NodeConstraint | Shape | ShapeExternal ;

signifies that a shapeExpr is one of seven object types: ShapeOr | ShapeAnd | ….

NodeConstraint { nodeKind:("iri" | "bnode" | "nonliteral" | "literal")? xsFacet* } xsFacet = stringFacet | numericFacet ;

signifies that a NodeConstraint has a nodeKind of one of the four literals followed by any number of xsFacet and an xsFacet is either a stringFacet or a numericFacet.

References

ShExJ is a dialect of JSON-LD [[!JSON-LD]] and the member id is used as a node identifier. An object may be represented inline or referenced by its id which may be either a blank node or an IRI.

The JSON structure may include references to shapes and triples:

An object with a circular reference must be referenced by id. This example uses a nested shape reference on a value expression (defined below).

{ "type": "Schema", "shapes": [
    { "id": "http://schema.example/IssueShape",
      "type": "Shape", "expression": {
        "type": "TripleConstraint", "predicate": "http://schema.example/related",
        "valueExpr": "http://schema.example/IssueShape", min: 0 } } ] }

Not captured in this JSON syntax definition is the rule that every shapeExpr referenced by a schema's shapes must have an id and no other shapeExpr may have an id. The JSON syntax definitition simplifies this by adding id:shapeExprLabel? to every shapeExpr.

Document style

JSON examples are rendered in a .json CSS style. Partial examples include ranges in a .comment CSS style to indicate text which would be substituted in a complete example. For example { "type": "ShapeAnd", "shapeExprs": [ SE1, ] } indicates that both SE1 and would be substituted in a complete example.

Graph access

The validation process defined in this document relies on matching triple patterns in the form (subject, predicate, object) where each position may be supplied by a constant, a previously defined term, or the underscore "_", which represents a previously undefined element or wildcard. This corresponds to a SPARQL Triple Pattern where each "_" is replaced by a unique blank node. Matching such a triple pattern against a graph is defined by SPARQL Basic Graph Pattern Matching (BGP) with a BGP containing only that triple pattern.

Results

ShEx does not prescribe a report format. For illustration purposes in this specification, results of validation are represented in a table associating node/shape pairs with a pass or fail and a reason for failure:

shapenoderesultreason
<Shape1><node1>pass
<Shape1><node2>failno ex:state supplied.

Terminology

Shape expressions are defined using terms from RDF semantics [[!rdf11-mt]]:

This specification makes use of the following namespaces:

foaf:
http://xmlns.com/foaf/0.1/
rdf:
http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs:
http://www.w3.org/2000/01/rdf-schema#
shex:
http://www.w3.org/ns/shex#
xsd:
http://www.w3.org/2001/XMLSchema#

For the purposes of the ShExProcessor interface, a promise is an object that represents the eventual result of a single asynchronous operation. Promises are defined in [[ECMASCRIPT-6.0]].

The following functions access the elements of an RDF graph G containing a node n:

Consider the RDF graph G represented in Turtle:

PREFIX ex: http://schema.example/
PREFIX inst: http://inst.example/#
PREFIX foaf: http://xmlns.com/foaf/
PREFIX xsd: http://www.w3.org/2001/XMLSchema#

inst:Issue1 
    ex:state      ex:unassigned ;
    ex:reportedBy _:User2 .

_:User2
    foaf:name     "Bob Smith" ;
    foaf:mbox     <mailto:bob@example.org> .
        

There are two arcs out of _:User2; arcsOut(G, _:User2):

    _:User2  foaf:name  "Bob Smith" .
    _:User2  foaf:mbox  <mailto:bob@example.org> .
  

There is one arc into _:User2; arcsIn(G, _:User2):

    inst:Issue1 ex:reportedBy  _:User2 .
  

There are three arcs in the neighbourhood of _:User2 set, neigh(G, _:User2):

    _:User2  foaf:name  "Bob Smith" .
    _:User2  foaf:mbox  <mailto:bob@example.org> .
    inst:Issue1 ex:reportedBy  _:User2 .
  

Conformance criteria are relevant to authors and authoring tool implementers. As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The Shape Expressions Language

A Shape Expressions (ShEx) schema is a collection of labeled Shapes and Node Constraints. These can be used to describe or test nodes in RDF graphs. ShEx does not prescribe a language for associating nodes with shapes but several approaches are described in the ShEx Primer.

Shapes Schema

A shapes schema is captured in a Schema object:

Schema{startActs:[SemAct]? start:shapeExpr? shapes:[shapeExpr+]? }

where shapes is a mapping from shape label to shape expression.

{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/IssueShape",  },
  { "id": "_:UserShape",  },
  { "id": "http://schema.example/EmployeeShape",  } ] }
                                                    
    ex:IssueShape { … }
    _:UserShape { … }
    ex:EmployeeShape { … }

ShapeMap Definition

A ShapeMap is a mapping from node to a set of shapes (Node1: [Shape2, Shape3, …]) where node is an RDF node, and Shapen is a shape label from a schema. A node-label association in a ShapeMap indicates that the node Node1 satisfies the constraint given by the definition of the shape expressions labeled Shape2 and Shape3. We use the JSON list notation [Shape2, Shape3, …] for sets with the additional understanding that the list neither order nor repetition have any semantic impact.

Consider a shapes schema with shape labels "http://schema.example/IssueShape", "_:UserShape":

{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/IssueShape",  },
  { "id": "_:UserShape",  },
  { "id": "http://schema.example/EmployeeShape",  } ] }
                                                        
    ex:IssueShape { … }
    _:UserShape { … }
    ex:EmployeeShape { … }

An example ShapeMap is:

{ "http://inst.example/#Issue1": ["http://schema.example/IssueShape"],
  "_:User2": ["_:UserShape", "http://schema.example/EmployeeShape"] }

isValid: The expression isValid(G, m) indicates that for every node/shape pair (n, s) in m, s has a corresponding shape expression se and satisfies(n, se, G, m). satisfies is defined below for each form of shape expression.

Validation of a graph G does not necessarily require the construction of a ShapeMap with every node in G. The validation semantics defined below imply a set of dependencies for validating some node as a shape. A validation process that tests every node in a graph against every shape in a schema can be considered to have an input ShapeMap associating each node with the set of all shapes in the schema.

The semantics of ShEx are independent from the construction of an input ShapeMap. https://www.w3.org/2001/sw/wiki/ShEx/ShapeMap has list of popular methods for constructing ShapeMaps.

Validation of a node against a shape in a schema requires that there is a consistent mapping from node to shape for every node and shape visited during a validation process. The remainder of this document uses m to indicate such a mapping.

Shape Expressions

A shape expression is composed of four kinds of objects combined with the algebraic operators And, Or and Not:

JSON Syntax

shapeExpr = ShapeOr | ShapeAnd | ShapeNot | NodeConstraint | Shape | ShapeExternal | shapeExprRef ;
ShapeOr { id:shapeExprLabel? shapeExprs:[shapeExpr] }
ShapeAnd { id:shapeExprLabel? shapeExprs:[shapeExpr] }
ShapeNot { id:shapeExprLabel? shapeExpr:shapeExpr }
ShapeExternal { id:shapeExprLabel? }
shapeExprRef = shapeExprLabel ;
shapeExprLabel = IRI | BNODE ;

Examples of shape expressions:

{ "type": "Shape",  }  
{  }                   
 
{ "type": "ShapeAnd", "shapeExprs": [
  { "type": "NodeConstraint", "nodeKind": "iri" },
  { "type": "ShapeOr", "shapeExprs": [
    "http://schema.example/IssueShape",
    { "type": "ShapeNot", "shapeExpr": { "type": "Shape",  } }
    ] } ] }
                                                    
  IRI AND
  (
    @<http://schema.example/IssueShape>
    OR NOT {  }
  )
In this ShapeOr's shapeExprs,"http://schema.example/IssueShape" is a reference to the shape expression with the id "http://schema.example/IssueShape".

Semantics

satisfies: The expression satisfies(n, se, G, m) indicates that a node n and graph G satisfy a shape expression se with shapeMap m.
notSatisfies: Conversely, notSatisfies(n, se, G, m) indicates that n and G do not satisfy se with the given shapeMap m. notSatisfies(n, se, G, _) indicates that no shapeMap would allow n and G to satisfy se.

satisfies(n, se, G, m) is true if and only if:

Given the three shape expressions SE1, SE2, SE3 in a Schema Sch, such that:

  • satisfies(n, SE1, G, m)
  • satisfies(n, SE2, G, m)
  • notSatisfies(n, SE3, G, m)

the following hold:

  • satisfies(
    n,
    { "type": "ShapeAnd", "shapeExprs": [ SE1, SE2 ] }
    ,
    G, m)
  • satisfies(
    n,
    { "type": "ShapeOr", "shapeExprs": [ SE1, SE2, SE3 ] }
    ,
    G, m)
  • notSatisfies(
    n,
    { "type": "ShapeNot", "shapeExpr": {
        { "type": "ShapeOr", "shapeExprs": [
            SE1,
            { "type": "ShapeAnd", "shapeExprs": [ SE2, SE3 ] }
          ] }
      } }
    ,
    G, m)

If Sch's shapes maps "http://schema.example/shape1" to SE1 then the following holds:

  • satisfies(
    n,
    http://schema.example/shape1"
    ,
    G, m)

Node Constraints

dfn
NodeConstraint {id:shapeExprLabel? nodeKind:("iri" | "bnode" | "nonliteral" | "literal")? datatype:IRI? xsFacet* values:[valueSetValue]? }
xsFacet =stringFacet | numericFacet ;
stringFacet =("length"|"minlength"|"maxlength"):INTEGER | pattern:STRING flags:STRING? ;
numericFacet =("mininclusive"|"minexclusive"|"maxinclusive"|"maxexclusive"):numericLiteral
|("totaldigits"|"fractiondigits"):INTEGER ;
numericLiteral =INTEGER | DECIMAL | DOUBLE ;
valueSetValue=objectValue | IriStem | IriStemRange | LiteralStem | LiteralStemRange | LanguageStem | LanguageStemRange ;
objectValue=IRI | ObjectLiteral ;
IriStem{stem:IRI }
IriStemRange{stem:(IRI | Wildcard) exclusions:[objectValue|IriStem +]? }
LiteralStem{stem:ObjectLiteral }
LiteralStemRange{stem:(ObjectLiteral | Wildcard) exclusions:[objectValue|LiteralStem +]? }
LanguageStem{stem:ObjectLiteral }
LanguageStemRange{stem:(ObjectLiteral | Wildcard) exclusions:[objectValue|LanguageStem +]? }
Wildcard {/* empty */ }

Semantics

For a node n and constraint nc, satisfies2(n, nc) if and only if for every nodeKind, datatype, xsFacet and values constraint value v present in nc nodeSatisfies(n, v). The following sections define nodeSatisfies for each of these types of constraints:

Node Kind Constraints

For a node n and constraint value v, nodeSatisfies(n, v) if:

Node Kind example 1

The following examples use a TripleConstraint object described later in the document. The

{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/IssueShape",
    "type": "Shape", "expression": {
      "type": "TripleConstraint", "predicate": "http://schema.example/state",
      "valueExpr": { "type": "NodeConstraint", "nodeKind": "iri" } } } ] }
                                                     
    ex:IssueShape {                                                          
      ex:state IRI
    }
              
<issue1> ex:state ex:HunkyDory .
<issue2> ex:taste ex:GoodEnough .
<issue3> ex:state "just fine" .
shapenoderesultreason
<IssueShape><issue1>pass
<IssueShape><issue2>failexpected 1 ex:state property.
<IssueShape><issue3>failex:state expected to be an IRI, literal found.

Note that <issue2> fails not because of a nodeKind violation but instead because of a Cardinality violation described below.

Datatype Constraints

For a node n and constraint value v, nodeSatisfies(n, v) if n is an Literal with the datatype v and, if v is in the set of SPARQL operand data types[[!sparql11-query]], an XML schema string with a value of the lexical form of n can be cast to the target type v per XPath Functions 3.1 section 19 Casting[[!xpath-functions]]. Only datatypes supported by SPARQL MUST be tested but ShEx extensions MAY add support for other datatypes.

Datatype example 1
{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/IssueShape",
    "type": "Shape", "expression": {
      "type": "TripleConstraint", "predicate": "http://schema.example/submittedOn",
      "valueExpr": {
        "type": "NodeConstraint",
        "datatype": "http://www.w3.org/2001/XMLSchema#date"
      } } } ] }
                                                                                   
    ex:IssueShape {                                                          
      ex:submittedOn xsd:date
    }

    
    
              
<issue1> ex:submittedOn "2016-07-08"^^xsd:date .
<issue2> ex:submittedOn "2016-07-08T01:23:45Z"^^xsd:dateTime .
<issue3> ex:submittedOn "2016-07"^^xsd:date .
shapenoderesultreason
<IssueShape><issue1>pass
<IssueShape><issue2>failex:submittedOn expected to be an xsd:date, xsd:dateTime found.
<IssueShape><issue3>fail2016-07 is not a valid xsd:date.

In RDF 1.1, language-tagged strings[[!rdf11-concepts]] have the datatype http://www.w3.org/1999/02/22-rdf-syntax-ns#langString.

RDF 1.0 included RDF literals with no datatype or language tag. These are called "simple literals" in SPARQL11[[!sparql11-query]]. In RDF 1.1, these literals have the datatype http://www.w3.org/2001/XMLSchema#string.

Datatype example 2
{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/IssueShape",
    "type": "Shape", "expression": {
      "type": "TripleConstraint",
      "predicate": "http://www.w3.org/2000/01/rdf-schema#label",
      "valueExpr": {
        "type": "NodeConstraint",
        "datatype": "http://www.w3.org/1999/02/22-rdf-syntax-ns#langString"
      } } } ] }
                                                                           
  ex:IssueShape {
    rdfs:label rdf:langString
  }




              
<issue3> rdfs:label "emits dense black smoke"@en .
<issue4> rdfs:label "unexpected odor" .
shapenoderesultreason
<IssueShape><issue3>pass
<IssueShape><issue4>failrdfs:label expected to be an rdf:langString, xsd:string found.

XML Schema String Facet Constraints

String facet constraints apply to the lexical form of the RDF Literals and IRIs and blank node identifiers (see note below regarding access to blank node identifiers).
Let lex =

Let len = the number of unicode codepoints in lex
For a node n and constraint value v, nodeSatisfies(n, v):

  • for "length" constraints, v = len,
  • for "minlength" constraints, v >= len,
  • for "maxlength" constraints, v <= len,
  • for "pattern" constraints, v is unescaped into a valid XPath 3.1 regular expression[[!xpath-functions-31]] re and invoking fn:matches(lex, re) returns fn:true. If the flags parameter is present, it is passed as a third argument to fn:matches. The pattern may have XPath 3.1 regular expression escape sequences per the modified production [10] in section 5.6.1.1 as well as numeric escape sequences of the form 'u' HEX HEX HEX HEX or 'U' HEX HEX HEX HEX HEX HEX HEX HEX. Unescaping replaces numeric escape sequences with the corresponding unicode codepoint.
String Facets example 1
{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/IssueShape",
    "type": "Shape", "expression": {
      "type": "TripleConstraint",
      "predicate": "http://schema.example/submittedBy",
      "valueExpr": { "type": "NodeConstraint", "minlength": 10 } } } ] }
                                                                        
  ex:IssueShape {
    ex:submittedBy MINLENGTH 10
  }

            
<issue1> ex:submittedBy <http://a.example/bob> . # 20 characters
<issue2> ex:submittedBy "Bob" . # 3 characters
shapenoderesultreason
<IssueShape><issue1>pass
<IssueShape><issue2>failex:submittedBy expected to be >= 10 characters,
3 characters found.

Access to blank node identifiers may be impossible or unadvisable for many use cases. For instance, the SPARQL Query and SPARQL Update languages treat blank nodes in the query, labeled or otherwise, as variables. Lexical constraints on blank node identifiers can only be implemented in systems which preserve such labels on data import.

String Facets example 2
{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/IssueShape",
    "type": "Shape", "expression": {
      "type": "TripleConstraint",
      "predicate": "http://schema.example/submittedBy",
      "valueExpr": { "type": "NodeConstraint",
                     "pattern": "genuser[0-9]+", "flags": "i" }
} } ] }
                                                               
  ex:IssueShape {
    ex:submittedBy ~/genuser[0-9]+/i
  }



            
<issue6> ex:submittedBy _:genUser218 .
<issue7> ex:submittedBy _:genContact817 .
shapenoderesultreason
<IssueShape><issue6>pass
<IssueShape><issue7>fail_:genContact817 expected to match genuser[0-9]+.

When expressed as JSON strings, regular expressions are subject to the JSON string escaping rules.

String Facets example 3
{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/ProductShape",
    "type": "Shape", "expression": {
      "type": "TripleConstraint",
      "predicate": "http://schema.example/trademark",
      "valueExpr": { "type": "NodeConstraint",
                     "pattern": "^/\\t\\\\\\U0001D4B8\\?$" }
} } ] }
                                                            
  ex:ProductShape {
    ex:trademark /^\/\t\\\U0001D4B8\?$/
  }



            
<product6> ex:trademark "	\\𝒸?" .
<product7> ex:trademark "\t\\\U0001D4B8?" . # Turtle literals have escape characters [tbnrf"'\].
<product8> ex:trademark "\t\\\\U0001D4B8?" .
shapenoderesultreason
<ProductShape><product6>pass
<ProductShape><product7>pass
<ProductShape><product8>failfound "\U0001D4B8" instead of "𝒸" (codepoint U+1D4B8).

XML Schema Numeric Facet Constraints

Numeric facet constraints apply to the numeric value of RDF Literals with datatypes listed in SPARQL 1.1 Operand Data Types[[!sparql11-query]]. Numeric constraints on non-numeric values fail. totaldigits and fractiondigits constraints on values not derived from xsd:decimal fail.

Let num be the numeric value of n.
For a node n and constraint value v, nodeSatisfies(n, v):

  • for "mininclusive" constraints, v <= num,
  • for "minexclusive" constraints, v < num,
  • for "maxinclusive" constraints, v >= num,
  • for "maxexclusive" constraints, v > num,
  • for "totaldigits" constraints, v equals the number of digits in the XML Schema canonical form[[!xmlschema-2]] of the value of n,
  • for "fractiondigits" constraints, v is less than or equals the number of digits to the right of the decimal place in the XML Schema canonical form[[!xmlschema-2]] of the value of n, ignoring trailing zeros.

The operators <=, <, >= and > are evaluated after performing numeric type promotion[[!xpath20]].

Numeric Facets example 1
{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/IssueShape",
    "type": "Shape", "expression": {
      "type": "TripleConstraint",
      "predicate": "http://schema.example/confirmations",
      "valueExpr": { "type": "NodeConstraint", "mininclusive": 1 } } } ] }
                                                                          
  ex:IssueShape {
    ex:confirmations MININCLUSIVE 1
  }

            
<issue1> ex:confirmations 1 .
<issue2> ex:confirmations 2^^xsd:byte .
<issue3> ex:confirmations 0 .
<issue4> ex:confirmations "ii"^^ex:romanNumeral .
shapenoderesultreason
<IssueShape><issue1>pass
<IssueShape><issue2>pass
<IssueShape><issue3>fail0 is less than 1.
<IssueShape><issue4>failex:romanNumeral is not a numeric datatype.

Values Constraint

The nodeSatisfies semantics for NodeConstraint values depends on a nodeIn function defined below.

For a node n and constraint value v, nodeSatisfies(n, v) if n matches some valueSetValue vsv in v. A term matches a valueSetValue if:

nodeIn: asserts that an RDF node n is equal to an RDF term s or is in a set defined by a IriStem, LiteralStem or LanguageStem.
The expression nodeIn(n, s) is satisfied if:

Values Constraint example 1

NoActionIssueShape requires a state of Resolved or Rejected:

{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/NoActionIssueShape",
    "type": "Shape", "expression": {
      "type": "TripleConstraint",
      "predicate": "http://schema.example/state",
      "valueExpr": {
        "type": "NodeConstraint", "values": [
          "http://schema.example/Resolved",
          "http://schema.example/Rejected" ] } } } ] }
                                                      
  ex:NoActionIssueShape {
    ex:state [ ex:Resolved ex:Rejected ]
  }




            
<issue1> ex:state ex:Resolved .
<issue2> ex:state ex:Unresolved .
shapenoderesultreason
<NoActionIssueShape><issue1>pass
<NoActionIssueShape><issue2>failex:state expected to be ex:Resolved or ex:Rejected, ex:Unresolved found.
Values Constraint example 2

An employee must have an email address that is the string "N/A" or starts with "engineering-" or "sales-" but not "sales-contacts" or "sales-interns":

{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/EmployeeShape",
    "type": "Shape", "expression": {
      "type": "TripleConstraint",
      "predicate": "http://xmlns.com/foaf/0.1/mbox",
      "valueExpr": {
        "type": "NodeConstraint", "values": [
          {"value": "N/A"},
          { "type": "IriStemRange", "stem": "mailto:engineering-" },
          { "type": "IriStemRange", "stem": "mailto:sales-", "exclusions": [
              { "type": "IriStem", "stem": "mailto:sales-contacts" },
              { "type": "IriStem", "stem": "mailto:sales-interns" }
            ] }
        ] } } } ] }
                                                                         
  ex:EmployeeShape {
    foaf:mbox [ "N/A"
                <mailto:engineering->~
                <mailto:sales->~
                    - <mailto:sales-contacts>~
                    - <mailto:sales-interns>~ ]
  }





            
<issue3> foaf:mbox "N/A" .
<issue4> foaf:mbox <mailto:engineering-2112@a.example> .
<issue5> foaf:mbox <mailto:sales-835@a.example> .
<issue6> foaf:mbox "missing" .
<issue7> foaf:mbox <mailto:sales-contacts-999@a.example> .
shapenoderesultreason
<EmployeeShape><issue3>pass
<EmployeeShape><issue4>pass
<EmployeeShape><issue5>pass
<EmployeeShape><issue6>fail"missing" is not in value set.
<EmployeeShape><issue7>fail<mailto:sales-contacts-999@a.example> is excluded.
Values Constraint example 3

An employee must not have an email address that starts with "engineering-" or "sales-":

{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/EmployeeShape",
    "type": "Shape", "expression": {
      "type": "TripleConstraint",
      "predicate": "http://xmlns.com/foaf/0.1/mbox",
      "valueExpr": {
        "type": "NodeConstraint", "values": [
          { "type": "IriStemRange", "stem": {"type": "Wildcard"},
            "exclusions": [
              { "type": "IriStem", "stem": "mailto:engineering-" },
              { "type": "IriStem", "stem": "mailto:sales-" }
            ] }
        ] } } } ] }
                                                                
  ex:EmployeeShape {
    foaf:mbox [ . - <mailto:engineering->~ - <mailto:sales->~ ]
  }








            
<issue8> foaf:mbox 123 .
<issue9> foaf:mbox <mailto:core-engineering-2112@a.example> .
<issue10> foaf:mbox <mailto:engineering-2112@a.example> .
shapenoderesultreason
<EmployeeShape><issue8>pass
<EmployeeShape><issue9>pass
<EmployeeShape><issue10>fail<mailto:engineering-2112@a.example> is excluded.

A value set can have a single value in it. This is used to indicate that a specific value is required, e.g. that an ex:state must be equal to <http://schema.example/Resolved> or the rdf:type of some node must be foaf:Person.

Shapes and Triple Expressions

Triple expressions are used for defining patterns composed of triple constraints. Shapes associate triple expressions with flags indicating whether triples match if they do not correspond to triple constraints in the triple expression. A triple expression is composed of TripleConstraint and tripleExprRef objects composed with grouping and choice operators.

JSON Syntax

Shape {id:shapeExprLabel? closed:BOOL? extra:[IRI]? expression:tripleExpr? semActs:[SemAct]? }
tripleExpr =EachOf | OneOf | TripleConstraint | tripleExprRef ;
EachOf {id:tripleExprLabel? expressions:[tripleExpr] min:INTEGER? max:INTEGER? semActs:[SemAct]? annotations:[Annotation]? }
OneOf {id:tripleExprLabel? expressions:[tripleExpr] min:INTEGER? max:INTEGER? semActs:[SemAct]? annotations:[Annotation]? }
TripleConstraint {id:tripleExprLabel? inverse:BOOL? predicate:IRI valueExpr:shapeExpr? min:INTEGER? max:INTEGER? semActs:[SemAct]? annotations:[Annotation]? }
tripleExprRef =tripleExprLabel ;
tripleExprLabel =IRI | BNODE ;

Semantics

The satisfies semantics for a Shape depend on a matches function defined below. For a node n, shape S, graph G, and shapeMap m, satisfies(n, S, G, m) if and only if:

  • neigh(G, n) can be partitioned into two sets matched and remainder such that matches(matched, expression, m). If expression is absent, remainder = neigh(G, n).
    Let outs be the arcsOut in remainder: outs = remainderarcsOut(G, n).
    Let matchables be the triples in outs whose predicate appears in a TripleConstraint in expression. If expression is absent, matchables = Ø (the empty set).
    The complexity of partitioning is described briefly in the ShEx2 Primer.
  • There is no triple in matchables which matches a TripleConstraint in expression.
    Let unmatchables be the triples in outs which are not in matchables. matchablesunmatchables = outs.
  • There is no triple in matchables whose predicate does not appear in extra.
  • closed is false or unmatchables is empty.

matches: asserts that a triple expression is matched by a set of triples that come from the neighbourhood of a node in an RDF graph. The expression matches(T, expr, m) indicates that a set of triples T can satisfy these rules:

  • expr has semActs and matches(T, expr, m) by the remaining rules in this list and the evaluation of semActs succeeds according to the section below on Semantic Actions.

    matches(
    T,
    { "type": "OneOf", "shapeExprs": [te1, te2, …], "min": 2, "max": 3,
      "semActs": [SemAct1, SemAct2, …] }
    (te1 | te2) {2,3}                                                
            %<SemAct1>% %<SemAct2>% …
    
    ,
    m)
    evaluates as:
    matches(
    T,
    { "type": "OneOf", "shapeExprs": [te1, te2, …], "min": 2, "max": 3 }
    (te1 | te2) {2,3}                                                 
    ,
    m)
    and semActsSatisfied([SemAct1, SemAct2, …])
  • expr has a cardinality of min and/or max not equal to 1, where a max of -1 is treated as unbounded, and T can be partitioned into k subsets T1, T2,…Tk such that min ≤ k ≤ max and for each Tn, matches(Tn, expr, m) by the remaining rules in this list.

    matches(
    T,
    { "type": "OneOf", "shapeExprs": [te1, te2, …], "min": 2, "max": 3 }
    (te1 | te2) {2,3}                                                 
    ,
    m)
    evaluates as:
    Let e = 
    { "type": "OneOf", "shapeExprs": [te1, te2, …] }
    (te1 | te2)                                   
    (matches(T1, e, m) and matches(T2, e, m)
     and T = T1 ∪ T2)
    or
    (matches(T1, e, m) and matches(T2, e, m) and matches(T3, e, m)
     and T = T1 ∪ T2 ∪ T3)
  • expr is a OneOf and there is some shape expression se2 in shapeExprs such that a matches(T, se2, m).

    matches(
    T,
    { "type": "OneOf", "shapeExprs": [
      { "type": "EachOf", "shapeExprs": [te3, te4, …] },
      { "type": "TripleExpression", "min": 1, "max": -1,
        "predicate": "http://xmlns.com/foaf/0.1/name" }
    ] }
                                                                 
      (te3 ; te4 ; …)
      | <http://xmlns.com/foaf/0.1/name> . +
    
     
    ,
    m)
    evaluates as:
    matches(
    T,
    { "type": "EachOf", "shapeExprs": [te3, te4, …] }
    te3 ; te4 ; …                                    
    ,
    m)
    or matches(
    T,
    { "type": "TripleExpression", "min": 1, "max": -1,
        "predicate": "http://xmlns.com/foaf/0.1/name" }
    <http://xmlns.com/foaf/0.1/name> . +                       
     
    ,
    m)
  • expr is an EachOf and there is some partition of T into T1, T2,… such that for every expression expr1, expr2,… in shapeExprs, matches(Tn, exprn, m).

    matches(
    T,
    { "type": "EachOf", "shapeExprs": [
      { "type": "TripleExpression",
        "predicate": "http://xmlns.com/foaf/0.1/givenName" },
      { "type": "TripleExpression",
        "predicate": "http://xmlns.com/foaf/0.1/familyName" }
    ] }
                                                             
    <http://xmlns.com/foaf/0.1/givenName> . ;
    
    <http://xmlns.com/foaf/0.1/familyName> .
    
     
    ,
    m)
    evaluates as:
    matches(
    T1,
    { "type": "TripleExpression",
        "predicate": "http://xmlns.com/foaf/0.1/givenName" }
    <http://xmlns.com/foaf/0.1/givenName> .                 
     
    ,
    m)
    and matches(
    T2,
    { "type": "TripleExpression",
        "predicate": "http://xmlns.com/foaf/0.1/familyName" }
    <http://xmlns.com/foaf/0.1/familyName> .                 
     
    ,
    m)
    and T = T1 ∪ T2
  • expr is a TripleConstraint and:

    • T is a set of one triple.
      Let t be the soul triple in T.
    • t's predicate equals expr's predicate.
      Let value be t's subject if inverse is true, else t's object.
    • if inverse is true, t is in arcsIn, else t is in arcsOut.
    • either
      • expr has no valueExpr

        matches(
        T,
        { "type": "TripleExpression",
          "predicate": "http://xmlns.com/foaf/0.1/givenName" }
        <http://xmlns.com/foaf/0.1/givenName> . ;             
         
        ,
        m)
        holds if
        • T has exactly one triple t.
        • t has the predicate "http://xmlns.com/foaf/0.1/givenName"
      • or satisfies(value, valueExpr, G, m).

        matches(
        T,
        { "type": "TripleConstraint", "inverse": true,
          "predicate": "http://purl.org/dc/elements/1.1/author",
          "valueExpr": "http://schema.example/IssueShape" }
                                                                
        ^<http://purl.org/dc/elements/1.1/author>
                @<http://schema.example/IssueShape>
        ,
        m)
        holds if
        • T has exactly one triple t.
        • t has the predicate "http://purl.org/dc/elements/1.1/author"
        • t has a subject n2
        • The schema's shapes maps "http://schema.example/IssueShape" to se2
        • satisfies(n2, se2, G, m)
  • expr is an tripleExprRef and satisfies(value, tripleExprWithId(tripleExprRef), G, m).
    The tripleExprWithId function is defined in Triple Expression Reference Requirement below.

    For the schema
    { "type": "Schema", "shapes": [
      { "id": "http://schema.example/EmployeeShape",
        "type": "Shape", "expression": {
          "type": "EachOf", "expressions": [
            "http://schema.example/nameExpr",
            { "type": "TripleConstraint",
              "predicate": "http://schema.example/empID",
              "valueExpr": { "type": "NodeConstraint",
                "datatype": "http://www.w3.org/2001/XMLSchema#integer" } } ] } },
      { "id": "http://schema.example/PersonShape",
        "type": "Shape", "expression": {
          "id": "http://schema.example/nameExpr",
          "type": "TripleConstraint",
          "predicate": "http://xmlns.com/foaf/0.1/name" } } ] }
                                                                                 
    <http://schema.example/EmployeeShape> {
        &<http://schema.example/nameExpr> ;
        <http://schema.example/empID>
              <http://www.w3.org/2001/XMLSchema#integer>
    
    
    
    }
    <http://schema.example/PersonShape> {
    
      $<http://schema.example/nameExpr>
        <http://xmlns.com/foaf/0.1/name> ;
    }
    matches(
    T,
    "http://schema.example/PersonShape"
    ,
    m)
    holds if
    • The schema has a shape se2 with the id "http://schema.example/PersonShape"
    • satisfies(n, se2, G, m)

Schema Requirements

The semantics defined above assume two structural requirements beyond those imposed by the grammar of the abstract syntax. These ensure referential integrity and eliminate logical paradoxes such as those that arrise through the use of negation. These are not constraints expressed by the schema but instead those imposed on the schema.

Schema Validation Requirement

A graph G is said to conform with a schema S with a ShapeMap m when:

  1. Every, SemAct in the startActs of S has a successful evaluation of semActsSatisfied.
  2. Every node n in m conforms to its associated shapeExprRefs sen where for each shapeExprRef sei in sen:
    1. sei references a ShapeExpr in shapes, and
    2. satisfies(n, sei, G, m) for each shape sei in sen.
  3. Each optional focus node fi in focusNodes, m has a start shapeExpr se, and satisfies(fi, se, G, m).

Shape Expression Reference Requirement

An shapeExprRef MUST appear in the schema's shapes map and the corresponding triple expression MUST be a Shape with a shapeExpr. The function shapeExprWithId(shapeExprRef) returns the shape's shapeExpr.

Following are two valid shapeExprRefs:

{ "type":"Schema", "shapes": [
    { "id": "http://schema.example/PersonShape",
      "type":"Shape", "expression": {
        "id": "http://schema.example/nameExpr",
        "type": "TripleConstraint",
        "predicate": "http://xmlns.com/foaf/0.1/name"
      } },
    { "id": "http://schema.example/EmployeeShape", "type":"EachOf", "shapeExprs": [
        "http://schema.example/nameExpr",
        {
      "type":"Shape", "expression": { "type": "TripleConstraint",
          "predicate": "http://schema.example/employeeNumber" }
} ] } ] }
                                                                             
    ex:PersonShape {
      foaf:name .
    }

    ex:EmployeeShape ex:PersonShape AND {
      ex:employeeNumber .
    }




          
{ "type":"Schema", "shapes": [
    { "id": "http://schema.example/PersonShape",
      "type":"Shape", "expression": {
        "id": "http://schema.example/nameExpr",
        "type": "TripleConstraint",
        "predicate": "http://xmlns.com/foaf/0.1/name"
      } },
    { "id": "http://schema.example/EmployeeShape",
      "type":"Shape", "expression": { "type": "TripleConstraint",
          "predicate": "http://schema.example/dependent",
          "valueExpr": "http://schema.example/PersonShape" } } ] }
                                                                  
    ex:PersonShape {
      foaf:name .
    }



    ex:EmployeeShape {
      ex:dependent ex:PersonShape*
    }
          

This shapeExprRef is invalid because there is no corresponding shape expression:

{ "type":"Schema", "shapes": [
    { "id": "http://schema.example/S1",
      "type":"Shape", "expression":
        "http://schema.example/MissingShapeExpr"
} ] }
                                                
    S1 {
      &MissingShapeExpr
    }
          

This shapeExprRef is invalid because the referenced object is a triple expression instead of a shape expression:

{ "type":"Schema", "shapes": [
    { "id": "http://schema.example/CustomerShape",
      "type":"Shape", "expression": { "type": "TripleConstraint",
          "id": "http://schema.example/discountExpr",
          "predicate": "http://schema.example/discount" } },
    { "id": "http://schema.example/EmployeeShape",
      "type":"Shape", "expression": { "type": "TripleConstraint",
        "predicate": "http://schema.example/contactFor",
        "valueExpr": "http://schema.example/CustomerShape"
} } ] }
                                                                 
    ex:CustomerShape {
      $ex:discountExpr ex:discount .
    }

    ex:EmployeeShape {
      ex:contactFor @ex:CustomerShape
    }

          

Triple Expression Reference Requirement

An tripleExprRef MUST identify a triple expression in the schema. The function tripleExprWithId(tripleExprRef) returns the triple expression with the id tripleExprRef.

Following is a valid triple expression reference:

{ "type":"Schema", "shapes": [
    { "id": "http://schema.example/PersonShape",
      "type":"Shape", "expression": {
        "id": "http://schema.example/nameExpr",
        "type": "TripleConstraint",
        "predicate": "http://xmlns.com/foaf/0.1/name"
      } },
    { "id": "http://schema.example/EmployeeShape",
      "type":"Shape", "expression": { "type":"EachOf", "shapeExprs": [
        "http://schema.example/nameExpr",
        { "type": "TripleConstraint",
          "predicate": "http://schema.example/employeeNumber" }
] } } ] }
                                                                      
    ex:PersonShape {
      foaf:name .
    }



    ex:EmployeeShape {
      &ex:PersonShape ;
      ex:employeeNumber .
    }

          

This triple expression reference is invalid because there is no corresponding triple expression:

{ "type":"Schema", "shapes": [
    { "id": "http://schema.example/S1",
      "type":"Shape", "expression":
        "http://schema.example/missingTripleExpr"
} ] }
                                                 
    S1 {
      &missingTripleExpr
    }
          

This triple expression reference is invalid because the referenced object is a shape expression instead of a triple expression:

{ "type":"Schema", "shapes": [
    { "id": "http://schema.example/CustomerShape",
      "type":"ShapeAnd", "shapeExprs": [  ]
    },
    { "id": "http://schema.example/PreferredCustomerShape",
      "type":"Shape", "expression": { "type":"EachOf", "expressions": [
        "http://schema.example/CustomerShape",
        { "type": "TripleConstraint",
          "predicate": "http://schema.example/discount" }
] } } ] }
                                                                       
    ex:CustomerShape {
      ; 
    }
    ex:PreferredCustomerShape {
      &ex:CustomerShape ;
      ex:discount .
    }

          

Negation Requirement

A schema MUST NOT contain any shape expression S with negated references, either directly or transitively, to S. A reference to S is negated if it containes an odd number of nested negations. A nested negation is any of:

  • a shapeExpr in a ShapeNot

    This negated self-reference violates the negation requirement:

    { "type": "Schema", "shapes": [
      { "id": "http://schema.example/#S",
        "type": "ShapeNot">, "shapeExpr":
          "http://schema.example/#S"
    } ] }
                                           
        ex:S NOT @ex:S
    
    
                    

    This negated, indirect self-reference violates the negation requirement:

    { "type": "Schema", "shapes": [
      { "id": "http://schema.example/#S",
        "type": "ShapeNot">, "shapeExpr":
          "http://schema.example/#T" },
        "http://schema.example/#T": "http://schema.example/#S"
    ] }
                                                              
        ex:S NOT @ex:T
    
    
        ex:T @ex:S
                    

    This doubly-negated self-reference does not violate the negation requirement:

    { "type": "Schema", "shapes": [
        { "id": "http://schema.example/#S",
          "type": "ShapeNot", "shapeExpr": {
            "type": "ShapeOr", "shapeExprs": [
              { "type": "NodeConstraint", "nodeKind": "iri" },
              { "type": "ShapeNot",
                "shapeExpr": "http://schema.example/#S" }
    ] } } ] }
                                                              
        ex:S NOT (IRI OR NOT @ex:S)
    
    
    
    
    
                    
  • a valueExpr in a TripleConstraint with predicate p inside a Shape with extra p

    This self-reference on a predicate designated as extra violates the negation requirement:

    { "type": "Schema", "shapes": [
        { "id": "http://schema.example/#S":
          "type": "Shape",
          "extra": [ "http://schema.example/#p" ], "expression":
          { "type": "TripleConstraint",
            "predicate": "http://schema.example/#p",
            "valueExpr": "http://schema.example/#S"
    } } ] }
                                                                
      ex:S EXTRA ex:p {
        ex:p @ex:S
      }
    
    
    
                  

    The same shape with a negated self-reference still violates the negation requirement because the reference occurs with a ShapeNot:

    { "type": "Schema", "shapes": [
        { "id": "http://schema.example/#S",
          "type": "Shape",
          "extra": [ "http://schema.example/#p" ],
          "expression": {
            "type": "TripleConstraint",
            "predicate": "http://schema.example/#p",
            "valueExpr": {
              "type": "ShapeNot", "shapeExpr": "http://schema.example/#S"
    } } } ] }
                                                                         
        ex:S EXTRA ex:p {
          ex:p NOT @ex:S
        }
    
    
    
    
    
                  

Semantic Actions

Semantic actions serve as an extension point for Shape Expressions. They appear in lists in Schema's startActs and Shape, OneOf, EachOf and TripleConstraint's semActs.

A semantic action is a tuple of an identifier and some optional code:

SemAct{name:IRI code:STRING? }

Semantics

The evaluation semActsSatisfied on a list of SemActs returns success or failure. The evaluation of an individual SemAct is implementation-dependent.

Use - informative

A practical evaluation of a SemAct will provide access to some context. For instance, the http://shex.io/extensions/Test/ extension requires access to the subject, predicate and object of a triple matching a TripleConstraint. These are used in a print function.

Semantic Actions example 1
{ "type": "Schema", "shapes": [
    { "id": "http://a.example/S1",
      "type": "Shape", "expression": {
        "type": "TripleConstraint", "predicate": "http://a.example/p1",
        "min": 1, "max": -1,
        "semActs": [
          { "type": "SemAct", "code": " print(s) ",
            "name": "http://shex.io/extensions/Test/" },
          { "type": "SemAct", "code": " print(o) ",
            "name": "http://shex.io/extensions/Test/" } ] } } ] }
                                                                       
    ex:S1 {
      ex:p1 .+ %Test:{ print(s) %} %Test:{ print(o) %}
    }





              
<http://a.example/n1> <http://a.example/p1> <http://a.example/o1> .
<http://a.example/n2> <http://a.example/p1> "a", "b" .
<http://a.example/n3> <http://a.example/p2> <http://a.example/o2> .
shapenoderesultprint arguments
<S1><n1>passhttp://a.example/s1
http://a.example/o1
<S1><n2>passhttp://a.example/s1
"a"
http://a.example/s1
"b"
<S1><n3>fail

Annotations

Annotations provide a format-independent way to provide additional information about elements in a schema. They appear in lists in Shape, OneOf, EachOf and TripleConstraint's annotations.

Annotation{predicate:IRI object:objectValue }

Semantics - informative

Annotations do not affect whether a node conforms to some shape. Because they are part of the structure of the schema, they can be parsed in one ShEx format and emitted in that format or another.

Annotations example 1
{ "type": "Schema", "shapes": [
    { "id": "http://schema.example/IssueShape",
      "type": "Shape", "expression": {
        "type": "TripleConstraint",
        "predicate": "http://schema.example/status",
        "annotations": [
           { "type": "Annotation",
             "predicate": "http://www.w3.org/2000/01/rdf-schema#comment",
             "object": {"value": "Represents reported software issues."} },
           { "type": "Annotation",
             "predicate": "http://www.w3.org/2000/01/rdf-schema#label",
             "object": {"value": "software issue"} } ] } } ] }
                                                                         
    ex:IssueShape {
      ex:status .
          // rdfs:comment "Represents reported software issues."
          // rdfs:label "software issue"
    }




              

Validation Examples

The following examples demonstrate proofs for validations in the form of a nested list of invocations of the evaluation functions defined above.

Simple Examples

Schema:


S1

{ "type": "Schema", "shapes": [
    { "id": "http://schema.example/IntConstraint",
      "type": "NodeConstraint",
      "datatype": "http://www.w3.org/2001/XMLSchema#integer"
    } ] }
                                                            

  ex:IntConstraint
    xsd:integer
              

Here the shape identified by http://schema.example/IntConstraint is a shape expression consisting of a single NodeConstraint. Per Shape Expression Semantics, "30"^^<http://www.w3.org/2001/XMLSchema#integer> satisfies IntConstraint.

This document uses this nested tree convention to indicate that the dependency of an evaluation on those nested inside it. Nesting is expressed as indentation. Here, the evaluation of satisfies NodeConstraint ("30"^^xsd:integer, S1, G, m) depends on satisfies2 NodeConstraint ("30"^^xsd:integer, S1).

Validate "30"^^<http://www.w3.org/2001/XMLSchema#integer> as IntConstraint:

Validating a shape requires evaluating it's triple expression as well as the variables and functions neigh(G, n), matched, remainder, outs, matchables and unmatchables:

Schema:
 

S1
tc1
{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/UserShape",
    "type": "Shape", "expression":
    { "type": "TripleConstraint",
      "predicate": "http://schema.example/shoeSize"
      } } ] }
                                                   

  ex:UserShape {
    ex:shoeSize .
  }
              
Data:
 

t1
BASE <http://a.example/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
<Alice> ex:shoeSize "30"^^xsd:integer .
Validate <Alice> as http://schema.example/UserShape:
  • G = [t1] The graph G consists of one triple.
  • satisfies Shape (<Alice>, S1, G, m)
    • neigh(G, <Alice>) = [t1] /* The neighborhood around <Alice> consists of one triple. */
    • matched = [t1] /* That triple is matched in the nested evaluation. */
    • remainder = Ø /* The remainder is the empty set. */
    • matches TripleConstraint ([t1], tc1, m)
    • outs = [t1] /* There is one arc out. */
    • matchables = Ø /* There are no remaining arcs out of <Alice> with predicates appearing in tc1. */
    • unmatchables = Ø /* There are no other arcs out of <Alice>. */
    • closed is false /* The Shape's closed paramater has a value of false. */

It is quite common that Shapes will constrain their nested TripleConstraints with NodeConstraints. Here is an example including that, extra triples and a closed shape:

Schema:
 

S1
tc1



nc1
{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/UserShape",
    "type": "Shape", "expression":
    "extra": ["http://www.w3.org/1999/02/22-rdf-syntax-ns#type"],
    { "type": "TripleConstraint",
      "predicate": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type",
      "valueExpr":
      { "type": "NodeConstraint",
        "values": ["http://schema.example/Teacher"]
      } } } ] }
                                                                     

  ex:UserShape EXTRA a {
    a



      [ex:Teacher]
  }
              
Data:
 

t1
t2
t3
t4
t5
BASE <http://a.example/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
<Alice> ex:shoeSize "30"^^xsd:integer .
<Alice> a ex:Teacher .
<Alice> a ex:Person .
<SomeHat> ex:owner <Alice> .
<TheMoon> ex:madeOf <GreenCheese> .
Validate <Alice> as http://schema.example/UserShape:

The non-empty matchables is permitted because the triple t3 has a predicate which appears in the "extra" list: ["http://schema.example/Teacher"].

Disjunction Example

Schema:

 

S1
te1
tc1


nc1
te2
tc2


nc2
tc3


nc3
{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/UserShape",
    "type": "Shape", "expression":
     {"type": "OneOf", "expressions": [
        { "type": "TripleConstraint",
          "predicate": "http://xmlns.com/foaf/0.1/name",
          "valueExpr":
            { "type": "NodeConstraint", "nodeKind": "literal" } },
        { "type": "EachOf", "expressions": [
            { "type": "TripleConstraint", "min": 1, "max": -1 ,
              "predicate": "http://xmlns.com/foaf/0.1/givenName",
              "valueExpr":
                { "type": "NodeConstraint", "nodeKind": "literal" } },
            { "type": "TripleConstraint",
              "predicate": "http://xmlns.com/foaf/0.1/familyName",
              "valueExpr":
                { "type": "NodeConstraint", "nodeKind": "literal" } }
        ] }
     ] }
 } ] }
                                                                      
  ex:UserShape
    {
     (              # extra ()s to clarify alignment with ShExJ
      foaf:name


              LITERAL |
      (             # extra ()s to clarify alignment with ShExJ
       foaf:givenName


               LITERAL+ ;
       foaf:familyName


               LITERAL
      )
     )
    }
Data:


t1
t2
t3
t4
t5
t6
BASE <http://a.example/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
<Alice> foaf:givenName "Alice" .
<Alice> foaf:givenName "Malsenior" .
<Alice> foaf:familyName "Walker" .
<Alice> foaf:mbox <mailto:alice@example.com> .
<Bob> foaf:knows <Alice> .
<Bob> foaf:mbox <mailto:bob@example.com> .

Per Shape Expression Semantics, <Alice> satisfies S1 with the simple ShapeMap

m:
{ "http://a.example/Alice": "http://a.example/UserShape }

as seen in this validation.

Validate <Alice> as http://schema.example/UserShape:

Replacing triples 1-3 with a single foaf:name property will also satisfy the schema.

Data:


t4
t5
t6
t7
BASE <http://a.example/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
<Alice> foaf:mbox <mailto:alice@example.com> .
<Bob> foaf:knows <Alice> .
<Bob> foaf:mbox <mailto:bob@example.com> .
<Alice> foaf:name "Alice Malsenior Walker" .
Validate <Alice> as http://schema.example/UserShape:

Any mixure of foaf:name with foaf:givenName or foaf:familyName will fail to satisfy the schema as there will be a matchable triple t3 that's not used in the triple expression te1.

Data:


t3
t4
t5
t6
t7
BASE <http://a.example/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
<Alice> foaf:familyName "Walker" .
<Alice> foaf:mbox <mailto:alice@example.com> .
<Bob> foaf:knows <Alice> .
<Bob> foaf:mbox <mailto:bob@example.com> .
<Alice> foaf:name "Alice Malsenior Walker" .
Validate <Alice> as http://schema.example/UserShape:

Adding a foaf:familyName to S1's extra would allow this graph to satisfy the schema.

 

S1
{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/UserShape",
    "type": "Shape", "extra": ["http://xmlns.com/foaf/0.1/familyName"] …
   } ] }

Closing S1 would also cause a validation failure if unmatchables were not empty:

 

S1
{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/UserShape",
    "type": "Shape", "closed": true …
   } ] }
  • G = [t4,t5,t6,t7]
  • satisfies Shape (<Alice>, S1, G, m)
    • unmatchables = [t5], closed is true

Dependent Shape Example

Schema:
 

S1
tc1


nc1


S2
tc2


nc2


{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/IssueShape",
    "type": "Shape", "expression":
    { "type": "TripleConstraint",
      "predicate": "http://schema.example/reproducedBy",
      "valueExpr":
      "http://schema.example/TesterShape" } },
  { "id": "http://schema.example/TesterShape",
    "type": "Shape", "expression":
    { "type": "TripleConstraint",
      "predicate": "http://schema.example/role",
      "valueExpr":
      { "type": "NodeConstraint",
        "values": [ "http://schema.example/testingRole" ] } } }
  ] }
ex:IssueShape {
  ex:reproducedBy @ex:TesterShape
}
ex:TesterShape {
  ex:role [ex:testingRole]
}
              
Data:

t1
t2
PREFIX ex: <http://schema.example/>
PREFIX inst: <http://inst.example/>
inst:Issue1 ex:reproducedBy inst:Tester2 .
inst:Tester2 ex:role ex:testingRole .

inst:Issue1 satisfies S1 with the ShapeMap

m:
{ "http://inst.example/Issue1": "http://schema.example/IssueShape",
  "http://inst.example/Tester2": "http://schema.example/TesterShape",
  "http://inst.example/Testgrammer23": "http://schema.example/ProgrammerShape" }
Validate inst:Issue1 as http://schema.example/IssueShape:

as seen in this evaluation:

Recursion Example

Schema:
 

S1
tc1


nc1

{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/IssueShape",
    "type": "Shape", "expression":
    { "type": "TripleConstraint", "min": 0, "max": -1,
      "predicate": "http://schema.example/related",
      "valueExpr":
      "http://schema.example/IssueShape"
    } } ] }
                                                               
  ex:IssueShape {


    ex:related

      @ex:IssueShape*
}
Data:
 

t1
t2
t3
PREFIX ex: <http://schema.example/>
PREFIX inst: <http://inst.example/>
inst:Issue1 ex:related inst:Issue2 .
inst:Issue2 ex:related inst:Issue3 .
inst:Issue3 ex:related inst:Issue1 .

inst:Issue1 satisfies S1 with the ShapeMap

m:
{ "http://inst.example/Issue1": "http://schema.example/IssueShape",
  "http://inst.example/Issue2": "http://schema.example/IssueShape",
  "http://inst.example/Issue3": "http://schema.example/IssueShape" }
Validate inst:Issue1 as http://schema.example/IssueShape:

as seen in this evaluation:

Simple Repeated Property Examples

Schema:
 

S1
te1
tc1


nc1

tc2


nc2

{ "type": "Schema", "shapes": [
    { "id": "http://schema.example/TestResultsShape",
      "type": "Shape", "expression": {
        "type": "EachOf", "expressions": [
          { "type": "TripleConstraint", "min": 1, "max": -1,
            "predicate": "http://schema.example/val",
            "valueExpr":
            { "type": "NodeConstraint",
              "values": [ {"value": "a"}, {"value": "b"}, {"value": "c"} ] } },
          { "type": "TripleConstraint", "min": 1, "max": -1,
            "predicate": "http://schema.example/val",
            "valueExpr":
            { "type": "NodeConstraint",
              "values": [ {"value": "b"}, {"value": "c"}, {"value": "d"} ] } }
        ] } } ] }
                                                                               
            <http://schema.example/TestResultsShape>
{


                         <http://schema.example/val>


                         ["a" "b" "c"]+ ;

                         <http://schema.example/val>


                         ["b" "c" "d"]+
}              
Data:


t1
t2
t3
t4
BASE <http://a.example/>
PREFIX ex: <http://schema.example/>
<s> ex:val "a" .
<s> ex:val "b" .
<s> ex:val "c" .
<s> ex:val "d" .

<s> satisfies S1 with:

m:
{ "http://a.example/s": "http://a.example/S1 }
Validate <s> as http://schema.example/TestResultShape:

If tc1 consumes as many triples as it can, it consumes three and tc2 consumes one:

If we eliminate t4, either t2 or t3 must be allocated to tc2:

Repeated Property With Dependent Shapes Example

Schema:
 

S1
te1

tc1

nc1

tc2

nc2

S2

tc3


nc3

S3

tc4


nc4
{ "type": "Schema", "shapes": [
  { "id": "http://schema.example/IssueShape",
    "type": "Shape", "expression":
    { "type": "EachOf", "expressions": [
        { "type": "TripleConstraint",
          "predicate": "http://schema.example/reproducedBy",
          "valueExpr":
          "http://schema.example/TesterShape" },
        { "type": "TripleConstraint",
          "predicate": "http://schema.example/reproducedBy",
          "valueExpr":
          "http://schema.example/ProgrammerShape" }
      ] } },
  { "id": "http://schema.example/TesterShape",
    "type": "Shape", "expression":
    { "type": "TripleConstraint",
      "predicate": "http://schema.example/role",
      "valueExpr":
      { "type": "NodeConstraint",
        "values": [ "http://schema.example/testingRole" ] } } },
  { "id": "http://schema.example/ProgrammerShape",
    "type": "Shape", "expression":
    { "type": "TripleConstraint",
      "predicate": "http://schema.example/department",
      "valueExpr":
      { "type": "NodeConstraint",
        "values": [ "http://schema.example/ProgrammingDepartment" ] } } }
  ] }
                                                                         
ex:IssueShape {



  ex:reproducedBy

    @ex:TesterShape;

  ex:reproducedBy

    @ex:ProgrammerShape
}
ex:TesterShape {


  ex:role

    [ex:testingRole]
}
ex:ProgrammerShape {


  ex:department

    [ex:ProgrammingDepartment]
}

Data:
 


t1
t2


t3


t4
t5
PREFIX ex: <http://schema.example/>
PREFIX inst: <http://inst.example/>
inst:Issue1
  ex:reproducedBy inst:Tester2 ;
  ex:reproducedBy inst:Testgrammer23 .

inst:Tester2              
  ex:role ex:testingRole .

inst:Testgrammer23                        
  ex:role ex:testingRole ;                
  ex:department ex:ProgrammingDepartment .

inst:Issue1 satisfies S1 with the ShapeMap

m:
{ "http://inst.example/Issue1": "http://schema.example/IssueShape",
  "http://inst.example/Tester2": "http://schema.example/TesterShape",
  "http://inst.example/Testgrammer23": "http://schema.example/ProgrammerShape" }
Validate inst:Issue1 as http://schema.example/IssueShape:

as seen in this evaluation:

Negation Example

Setting the maximum cardinality of a TripleConstraint with predicate p to zero (i.e. "max": 0 in ShExJ or {0} or {0, 0} in ShExC) asserts that matching nodes must have no triples with predicate p.

Schema:
 

S1
te1
tc1


nc1

tc2


{ "type": "Schema", "shapes": [
    { "id": "http://schema.example/TestResultsShape",
      "type": "Shape", "expression": {
        "type": "EachOf", "expressions": [
          { "type": "TripleConstraint", "min": 1, "max": -1,
            "predicate": "http://schema.example/p1",
            "valueExpr":
            { "type": "NodeConstraint",
              "values": [ {"value": "a"}, {"value": "b"} ] } },
          { "type": "TripleConstraint", "min": 1, "max": -1,
            "predicate": "http://schema.example/p2", "min": 0, "max": 0 }
        ] } } ] }
                                                                         
            <http://schema.example/TestResultsShape>
{


                         <http://schema.example/p1>


                         ["a" "b"] ;

                         <http://schema.example/p2> . {0}
}              
Data:


t1
BASE <http://a.example/>
PREFIX ex: <http://schema.example/>
<s> ex:p1 "a" .

<s> satisfies S1 with:

m:
{ "http://a.example/s": "http://a.example/S1" }
Validate <s> as http://schema.example/TestResultShape:

If tc1 consumes as many triples as it can, it consumes three and tc2 consumes one:

If we add a t2 which matches tc2:

Data:


t1
t2
BASE <http://a.example/>
PREFIX ex: <http://schema.example/>
<s> ex:p1 "a" .
<s> ex:p2 5 .

every partition fails, either because matchables is non-empty or because the maximum cardinality on tc2 is exceeded:

ShEx Compact syntax (ShExC)

The ShEx Compact Syntax expresses ShEx schemas in a compact, human-friendly form. Parsing ShExC transforms a ShExC document into an equivalent ShExJ structure. This is defined as a BNF which accepts ShExC followed by instructions for tranlating the rules in the BNF production into their corresponding ShExJ objects. For example, "shapeExprDecl returns shapeExpression" indicates that the result of matching the shapeExprDecl production is the object produced by parsing the shapeExpression production.

Semantic actions before the first shape expression declaration are startActs. After the first shape expression declaration, semantic actions are associated with the previous declaration.

Below is the ShExC grammar following the notation in the XML specification[[!XML]]:
[1]    shexDoc    ::=    directive* ((notStartAction | startActions) statement*)?
followed by the associated ShExJ object(s):
Schema{ startActs:[SemAct]? start:shapeExpr? shapes:[shapeExpr+]? }
SemAct{name:IRI code:STRING? }
and a description of the mapping of rules in the production to elements of the ShExJ object:
[2]    directive    ::=    baseDecl | prefixDecl
[3]    baseDecl    ::=    "BASE" IRIREF
[4]    prefixDecl    ::=    "PREFIX" PNAME_NS IRIREF
[5]    notStartAction    ::=    start | shapeExprDecl
[6]    start    ::=    "start" '=' inlineShapeExpression
[7]    startActions    ::=    codeDecl+
[8]    statement    ::=    directive | notStartAction
[9]    shapeExprDecl    ::=    shapeExprLabel (shapeExpression | "EXTERNAL")
If the "EXTERNAL" keyword is present, shapeExprDecl returns a ShapeExternal object:
ShapeExternal{id:shapeExprLabel? }
otherwise shapeExprDecl returns shapeExpression.

Shape expressions are logical combinations of shape atoms. Inline variants of shape expressions are used in tripleConstraints and are not permitted to have annotations or semantic actions.

[10]    shapeExpression    ::=      "NOT"? shapeAtomNoRef shapeOr?
| "NOT" shapeRef shapeOr?
| shapeRef shapeOr
[11]    inlineShapeExpression    ::=    inlineShapeOr
[12]    shapeOr    ::=      ("OR" shapeAnd)+
| ("AND" shapeNot)+ ("OR" shapeAnd)*
[13]    inlineShapeOr    ::=    inlineShapeAnd ("OR" inlineShapeAnd)*
If the right shapeAnd matches one or more times, the result is a ShapeOr object with shapeExprs containing the first shapeAnd followed by the ordered list from the second shapeAnd:
ShapeOr{id:shapeExprLabel? shapeExprs:[shapeExpr] }
otherwise the result is the left shapeAnd.
[14]    shapeAnd    ::=    shapeNot ("AND" shapeNot)*
[15]    inlineShapeAnd    ::=    inlineShapeNot ("AND" inlineShapeNot)*
If the right shapeNot matches one or more times, the result is a ShapeAnd object with shapeExprs containing the first shapeNot followed by the ordered list from the second shapeNot:
ShapeAnd{id:shapeExprLabel? shapeExprs:[shapeExpr] }
otherwise the result is the left shapeNot.
[16]    shapeNot    ::=    "NOT"? shapeAtom
[17]    inlineShapeNot    ::=    "NOT"? inlineShapeAtom
If the left "NOT" matches, the result is a ShapeNot object with shapeExpr containing the shapeAtom:
ShapeNot{id:shapeExprLabel? shapeExpr:shapeExpr }
otherwise the result is the shapeAtom.

Shape atoms are shape references (indicated by "@"), definitions, or nested expressions.

[18]    shapeAtom    ::=       nonLitNodeConstraint shapeOrRef?
| litNodeConstraint
| shapeOrRef nonLitNodeConstraint? | '(' shapeExpression ')'
| '.'
[19]    shapeAtomNoRef    ::=       nonLitNodeConstraint shapeOrRef?
| litNodeConstraint
| shapeDefinition nonLitNodeConstraint?
| '(' shapeExpression ')'
| '.'
[20]    inlineShapeAtom    ::=       nonLitNodeConstraint inlineShapeOrRef?
| litNodeConstraint
| inlineShapeOrRef nonLitNodeConstraint?
| '(' shapeExpression ')'
| '.'
[21]    shapeOrRef    ::=       shapeDefinition | shapeRef
[22]    inlineShapeOrRef    ::=       inlineShapeDefinition | shapeRef
[23]    shapeRef    ::=       ATPNAME_LN | ATPNAME_NS | '@' shapeExprLabel
shapeExprRef = shapeExprLabel ;

Node constraints identify a (possibly infinite) set of matching RDF nodes.

[24]    litNodeConstraint    ::=       "LITERAL" xsFacet*
| nonLiteralKind stringFacet*
| datatype xsFacet*
| valueSet xsFacet*
| numericFacet+
[25]    nonLitNodeConstraint    ::=       nonLiteralKind stringFacet*
| stringFacet+
NodeConstraint{nodeKind:("iri" | "bnode" | "nonliteral" | "literal")? datatype:IRI? xsFacet* values:[valueSetValue]? }
[26]    nonLiteralKind    ::=    "IRI" | "BNODE" | "NONLITERAL"
[27]    xsFacet    ::=    stringFacet | numericFacet
xsFacet=stringFacet | numericFacet ;
[28]    stringFacet    ::=       stringLength INTEGER
| REGEXP
[29]    stringLength    ::=    "LENGTH" | "MINLENGTH" | "MAXLENGTH"
stringFacet=("length"|"minlength"|"maxlength"):INTEGER | pattern:STRING flags:STRING? ;
[30]    numericFacet    ::=       numericRange numericLiteral
| numericLength INTEGER
[31]    numericRange    ::=    "MININCLUSIVE" | "MINEXCLUSIVE" | "MAXINCLUSIVE" | "MAXEXCLUSIVE"
[32]    numericLength    ::=    "TOTALDIGITS" | "FRACTIONDIGITS"
numericFacet{("mininclusive"|"minexclusive"|"maxinclusive"|"maxexclusive"):numericLiteral | ("totaldigits"|"fractiondigits"):INTEGER ;

Shape defintions associate a triple expression with a closed flag and a list of partially constrained (extra) predicates. Any predicate appearing in a triple expression is fully constrained unless it appears in the list of extras.

[33]    shapeDefinition    ::=    (extraPropertySet | "CLOSED")* '{' tripleExpression? '}' annotation* semanticActions
[34]    inlineShapeDefinition    ::=    (extraPropertySet | "CLOSED")* '{' tripleExpression? '}'
Shape{ closed:BOOL? extra:[IRI]? expression:tripleExpr? annotations:[Annotation]? semActs:[SemAct]? }
Annotation{predicate:IRI object:objectValue }
  • closed is true if the "CLOSED" choice was matched one or more times.
  • extra is the set of IRIs matching the extraPropertySet production.
  • expression comes from the tripleExpression production.
  • annotations is the set of Annotations matching the annotation production.
  • semActs is the set of semantic actions matching the semanticActions production.
[35]    extraPropertySet    ::=    "EXTRA" predicate+

Triple expressions are arrangements of triple constraints.

[36]    tripleExpression    ::=    oneOfTripleExpr
[37]    oneOfTripleExpr    ::=    groupTripleExpr | multiElementOneOf
[38]    multiElementOneOf    ::=    groupTripleExpr ('|' groupTripleExpr)+
If the right groupTripleExpr matches one or more times, the result is a OneOf object with expressions containing the first groupTripleExpr followed by the ordered list from the second groupTripleExpr:
OneOf {id:tripleExprLabel? expressions:[tripleExpr] min:INTEGER? max:INTEGER? semActs:[SemAct]? annotations:[Annotation]? }
otherwise the result is the left groupTripleExpr.
[39]    innerTripleExpr    ::=    multiElementGroup | multiElementOneOf
[40]    groupTripleExpr    ::=    singleElementGroup | multiElementGroup
[41]    singleElementGroup    ::=    unaryTripleExpr ';'?
[42]    multiElementGroup    ::=    unaryTripleExpr (';' unaryTripleExpr)+ ';'?
If the right unaryTripleExpr matches one or more times, the result is a EachOf object with expressions containing the first unaryTripleExpr followed by the ordered list from the second unaryTripleExpr:
EachOf {id:tripleExprLabel? expressions:[tripleExpr] min:INTEGER? max:INTEGER? semActs:[SemAct]? annotations:[Annotation]? }
otherwise the result is the left unaryTripleExpr.
[43]    unaryTripleExpr    ::=       ('$' tripleExprLabel)? (tripleConstraint | bracketedTripleExpr)
| include
[44]    bracketedTripleExpr    ::=    '(' innerTripleExpr ')' cardinality? annotation* semanticActions

Triple constraints are matched against RDF triples.

[45]    tripleConstraint    ::=    senseFlags? predicate inlineShapeExpression cardinality? annotation* semanticActions
TripleConstraint{inverse:BOOL? predicate:IRI valueExpr:shapeExpr? min:INTEGER? max:INTEGER? semActs:[SemAct]? annotations:[Annotation]? }
[46]    cardinality    ::=    '*' | '+' | '?' | REPEAT_RANGE
In ShExJ, "*" is represented as -1, standing for the unbounded cardinality..
[47]    senseFlags    ::=    '^'

Value sets identify ranges of RDF nodes by explicit inclusion or by range (indicated by "~"). Ranges may include exclusions, which may also be ranges but must not in turn contain exclusions. A valueSetValue may be an objectValue or one of IriStem, IriStemRange, LiteralStem, LiteralStemRange, LanguageStem, LanguageStemRange, .

[48]    valueSet    ::=    '[' valueSetValue* ']'
[49]    valueSetValue    ::=    iriRange | literalRange | languageRange
| exclusion+
If "." matches and exclusion matches one or more times, all matched items must be consistently iri, literal, or language. valueSetValue returns either a IriStemRange, LiteralStemRange, or LanguageStemRange object with exclusions equal to the set of results of exclusion:
IriStemRange{stem:(IRI | Wildcard) exclusions:[valueSetValue]? }
LiteralStemRange{stem:(LITERAL | Wildcard) exclusions:[valueSetValue]? }
LanguageStemRange{stem:(language tag | Wildcard) exclusions:[valueSetValue]? }
If "~" matches with no exclusion, valueSetValue returns a Wildcard object:
Wildcard{/* contains no properties */ }
[50]    exclusion    ::=    '-' (iri | literal | LANGTAG) '~'?
[51]    iriRange    ::=       iri ('~' exclusion*)?
If iri matches with no "~", iriRange returns iri.
If iri and "~" match with no iriExclusion, iriRange returns a IriStem object:
IriStem{stem:IRI }
If iri and "~" match and iriExclusion matches one or more times, iriRange returns a IriStemRange object with exclusions equal to the set of results of iriExclusion:
[52]    iriExclusion    ::=    '-' iri '~'?
[53]    literalRange    ::=       literal ('~' literalExclusion*)?
If literal matches with no "~", literalRange returns literal.
If literal and "~" match with no literalExclusion, literalRange returns a LiteralStem object:
LiteralStem{stem:objectLiteral }
If literal and "~" match and literalExclusion matches one or more times, literalRange returns a LiteralStemRange object with exclusions equal to the set of results of literalExclusion:
[54]    literalExclusion    ::=    '-' literal '~'?
[55]    languageRange    ::=       LANGTAG ('~' languageExclusion*)?
If LANGTAG matches with no "~", languageRange returns LANGTAG.
If LANGTAG and "~" match with no languageExclusion, languageRange returns a LanguageStem object:
LanguageStem{stem:objectLiteral }
If LANGTAG and "~" match and languageExclusion matches one or more times, languageRange returns a LanguageStemRange object with exclusions equal to the set of results of languageExclusion:
[56]    languageExclusion    ::=    '-' LANGTAG '~'?

Triple expressions can include the shapeExpression in a shapeExprDecl.

[57]    include    ::=    '&' tripleExprLabel
Per the triple expression refrence requirement, tripleExprLabel property MUST appear in the schema's shapes map and the corresponding triple expression MUST be a Shape with a tripleExpr.
tripleExprRef=include:tripleExprLabel ;

Triple expressions can include annotations in the form of a tuple of a predicate and an iri or literal.

[58]    annotation    ::=    "//" predicate (iri | literal)
Annotation{predicate:IRI object:objectValue }

Triple expressions can include semantic actions consisting of an iri and an optional code string.

[59]    semanticActions    ::=    codeDecl*
[60]    codeDecl    ::=    '%' iri (CODE | '%')
SemAct{name:IRI code:STRING? }

The remaining productions come from the specifications for SPARQL and Turtle.

[13t]    literal    ::=    rdfLiteral | numericLiteral | booleanLiteral
[61]    predicate    ::=    iri | RDF_TYPE
[62]    datatype    ::=    iri
[63]    shapeExprLabel    ::=    iri | blankNode
[64]    tripleExprLabel    ::=    iri | blankNode
[16t]    numericLiteral    ::=    INTEGER | DECIMAL | DOUBLE
[65]    rdfLiteral    ::=    langString | string ("^^" datatype)?
returns: literal The literal has a lexical form of the first rule argument, String. If the '^^' iri rule matched, the datatype is iri and the literal has no language tag. If the langString rule matched, the datatype is rdf:langString and the language tag is extracted from langTag. If neither matched, the datatype is xsd:string and the literal has no language tag.
[134s]    booleanLiteral    ::=    "true" | "false"
returns: literal The literal has a lexical form of the true or false, depending on which matched the input, and a datatype of xsd:boolean.
[135s]    string    ::=       STRING_LITERAL1 | STRING_LITERAL_LONG1
| STRING_LITERAL2 | STRING_LITERAL_LONG2
[66]    langString    ::=       LANG_STRING_LITERAL1 | LANG_STRING_LITERAL_LONG1
| LANG_STRING_LITERAL2 | LANG_STRING_LITERAL_LONG2
[136s]    iri    ::=    IRIREF | prefixedName
[137s]    prefixedName    ::=    PNAME_LN | PNAME_NS
[138s]    blankNode    ::=    BLANK_NODE_LABEL

Terminals

Terminals return:

  • the RDF abstract types IRI, lexical form, literal, language tag.
  • a string of unicode codepoints for CODE.
  • a repeat range for REPEAT_RANGE. A repeat range is a tuple of non-negative integers or a non-negative integer and a token for *.
[67]    <CODE>    ::=    "{" ([^%\\] | "\\" [%\\] | UCHAR)* "%" "}"
returns: a string of unicode codepointsThe characters between "{" and "%}" are taken, with the numeric escape sequences unescaped, to form the unicode string of the IRI.
[68]    <REPEAT_RANGE>    ::=    "{" INTEGER ( "," (INTEGER | "*")? )? "}"
returns: repeat rangeThe base-10 numeric values of INTEGER are taken or a non-negative integer and an * token if "*" was matched.
[69]    <RDF_TYPE>    ::=    "a"
returns: IRI The iri http://www.w3.org/1999/02/22-rdf-syntax-ns# is returned.
[18t]    <IRIREF>    ::=    "<" ([^#0000- <>\"{}|^`\\] | UCHAR)* ">"
returns: IRIThe characters between "<" and ">" are taken, with the numeric escape sequences unescaped, to form the unicode string of the IRI. Relative IRI resolution is performed per Turtle Section 6.3.
[140s]    <PNAME_NS>    ::=    PN_PREFIX? ":"
returns: PREFIX When used in a prefixDecl production, the prefix is a potentially empty unicode string matching the first argument of the rule and serves as a key into the prefixes map.
returns: IRI When used elsewhere, the iri is the value in the prefixes map corresponding to the first argument of the rule.
[141s]    <PNAME_LN>    ::=    PNAME_NS PN_LOCAL
returns: IRI A potentially empty prefix is identified by the first token, PNAME_NS. The prefixes map MUST have a corresponding namespace. The unicode string of the IRI is formed by unescaping the reserved characters in the second argument, PN_LOCAL, and concatenating this onto the namespace.
[70]    <ATPNAME_NS>    ::=    "@" PN_PREFIX? ":"
returns: IRI The iri is the value in the prefixes map corresponding to the second token of the rule.
[71]    <ATPNAME_LN>    ::=    "@" PNAME_NS PN_LOCAL
returns: IRI A potentially empty prefix is identified by the second token, PNAME_NS. The prefixes map MUST have a corresponding namespace. The unicode string of the IRI is formed by unescaping the reserved characters in the third token, PN_LOCAL, and concatenating this onto the namespace.
[72]    <REGEXP>    ::=    '/' ([^/\\\n\r] | '\\' [nrt\\|.?*+(){}$-\[\]^/] | UCHAR)+ '/' [smix]*
{pattern:STRING flags:STRING? }
returns: JSON objectpattern is a unicode string formed from the characters between the outermost '/'s by unescaping matches of '\\' '/' in the terminal pattern as well as the numeric escape sequences matched by UCHAR. The remaining escape sequences are included verbatim in pattern, e.g.
^\/\t\\\U0001D4B8$
would become
^/\t\\\U0001D4B8$
.
flags is a sequence of the characters [smix] if any were matched. Otherwise no flags attribute is returned.
[142s]    <BLANK_NODE_LABEL>    ::=    "_:" (PN_CHARS_U | [0-9]) ((PN_CHARS | ".")* PN_CHARS)?
returns: BNodeThe characters following the "_:" form a blank node identifier. This corresponds to any blank node in the input dataset that had the same label.
[145s]    <LANGTAG>    ::=    "@" ([a-zA-Z])+ ("-" ([a-zA-Z0-9])+)*
returns: language tagThe characters following the @ form the unicode string of the language tag.
[19t]    <INTEGER>    ::=    [+-]? [0-9]+
returns: literal The literal has a lexical form of the input string, and a datatype of xsd:integer.
[20t]    <DECIMAL>    ::=    [+-]? [0-9]* "." [0-9]+
returns: literal The literal has a lexical form of the input string, and a datatype of xsd:double.
[21t]    <DOUBLE>    ::=    [+-]? ([0-9]+ "." [0-9]* EXPONENT | "."? [0-9]+ EXPONENT)
returns: literal The literal has a lexical form of the input string, and a datatype of xsd:double.
[155s]    <EXPONENT>    ::=    [eE] [+-]? [0-9]+
[156s]    <STRING_LITERAL1>    ::=    "'" ([^'\\\n\r] | ECHAR | UCHAR)* "'"
returns: lexical formThe characters between the outermost "'"s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form.
[157s]    <STRING_LITERAL2>    ::=    '"' ([^\"\\\n\r] | ECHAR | UCHAR)* '"'
returns: lexical formThe characters between the outermost '"'s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form.
[158s]    <STRING_LITERAL_LONG1>    ::=    "'''" ( ("'" | "''")? ([^\\'\\] | ECHAR | UCHAR) )* "'''"
returns: lexical formThe characters between the outermost "'''"s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form.
[159s]    <STRING_LITERAL_LONG2>    ::=    '"""' ( ('"' | '""')? ([^\"\\] | ECHAR | UCHAR) )* '"""'
returns: lexical formThe characters between the outermost '"""'s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form.
[73]    <LANG_STRING_LITERAL1>    ::=    "'" ([^'\\\n\r] | ECHAR | UCHAR)* "'" LANGTAG
returns: lexical formThe characters between the outermost "'"s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form. The trailing LANGTAG is used to create a language-tagged string.
[74]    <LANG_STRING_LITERAL2>    ::=    '"' ([^\"\\\n\r] | ECHAR | UCHAR)* '"' LANGTAG
returns: lexical formThe characters between the outermost '"'s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form. The trailing LANGTAG is used to create a language-tagged string.
[75]    <LANG_STRING_LITERAL_LONG1>    ::=    "'''" ( ("'" | "''")? ([^\\'\\] | ECHAR | UCHAR) )* "'''" LANGTAG
returns: lexical formThe characters between the outermost "'''"s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form. The trailing LANGTAG is used to create a language-tagged string.
[76]    <LANG_STRING_LITERAL_LONG2>    ::=    '"""' ( ('"' | '""')? ([^\"\\] | ECHAR | UCHAR) )* '"""' LANGTAG
returns: lexical formThe characters between the outermost '"""'s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form. The trailing LANGTAG is used to create a language-tagged string.
[26t]    <UCHAR>    ::=       "\\u" HEX HEX HEX HEX
| "\\U" HEX HEX HEX HEX HEX HEX HEX HEX
[160s]    <ECHAR>    ::=    "\\" [tbnrf\\\"\\']
[164s]    <PN_CHARS_BASE>    ::=       [A-Z] | [a-z]
| [#00C0-#00D6] | [#00D8-#00F6] | [#00F8-#02FF]
| [#0370-#037D] | [#037F-#1FFF]
| [#200C-#200D] | [#2070-#218F] | [#2C00-#2FEF]
| [#3001-#D7FF] | [#F900-#FDCF] | [#FDF0-#FFFD]
| [#10000-#EFFFF]
[165s]    <PN_CHARS_U>    ::=    PN_CHARS_BASE | "_"
[167s]    <PN_CHARS>    ::=       PN_CHARS_U | "-" | [0-9]
| [#00B7] | [#0300-#036F] | [#203F-#2040]
[168s]    <PN_PREFIX>    ::=    PN_CHARS_BASE ( (PN_CHARS | ".")* PN_CHARS )?
[169s]    <PN_LOCAL>    ::=    (PN_CHARS_U | ":" | [0-9] | PLX) ( (PN_CHARS | "." | ":" | PLX)* (PN_CHARS | ":" | PLX) )?
[170s]    <PLX>    ::=    PERCENT | PN_LOCAL_ESC
[171s]    <PERCENT>    ::=    "%" HEX HEX
[172s]    <HEX>    ::=    [0-9] | [A-F] | [a-f]
[173s]    <PN_LOCAL_ESC>    ::=    "\\" ( "_" | "~" | "." | "-" | "!" | "$" | "&" | "'" | "(" | ")" | "*" | "+" | "," | ";" | "=" | "/" | "?" | "#" | "@" | "%" )
[98]    PASSED TOKENS    ::=       [ \t\r\n]+
| "#" [^\r\n]*

The Application Programming Interface

This API provides a clean mechanism that enables developers to determine conformance of RDF Graphs to a ShEx Schema.

The JSON-LD API uses Promises to represent the result of the various asynchronous operations. Promises are defined in [[ECMASCRIPT-6.0]]. General use within specifications can be found in [[promises-guide]].

The ShExProcessor Interface

The ShExProcessor interface is the high-level programming structure that developers use to access the JSON-LD transformation methods.

The ShExProcessor interface is the high-level programming structure that developers use to determine conformance of RDF Graphs to a ShEx Schema.

It is important to highlight that implementations do not modify the input parameters. If an error is detected, the Promise is rejected passing a ShExError with the corresponding error code.

validate
Checks conformance of the given validate.graph with validate.schema using validate.shapeMap according to .
  1. Create a new Promise promise and return it. The following steps are then executed asynchronously.
  2. If validate.schema has the form of an IRI, set schema to the result of dereferencing the IRI, treating the result as ShExC, ShExJ, or some other RDF serialization, depending on the content-type of the result, and transforming into the ShExJ abstract syntax for processing.
  3. Otherwise, if a USVString, validate.schema is treated as a string, and parsed into the ShExJ abstract syntax, using format as a hint to determine the content-type.
  4. Otherwise, if an RDF Graph, schema is transforming into the ShExJ abstract syntax for processing.
  5. If validate.schema is parsed as ShExC, and a syntax error is found, reject promise passing syntax error.
  6. If a transformation to the ShExJ abstract syntax does not result in valid ShExJ, reject promise, passing invalid schema.
  7. If validate.graph has the form of an IRI, set graph to the result of dereferencing the IRI, treating the result an RDF Graph serialization, depending on the content-type of the result. Otherwise, validate.graph either is an RDF Graph, or a processor should use other means to transform it into an RDF Graph.
  8. Process schema and graph according to the requirements in , creating shapeResults, a ShapeResults dictionary based on a copy of validate.shapeMap with entries added, as necessary, for each focus node fi in focusNodes associated with http://www.w3.org/ns/shex#Start as a shapeExprLabel, a BOOL result value, depending on if the node conforms with the referenced shape(s), and an optional message, describing details conformance, and promise is fulfilled using shapeResults.
  9. If graph is found not to conform using schema and validate.shapeMap, reject promise, passing non-conforming graph along with shapeResults.
schema
The ShEx Schema, either as a dereferencable IRI (URL), an inline data string, or an RDF Graph which is transformed into the ShExJ abstract syntax.
graph
The graph to check conformance with, either as a a dereferencable IRI to some RDF serialization, or as an accessable RDF Graph.
shapeMap
A dictionary mapping a node in validate.graph to one ore more shapeExprLabel.
options
A set of options to configure the algorithms.

The RDFGraph Type

The RDFGraph type is describe an RDF Graph accessed as described in Graph access.

bgp
Performs a SPARQL Basic Graph Pattern query on the graph.

The ShExOptions Type

The ShExOptions type is used to pass various options to the ShExProcessor methods.

base
The base IRI to use when parsing validate.schema. If set, this overrides the validate.schema's IRI.
focusNodes
One or more nodes in validate.graph which are are checked for conformance against start.
format
One of ShExC or ShExJ.
processingMode
If set to shex-2.0, the implementation has to produce exactly the same results as the algorithms defined in this specification. If set to another value, the ShEx processor is allowed to extend or modify the algorithms defined in this specification to enable application-specific optimizations. The definition of such optimizations is beyond the scope of this specification and thus not defined. Consequently, different implementations may implement different optimizations. Developers must not define modes beginning with shex as they are reserved for future versions of this specification.

The ShapeResults Type

The ShapeResults type is used for reporting the result of conformance checking of an RDF Graph against a schema.

ShapeResults is a dictionary mapping a node in validate.graph to a sequence of ShapeResult entries.

shape
A shapeExprLabel in validate.schema against which node conformance was checked.
result
The result of the conformance check.
reason
An optional message describing the result of the conformance check.

Error Handling

This section describes the datatype definitions used within the ShEx API for error handling.

ShExError

The ShExError type is used to report processing errors.

code
a string representing the particular error type, as described in the various algorithms in this document.
message
an optional error message containing additional debugging information. The specific contents of error messages are outside the scope of this specification.

ShExErrorCode

The ShExErrorCode represents the collection of valid ShEx error codes.

invalid schema
The validate.schema argument to validate is invalid. @gkellogg: This could use some improvement.
non-conforming graph
The validate.graph is found not to conform with validate.schema.
syntax error
Parsing validate.schema as ShExC failed with a syntax error.

ShEx JSON Syntax (ShExJ)

This section defines a set of properties and permitted values for describing Shape Expressions in JSON [[!RFC7159]], and how these properties should be interpreted by applications.

A ShExJ document is a JSON-LD [[!!JSON-LD]] document uses a proscribed structure which holds an object at the top level. This object describes a schema containing shape expressions and triple expressions. A ShExJ document MUST include a @context property referencing http://www.w3.org/ns/shex.jsonld, and MAY include additional contexts and term definitions. In the absense of a top-level @context, ShEx Processors MUST act as if a @context property is present with the value http://www.w3.org/ns/shex.jsonld.

A ShExJ document can also be thought of as the serialization of an RDF Graph using the Shape Expression Vocabulary [[shex-vocab]] which conforms to the shape defined in . Processors MAY choose to interpret a ShExJ document as an RDF Graph and process using RDF Graph semantics. Processors may also transform arbitrary RDF Graphs conforming to into ShExJ using a mechanism not described within this specification.

In ShExJ, the unbounded cardinality constraint is -1, rather than "*".

This is the complete grammar for ShExJ.

Schema{"@context": "http://www.w3.org/ns/shex.jsonld"
startActs:[SemAct+]? start:shapeExpr? shapes:[shapeExpr+]? }
shapeExpr=ShapeOr | ShapeAnd | ShapeNot | NodeConstraint | Shape | shapeExprRef | ShapeExternal ;
ShapeOr{id:shapeExprLabel? shapeExprs:[shapeExpr{2,}] }
ShapeAnd{id:shapeExprLabel? shapeExprs:[shapeExpr{2,}] }
ShapeNot{id:shapeExprLabel? shapeExpr:shapeExpr }
NodeConstraint{id:shapeExprLabel? nodeKind:("iri" | "bnode" | "nonliteral" | "literal")? datatype:IRI? xsFacet* values:[valueSetValue+]? }
Shape{id:shapeExprLabel? closed:BOOL? extra:[IRI+]? expression:tripleExpr? semActs:[SemAct+]? }
shapeExprRef=shapeExprLabel ;
ShapeExternal{/* empty */ }
tripleExpr=EachOf | OneOf | TripleConstraint | tripleExprRef ;
EachOf{id:tripleExprLabel? expressions:[tripleExpr{2,}] min:INTEGER? max:INTEGER? semActs:[SemAct+]? annotations:[Annotation+]? }
OneOf{id:tripleExprLabel? expressions:[tripleExpr{2,}] min:INTEGER? max:INTEGER? semActs:[SemAct+]? annotations:[Annotation+]? }
tripleExprRef{include:shapeExprLabel }
TripleConstraint{id:tripleExprLabel? inverse:BOOL? predicate:IRI valueExpr:shapeExpr? min:INTEGER? max:INTEGER? semActs:[SemAct+]? annotations:[Annotation+]? }
xsFacet=stringFacet | numericFacet ;
stringFacet=("length"|"minlength"|"maxlength"):INTEGER | pattern:STRING flags:STRING ;
numericFacet=("mininclusive"|"minexclusive"|"maxinclusive"|"maxexclusive"):numericLiteral
|("totaldigits"|"fractiondigits"):INTEGER ;
valueSetValue=objectValue | IriStem | IriStemRange | LiteralStem | LiteralStemRange | LanguageStem | LanguageStemRange ;
objectValue=IRI | ObjectLiteral ;
IriStem{stem:IRI }
IriStemRange{stem:(IRI | Wildcard) exclusions:[objectValue|IriStem +]? }
LiteralStem{stem:ObjectLiteral }
LiteralStemRange{stem:(ObjectLiteral | Wildcard) exclusions:[objectValue|LiteralStem +]? }
LanguageStem{stem:ObjectLiteral }
LanguageStemRange{stem:(ObjectLiteral | Wildcard) exclusions:[objectValue|LanguageStem +]? }
Wildcard{/* empty */ }
SemAct{name:IRI code:STRING? }
Annotation{predicate:IRI object:objectValue }
shapeExprLabel=IRI | BNODE ;
tripleExprLabel=IRI | BNODE ;
numericLiteral=INTEGER | DECIMAL | DOUBLE ;
ObjectLiteral{value:STRING language:STRING? type:STRING? }
Terminalsconstrained by the terminal rules in the compact syntax
PREFIX:PN_PREFIX
IRI:( [^#0000- <>\"{}|^`\\] | UCHAR )* # (IRIREF without the enclosing "<>"s)
BNODE:BLANK_NODE_LABEL
BOOL:"true" | "false" # a JSON boolean value
INTEGER:INTEGER # a JSON string matching the lexical form of an integer
DECIMAL:DECIMAL # a JSON string matching the lexical form of an decimal
DOUBLE:DOUBLE # a JSON string matching the lexical form of an double
STRING:'"' .* '"' # a JSON string starting and ending with the U+0022 (") character
DATATYPE_STRING:'"' .* '"' ( [^#0000- <>\"{}|^`\\] | UCHAR )* # a JSON string starting with U+0022 ("), followed by the lexical form of the string, then U+0022 U+005E U+005E ("^^) and the IRI of the datatype
LANG_STRING:'"' .* '"@' LANGTAG # a JSON string starting with U+0022 ("), followed by the lexical form of the string, then U+0022 U+0040 ("@) and the LANGTAG

ShEx Shape

The following schema describes the form of an RDF Graph conforming to a Shape Expression schema, consisten with the description of ShExJ.


IANA Considerations

This section has been submitted to the Internet Engineering Steering Group (IESG) for review, approval, and registration with IANA.

application/shex

Type name:
application
Subtype name:
shex
Required parameters:
None
Encoding considerations:
See RFC 6839, section 3.1.
Security considerations:
Given that ShExC allows the substitution of long IRIs with short terms, ShExC documents may expand considerably when processed and, in the worst case, the resulting data might consume all of the recipient's resources. Applications should treat any data with due skepticism.
Interoperability considerations:
Not Applicable
Published specification:
http://www.w3.org/TR/json-ld
Applications that use this media type:
Any programming environment that requires the exchange of directed graphs. Implementations of JSON-LD have been created for JavaScript, Python, Ruby, PHP, and C++.
Additional information:
Magic number(s):
Not Applicable
File extension(s):
.shex
Macintosh file type code(s):
TEXT
Person & email address to contact for further information:
Eric Prud'hommeaux <eric@w3.org>
Intended usage:
Common
Restrictions on usage:
None
Author(s):
Eric Prud'hommeaux, Iovka Boneva, Jose Labra Gayo, Gregg Kellogg, Niklas Lindström
Change controller:
W3C

Fragment identifiers used with application/ld+json are treated as in RDF syntaxes, as per RDF 1.1 Concepts and Abstract Syntax [[RDF11-CONCEPTS]].

Security Considerations

Consider requirements from Self-Review Questionnaire: Security and Privacy.

Since ShEx is intended to be a pure data exchange format for validating RDF graphs, the ShExJ serialization SHOULD NOT be passed through a code execution mechanism such as JavaScript's eval() function to be parsed. An (invalid) document may contain code that, when executed, could lead to unexpected side effects compromising the security of a system.

See also,