ShEx is a schema language for RDF graphs. It provides structural constraint on graph structures and lexical forms of literals. ShExPath addresses elements in a ShEx schema. This can be used to anchor validation results, identify regions of RDF graphs, or tie external annotations to elements in a schema.
ShExPath identifies Shape Expressions and Triple Expressions. It does not, on its own, identify portions of Node Constraint.
This specification describes the structure of ShEx schemas in terms of ShExJ.
Elements in the ShExJ are addressed in a javascript notation, e.g. S.shapes[N]
.
A ShExPath is a Unicode string which defines a traversal of a Shape Expressions schema.
An item is one of these elements from a Shape Expressions schema:
A value is a sequence of items.
This specification uses XPath's notion of item, value, step and singleton to leverage shared understanding. Would other terms be more useful?
is a value a set?
A singleton is a sequence of exactly one item.
Do we some way to say a singleton is expected? This would be useful at end to say ShExPath should specify a single item, but also useful earlier in path for debugging.
A context label identifies the expected element type in the ShExJ where each item should be.
The context labels are: ShapeAnd
| ShapeOr
| ShapeNot
| EachOf
| OneOf
| NodeConstraint
| TripleConstraint
.
A index may be an integer i
, RDF node (URL or blank node label) N
with an optional integer Ni
.
Integer indexes are 1-based accessors to array elements in a schema.
Out of range indexes are ignored (they do not contribute any new items to a result value).
The function evaluateIndex maps a value to a new value:
I
in value:
E
in I.shapes with E.id = N and there is no Ni
or Ni
= 1, add E
to the new valueTCs
be the (possibly empty) set of triple constraints the shape expression I.expressions with a predicate of N.Ni
or Ni
= 1, add I
to the new valueShExPath indexes are 1-based, following XPath's precedent. Change to 0-based?
A ShExPath is divided into steps (see StepExpr [@@ add link when grammar is markup] in the grammar).
The initial step may be the character /
.
All following steps are separated by /
and include an optional context label followed by a required index.
evaluateShExPath takes as arguments a ShExPath P
, a schema S
and an initial value V
.
It iteratively calls evaluateStep with each step and either the initial value or the results of the last invocation of evaluateStep.
evaluateStep(P, S, V) is a function that takes as arguments a ShExPath P
, schema S
and a value V
and produces a new value.
The operators are evaluated as follows:
.
step | action |
---|---|
/ | value = the list of shapes in the schema |
context label | the items in value are tested for alignment with the context label.
It is a fatal error if the item is not the same as the context label.
This does not change the value .
is an error or just a filter? |
index | value = evaluateIndex(index, value ) |
empty string | the evaluation is terminated, the result is value . |
Should we advance through ShapeAnd automagically? This requires more aggressive searching but maybe that's worth it. This would be useful when shape expr is a ShapeAnd of node constraint and a shape:
<#UserShape> IRI /User\?id=[0-9]+/ { foaf:mbox IRI }Compare
@<#UserShape>/2/foaf:mbox
with @<#UserShape>/foaf:mbox
.
This schema excerpt describes an issue that might appear in an issue tracking system.
<#IssueShape> { :name STRING MinLength 4; :category ["bug" "feature request"]; :postedBy @<#UserShape>; :processing { :reproduced [true false]; :priority xsd:integer }? } <#UserShape> IRI /User\?id=[0-9]+/ { ( foaf:name xsd:string | foaf:givenName +; foaf:familyName ); foaf:mbox IRI }
Elements can be addressed either by label or index.
For shape expressions, the label is the name of the shape expression.
Shape expression labels or indexes are prefixed by "@
".
For triple constraints, the label is the name of the predicate for that triple constraints.
Elements of triple expressions may be selected by index within the triple expression.
ShExPath | value |
---|---|
/@<#IssueShape>/:category | the :category constraint in the #IssueShape shape, addressed by the name of the shape expression followed by the name of the :category property. |
/@<#IssueShape>/2 | the :category constraint in the #IssueShape shape, addressed by the name of the shape expression followed by the index of the :category property. |
/@1/2 | the :category constraint in the #IssueShape shape, addressed by the index of the shape expression followed by the index of the :category property. |
A traversal through a schema, for instance, the results of a validation, can be expressed as a ShExPath.
Such a ShExPath can include shape references in value expressions.
These are can be appended to the triple expression path with a /
separator.
ShExPath | value |
---|---|
/@<#IssueShape>/:postedBy/@<#UserShape>/foaf:mbox | the :postedBy constraint in the #IssueShape shape, which then references the :category property in the #UserShape . |
/@1/3/@2/2 | the same path, but with ordinals. |
For added clarity and confidence, the ShExJ type of all shape expressions and triple expressions addressable can be tested. If the expression type is specified and does not match the corresponding type in the schema, the path is invalid.
The axes ShapeAnd
, ShapeOr
, ShapeNot
, EachOf
, OneOf
, NodeConstraint
and TripleConstraint
may be used to specify the expected expression type.
ShExPath | value |
---|---|
/@<#UserShape>/shapeAnd 2/foaf:mbox | the foaf:mbox constraint in #UserShape 's shape.
Note that IRI /User\?id=[0-9]+/ {...} compiles to a ShapeAnd with the first component a NodeConstraint and the second being a shape..
|
The axes EachOf
, OneOf
and TripleConstraint
may be used to specify the expected expression type.
ShExPath | value |
---|---|
/1/ShapeAnd 2/EachOf 2 | the :category constraint in the #IssueShape shape, explicitly labeling the index axes. |
/<#UserShape>/2/EachOf 1/OneOf 2 | the EachOf containing foaf:givenName and foaf:familyName in the #UserShape shape. |
/<#UserShape>/2/EachOf 1/EachOf 2 | invalid path for the given schema. |
Invocation of evaluateShExPath includes a value.
ShExPaths starting with a /
force the context to be the schema.
If the value is a schema, indexes access entries in the ShExJ .shapes
property.
The application may provide a different value.
A context path is a ShExPath which, when evaluated against the schema, produces an equivalent value.
For instance validation results MAY be reported using relative ShExPaths.
Context | ShExPath | value |
---|---|---|
/@<#IssueShape> | :category | the :category constraint in the #IssueShape shape. |
Relative ShExPaths can be concatonated to their context with /
separator.
If more than one triple constraint has the same predicate, they can be indexed by the order they would be encountered in a depth-first search. If the index is omitted, it is assumed to be 1.
<BPObs> { :component { code: "systolic"; value: xsd:double }; :component { code: "diastolic"; value: xsd:double }; :component { code: "posture"; value: @<Postures> }?; }
ShExPath | value |
---|---|
/<BPObs>/:component 3 | the third :component triple expression (the one expecting a code of "posture"). |
/<BPObs>/:component | the first :component triple expression (the one expecting a code of "systolic"). |
[Try it in yacker], e.g. this nested validation result test.
ShExPathExpr ::= AbsolutePathExpr | RelativePathExpr AbsolutePathExpr ::= "/" RelativePathExpr RelativePathExpr ::= StepExpr ("/" StepExpr)* StepExpr ::= ContextTest? ExprIndex | ContextTest ContextTest ::= ShapeExprContext | TripleExprContext ShapeExprContext ::= "ShapeAnd" | "ShapeOr" | "ShapeNot" | "NodeConstraint" | "Shape" TripleExprContext ::= "EachOf" | "OneOf" | "TripleConstraint" ExprIndex ::= ShapeExprIndex | TripleExprIndex ShapeExprIndex ::= "@" (INTEGER | ShapeExprLabel) ShapeExprLabel ::= iri | BLANK_NODE_LABEL TripleExprIndex ::= INTEGER | TripleExprLabel TripleExprLabel ::= (iri | BLANK_NODE_LABEL) INTEGER? [136s] iri ::= IRIREF | prefixedName [137s] prefixedName ::= PNAME_LN | PNAME_NS @terminals [18t] IRIREF ::= '<' ([^#x00-#x20<>\"{}|^`\\] | UCHAR)* '>' [140s] PNAME_NS ::= PN_PREFIX? ':' [141s] PNAME_LN ::= PNAME_NS PN_LOCAL [142s] BLANK_NODE_LABEL ::= '_:' (PN_CHARS_U | [0-9]) ((PN_CHARS | '.')* PN_CHARS)? [19t] INTEGER ::= [+-]? [0-9]+ [26t] UCHAR ::= '\\u' HEX HEX HEX HEX | '\\U' HEX HEX HEX HEX HEX HEX HEX HEX [160s] ECHAR ::= '\\' [tbnrf\\\"\'] [164s] PN_CHARS_BASE ::= [A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6] | [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] [165s] PN_CHARS_U ::= PN_CHARS_BASE | '_' [167s] PN_CHARS ::= PN_CHARS_U | '-' | [0-9] | [#x00B7] | [#x0300-#x036F] | [#x203F-#x2040] [168s] PN_PREFIX ::= PN_CHARS_BASE ((PN_CHARS | '.')* PN_CHARS)? [169s] PN_LOCAL ::= (PN_CHARS_U | ':' | [0-9] | PLX) ((PN_CHARS | '.' | ':' | PLX)* (PN_CHARS | ':' | PLX))? [170s] PLX ::= PERCENT | PN_LOCAL_ESC [171s] PERCENT ::= '%' HEX HEX [172s] HEX ::= [0-9] | [A-F] | [a-f] [173s] PN_LOCAL_ESC ::= '\\' ('_' | '~' | '.' | '-' | '!' | '$' | '&' | "'" | '(' | ')' | '*' | '+' | ',' | ';' | '=' | '/' | '?' | '#' | '@' | '%')
This grammar does not ensure that an absolute path start with a ShapeExprLabel and ShapeExprContext, which would require duplication of the RelativePathExpr
and StepExpr
productions.
Nor does it ensure a striping between a shape expression labels, triple expression labels, and any nested shape expression labels in triple expression value expressions.