Appearance
SHACL (Shapes Constraint Language) validates that an RDF graph meets your rules and reports exactly which nodes violate which constraint, making data quality consistent and defensible across a whole dataset. You write shapes, declarative descriptions of what a valid object, person or place looks like, then run a validator that produces a machine-readable report of every violation with its focus node and failing constraint. The discipline that makes this reliable is targeting: a shape with no target silently validates nothing and gives a false pass, which is the single most common reason SHACL "approves" broken data.
What is SHACL for, and how is it different from OWL?
OWL describes meaning under an open-world assumption and infers facts; it will not tell you a record is missing a creator, because the open world assumes the fact might exist elsewhere. SHACL closes the world and asks a different question: is the required data actually present and well-formed here? For data quality, gatekeeping a publication pipeline, catching empty dates, you want SHACL.
How do node shapes and property shapes fit together?
A sh:NodeShape says which nodes to check; nested sh:PropertyShapes say what must hold for each property.
turtle
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix crm: <http://www.cidoc-crm.org/cidoc-crm/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
ex:ObjectShape a sh:NodeShape ;
sh:targetClass crm:E22_Human-Made_Object ;
sh:property [
sh:path dcterms:title ;
sh:minCount 1 ;
sh:datatype xsd:string ;
sh:severity sh:Violation ;
] ;
sh:property [
sh:path dcterms:creator ;
sh:minCount 1 ;
sh:nodeKind sh:IRI ;
sh:severity sh:Warning ;
] .Here every human-made object must have a string title (a hard violation) and should have an IRI creator (a warning). The sh:targetClass is what gives the shape teeth.
How do I constrain values to a controlled vocabulary?
Heritage data lives or dies by controlled terms. Two clauses cover most cases:
sh:classrequires the value to be an instance of a class.sh:inrestricts to an explicit allowed list of URIs.
turtle
sh:property [
sh:path crm:P2_has_type ;
sh:in ( ex:painting ex:drawing ex:print ) ;
sh:message "type must be one of the approved object types" ;
] .For a SKOS scheme, point sh:class at skos:Concept and optionally check membership with a SPARQL-based constraint.
How do I run validation and read the report?
pySHACL and Apache Jena are the workhorses.
bash
# pySHACL
pyshacl -s shapes.ttl -f human data.ttl
# Jena
shacl validate --shapes shapes.ttl --data data.ttlThe output is itself an RDF graph. The fields you read first:
| Report field | Meaning |
|---|---|
sh:focusNode | The node that failed |
sh:resultPath | Which property |
sh:sourceConstraintComponent | Which rule (e.g. MinCount) |
sh:resultSeverity | Violation, Warning or Info |
sh:resultMessage | Your custom message |
Why does a shape pass when data is obviously wrong?
Three usual culprits, in order:
- No target. The shape lacks
sh:targetClass/sh:targetNode, so it inspects nothing. - Wrong target class. The data uses a different class URI than the shape expects.
- Constraint never set. You meant
sh:minCount 1but left it off, so absence is legal.
Add a deliberately broken record to your test data; if SHACL does not flag it, your shape is not targeting what you think.
A working checklist before you trust the report
- Every shape has an explicit, correct target.
- Required fields use
sh:minCountand the rightsh:datatypeorsh:nodeKind. - Controlled values use
sh:inorsh:class. - Severities are deliberate: hard structure as
sh:Violation, niceties assh:Warning. - Each constraint has an
sh:messageso reports are self-explanatory. - A known-bad fixture exists and does fail, proving the shapes bite.
- Validation runs in CI so regressions are caught before publication.
Key Takeaways
- SHACL checks that required data is present and well-formed; OWL infers meaning, it does not gatekeep.
sh:NodeShapeselects nodes via a target;sh:PropertyShapeconstrains individual properties.- Constrain controlled values with
sh:in(explicit list) orsh:class(instance of). - Use
sh:severityto separate blocking violations from advisory warnings. - Read
sh:focusNode,sh:resultPathandsh:sourceConstraintComponentin the report. - A passing shape on broken data almost always means a missing or wrong target.
- Keep a known-bad fixture and run validation in CI so quality stays defensible.
Frequently Asked Questions
What is SHACL and how does it differ from OWL?
SHACL (Shapes Constraint Language) validates RDF graphs against constraints and reports violations. OWL describes meaning and infers new facts under an open-world assumption, whereas SHACL closes the world to check that required data is actually present and well-formed.
What is the difference between sh:NodeShape and sh:PropertyShape?
A sh:NodeShape declares which nodes to validate, usually via sh:targetClass, and groups property constraints. A sh:PropertyShape, attached through sh:property, constrains one path, such as setting cardinality, datatype or value range for a single property.
Does SHACL validation report warnings or only errors?
Both. Each constraint can declare a sh:severity of sh:Violation, sh:Warning or sh:Info, so you can treat missing recommended fields as warnings while hard structural problems remain blocking violations.
Can SHACL check that a value is a valid URI from a controlled vocabulary?
Yes. Use sh:class to require the value be an instance of a class, or sh:in to restrict to an enumerated list of allowed URIs, which is ideal for constraining values to a SKOS concept scheme.
How do I run SHACL validation from the command line?
Tools like pySHACL (pyshacl -s shapes.ttl data.ttl) or Apache Jena's shacl validate take a shapes file and a data file and emit a validation report graph listing each violation, its focus node and the failing constraint.
Why does my SHACL shape pass when the data is clearly missing fields?
Usually the shape has no target, so it validates nothing. Check that sh:targetClass, sh:targetNode or another target points at real nodes, and that your minimum cardinality (sh:minCount) is actually set on the property.