Skip to content
Linked Open Data

SHACL (Shapes Constraint Language) validates that an RDF graph meets your rules and reports exactly which nodes violate which constraint, making data quality consistent and defensible across a whole dataset. You write shapes, declarative descriptions of what a valid object, person or place looks like, then run a validator that produces a machine-readable report of every violation with its focus node and failing constraint. The discipline that makes this reliable is targeting: a shape with no target silently validates nothing and gives a false pass, which is the single most common reason SHACL "approves" broken data.

What is SHACL for, and how is it different from OWL?

OWL describes meaning under an open-world assumption and infers facts; it will not tell you a record is missing a creator, because the open world assumes the fact might exist elsewhere. SHACL closes the world and asks a different question: is the required data actually present and well-formed here? For data quality, gatekeeping a publication pipeline, catching empty dates, you want SHACL.

How do node shapes and property shapes fit together?

A sh:NodeShape says which nodes to check; nested sh:PropertyShapes say what must hold for each property.

turtle
@prefix sh:   <http://www.w3.org/ns/shacl#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix crm:  <http://www.cidoc-crm.org/cidoc-crm/> .
@prefix xsd:  <http://www.w3.org/2001/XMLSchema#> .

ex:ObjectShape a sh:NodeShape ;
    sh:targetClass crm:E22_Human-Made_Object ;
    sh:property [
        sh:path dcterms:title ;
        sh:minCount 1 ;
        sh:datatype xsd:string ;
        sh:severity sh:Violation ;
    ] ;
    sh:property [
        sh:path dcterms:creator ;
        sh:minCount 1 ;
        sh:nodeKind sh:IRI ;
        sh:severity sh:Warning ;
    ] .

Here every human-made object must have a string title (a hard violation) and should have an IRI creator (a warning). The sh:targetClass is what gives the shape teeth.

How do I constrain values to a controlled vocabulary?

Heritage data lives or dies by controlled terms. Two clauses cover most cases:

  • sh:class requires the value to be an instance of a class.
  • sh:in restricts to an explicit allowed list of URIs.
turtle
sh:property [
    sh:path crm:P2_has_type ;
    sh:in ( ex:painting ex:drawing ex:print ) ;
    sh:message "type must be one of the approved object types" ;
] .

For a SKOS scheme, point sh:class at skos:Concept and optionally check membership with a SPARQL-based constraint.

How do I run validation and read the report?

pySHACL and Apache Jena are the workhorses.

bash
# pySHACL
pyshacl -s shapes.ttl -f human data.ttl

# Jena
shacl validate --shapes shapes.ttl --data data.ttl

The output is itself an RDF graph. The fields you read first:

Report fieldMeaning
sh:focusNodeThe node that failed
sh:resultPathWhich property
sh:sourceConstraintComponentWhich rule (e.g. MinCount)
sh:resultSeverityViolation, Warning or Info
sh:resultMessageYour custom message

Why does a shape pass when data is obviously wrong?

Three usual culprits, in order:

  1. No target. The shape lacks sh:targetClass/sh:targetNode, so it inspects nothing.
  2. Wrong target class. The data uses a different class URI than the shape expects.
  3. Constraint never set. You meant sh:minCount 1 but left it off, so absence is legal.

Add a deliberately broken record to your test data; if SHACL does not flag it, your shape is not targeting what you think.

A working checklist before you trust the report

  1. Every shape has an explicit, correct target.
  2. Required fields use sh:minCount and the right sh:datatype or sh:nodeKind.
  3. Controlled values use sh:in or sh:class.
  4. Severities are deliberate: hard structure as sh:Violation, niceties as sh:Warning.
  5. Each constraint has an sh:message so reports are self-explanatory.
  6. A known-bad fixture exists and does fail, proving the shapes bite.
  7. Validation runs in CI so regressions are caught before publication.

Key Takeaways

  • SHACL checks that required data is present and well-formed; OWL infers meaning, it does not gatekeep.
  • sh:NodeShape selects nodes via a target; sh:PropertyShape constrains individual properties.
  • Constrain controlled values with sh:in (explicit list) or sh:class (instance of).
  • Use sh:severity to separate blocking violations from advisory warnings.
  • Read sh:focusNode, sh:resultPath and sh:sourceConstraintComponent in the report.
  • A passing shape on broken data almost always means a missing or wrong target.
  • Keep a known-bad fixture and run validation in CI so quality stays defensible.

Frequently Asked Questions

What is SHACL and how does it differ from OWL?

SHACL (Shapes Constraint Language) validates RDF graphs against constraints and reports violations. OWL describes meaning and infers new facts under an open-world assumption, whereas SHACL closes the world to check that required data is actually present and well-formed.

What is the difference between sh:NodeShape and sh:PropertyShape?

A sh:NodeShape declares which nodes to validate, usually via sh:targetClass, and groups property constraints. A sh:PropertyShape, attached through sh:property, constrains one path, such as setting cardinality, datatype or value range for a single property.

Does SHACL validation report warnings or only errors?

Both. Each constraint can declare a sh:severity of sh:Violation, sh:Warning or sh:Info, so you can treat missing recommended fields as warnings while hard structural problems remain blocking violations.

Can SHACL check that a value is a valid URI from a controlled vocabulary?

Yes. Use sh:class to require the value be an instance of a class, or sh:in to restrict to an enumerated list of allowed URIs, which is ideal for constraining values to a SKOS concept scheme.

How do I run SHACL validation from the command line?

Tools like pySHACL (pyshacl -s shapes.ttl data.ttl) or Apache Jena's shacl validate take a shapes file and a data file and emit a validation report graph listing each violation, its focus node and the failing constraint.

Why does my SHACL shape pass when the data is clearly missing fields?

Usually the shape has no target, so it validates nothing. Check that sh:targetClass, sh:targetNode or another target points at real nodes, and that your minimum cardinality (sh:minCount) is actually set on the property.