When to Model rights in metadata

Model rights in metadata whenever your items will be published, aggregated, or reused by anyone outside the project — which is almost every digitisation effort. The decisive reason is machine-actionability: a rightsstatements.org URI or a Creative Commons licence URL is what lets aggregators, search engines and reusers automatically tell a reusable item from a restricted one. Skip per-item rights only when the collection is purely internal and a single collection-level statement genuinely covers everything. This article is about that when: the signals that say model it, the cases where it is overkill, and the trade-offs in between.

What signals say "model rights now"?

Reach for structured rights fields when any of these are true:

Items will appear on a public site, in an IIIF manifest, or in an aggregator (Europeana, the DPLA, Wikimedia Commons).
The collection mixes copyright statuses — some public domain, some in-copyright, some orphan.
Donor or depositor agreements impose item-specific restrictions.
Third-party material (quoted images, embedded artworks) sits inside otherwise-clear items.

Any one of these means a single project licence cannot tell the truth, so per-item rights stop being optional.

Licence or rights statement — which field do I need?

These are different tools and confusing them is the classic error:

	Licence	Rights statement
Example	`CC BY 4.0`, `CC0`	`rightsstatements.org/vocab/InC/1.0/`
Who issues it	The rightsholder	A describer reporting status
Use it when	You hold rights and grant reuse	Rights are reserved, unknown or orphan
Public-domain case	`CC0` / Public Domain Mark	`NoC-US`, `NKC`

You will often need both: a licence URI where you can grant permission, and a rights-statement URI where you can only describe the situation.

How many rights fields should a record carry?

At minimum two, because no single field does both jobs:

xml

<!-- machine-readable, drives filtering and aggregation -->
<dc:rights rdf:resource="http://rightsstatements.org/vocab/InC/1.0/"/>
<!-- human-readable nuance the URI cannot express -->
<dcterms:accessRights>
  In copyright. Reproduction by permission of the depositor;
  figure 3 contains a third-party photograph, rights not cleared.
</dcterms:accessRights>

The URI is for software; the note is for the cataloguer and the next researcher.

When is modelling rights overkill?

For an internal working dataset that will never be published, and where one licence (say, "internal use only, © County Archive") blankets every record, per-item rights add cost without benefit. Record the single statement once at collection level. The moment any item leaves that boundary, though, the blanket assumption breaks and you are back to modelling per item.

What does getting it wrong cost?

Rights metadata is one of the few fields where a confident error is worse than a blank. Tagging an in-copyright photograph as Public Domain Mark invites downstream reusers to act on it, and the liability propagates. The safe default for uncertainty is an explicit "undetermined" statement:

text

rightsstatements.org/vocab/UND/1.0/   (Copyright Undetermined)

This is honest, machine-readable, and signals "do your own clearance" rather than a false green light. Pair it with an orphan-works diligent-search note where relevant.

How do I keep rights values consistent at scale?

Constrain the URI field to a closed vocabulary — the twelve rightsstatements.org URIs plus your chosen CC licences — and validate on ingest:

bash

# flag any rights value outside the approved list
csvgrep -c rights_uri -r "rightsstatements\.org|creativecommons\.org" \
  -i records.csv | csvcut -c identifier,rights_uri

Free-text rights notes are fine for nuance, but the URI that drives aggregation must come from a fixed list, or filtering downstream silently fails.

Key Takeaways

Model rights per item whenever content is published, aggregated or reused — the common case.
A machine-readable rights URI is essential; a free-text note carries the nuance it cannot.
Distinguish licences (you grant) from rights statements (you describe), and use both as needed.
Per-item rights are overkill only for never-published datasets under one blanket licence.
An honest "undetermined" statement is safer than a confident wrong one.
Constrain rights URIs to a closed vocabulary and validate on ingest.

Frequently Asked Questions

When should I model rights in metadata?

Whenever items will be published, aggregated or reused — which is almost always. A machine-readable rights URI is what lets Europeana, the DPLA and search engines know an item is reusable; without it the default reader assumption is "all rights reserved".

What is the difference between a licence and a rights statement?

A licence (e.g. CC BY 4.0) grants permissions the rightsholder controls. A rights statement (e.g. rightsstatements.org InC) describes the copyright status when you cannot or will not grant a licence, including for orphan and unknown-rights works.

Should I use one rights field or several?

Use at least two: a machine-readable URI for filtering and aggregation, and a free-text note for the human nuance the URI cannot capture, such as donor restrictions or third-party content.

When is modelling rights overkill?

For purely internal, never-published working datasets where one project-level licence covers everything. Even then, record that single statement once at the collection level rather than per item.

Can rights metadata be wrong and cause harm?

Yes. Labelling an in-copyright work as public domain can expose your institution and downstream reusers to infringement claims, so an honest "undetermined" statement is safer than a confident wrong one.

What signals say "model rights now"? ​

Licence or rights statement — which field do I need? ​

How many rights fields should a record carry? ​

When is modelling rights overkill? ​

What does getting it wrong cost? ​

How do I keep rights values consistent at scale? ​

Key Takeaways ​

Frequently Asked Questions ​

When should I model rights in metadata? ​

What is the difference between a licence and a rights statement? ​

Should I use one rights field or several? ​

When is modelling rights overkill? ​

Can rights metadata be wrong and cause harm? ​

Related reading ​

What signals say "model rights now"?

Licence or rights statement — which field do I need?

How many rights fields should a record carry?

When is modelling rights overkill?

What does getting it wrong cost?

How do I keep rights values consistent at scale?

Key Takeaways

Frequently Asked Questions

When should I model rights in metadata?

What is the difference between a licence and a rights statement?

Should I use one rights field or several?

When is modelling rights overkill?

Can rights metadata be wrong and cause harm?

Related reading