Appearance
To assign persistent identifiers, pick a scheme (ARK, DOI or Handle), register for a namespace, then mint an opaque, never-reused identifier for each object and register it with a resolver that points to the object's current URL. Persistence comes from that resolver plus your organisation's commitment to maintain it — not from the string itself. Below is the full workflow with practical defaults.
Which scheme should I choose?
Match the scheme to the job rather than picking one for everything.
| Scheme | Best for | Cost | Resolver |
|---|---|---|---|
| ARK | High-volume institutional objects | Free (need a NAAN) | n2t.net or local |
| DOI | Citable, published datasets/objects | Membership + per-DOI fee | doi.org via DataCite |
| Handle | Underlying PID infrastructure | Prefix fee | hdl.handle.net |
A common pattern: mint ARKs for everything internally because they are free and scale, then mint DOIs only for the subset that needs formal citation and DataCite metadata. ARKs and DOIs are not mutually exclusive.
Step 1: Register a namespace
You cannot mint into thin air. Acquire the prefix your scheme requires:
- ARK: request a NAAN (Name Assigning Authority Number) from the ARK Alliance — free, e.g.
12345. - DOI: join DataCite (often via a national consortium) to get a prefix like
10.55555. - Handle: register a prefix such as
20.500.12345.
Record the prefix in your configuration so minting is consistent across the team.
Step 2: Design the identifier string
The most important rule: make it opaque. Do not encode titles, dates, departments or formats — all of which change. A good ARK looks like:
text
ark:/12345/bv9x7q2k # opaque, random-ish, no semanticsAvoid this:
text
ark:/12345/photos-london-1923-tif # breaks on reclassification/migrationUse a short random or sequential local part. Many institutions add a check digit (NOID generates these) to catch transcription errors.
Step 3: Mint and register the target
Minting creates the identifier and tells the resolver where the object lives. With an ARK service you POST the local identifier and its target URL; the resolver then redirects n2t.net/ark:/12345/bv9x7q2k to your live page. A bulk example:
python
import requests
def mint_ark(naan, local, target_url, token):
r = requests.put(
f"https://ark.example.org/ark:/{naan}/{local}",
headers={"Authorization": f"Bearer {token}"},
json={"target": target_url},
timeout=10,
)
r.raise_for_status()
return f"ark:/{naan}/{local}"
# mint and immediately store back on the record
for rec in records:
pid = mint_ark("12345", rec["local"], rec["url"], TOKEN)
rec["identifier"] = pid # write PID onto the record nowStore the PID back on the record in the same transaction, so an object never exists without its identifier.
Step 4: Keep targets resolving
Persistence is ongoing maintenance. When an object moves, update the target in the resolver — never mint a new identifier. A quarterly job that checks every PID still resolves catches rot early:
bash
# verify a sample of ARKs return a 200/redirect
while read pid; do
code=$(curl -o /dev/null -s -w "%{http_code}" "https://n2t.net/$pid")
echo "$pid -> $code"
done < pids.txtPitfalls to avoid
- Reusing a retired identifier — once assigned, a PID must never point to a different object. Tombstone deleted objects instead.
- Baking semantics into the string — reclassification then makes the identifier lie.
- Treating the string as the guarantee — without a maintained resolver and an institutional commitment, even a DOI rots.
- Minting without storing — always write the PID back to the catalogue immediately.
- Mixing schemes per item inconsistently — decide the rule (e.g. ARK for all, DOI for citable) and apply it uniformly.
Key Takeaways
- A persistent identifier resolves to an object's current location even when URLs change.
- Choose ARK for volume and cost, DOI for citation, Handle for underlying infrastructure — and combine them where sensible.
- Persistence is an organisational commitment plus a maintained resolver, not a property of the string.
- Keep identifiers opaque: never encode titles, dates, folders or formats.
- Mint and store the PID on the record in the same transaction so objects are never identifier-less.
- Never reuse a retired identifier; tombstone instead, and check resolution on a schedule.
Frequently Asked Questions
What is a persistent identifier?
A persistent identifier (PID) is a long-lasting reference to a digital object that resolves to its current location even if the underlying URL changes. ARKs, DOIs and Handles are the common schemes in cultural heritage.
Should I use ARK, DOI or Handle?
Use ARKs for high-volume, low-cost institutional collections, DOIs when citation and DataCite metadata matter, and Handles when you need the underlying infrastructure DOIs are built on. Many institutions run ARKs internally and mint DOIs only for citable objects.
How much do persistent identifiers cost?
ARKs are free to mint once you have a registered NAAN. DOIs require a DataCite or Crossref membership, typically an annual fee plus a small per-DOI cost. Handles need a prefix from the Handle registry with a modest annual fee.
What makes an identifier persistent?
Persistence is a commitment, not a technology. The identifier must be opaque, never reused, and backed by a resolver and an organisation that promises to keep it resolving. The string alone guarantees nothing.
Should identifiers contain meaningful information?
No. Avoid encoding titles, dates, folders or formats into the identifier. Such semantics break when items are reclassified or migrated. Use opaque, randomly assigned strings instead.
Can I assign persistent identifiers in bulk?
Yes. ARK and DOI APIs let you mint thousands at once. Generate the local part, register each identifier with its target URL, and store the PID back on the record in the same transaction.