Appearance
Logging preservation events means recording every action that touches a digital object — ingest, fixity check, format migration, replication, deletion — with five facts: what happened, when, to which object, by which agent, and with what outcome. Capture those consistently and you have machine-readable provenance that can answer any audit question. The recognised standard for this is PREMIS, but the discipline matters more than the format.
What counts as a preservation event?
Anything that changes an object's bitstream, its form, or its integrity status. Common types you should always log:
- Ingestion — the object entered the repository.
- Message digest calculation — a checksum was generated.
- Fixity check — the checksum was re-verified.
- Virus check — scanned for malware.
- Format identification / validation — what it is, and whether it is well-formed.
- Migration / normalisation — converted to a preservation format.
- Replication — a copy was made to another location.
- Deletion — an object or copy was removed (yes, log these too).
If an action could later make you ask "what happened to this file?", it is an event worth recording.
What are the minimum fields for an event?
Five fields make an event auditable. Drop any one and the record stops answering questions.
| Field | Example | Why it matters |
|---|---|---|
| Event type | fixity check | What was done |
| Timestamp | 2024-09-12T08:31:04Z | When, precisely, in UTC |
| Object ID | uuid:9f2c-... | Which object |
| Agent | sha256sum 9.0; J. Okafor | Who/what did it, with version |
| Outcome | success / fail: mismatch | The result and detail |
Always record the agent's software version — a migration done with one tool version may behave differently from another, and that detail resolves future disputes.
How do I log events in PREMIS?
PREMIS models an Event linking an Object, an Agent and an Outcome. In XML it looks like this:
xml
<premis:event>
<premis:eventIdentifier>
<premis:eventIdentifierType>UUID</premis:eventIdentifierType>
<premis:eventIdentifierValue>3b1e-7a44</premis:eventIdentifierValue>
</premis:eventIdentifier>
<premis:eventType>fixity check</premis:eventType>
<premis:eventDateTime>2024-09-12T08:31:04Z</premis:eventDateTime>
<premis:eventOutcomeInformation>
<premis:eventOutcome>success</premis:eventOutcome>
</premis:eventOutcomeInformation>
<premis:linkingAgentIdentifier>
<premis:linkingAgentIdentifierValue>sha256sum-9.0</premis:linkingAgentIdentifierValue>
</premis:linkingAgentIdentifier>
</premis:event>Tools like Archivematica emit this automatically for every micro-service. The structure is what lets a machine answer "show me every failed event for this object."
Can I log events without heavyweight software?
Yes. For a small archive a disciplined append-only log in CSV or JSON Lines is perfectly valid, provided it carries the five core fields. A pragmatic JSON Lines entry:
json
{"event":"fixity_check","time":"2024-09-12T08:31:04Z","object":"uuid:9f2c","agent":"sha256sum-9.0","outcome":"success"}Append, never edit. The integrity of the log itself matters, so store it write-once or under version control and, ideally, hash the log periodically.
Why log fixity checks that always pass?
Because preservation evidence is cumulative. A two-year run of passing fixity checks is precisely what proves an object stayed intact; the eventual failure is meaningful only against that unbroken record. Skipping the "boring" successes destroys the baseline that gives a failure its meaning. Automate the logging so passing checks cost you nothing.
How do events become provenance I can trust?
A complete, time-ordered event history lets anyone reconstruct an object's whole life — received here, migrated then, copied there, verified ever since. That reconstructability is the core requirement of a trustworthy digital repository and a CoreTrustSeal audit. Practically, make sure events are:
- Linked to a stable object identifier (so they survive renames).
- Ordered by precise UTC timestamps.
- Attributable to a named agent with version.
- Immutable once written.
A reusable workflow
- Define your event vocabulary up front (reuse PREMIS event types — don't invent your own).
- Capture an event at every pipeline step automatically.
- Write the five core fields, always with UTC time and agent version.
- Store the log append-only and hash it periodically.
- Review failure events on a schedule and act on them.
Key Takeaways
- A preservation event records what, when, which object, which agent, and the outcome.
- PREMIS is the standard model linking Object, Event, Agent and Outcome.
- Five fields are mandatory; record the agent's software version too.
- Log "boring" successes — they form the baseline that makes a failure meaningful.
- You don't need heavyweight software; a disciplined append-only CSV/JSON log works.
- Keep events linked, ordered, attributable and immutable for trustworthy provenance.
- Reuse PREMIS event vocabularies rather than inventing local terms.
Frequently Asked Questions
What is a preservation event?
A preservation event is any action that affects a digital object's integrity or form — ingest, fixity check, format identification, virus scan, migration, replication or deletion — recorded with what happened, when, by which agent, and the outcome.
What metadata standard is used for preservation events?
PREMIS (Preservation Metadata: Implementation Strategies) is the de facto standard. Its Event entity links an Object, an Agent (the actor) and an Outcome, giving you machine-readable provenance.
What are the minimum fields for a preservation event?
At minimum: event type, a precise timestamp, the object identifier, the agent (person or software with version), and the outcome (success/failure plus detail). Without all five the record cannot answer an audit question.
Why log fixity checks if they usually pass?
Because the value is the continuous record. A logged history of passing checks is your evidence the object was intact the whole time; the one failure stands out precisely because everything before it passed.
Do I need software like Archivematica to log events?
No. Archivematica generates PREMIS automatically, but a disciplined CSV or JSON log with the five core fields is a valid, auditable event record for a small archive.
How do events relate to provenance?
Events are the building blocks of digital provenance. A complete, ordered event history lets anyone reconstruct everything that happened to an object from receipt to today, which is the heart of a trustworthy repository.