Appearance
To plan born-digital storage well, design for preservation, not just capacity: keep at least three copies on two media types with one off-site (the 3-2-1 rule), verify fixity on a schedule, choose storage that detects and repairs corruption, and write the whole thing down against a framework like the NDSA Levels so it is consistent and defensible. Storage that merely holds bits without verification, redundancy and a migration path is a liability, not a preservation system. This guide gives you the decisions and a checklist.
Storage versus preservation — know the difference
Storage keeps the bits; preservation keeps the bits usable. A single hard drive "stores" your records right up until it fails silently. Preservation adds redundancy, integrity checking, format management and the capacity to act when media or formats age out. Plan for preservation and storage falls out of it; plan for storage alone and you will discover the gap at the worst moment.
How many copies, and where?
The baseline is the 3-2-1 rule: three copies, on two different media types, with one geographically separate. The point is independence — copies that cannot fail for the same reason at the same time. A laptop and its Time Machine drive in the same room are two copies but one fire.
| Copy | Medium | Location | Role |
|---|---|---|---|
| 1 | Spinning disk / SSD | Working store | Daily access |
| 2 | Object storage (cloud) | Provider region | Durable preservation |
| 3 | Tape (LTO) or second cloud | Off-site | Independent backstop |
High-value collections justify more copies and more locations.
How do you stop silent corruption?
Bits rot quietly — a flipped bit on a drive gives no error until you open the file. Defend with two layers. First, scheduled fixity checks: re-hash stored objects and compare to the manifest, so any change is caught and the copy re-replicated from a good one. Second, self-healing storage: a checksumming filesystem like ZFS, or object storage with built-in integrity, detects and repairs corruption automatically.
bash
# Scheduled fixity verification against a stored manifest
sha256sum -c /archive/col-301/manifest.sha256 --quiet || \
echo "FIXITY FAILURE in col-301 — investigate and re-replicate"
# A ZFS scrub detects and repairs bit rot across the pool
zpool scrub tank && zpool status tankA copy you never verify is not really a backup — it is an untested assumption.
Cloud, local, or both?
Most mature programmes land on hybrid. Cloud gives durable, geographically redundant storage without running your own data centre; a local or second-provider copy gives you control and an exit if a provider raises prices, changes terms, or fails. Decide your exit strategy before you upload terabytes you cannot affordably retrieve — egress fees and vendor lock-in are real planning constraints.
How do you estimate cost over time?
Cost the full picture, not just per-gigabyte today. Take the size of masters plus derivatives, multiply by your copy count, add expected annual growth, and project across your planning horizon. Then add the costs that dwarf raw storage: verification compute, periodic format migration, and staff time.
text
year_1_TB = (masters_TB + derivatives_TB) * copies
projected_TB(n) = year_1_TB * (1 + growth_rate)^n
total_cost(n) = sum over years of (TB * $/TB/yr) + migration + staffThe ongoing operational cost usually exceeds the storage bill, so a plan that budgets only for disk is incomplete.
How do you make the plan defensible?
Defensible means documented and benchmarked. Write the storage plan down and map it to the NDSA Levels of Preservation, which scale from Level 1 (have a second copy) to Level 4 (multiple verified copies, format monitoring, repair). State your copy counts, media, locations, verification schedule, and who is responsible, then review on a fixed cycle. Improvised storage decisions are indefensible; a written, framework-aligned plan is one you can stand behind in an audit.
Key Takeaways
- Plan for preservation, not just capacity — storage without verification and redundancy is a liability.
- Apply 3-2-1: three copies, two media types, one off-site, with copies that fail independently.
- Defend against bit rot with scheduled fixity checks and self-healing storage like ZFS or integrity-checked object stores.
- Use hybrid cloud-plus-local for durability with control, and decide your exit strategy before you commit data.
- Cost the masters, derivatives, copies and growth — and remember verification, migration and staff outweigh raw storage.
- Write the plan down, map it to the NDSA Levels, and review it on a fixed cycle to keep it defensible.
Frequently Asked Questions
How many copies of born-digital records should I keep?
At least three: the working copy plus two preservation copies, on at least two different media types, with one geographically separate. This is the 3-2-1 rule, and many archives extend it to more copies for high-value material.
How do I stop silent data corruption (bit rot)?
Run scheduled fixity checks against stored checksums, and prefer storage that detects and repairs corruption automatically — a checksumming filesystem like ZFS, or object storage with built-in integrity. A copy you never verify is not really a backup.
Should born-digital archives go to the cloud or stay local?
Most mature programmes use a hybrid: cloud for geographic redundancy and durability, plus a local or independent copy so you are not locked to one provider. Match the choice to your budget, control needs and exit strategy.
How do I estimate storage costs over time?
Cost the master plus derivatives, multiply by your number of copies, add growth per year, and project over your planning horizon — and remember ongoing costs (verification, migration, staff) usually exceed the raw storage bill.
What is the difference between storage and preservation?
Storage keeps the bits; preservation keeps the bits usable. Storage is necessary but not sufficient — you also need fixity, format management, metadata and the ability to act when formats or media age out.
How do I make my storage plan defensible?
Write it down, map it to a recognised framework such as the NDSA Levels of Preservation, define copy counts, locations, verification schedules and responsibilities, and review it on a fixed cycle so decisions are documented, not improvised.