Appearance
Scan text-bearing documents at 300–400 ppi measured at the physical size of the original, and step up to 400–600 ppi for manuscripts, small print, fine engravings or anything headed for OCR/HTR. Resolution should be driven by the smallest meaningful detail you must capture, not by the largest number your scanner advertises. Once two pixels comfortably span the finest stroke or hairline you care about, more ppi only inflates storage and noise.
What does DPI actually mean for a scan?
DPI (dots per inch) and PPI (pixels per inch) are used interchangeably in capture, but the unit that matters is samples per inch of the original at its real size. A scanner set to 600 ppi sampling a 10 cm wide photograph produces an image about 2362 pixels wide. The same 600 ppi figure tells you nothing useful unless you also know the physical dimensions, which is why archival specs always pair resolution with object size.
How do I calculate the resolution I actually need?
Work backwards from the smallest feature you must resolve. The rule of thumb is the Nyquist principle: you need at least two pixels across the finest detail.
text
required_ppi = (2 * features_per_inch_you_must_resolve)
# Example: smallest serif stroke ~ 0.1 mm wide
# 0.1 mm = ~254 features per inch worst case
# required_ppi ~= 2 * 254 = 508 ppi -> round up to 600For text legibility, FADGI and the Dutch Metamorfoze guidelines express a Quality Index (QI) based on the smallest character height. A QI of 8 is excellent for OCR; QI 5 is the legibility floor.
What DPI should I use for different materials?
| Material | Recommended ppi (at original size) | Notes |
|---|---|---|
| Modern printed text | 300–400 | 300 fine for clean type; 400 if OCR-bound |
| Manuscripts / handwriting | 400–600 | Headroom for HTR and faint ink |
| Photographic prints | 400–600 | Capture grain and tonal subtlety |
| 35 mm film / negatives | 2400–4000 | Small original, big enlargement factor |
| Maps and engravings | 400–600 | Hairlines and stippling demand detail |
| Glass plate negatives | 1200–2400 | Treat like film: small carrier, fine grain |
Does higher resolution always help?
No, and this is the most common mistake. Past the point where the optics or the original's own detail run out, you are sampling empty magnification: bigger files, more sensor noise, longer transfer times, and no extra information. A 1200 ppi scan of a coarse newspaper halftone is wasteful; the halftone screen is the limit, not your scanner. Test by zooming to 100% and confirming you can see real structure, not just enlarged fuzz.
How does resolution interact with bit depth and file format?
Resolution sets spatial detail; bit depth sets tonal detail. For archival masters, capture 16-bit per channel for greyscale and colour where the scanner supports it, save as uncompressed TIFF (or JPEG 2000 lossless), and derive smaller access copies later. A typical master spec reads like this:
bash
# Archival master profile (example metadata you'd record)
Resolution: 600 ppi
Bit depth: 24-bit colour (8/channel) or 48-bit (16/channel)
Format: TIFF, uncompressed
Colour space: Adobe RGB 1998 or eciRGB v2 (embed ICC profile)Record the resolution in the file's metadata so downstream tools can reconstruct physical size. With ImageMagick you can confirm what was actually written:
bash
identify -format "%w x %h px, %x x %y ppi, %z-bit\n" master.tifHow do I verify I hit the target after scanning?
Place a resolution target (a USAF 1951 chart or the slanted-edge patch on an objective target like the ISO 16067 or a Golden Thread) in a calibration capture. Measure the limiting resolution and the modulation transfer function (MTF). If the chart says you resolve 6 lp/mm but you specified detail at 8 lp/mm, your effective resolution is below spec even if the file header proudly reads 600 ppi. Always trust the measured number over the nominal one.
Key Takeaways
- Set resolution from the smallest meaningful detail, using the two-pixels-across rule, not from the scanner's maximum.
- 300–400 ppi for printed text, 400–600 ppi for manuscripts and OCR/HTR, much higher for small originals like film and glass plates.
- Resolution is meaningless without the original's physical size — always record both.
- Higher ppi beyond the optical limit adds file size and noise, not information.
- Pair resolution with 16-bit depth and an uncompressed/lossless master format.
- Verify with a resolution target and MTF measurement, not the header value.
- Use FADGI/Metamorfoze Quality Index figures to justify the number you chose.
Frequently Asked Questions
What DPI should I scan archival documents at?
For typed or printed text-bearing documents, 300–400 ppi at the physical size of the original is the archival baseline. Manuscripts, fine detail and material destined for OCR/HTR benefit from 400–600 ppi.
Is DPI the same as PPI?
In capture they are used interchangeably, but strictly DPI describes printer dots and PPI describes image samples per inch. For scanning you almost always mean pixels per inch (ppi) measured against the original's real size.
Does higher resolution always mean a better archival master?
No. Past the point where you resolve the smallest meaningful detail, extra ppi only adds file size and noise without adding information. Match resolution to the smallest feature you must capture, not to the highest number the scanner offers.
How do I calculate the resolution I actually need?
Decide the smallest detail in line-pairs per millimetre, then set ppi so at least two pixels span that feature. FADGI and Metamorfoze publish target QI (quality index) figures you can work back from for text legibility.
What resolution do I need for OCR or HTR?
Aim for 300 ppi minimum for clean modern print and 400–600 ppi for handwriting, degraded type or small fonts. Below roughly 150 ppi, recognition accuracy collapses for most engines.