Skip to content
File Formats & Migration

When DROID gives wrong, missing or tentative identifications, the cause is almost always one of four things: stale signature files, truncated or corrupt input, container formats DROID cannot open, or insufficient Java heap. Fix them in that order. DROID matches files against PRONOM byte signatures to assign a PUID (PRONOM Unique Identifier), so most "DROID is wrong" problems are really signature or input problems.

Why does DROID return no identification at all?

A blank result means no signature matched. Work through three checks. First, confirm the file is complete — a truncated download has no valid header:

bash
ls -l suspect.file        # zero or tiny size is a red flag
xxd suspect.file | head    # inspect the first bytes against the expected magic number

Second, confirm your PRONOM signature file actually contains a signature for that format; newer or niche formats may not be covered yet. Third, if it is a container (ZIP-based Office, EPUB), make sure you ran DROID with container signatures enabled. If the format genuinely has no signature, report it to PRONOM with a sample.

How do I fix stale or missing signatures?

Out-of-date signatures cause more bad identifications than anything else. Update both the standard and container signature files:

# In the DROID GUI
Tools  ->  Check for signature updates  ->  Download  ->  Set as default

For headless or scheduled use, point DROID at the current signature XML explicitly:

bash
droid -a ./collection \
  -Ns DROID_SignatureFile_VXX.xml \
  -Nc container-signature-XXXXXXXX.xml \
  -p collection.droid

Re-run the profile after updating; identification rates often jump materially once signatures are current.

How do I read tentative versus definitive matches?

DROID reports a method for each hit. Know what each means before you trust it:

MethodWhat it meansTrust level
SignatureInternal byte signature matchedHigh
ContainerMatched inside a ZIP/OLE2 containerHigh
ExtensionOnly the filename matchedLow — verify
(none)No match at allInvestigate

An Extension match is a prompt to inspect, not an answer. Open the header bytes and compare them to the expected magic number; a .doc file that is really HTML will show it immediately.

Why is DROID crashing or running out of memory?

DROID holds profile state in memory, so large collections exhaust the default Java heap. Raise it in the startup script:

bash
# in droid.sh / droid.bat, adjust the JAVA_OPTS line
-Xmx4g        # 4 GB heap; raise further for very large profiles

If a single profile still struggles, split the collection into several profiles and merge the exported CSVs afterwards. Memory pressure also slows identification dramatically, so this fix often speeds things up too.

Why do DROID and Siegfried disagree?

Both tools read PRONOM, so disagreement usually traces to different signature versions or container handling. Pin both to the same PRONOM release first:

bash
sf -version           # shows Siegfried's signature build
# update DROID signatures to match the same PRONOM version

Remaining differences are typically genuinely ambiguous files where two signatures both match. Inspect those by hand; they are exactly the files worth a human decision before migration.

How do I turn results into an actionable report?

Export the profile to CSV and summarise by PUID. The CSV carries per-file PUID, format name, MIME type, method and warnings:

bash
# count files per format to scope migration work
cut -d, -f10 collection.csv | sort | uniq -c | sort -rn

That count tells you which obsolete formats dominate the collection and therefore where migration effort should go first.

Key Takeaways

  • Most DROID problems are stale signatures, truncated input, unopened containers, or low heap — check in that order.
  • Update both standard and container PRONOM signature files before blaming DROID.
  • Treat Extension-method matches as flags to inspect header bytes, not as answers.
  • Raise the Java heap (-Xmx4g+) and split very large collections into multiple profiles.
  • Align DROID and Siegfried signature versions before reconciling disagreements.
  • Export to CSV and count by PUID to scope migration work.

Frequently Asked Questions

Why does DROID return no identification for some files?

Usually the file's format has no PRONOM signature yet, the file is truncated or corrupt, or it is a container DROID cannot open. Check whether the byte signature exists in your PRONOM version and confirm the file is complete before assuming DROID is at fault.

Why does DROID give an extension-only or tentative match?

A tentative (extension) match means no internal byte signature matched, so DROID fell back to the filename. This is unreliable; treat it as a flag to inspect the file's header bytes or report a new signature to PRONOM.

How do I update DROID's signature files?

In the DROID GUI use Tools then Check for signature updates, or download the latest DROID and PRONOM signature XML and container signature XML manually. Out-of-date signatures are the single most common cause of poor identification rates.

Why is DROID running out of memory on a large profile?

Java heap is too small for the profile. Increase the heap in the droid startup script (for example -Xmx4g) and consider splitting very large collections into multiple profiles, since DROID keeps profile data in memory.

Why do DROID and Siegfried disagree on a format?

Both read PRONOM signatures but may use different signature versions or container handling. Align their signature versions first; remaining differences usually come down to ambiguous files where more than one signature matches.

How do I get DROID results into a useful report?

Export the profile to CSV, then aspect-report or pivot on the PUID and format columns. The CSV gives you per-file PUID, format name, MIME type and any warnings, which you can summarise to plan migration.