Appearance
Apply MIX technical image metadata when you are building a long-term preservation repository for still raster images and need a normalised, schema-validatable, machine-queryable record of each image's intrinsic technical properties — format, dimensions, bit depth, colour space, compression, capture device and fixity — held independently of the file itself. Do not apply it for access-only collections, thumbnails, or born-digital photographs where embedded EXIF already answers every question you will realistically ask. MIX earns its cost only when you must prove technical integrity across decades and migrations, usually inside a METS package alongside PREMIS.
What problem does MIX actually solve?
MIX — formally the NISO Metadata for Images in XML schema — exists because embedded technical metadata is fragile and inconsistent. EXIF, IPTC and XMP live inside the file, use different vocabularies, and are routinely stripped by editing software, format migration or careless re-saving. MIX pulls the technically meaningful facts out into a stable, external XML record that a repository can validate, index and reason over without ever opening the pixel data again.
The schema is organised into four sections you will recognise once you have seen one record: BasicDigitalObjectInformation (format, byte size, compression, fixity), BasicImageInformation (dimensions, colour space, ICC profile reference), ImageCaptureMetadata (the scanner or camera, the target, the operator), and ImageAssessmentMetadata (spatial and energetic resolution, the reference black/white). That structure is the tell: MIX is a reformatting schema at heart, designed around the digitisation of physical originals.
When is MIX the right call?
Reach for MIX when several of these signals are present at once:
- You have a genuine preservation mandate — the images must survive format migration over decades, not just be served to a website.
- You are already producing METS packages; MIX is the canonical technical-metadata vocabulary that METS expects in its
<amdSec>. - Your master files are archival TIFFs or JP2s from a digitisation programme, where capture conditions (resolution, target, ICC profile) are part of the evidentiary record.
- You will need to query technical properties across the corpus — "find every master under 400 ppi", "list every uncompressed 16-bit greyscale scan" — without touching the files.
- A funder, standard (e.g. FADGI) or trusted-repository audit requires documented, validatable technical metadata.
If three or more of those are true, MIX pays for itself.
When should you not bother with MIX?
| Situation | Use this instead of MIX |
|---|---|
| Access derivatives, thumbnails, web JPEGs | No technical metadata, or minimal EXIF |
| Born-digital photographs | Retain original EXIF/XMP; keep the camera-written record |
| Small project, no preservation mandate | Embedded EXIF + a fixity manifest (BagIt) |
| You only need fixity and format | PREMIS object entity + DROID/PRONOM ID |
| Rich descriptive/structural needs only | MODS / METS without a MIX block |
The recurring mistake is generating MIX for derivatives. Your 1200-pixel access JPEG does not need a ReferenceBlackWhite value or a capture-device record; that information belongs only to the master. Producing MIX for every derivative multiplies your storage and validation cost with no preservation benefit.
How do you generate MIX without hand-keying it?
Never type MIX by hand. The tooling is mature. JHOVE characterises a file and can emit MIX directly:
bash
# Validate a TIFF and emit a MIX record
jhove -m TIFF-hul -h XML -o master_0001.mix.xml master_0001.tifFor batch extraction of the raw properties before mapping into MIX, ExifTool is the workhorse:
bash
exiftool -G -s -ICC_Profile:all -j *.tif > technical.jsonA typical BasicImageInformation block, once generated, looks like this:
xml
<mix:BasicImageInformation>
<mix:BasicImageCharacteristics>
<mix:imageWidth>7216</mix:imageWidth>
<mix:imageHeight>5412</mix:imageHeight>
<mix:PhotometricInterpretation>
<mix:colorSpace>RGB</mix:colorSpace>
<mix:ColorProfile>
<mix:ICCProfile>
<mix:iccProfileName>Adobe RGB (1998)</mix:iccProfileName>
</mix:ICCProfile>
</mix:ColorProfile>
</mix:PhotometricInterpretation>
</mix:BasicImageCharacteristics>
</mix:BasicImageInformation>Validate the result against the official XSD (currently MIX 2.0) before you trust it — xmllint --schema mix.xsd record.mix.xml --noout will catch the structural slips that JHOVE's older profiles occasionally introduce.
How does MIX fit with PREMIS and METS?
These three are a team, not competitors. In a METS document the <amdSec> contains a <techMD> wrapping your MIX, a <digiprovMD> wrapping PREMIS events (the scan event, the migration event), and often a <rightsMD>. MIX answers "what is this image, technically?"; PREMIS answers "what has happened to it and who did it?". A migration from TIFF to JP2 generates a new MIX record for the derivative and a PREMIS migration event linking the two — that pairing is exactly what an auditor wants to see.
What does MIX cost you, realistically?
The honest trade-off: a MIX record adds roughly 2–6 KB per master and a validation step to your ingest pipeline. At the scale of a 200,000-image programme that is real storage and real CPU, but it is dwarfed by the master TIFFs themselves. The larger cost is governance — someone must own the schema version, the XSD, and the question of what happens when MIX 2.0 is superseded. If no one in your organisation will own that, the metadata will rot and you would have been better served by a leaner EXIF-plus-fixity approach.
Key Takeaways
- MIX is a preservation-grade technical schema for still raster images: format, dimensions, colour, capture, fixity — never subject or rights.
- Apply it when you have a preservation mandate, METS packaging, archival masters, cross-corpus technical queries, or an audit requirement — ideally several at once.
- Do not generate MIX for derivatives, thumbnails or access JPEGs; that is the most common and most wasteful misuse.
- Generate it automatically with JHOVE (which emits MIX) and ExifTool; hand-keying is never justified at scale.
- MIX, PREMIS and METS are complementary: MIX = technical facts, PREMIS = events and agents, METS = the container.
- Each record costs only a few KB, but the real cost is long-term governance of the schema version — own that, or use a leaner approach.
- For born-digital photographs, retain original EXIF; add MIX only if your repository explicitly mandates it.
Frequently Asked Questions
What does MIX technical image metadata actually record?
MIX (the NISO Metadata for Images in XML schema) records preservation-grade technical facts about a raster image: format, dimensions, bit depth, colour space, compression, capture device, ICC profile, and a fixity checksum. It does not record subject, title or rights.
Do I need MIX if my files already have embedded EXIF?
Not always. EXIF lives inside the file and travels with it; MIX is an external, validatable, preservation-stable record. You apply MIX when you need a normalised, queryable, schema-validated technical record independent of the file — typically inside a METS package or a preservation repository.
Is MIX the same as PREMIS?
No. PREMIS describes preservation events, agents and rights at the object level; MIX describes the intrinsic technical properties of a still raster image. They are complementary and frequently sit in the same METS administrative metadata section.
Can I generate MIX automatically?
Yes. Tools such as JHOVE and ExifTool extract the underlying technical properties, and JHOVE can emit MIX directly. Hand-keying MIX is almost never justified at scale.
When should I skip MIX entirely?
Skip MIX for small access-only collections, born-digital photographs where EXIF suffices, or any project without a long-term preservation mandate. The cost of validating and storing MIX is only repaid when you must prove technical integrity over decades.
Does MIX work for born-digital photographs as well as scanned items?
Yes, MIX is format-agnostic for still raster images, but its capture-metadata vocabulary was designed around reformatting (scanning). For born-digital photographs much of the same information is better served by retaining original EXIF, with MIX added only if your repository mandates it.