Appearance
To estimate IIIF hosting cost, size three line items and sum them: storage of your master/derivative images, compute for the image server or pre-tiling jobs, and egress bandwidth for serving tiles to users. For public collections, egress and compute usually dwarf storage, and a CDN in front of either architecture is the single biggest lever on the bill.
What are the three cost drivers?
| Driver | Dynamic server | Static (pre-tiled) |
|---|---|---|
| Storage | Master images only | Master + every tile (huge file count) |
| Compute | Per-request tile generation | One-off tiling job, then near-zero |
| Egress | Origin bandwidth (CDN reduces) | Origin bandwidth (CDN reduces) |
The headline trade-off: a dynamic server (Cantaloupe, IIPImage) keeps storage small but pays CPU on every uncached request; static IIIF pays compute once during tiling, stores far more files, then serves cheaply from object storage.
How do I estimate storage?
Start from your master format and multiply:
pyramidal_tiff_size ~= flat_master_size * 1.5–3.0
jp2_size ~= flat_master_size * 0.4–0.8 (more compute to decode)
static_tiles_count ~= pages * sum(tiles_per_level)A 24000×18000 image at 512-px tiles across six pyramid levels yields roughly 1645 + 420 + 110 + 30 + 8 + 2 ≈ 2215 tiles. Across a 10,000-page collection that is over 22 million small objects — fine for object storage on cost, but watch per-request and per-1000-operation charges on cloud buckets.
How do I estimate egress, the part that bites?
Egress is the part most people under-budget. The formula:
monthly_egress_GB =
page_views
* tiles_per_view # typically 10–40 for deep zoom
* avg_tile_size_KB # typically 20–80 KB
/ 1_000_000 # KB -> GBFor 200,000 page views a month, 25 tiles per view, 40 KB per tile:
200000 * 25 * 40 / 1000000 = 200 GB/monthAt a representative cloud egress rate, that is a modest but non-trivial bill — and it scales linearly with popularity, which is exactly when you least want a surprise.
Why does a CDN change the maths?
Most tile requests are repeats: the same cover image, the same first opening, the same popular region. A CDN caches those, so the origin only serves cache misses. With a 90%+ hit ratio, your origin compute and origin egress both drop by an order of magnitude. Put differently: a CDN converts a per-request cost into a near-fixed one, and is almost always cheaper than scaling the origin to absorb peaks.
nginx
# Make tiles cacheable so the CDN can do its job
location /iiif/ {
add_header Cache-Control "public, max-age=31536000, immutable" always;
add_header Access-Control-Allow-Origin "*" always;
}Immutable, long-lived caching is safe because IIIF tile URLs are content-addressed by their parameters.
A worked estimate you can adapt
For a mid-size public collection:
- Storage: 2 TB of pyramidal masters → predictable monthly storage charge.
- Compute: one small always-on Cantaloupe instance (2 vCPU) plus burst — or zero if fully static behind a CDN.
- Egress: 200 GB/month origin, reduced to ~20 GB after a 90% CDN hit ratio.
- CDN: its own per-GB rate, usually cheaper than origin egress.
Document each assumption (views, tiles-per-view, tile size, hit ratio) so the estimate is defensible and can be revisited when traffic grows.
A defensible costing checklist
- [ ] Master format and per-image size measured, not guessed.
- [ ] Tile count per image computed from real dimensions and levels.
- [ ] Page-view forecast with a source (analytics or comparable site).
- [ ] Tiles-per-view and average tile size measured on a sample.
- [ ] CDN hit-ratio assumption stated explicitly.
- [ ] Object-storage operation charges included, not just GB stored.
- [ ] A growth scenario (2–5x traffic) costed alongside the base case.
Key Takeaways
- Three drivers: storage, compute, egress — egress and compute usually dominate.
- Dynamic servers minimise storage but pay CPU per uncached request; static minimises compute but multiplies file count.
- Estimate egress as views × tiles-per-view × tile size; it scales with popularity.
- A CDN with a 90%+ hit ratio is the biggest single cost reduction.
- JP2 saves storage but costs more decode CPU; pyramidal TIFF is the reverse.
- Document every assumption so the estimate stays consistent and defensible across the collection.
Frequently Asked Questions
What are the main cost drivers for IIIF hosting?
Storage of master images, compute for the image server (or pre-tiling jobs), and egress bandwidth from serving tiles. For public collections, egress and compute usually dominate over raw storage.
Is static IIIF cheaper than running an image server?
Often yes for read-heavy public access, because pre-generated tiles on object storage with a CDN avoid per-request compute. The trade-off is large upfront tiling and far more stored files.
How much storage do pyramidal masters need?
Budget roughly 1.5 to 3x the size of a flat JPEG master for a pyramidal TIFF, or comparable-to-smaller for JP2. Static pre-tiling multiplies file count enormously but each tile is small.
Why does a CDN lower IIIF costs so much?
Most tile requests are repeats of popular regions. A CDN serves those from cache, cutting both origin compute and origin egress, which are the expensive parts. Cache hit ratios above 90% are common.
How do I estimate egress for a collection?
Multiply expected page views by tiles-per-view (often 10 to 40) by average tile size (20 to 80 KB). Add a margin for crawlers and embeds, then apply your provider's per-GB egress rate.
Does JPEG 2000 reduce hosting cost versus pyramidal TIFF?
It reduces storage cost because JP2 files are smaller, but decoding JP2 is more CPU-intensive, so a dynamic server may need more compute. Weigh storage savings against per-request CPU.