Self-describing file format for gigapixel images?
- by Adam Goode
In medical imaging, there appears to be two ways of storing huge gigapixel images:
Use lots of JPEG images (either packed into files or individually) and cook up some bizarre index format to describe what goes where. Tack on some metadata in some other format.
Use TIFF's tile and multi-image support to cleanly store the images as a single file, and provide downsampled versions for zooming speed. Then abuse various TIFF tags to store metadata in non-standard ways. Also, store tiles with overlapping boundaries that must be individually translated later.
In both cases, the reader must understand the format well enough to understand how to draw things and read the metadata.
Is there a better way to store these images? Is TIFF (or BigTIFF) still the right format for this? Does XMP solve the problem of metadata?
The main issues are:
Storing images in a way that allows for rapid random access (tiling)
Storing downsampled images for rapid zooming (pyramid)
Handling cases where tiles are overlapping or sparse (scanners often work by moving a camera over a slide in 2D and capturing only where there is something to image)
Storing important metadata, including associated images like a slide's label and thumbnail
Support for lossy storage
What kind of (hopefully non-proprietary) formats do people use to store large aerial photographs or maps? These images have similar properties.