NDL - Evaluation of various migration paths

User Scenario ID 6.1
Author Pekka Mustonen, CSC

The Finnish National Digital Library project - launched by the Finnish Ministry of Education and Culture in year 2008 - brings the achievements of culture and science to general public. The aims of the NDL project are improving availability and usability of the key national information resources of libraries, archives and museums in information networks, and the development of long-term preservation solutions for digital cultural heritage content data objects. The long-term preservation section of the NDL project has prepared a plan describing the model for centralized national long-term preservation solution for the digital objects of memory organisations responsible for the preservation of cultural heritage.

In National Digital Library project file formats are divided into "acceptable for preservation" and "acceptable for transfer" -categories. For example, file formats used in MS Office suite are considered "acceptable for transfer" but these files will be converted into long-term preservation format before being archived.

We want to study various migration paths from "acceptable for transfer" to "acceptable for preservation" to be able to instruct depositors in preservation planning.

Type of digital information

According to a recent study, the current number of digital objects to be deposited to the NDL long-term preservation system is roughly 687 000 000 (2500TB), and the size of the collection is estimated to be 1 458 000 000 objects (5700TB) in 2015 when the system will be in production (obviously only a tiny share of this will be available for testing).

Possible migration paths can be any of the following (Note: also scenarios by DNB are very relevant to us):

  • "Text":
    • Acceptable for transfer: Microsoft Word for Windows Document
    • Acceptable for preservation: Open Document Format (ODF), PDF for long-term preservation (PDF/A)
  • Audio:
    • Acceptable for transfer: Audio Interchange File Format (AIFF), Mpeg-1 layer-3, Mepg-2 layer-3 (MP3), Mpeg-4 aac advanced audio coding (AAC), Window media audio
    • Acceptable for preservation: Broadcast Wave Format (BWF), Waveform Audio Format (WAV), AIFF (PCM-coded), AAC,
  • Video:
    • Acceptable for transfer: Audio video interleave (AVI), Moving pictures expert group (MPEG-2), Moving pictures expert group (MPEG-4), Quicktime (MOV), Windows media video (WMV)
    • Acceptable for preservation: JPEG 2000 MXF or Motion JPEG 2000
  • Still images
    • Acceptable for transfer: Encapsulated postscript (EPS), Graphics interchange format (GIF), Portable network graphics (PNG)
    • Acceptable for preservation: Joint photographic experts group (JPEG), Joint photographic experts group jpeg 2000 (JP2), Tagged image file format (TIFF)
Link to sample data Not yet available.
Threat(s) to the data Information loss during the conversion
Usage (Mainly) producers
Success Criteria Object properties are preserved with satisfying quality

-- HeikkiHelin - 2011-09-09

