List of users of Large Scale Digital Preservation Services

To provide an understanding of what it means for a digital preservation system to "Operate at Scale", we decided to capture detailed information about Archives that are currently recognised as facing large scale data challenges relating to the long term preservation of their digital content to give us further insight and understanding as to what functonality is impacted by operating with 'large data'.

The following organisations have been highlighted as demonstrating a business need that can only be addressed within an adequate time from through scalable processing methods:

US based Ancestry organisation

Characteristic Users Details
Total capability (in TB) Have been able to demonstrate ingest rates of 20TB/day, now seeking to reach 70TB/day
Number of digital objects and size of each objects (e.g. video objects or small documents) SIPS typically contain about 1000 JPEG2000 and XML files and range in size from several hundred MB to 10's of GB.
Distribution how geographically dispersed is the system Staging of SIPs takes place some 2000 miles away on another site to tape, and then the tapes are flown to the datacentre for ingest by the preservtion system
Degree of sharing, namely at what level does it support multiple curators and multiple users, and concurrency requirement ?
Security in a multi-tenant environment which hosts data shared by different curators Multi-tenant capability
Availability ?
Architecture 6 JobQueue Servers to process the jobs, 1 Application server to provide the Workflow Management and User Interface, 1 Database server, 1 Registry Server, and a GPFS Filestore

-- AshHunter - 2012-12-24

Topic revision: r1 - 2012-12-24 - AshHunter
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback