WP27 Scalability: amending the DoW

The idea is to use this page to amend the text for the WP description in the DoW. Please edit this wiki page directly, and use the "insert" and "delete" mark-up to show changes, as illustrated:

Here is some unchanged text.
<ins>Here is some text to be inserted.</ins>
<del>Here is some text to be deleted.</del>
Here is some more unchanged text

Start month End month WP leader
20 31 IBM


There are many aspects to the scalability of preservation systems. Scalability needs to address:

  • Total capability (in TB)
  • Number of digital objects and size of each objects (e.g. video objects or small documents)
  • Distribution – how geographically dispersed is the system
  • Degree of sharing, namely at what level does it support multiple curators and multiple users, and concurrency requirement
  • Security in a multi-tenant environment which hosts data shared by different curators
  • Availability – are objects expected to be available at any time from anywhere?
  • Number of versions of the same object
  • Connections between different objects (e.g., connection between a publication and the underling data it uses)
  • Amount of metadata and connections between metadata
  • Variety of data types
  • Substructure
  • Searchability

The objective of this workpackage is to understand what the important scalability parameters are in preservation systems. Of a particular importance is developing preservation support services which can be shared by many data curators, that can lead to a reduced cost infrastructure. identifying gaps in technology that prevent us from getting to the right level of scalability.

Description of work and role of partners

Task 2710 Scalability of services

This task will review of the scalability of storage and other techniques used and needed by partners . This will be contrasted with the extremely scalable solutions that exist today in the form of Cloud Storage providers. A starting point can be Tessella's work on measuring the scalability of the SDB solution and NARA’s ERA solutions. The scalability challenges identified in the Warwick Workshop report should also be addressed, for example dealing with hundred of billions of objects and objects of many petabytes. The initial work will consist of obtaining answers from partners, from existing surveys and our own questionnaires about scalability needs for long term digital preservation. We will obtain information from other projects considering scalability such as the SCIDIP-ES, SCAPE and ENSURE projects.

In addition, information will be collected about scalability of existing technology such as Tessella SDB, cloud storage, DuraSpace, etc., as well as in-house systems used in APARSEN partners.

Task 2720 Recommendations about scalability

Evaluation and recommendations. The task will identify the foremost important scalability parameters and dimensions that are needed for the partner's preservation systems. It will analyze implications on issues such as security and cost required to attain such levels of scalability. Specific recommendations will be given, with clear guidelines for how tools and techniques can be incorporated into real environments and, in particular the testing environments identified in workpackage 1400. Some of these tools can include external storage. identify gaps in technology than need to be addressed in order to attain the required levels of scalability. Recommendations will be given on tools and techniques. The recommendations report will be used by the VCoE to direct activities towards covering identified gaps and requirements.

List of deliverables

  • D27.1 Recommendations about scalability (M31)

Description of deliverables

D27.1) Recommendations about scalability: This report will summarise the challenges from, and possible responses to, scalability to be needed over the next decade or more. This will include estimates from partners and associated stakeholders about their requirements. [month 31]

Further revisions in response to feedback from Project Officer 12th/14th December

-- SimonLambert - 2012-10-16

Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r7 - 2012-12-15 - SimonLambert
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback