Proposal for Session (deadline April 30th 2012)

Digital Humanities Congress

http://www.shef.ac.uk/hri/dhc2012

Sheffield, 6 – 9 September 2012

The following APARSEN partners have expressed their interest to contribute to the session:

  • Maurizio Lunghi
  • William Killbride
  • Hervé L'hours
  • ...

Title: Digital Preservation for Digital Humanities. Contributions by the APARSEN Network of Excellence.

Organiser: René van Horik, Data Archiving & Networked Services (DANS), the Netherlands

Abstract: The goal of the EU funded APARSEN project that runs from 2011 till 2014 is to establish a Network of Excellence (NoE) on digital preservation. The more than 30 partners in the project represent the cultural heritage sector, the corporate sector, and the research community of which the digital humanities form an important group of stakeholders. The aim of this session is to present some of the current results of the project, grouped around the “trust” topic, and to get feedback on these outcomes from the digital humanities community. Trust is related to issues such as the reliability and quality of research data, the reputation of data repositories and digital preservation standards. The session consists of the following 5 parts: 1. Introduction (15 minutes) 2. Certification of repositories (15 minutes) 3. Reputation and trustability of datasets, publications and people (15 minutes) 4. Authenticity (15 minutes) 5. Questions and feedback (30 minutes)

1 Introduction (15 minutes), speaker: ... The APARSEN project (“Alliance Permanent Access to the Records of Science in Europe Network”) builds on the already established Alliance for Permanent Access (APA), a membership organisation of major European stakeholders in digital data and digital preservation. These stakeholders have come together to create a shared vision and framework for a sustainable digital information infrastructure providing permanent access. The three main streams of the APARSEN project are (1) integration of ideas and approaches on digital preservation (2) research related to technical, economic and legal issues related to digital preservation, and (3) the spreading of the excellence by means such as training events, awareness raising and liaison activities. The work packages of the APARSEN project are grouped into topics. Besides the “trust” topic, that is the main subject of this session and that was the main project focus for the first year of the APARSEN project, the following topics can be distinguished: “sustainability”, “usability” and “access”. These topics are covered by work packages that are currently active or will be active during the remainder of the APARSEN project. The outcomes of the topic areas are integrated in the overall vision concerning the realisation of a Network of Excellence on digital preservation. The current state of art concerning this common vision on digital preservation will be presented. Within the APARSEN project the “Reference Model for an Open Archival Information System” (OAIS) is the basic point of origin for all activities.

2 Certification of repositories (15 minutes) Speaker: ... Data repositories store, manage and disseminate research data sets. Within the APARSEN project activities are undertaken to assess the quality of the repositories. A number of parties have taken the initiative to create a European framework for audit and certification of digital repositories (see: http://www.trusteddigitalrepository.eu). The framework will consist of a sequence of three levels, in increasing trustworthiness. Basic Certification is granted to repositories that obtain “Dataseal of Approval” certification. Extended Certification is granted to Basic Certification repositories, which in addition perform a structured, externally reviewed, and publicly available self-audit based on the ISO 16363 or DIN 31644 standards. Formal Certification is granted to repositories, which in addition to Basic Certification obtain full external audit and certification based on ISO, 16363 or equivalent DIN 31644. Within the APARSEN project a number of test audits of data repositories are carried out. This presentation will cover the three mentioned certification procedures and present the test-audit activities. The aim is to create an infrastructure of trusted digital repositories in which data objects relevant for digital humanities research and other sectors is curated.

3 Reputation and trustability (15 minutes) Speaker:... A second group of “trust” activities carried out in the APARSEN project concern the reputation and trustability of datasets, publications and people. In this paper two types of activities are presented. In the first place a technological approach towards the reputation of digital preservation tools and techniques. In order to prevent the obsolescence of digital objects, such as databases and software, a number of tools and techniques are developed. Within the APARSEN project a framework is developed to assess the quality of these digital preservation tools and techniques. This framework also consists of a classification of digital objects and a number of applicable test bed techniques. Secondly, attention is paid to annotations; that is information added to data. The function of these annotations is to enable the evaluation of the quality of the data objects. Various annotation-types can be distinguished, such as descriptive and provenance metadata or evaluation remarks in a peer-review system. Both the tools and techniques and the annotations are issues that determine the trustworthiness of data objects.

4 Authenticity (15 minutes) Speaker:... A third cluster of work packages related to “trust” in the APARSEN project is grouped around authenticity. Authenticity can be defined as the degree to which a person or system regards an object as what it is purported to be. Authenticity is judged on the basis of evidence. Two authenticity topics are covered in the presentation. First attention is paid to the role of persistent identifiers with respect to authenticity of digital objects. The second topic concerns a “digital object lifecycle model”, aimed at managing authenticity and provenance of digital objects. The persistent identification of objects (both digital and non-digital) is becoming essential to allow its citation, retrieval and preservation. Persistent identifiers are crucial for preserving, managing, accessing and re-using huge amounts of data over time. A number of solutions for identifying objects have been proposed in different domains and several standards are currently at a mature stage of development. The presentation aims to investigate interoperability issues between several persistent identifier systems and proposes a general Interoperability Framework (IF) as a starting point to design new solutions to support interoperability. The IF considers persistent identifiers as the combination of technology, policies and decisions implemented by a user community.

In order to identify the main events that impact on authenticity and provenance a model of the digital object lifecycle is presented. For each of these events evidence has to be gathered in order to conveniently document the history of the digital object. This evidence consists both of technical and non-technical elements. The technical elements include controls on the integrity of the digital object (such as checking the bit sequences of the digital object over time). The non-technical elements vary from the identity of the author and set of custodians to the elements able to provide evidence of the reliability of the creation system and of the trustworthiness of the custodian. In short, “authenticity evidence” must be able to trace back, along the whole extent of its lifecycle since its creation, all the transformations the digital object has undergone and that may have affected its authenticity and provenance. The model for managing authenticity and provenance through the digital resource lifecycle consists of the two phases: pre-ingest and long term digital preservation phase (LTDP). The pre-ingest phase comprises the following core set of events, capture, integrate, aggregate, delete, migrate, transfer and submit. The LTDP-phase phase consists of the following events: ingest, aggregate, extract, migrate, delete and transfer. The presentation will elaborate on the model concentrating on its possible relevance for the digital humanities community.

5. Questions and feedback (30 minutes) The ideas and requirements concerning digital preservation within the digital humanities are very valuable for the APARSEN project, as they will enable the project to adjust and improve the project results. The question and feedback session will facilitate the exchange of ideas and questions concerning the role of digital preservation in digital humanities.

-- ReneVanHorik - 2012-04-27

Topic revision: r1 - 2012-04-27 - ReneVanHorik
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback