WP24 Authenticity and provenance: amending the DoW

The idea is to use this page to amend the text for the WP description in the DoW. Please edit this wiki page directly, and use the "insert" and "delete" mark-up to show changes, as illustrated:

Here is some unchanged text.
<ins>Here is some text to be inserted.</ins>
<del>Here is some text to be deleted.</del>
Here is some more unchanged text

Start month End month WP leader
3 14 CINI

Objectives

To review and recommend authenticity systems.

Description of work and role of partners

The revised OAIS defines Authenticity as the degree to which a person (or system) regards an object as what it is purported to be. Authenticity is judged on the basis of evidence. Much of this evidence consists of Provenance Information which is the information that documents the history of the Content Information. This information tells the origin or source of the Content Information, any changes that may have taken place since it was originated, and who has had custody of it since it was originated. The archive is responsible for creating and preserving Provenance Information from the point of Ingest; however, earlier Provenance Information should be provided by the Producer. Provenance Information adds to the evidence to support Authenticity.

In addition the fixity is also important to ensure that the bit sequences have not been changed where they should not have been. In some cases the data will have to be Transformed. OAIS points out that where the transformation is irreversible then one can define Transformational Information Property: An Information Property whose preservation is regarded as being necessary but not sufficient to verify that any Non-Reversible Transformation has adequately preserved information content. This could be important as contributing to evidence about Authenticity. Such Information Properties will need to be associated with specific Representation Information, including Semantic Information, to denote how they are encoded and what they mean. (The term ‘significant property’, which has various definitions in the literature, is sometimes used in a way that is consistent with its being a Transformational Information Property).

Provenance may reasonably [16] be divided into what we might term Technical Provenance – things that, for example, are recorded fairly automatically by software. This must be supplemented by Non-technical Provenance, by which we mean, for example, the information about the people who are in charge of the Content Information – the people who could perhaps fake the other PDI. In other words in order to judge whether we can trust the multitude of information that surrounds the Content Information, we must be able to judge whether we trust the people who were responsible for collecting it, and who may perhaps have been able to fake it.

Task 2410 Review of Authenticity systems

This task will:

  • Review work on authenticity and ways in which provenance, fixity and context are recorded by partners. This will include reviews on theoretical work on Provenance [14] as well as attempts at integrated models [15]. In addition techniques [11] ensuring that the bits have not been tampered with will be investigated
  • Evaluate the methodology for authenticity as delineated by CASPAR and InterPARES [17] as based on OAIS

Effort will be dedicated to a more complete definition of the interaction among the various components of the preservation systems with specific reference to their accuracy and reliability with reference to the procedures and processes related to the authenticity protocols as identified by CASPAR project on the basis of InterPARES concepts and OAIS models. These concepts and methods will be investigated in the other domains present in the network to identify the basis on which they can be applied and developed (specifically the methodology for setting protocols for authenticity and defining specific steps).

The issue of Transformational Information Properties will be also investigated with reference to the level of granularity required to ensure integrity and identity (important for authenticity) in the transfer from the digital objects creators to the repositories for preservation.

The main effort will be dedicated to the processes of collecting evidence about authenticity, the automation required and the ability to track information and components available in the lifecycle of digital resources. Further investigation, as part of the general methodology and conceptual framework for authenticity, will include:

1. creation of a consistent terminology and definitions, generally accepted and well understood beyond the professional communities involved in the preservation environment (but even there it lacks a cross-domains perspective): the definitions related to the attributes of preservation are not clearly expressed and present dangerous ambiguities with respect to the authenticity goals [44]. New terms (i.e. significant properties) are often required and used but do not necessarily contribute to the solution.

2. development of interrelations and concrete and open cooperation among the results of relevant projects (PREMIS [18], InterPARES , CASPAR, DRAMBORA, RAC [4], CIDOC.CRM) with the aim of building an interoperable framework: many environments specifically for the dynamic use of the resources as required because of the complexity of their layering and aggregations, the preservation is not only and mainly solved on the basis of a collection of metadata/information.

3. integration of conceptual models, schemas and business solutions to be developed in the application scenarios for handling relevant tasks as:

  • presumption and verification of authenticity
  • integrated use of relevant and flexible representation information possibly connected with approved schemas of descriptive systems (at least in the archival, library and museum domains where existing standard and recommendations at international level should be able to provide the required knowledge for identification of digital resources: this effort should be developed both with reference to the technical details but also by considering the semantic level required by the designated communities

Task 2420 Evaluation of authenticity evidence

Develop (or adopt) methods for creating evidence for the evaluation of authenticity which will be trusted by users.

This will be carried out by confronting the various proposed systems against the data which is held by the project partners.

Task 2430 Provenance Interoperability and Reasoning

Of particular interest is interoperability of Provenance systems. We will extend the CRM Digital (specialization of the ISO 21127:2006 standard CIDOC CRM) to the needs of digital preservation for various application domains. We would like to introduce the topic of "generalized reasoning on scientific Digital Provenance data for Digital Preservation". In the framework of the IP 3D-COFORM, we have considerably enhanced our Digital Provenance models from CASPAR to describe in any required detail the very complex data acquisition and data processing processes both on an atomic - processing step by processing step - and on an integrated level - from acquisition to data ready for publishing. A single acquisition process may create thousands of images and some terabyte of data. The complex processes yielding massive intermediate data and multiple versions of final products, reprocessing with improved methods or corrected input, give raise to a need for complex generic reasoning over provenance data in order to solve digital preservation tasks, such as inheritance of properties from superto subprocesses, inheritance along processing steps, merging metadata of intermediate steps, relevance, assessment, obsolescence control, "garbage collection"and appraisal, and others. In e-science, formats change rapidly, and frequently can only be identified via the software release that created the data. There is a need of an advanced schema (ontology) for registries of processing tools and their formats as well as some forms of reasoning.

List of deliverables

  • D24.1 Report on authenticity and plan for interoperable authenticity evaluation system (M14)
  • D24.2 Implementation and testing of an authenticity protocol on a specific domain (M14)

Description of deliverables

D24.1) Report on authenticity and plan for interoperable authenticity evaluation system: This report will describe the common view about how best to capture evidence about authenticity and to evaluate authenticity in a common way that allows the interoperability required to support changes in data holders and processing workflows. [month 14]

D24.2) Implementation and testing of an authenticity protocol on a specific domain: On the basis of the results of CASPAR project on authenticity protocols we define and test an operational version of the protocol for preservation processes in the e-gov domain, and usable across APARSEN members. [month 14]


Further revisions in response to feedback from Project Officer 12th/14th December

-- SimonLambert - 2012-10-16

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r2 - 2012-12-15 - SimonLambert
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback