Post Year 1 Review Recommendations and Actions

1 Background

On 2-3 February 2012 (i.e. month 14 of the APARSEN project) the EU held the end of year 1 review of APARSEN. This review was focussed on the first year’s work, but it also looked at the drafts of the deliverables due in month 14, which included D14.1.

The reviewers have now produced their report on the review (Review_Report_P1_Aparsen_2011 - final.DOC). In this report they have stated that they are not happy with deliverable D14.1 and that it “would be rejected” in its current form. They will be carrying out an interim review in month 16 and expect to see an updated version of D14.1 then.

This document looks at the reviewers’ comments on D14.1 (which are reproduced in full in the appendix) and the actions that need to be taken to get deliverable D14.1 in a form likely to be acceptable to the reviewers.

2 Deliervable D14.1

Deliverable D14.1 is the deliverable of APARSEN’s work package 14, common testing environments.

2.1 Overall Content of the Document

The reviewers state that it should be “a report on testing environments”. Furthermore, they state that it “should be a methodologically sound compare and contrast of available environments with a view to providing a service to the community as part of the VCoE.”

They have also stated that D14.1 should not be “a list of currently available environments”. They have stated that they would like to see the “various types of testbeds ... described impartially and consistently within APARSEN, so that both internal and external stakeholders can have access to all relevant information and details to choose the one most suitable tool for their needs.”

In addition, they have stated that they expect to see ”an overview and documentation of this work” at the M16 checkpoint review and that the “mapping of such testbeds is developed and incorporated within the Year 2 reporting period”. The reviewers have also commented on “the lack of rigour in assessing testing environments” demonstrated in D14.1.

2.1.1 Actions

Given these comments, we need to be very clear about our aims, our terminology and our methods to ensure that they are clear, rigorous and “methodologically sound”.

We also need to make “clear what the fundamental unit of preservation is in APARSEN terms”, to address another of the reviewers’ concerns.

2.2 Introduction (Section 1)

The reviewers made three comments which refer specifically to parts of the introduction:
  • “the approach taken is not clearly articulated (D14.1, page 6, para 5);”
  • “the description of user experiences is poor (D14.1, page 6, para 6);”
  • “the methodology for the gap analysis is not described (D14.1, page 6, para 7);”

These comments address the part of the introduction which summarises the test environments evaluation framework, which is described in more detail in section 3. Therefore, we need to address these comments in section 3, and then reflect those changes back into the introduction.

We also need to address the following question from the reviewers: ”How and when is it [this work package] intended to be used and by whom?"

2.2.1 Actions

This is our opportunity to start the report with a clear statement of what we're trying to achieve (our aims and objectives) & the basic principles we'll use to get there. It should be based on the description of the objectives of the work package and the description of deliverable D14.1, as the current document is. We should also refer back to the description of work for work package 14, as this is what the EU will judge us on.

Better late than never we should established consensus about the meaning of the following terms, which are also used in the DOW:

* test environment, * testbed, * tools and technique for digital preservation, * digitial preservation system

From our point of view a testbed is a software environment with a example set of data and software configuration for purposes of testing preservation tools and techniques. Here we should find consensual short definitions together. The scope of our WP should delimit of digital preservation systems as a whole. Ashley has already begun this in ACTION 2.3.2.1, but we should emphasize this in the introduction of our deliverable. It could be also helpful to expand the delivery with a glossary to meet this need.

-- StefanHein - 2012-04-02

I have created a wiki page to maintain a Glossary for the project. This can be found here: DigitalPreservationGlossary

-- AshHunter - 2012-04-18

2.3 Test Environments (Section 2)

2.3.1 Test Protocols used by Previous Projects (Section 2.1)

This section is a combination of a review of prior work on testbeds and a commentary on the different types of digital preservation techniques.

The reviewers have made the following comments which are relevant to this section:

  • “relevant citations on other works done in this area are absent from the draft."
  • "the lack of rigour in assessing testing environments"
  • "testbeds [should] be described impartially and consistently"
  • the report "should be a methodologically sounds compare and contrast of available environments".

2.3.1.1 Actions
Separate the review of prior work on testbeds from the commentary on the different types of digital preservation techniques. That is, base the review on section 2.3 of http://aparsen.digitalpreservation.eu/pub/Main/ApanWp14/APARSEN-proposal-for-WP14.doc.

Ensure that a description of the different types of digital preservation techniques is impartial, rigorous, and cites relevant works. At a minimum it should cover migration (lots of work done on this), emulation (KEEP) and extending the representation information and preservation description information held (CASPAR).

Citations of other works done in this area are included in the draft of D14.1; however, they are included in the text, rather than being listed in the references section at the end and cited in the text. So, move the inline citations into the references section (appendix B of D14.1) and then cite the appropriate references in the text.

2.3.2 APARSEN Test Environment Systems (Section 2.2)

Section 2.2 describes briefly each of the test environment systems available to APARSEN. They have made several comments on the work package’s testbeds:
  • “it seems that real testing on real testbeds have not been performed;”
  • “it is not clear that all available test beds have been collected together (Task 1410);”
  • "It is also important to ensure the independence of the specific scenario from the preservation solutions being used, i.e. the preservation testbeds and tests should be separable from the software being used to manage the objects whether that be a specialist digital preservation system or what might be considered a digital repository (commercial or open source)."

It is clear from the reviewers’ comments that we may need to rethink what testbeds we can use.

2.3.2.1 Actions
Throughout the discussions and work on this work package, the participants have been guilty of conflating various concepts and loose usage of terminology. We need to be clear that a testbed is not a digital preservation system.

A testbed is a testing environment (including testing procedures & test data).

A digital preservation system is a computer system which conforms to the OAIS and includes specific functionality to preserve over the long term the content entrusted to it.

We also need to be clear that the 3 dimensions we need to explore when testing digital preservation systems, tools and techniques are:

  • digital object types
  • digital preservation strategies
  • threats

i.e. which digital preservation strategies (tools & techniques) protect which types of digital objects against which threats.

Note that in the description of work (DoW), we also stated that: ”Besides preservation efficacy one also needs to test against portability, interoperability, robustness and scalability.” However, doing this in the time available is likely to be too much of a stretch, so we may have to drop this aim (at least for the moment).

DNB:

To make our testbeds comparable and consistently described, we think that we need a template like the user scenarios template. The template should give a standardized set of key features. For example key features could be

* licence model (open source, commercial) * runtime environment (Java, PHP?) * documentation link * configurability (via xml configuration files?)

If we are able to fill this template with the provided testbeds we are convinced to take a further step forward to improve the section "test environments evaluation framework".

-- StefanHein - 2012-04-02

2.3.3 Section 2.3: Threats to Digital Material

This section looks briefly at the threats identified by the PARSE.INSIGHT project.

2.3.3.1 Actions
The only problem with this section is that it is missing a reference to the PARSE.INSIGHT paper which lists the threats; someone needs to track this down.

2.4 Test Environments Evaluation Framework (Section 3)

The reviewers made the following comments on this section:
  • Section 3.1: "Common Approach" * "the approach taken is not clearly articulated"
  • Section 3.2: "Implementation Strategy" * "the methodology for the gap analysis is not described" * "the nature of the taxonomies to be used is not clearly stated"
  • Section 3.3: "Analysis of User Scenarios" * "the description of user experiences is poor"
  • Section 3.4: "Classification Schemes" * "the classification scheme explored in Section 3.4 is arbitrary and adds nothing to the process because of that."

They also commented that:

“This work package should be able to develop a ‘generic’ testing model and some proved test cases which have replicability and extensibility to other testing issues/problems. Or, if this is not possible, how a heterogeneous approach to testing would be managed.”

2.4.1 Actions

Given these damning comments, we need to rethink section 3. In particular, we need to come up with a rigorous, impartial, consistent (replicable) and methodologically sound approach to evaluating testbeds (i.e. test environments).

We should also drop the classification scheme described in section 3.4 in light of the reviewers’ comments.

DNB:

We agree to drop the classification scheme (section 3.4).

Furthermore we should clarify the main aim of the evaluation framework and its target group. We think the framework should give an answer to the question "Which is the best testbed for a specific user scenario?". We believe that the table on page 17 of the deliverable is a good basis, but it needs to be improved. Why this table should not be extended with the information from our spread sheet (http://aparsen.digitalpreservation.eu/pub/Main/UserScenarios/CTE_Review_FEB20 12.xlsx) ? Perhaps it could be helpful to enlarge the clusters here to make the sheet a little bit more manageable.

-- StefanHein - 2012-04-02

David has provided us with the full text from his book here: * Types_of_Digital_Objects.docx: Chapter 4 from David's Book describing the classification of digital objects

2.5 Future Direction (Section 4)

Section 4 is very short and the reviewers have not commented on it explicitly. However, they have stated that the “mapping [i.e. an impartial and consistent description of testbeds] should be completed by the end of Year 2, and updated on a rolling basis.”

Therefore, we should ensure that this section makes clear that we intend to do this.

ACTIONS

Action Who Description Status
1 TBC Define APARSEN's fundamental unit of preservation Open
2 TBC Rewrite (D14.1, page 6, para 5) to make the approach taken more clearly articulated Open
3 TBC Rewrite (D14.1, page 6, para 6) to improve the description of user experiences Open
4 TBC Rewrite (D14.1, page 6, para 7) to describe the methodology for the gap analysis Open
5 TBC Address the question: ”How and when is the WP14 work intended to be used and by whom?" Open
6 TBC Define terms: Test Environment; Testbed; Tools & Techniques; DPS Open
7 AshHunter Create a Glossary for the Project Done
8 TBC Separate the review of prior work on testbeds from the commentary on the different types of digital preservation techniques Open
9 TBC Move citations into a references section, and out of the main body text of the doc Open
10 DNB? Create a Testbed template, and populate up some examples of testbeds from the info we already have Open
11 ALL Clarify the main aim of the evaluation framework and its target group Open
12 ALL Look at Yannis's paper: decide on its inclusion within D14.1 Open
13 ALL Define how a decision tree approach could be used to assist with the analysis of a DP user's needs Open
Topic attachments
I Attachment History Action Size Date Who Comment
Microsoft Word filedocx Types_of_Digital_Objects.docx r1 manage 429.2 K 2012-04-19 - 11:21 AshHunter Chapter 4 from David's Book describing the classification of digital objects
Edit | Attach | Watch | Print version | History: r6 < r5 < r4 < r3 < r2 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r6 - 2012-04-19 - AshHunter
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback