This wiki page contains the key findings and underlying evidence (ie. mappings to APARSEN topics/concepts as XLS) from D42.1. As indicated previously we'd now encourage the participants of the respective WPs associated with the analyzed topics/concepts to examine the findings w.r.t. topical coverage of curricula/courses in Higher Education (Academia, Continuing Professional Education), assess the evidence, and provide us with possible interpretations in the comments field. Comments can be edited on the wiki page if necessary.


The key issue Trust is mainly represented by the themes “Provenance”, “Audit and certification” and “Appraisal, selection criteria”. The coverage of the terms in the area Trust is very different. Data Quality reaches max. 11%, Authenticity, and Audit and Certification are only represented in one category a bit more, Appraisal, Selection criteria in at least two categories. Only Provenance reaches more than 20% in all three categories.
Overview Trust mapping (Link to XLS spreadsheet)


Authenticity is represented merely in HE courses with 19%. As a theme for CPE, it is with 5% not very well represented, and in the descriptions of our present CURR, it is also under-represented. It has to be investigated, whether Authenticity is a too specific concept, hence it is only found in detailed course descriptions and not at curriculum level.


Authenticiy (but this is true for many other key terms like trust or provenance) cannot be considered on their own when evaluating or simply describing the courses. These terms are implied in the concept of digital preservation itself: you cannot teach digital preservation without concepts like trust, authenticity, provenance, interoperability, intelligibility. I think that it has been a mistake not to use a standardized vocabulary for keywords. I do not know how to correct at this point this problem.

The concept of provenance and related terms appear with almost 44% the strongest in CURR. In the field of CPE and HE it is with 32% and 22% the most common term in the category of Trust. Because of the strong significance for the area CURR and the overall high presence, a strong correlation between the conceptual context of provenance with the Trust topic can be observed.


Data Quality

The term Data Quality has a rather minor topical coverage in our dataset. It occurs most often in HE courses (CPE 5%, CURR 6% and HE 11%). Thus, it does not appear to represent a characterizing aspect of the core Trust topic.


Audit and certification

This theme has its highest coverage in CPE with 32%, so it seems to be especially relevant for professionals dealing with issues related to the Trust topic. Audit and certification was covered either as a direct term or indirectly, e.g., through "trusted repositories".


Appraisal, selection criteria

This term is as much represented in the area CPE as "Audit and certification" is (32%), but it also exists in the area HE with 22%. Selection and elimination processes dominate the term in our dataset. The theme seems to be much more relevant for individual courses than for the depiction of the overall CURR. It is one of the three terms which were most often observed to correlate with the Trust topic.



Access is characterized by the three terms "Policies and governance", "IRP, access rights" and "Finding aids, metadata". "Persistent identifiers" is mentioned only marginally, "Security" does not even occur once.
Overview Access mapping (Link to XLS spreadsheet)

Persistent identifiers

The term is marginally represented with 6% in CURR and 8% in HE courses. One possible interpretation of the low topical coverage might be related to the fact that issues related to the establishment/maintenance of persistent identifiers are of organizational nature rather than technical/RTD issues. This and alternative interpretations will be discussed with WP22 Identifiers and citability.


Policies and governance

This theme is equally represented in all three areas with round 30% - 40% and covered by various terms, such as: preservation policies, principles and guidelines, legislation, corporate accountability. This leads to the assessment that this theme appears to be relevant for CURR, HE, and CPE. This is also supported by its occurence at course-level in various offerings with different backgrounds.


IPR, access rights

Here we have a high coverage of 50% and 43% in the areas of CURR and HE. CPE on the other hand is covered with 21% less intensively. This theme is determined by various terms also, such as: digital rights management, data protection, freedom of information, archive rules, information law, copyright law, practice for access, access models and licensing. Again, we assume a relatively general and important issue that is used in different topics and courses. However, it is represented in the range of Academia stronger than in CPE.


Finding aids, metadata

This theme is with an average of 56% the most commonly mapped terminology within our analysis. This is due to the term "metadata" because it is related to archiving and Digital Preservation in many different subjects. In two of three CURR contents are described using metadata. The theme is part of the knowledge base of the overall subject.



Usability is the main topic with the most used subsumed terms. Except for one, all these themes were reflected in the three categories of DP courses with mostly high percentage.
Overview Usability Mappig (Link to XLS spreadsheet)

File formats

There is a nearly uniform distribution of file formats across CURR, HE, and CPE. This indicates that the theme is generally perceived as relevant for all kind of education offers. In every category (CPE, CURR and HE) file formats have been reached a percentage of more than 40% (CPE 42%, CURR 44% and HE 51%). That could be meaning that file formats are an important issue for the whole category usability. Terms like migration, integration, standards and particularly metadata are the most mentioned items in the different courses; probably because of their general importance for preservation. This could explain the similar distribution over all three groups of courses.


Context, semantics

Context and semantics are most important for the sector CPE. In this sector we can assess an appearance of 42%. In the area of CURR and HE following distributions have been shown: CURR 33% and HE 38%. Generally, context and semantics are observed to be strongly correlated with the Usability topic, hence it can be attributed to be of high relevance. Metadata and semantics are most occurring in this thematic field.



For the theme interoperability there are three different characteristics to notice. On the one hand we can see that this issue is not very relevant for the area of CURR (appearance only 17%). On the other hand we have a percentage of 57% in the area of HE; and at least a percentage of 37% in the field of CPE. The correlation of interoperability with CURR is low, whereas correlation is high for individual courses. However, many of those individual courses are part of different CURR. The OAIS (information) model and standards in general are most often observed as course subjects.



The theme intelligibility it not strongly represented in the different categories (percentage of 5% in the field of CPE courses, 11% in the CURR and none in HE. Possible interpretations of the rare occurrence of intelligibility have to be discussed with WP25, e.g., coverage through the related theme of context and semantics which are repeatedly found.


I agree, with the interpretation. The reason why "intelligibility" does not occur very oftern in courses, is because it is usually subsumed by the broader word of "semantics" which in turn is related to "context" and "provenance".

Common tools

We have a nearly equal percentage referring the theme common tools in the categories CURR (44%) and HE (46%). In both categories these issues are represented through a few of different terminologies, which all handle with varying techniques and software tools. Examples for some course and curricula topics mentioned are: tools for managing records and information, software options and developing new tools and products. In the field of CPE this theme has just a percentage 21%.


Preservation planning

Preservation planning is very common in all three categories (CURR 50%, CPE 47% and HE 49%). This might be due to fact that this theme is characterized by diverse terms and concepts, such preservation analysis workflow, Digital Preservation preparation and requirements, significant properties, strategy, managing Digital Preservation, preservation strategies etc. Because of that we have a very high percentage in all three groups of courses. The theme preservation planning is hence also very relevant for the APARSEN topic usability.



Referring the theme infrastructure we have a relative similar distribution overall all three categories (CPE 32%, CURR 33% and HE 27%). Hence infrastructure appears to be relevant across all types of offerings and the Usability topic in general. Terms like technical and organizational requirements, platforms, IT components and infrastructure itself are some mentioned items in the different courses; probably because of their relevance for planning, designing and configuration of preservation concepts. This could also explain the relative equal distribution of percentage of this theme in the different courses.



Sustainability includes seven themes, wherein "Business cases" and "Processes" stand out. The theme of "Brokerage services" occurs not at all.
Overview Sustainability mapping (Link to XLS spreadsheet)

Business cases

The theme “Business cases” is more represented in academia than in CPE (CPE 21%, CURR 50% and HE 42%). This suggests that this theme provides a good characterization of the Sustainability topic in general. For example, the theme is reflected in business models, problems and action fields, legal and cost aspects, supply sources, marketing. The low topical coverage in CPE needs to be further investigated with WP 36, as it might indicate a deficiency in current offerings.


Cost/benefit analysis

The term cost / benefit analysis appears almost equally in all three segments (CPE 16%, CURR 17% and HE 22%). Cost Analysis is the keyword. This theme, with a rather economical background, has a constant presence, but isn’t of the main concern.


Storage solutions

This term has in CPE much less importance than in the academic field (CPE 5%, CURR 17% and HE 16%). So this is another indicator of the needed expansion of DP courses for people in employment and professionals.


Transfer of custody

The term is used less, in the division CPE with 11%, in CURR with 6% and in the considered individual academic courses with 5%. The term is mainly represented by aspects of rights management, which finds itself in the field of Access in “Policies and governance”, or “IPR, access rights”. The content is thus certainly relevant, but is labelled differently and is defining the area of access more than the area of sustainability.



The Processes theme seems to provide a good characterization of the Sustainablity topic, as it is observed in an average of 44% of all offerings. It appeared in all three segments equally often, in CPE with 47%, with 39% in CURR and in HE with 46%. This might be due to the fact that even the term Processes itself is quite broad, with its specific meaning diverging across sectors and domains. Specific occurences of the Processes theme encountered were, for example: acquisition processes, workflow, process analysis and process modeling. The percentage difference of the division CURR to the other sectors speaks for the differential use of the multiple cross-term "process" in different areas of specialization and thus individual courses in the academic as well as in continuing education sector. Due to the extensive spectrum several “side issues” arise, that can be assigned to a limited extent, such as "migration, emulation, technology preservation" and "storing, encoding and transmitting data".


Risk management, threats

This theme is with 21% in the category of CPE and 11% in HE and 0% in CURR stronger in training than in the academic field, but overall quite thin represented. As a figurehead of a degree program, it is not used within our data set, however, there are special courses, especially in focusing on practical orientation and application to handle the risk management entirely.



Taking the APARSEN core topics as normative basis for what should be offered, the elicitation and detailed/comprehensive discussion of the actual reasons and implications of observed deficiencies in topical coverage in the data analyzed was beyond the scope of this deliverable. Possible interpretations (e.g., "deemed to be not mature enough/irrelevant", "terminology mismatch", "conveyed need for offerings", ...) will be discussed in detailed with the respective WP stakeholders and reflected in the design of the APARSEN curriculum.
Trust is influenced by three terms in the area of CPE (Provenance, audit and certification & Appraisal, selection criteria) and has reached the highest coverage in the area CPE with 79%. Trust is determined in the field of HE by two terms (Authenticity, Provenance & Appraisal, selection criteria) and is represented in the field of CURR only by Provenance.
Policies and governance, IPR, access rights, and finding aids, metadata define the topic Access. Access delivers with Usability the largest contribution in the field of curriculum.
Usability provides with six out of seven terms the largest input, the most frequent topic or in all three segments (CPE, CURR, HE).
Sustainability is represented mainly by Business cases and Processes. After Usability Sustainability provides the next largest contribution to HE courses.
All four main APARSEN topics provide contributions to all three divisions in over 50% of the examined offers, but in each case in a different number of sub-themes. The following terms could rarely be mapped: Persistent identifiers, security, intelligibility & brokerage services. The themes Transfer of custody and data quality are also of little importance. The APARSEN core topics are covered in over 50% of the analyzed offerings across CURR, HE, and CPE. However, the observed occurrence of constituting sub-themes varied. For instance, there was little evidence of the themes Persistent Identifiers, Security, Intelligibility and Brokerage Services, Transfer of Custody, and Data Quality. As another example, topical coverage of Authenticity, IPR, Access rights, Interoperability, Common tools, Business cases, and Storage solutions was in general found to be higher in HE than CPE.
Overall, topical coverage appeared to be relatively consistent across areas (CURR, HE, and CPE), countries, and languages. This supports adequateness of the methodology established and gives some indication as to the validity (i.e. critical mass reached through leading offerings) of the findings w.r.t. topical coverage.


