WP25 Interoperability and intelligibility: amending the DoW

The idea is to use this page to amend the text for the WP description in the DoW. Please edit this wiki page directly, and use the "insert" and "delete" mark-up to show changes, as illustrated:

Here is some unchanged text.
<ins>Here is some text to be inserted.</ins>
<del>Here is some text to be deleted.</del>
Here is some more unchanged text

Start month End month WP leader
20 33 FORTH

Objectives

Research and development of techniques to support interoperability of data between organisations and disciplines.

Background

According to the IEEE definition interoperability refers to “the ability of two or more systems or components to exchange information and to use the information that has been exchanged”. Various aspects or layers of interoperability have been identified, mainly:

Syntactic interoperability. If two or more systems are capable of communicating and exchanging data, they are exhibiting syntactic interoperability, which is required for any attempt s of further interoperability. Specified data formats, communication protocols and the like are fundamental. For instance, XML or SQL standards provide syntactic interoperability. This is also true for lower-level data formats, such as ensuring alphabetical characters are stored in ASCII format in both of the communicating systems.

Semantic interoperability. Beyond the ability of two or more computer systems to exchange information, semantic interoperability is the ability to automatically interpret the information exchanged meaningfully and accurately in order to produce useful results as defined by the end users of both systems. To achieve semantic interoperability, both sides must defer to a common information exchange reference model. The content of the information exchange requests should be unambiguously defined: what is sent is the same as what is understood.

Focus of this WP

Digital preservation has been termed “interoperability with the future”. Regarding syntactic interoperability, special attention will be dedicated to the metadata and standard protocols in the sector with specific reference to analyse significant properties according to OAIS model. Case studies will be developed in specific domains (like the case of the Italian universities’ networks where interoperability services based on syntactic framework will be planned also with reference to the preservation issues) . Regarding semantic interoperability, we will address techniques and issues related to the use of ontology to identify and qualify information sources, including a) character set or representation (b) language interoperability, and the issues described within Task 2530.

Furthermore, we will also investigate collaborative methods for tackling such issues.

We should mention however that the crux of the interoperability problem is that digital objects and services have various dependencies (syntactic, semantic, etc). We cannot achieve interoperability when the involved parties are not aware of the dependencies of the exchanged artefacts. One general approach to tackle this problem is standardization. From the dependency point of view, standardization essentially reduces the dependencies or makes them widely known (it does not vanish dependencies). Apart from developing standards, it is worth investigating more flexible methods for tackling the interoperability problem. A rising question is whether we could tackle the interoperability problem without having to rely to several and possibly discrepant standards. It is worth investigating whether we could establish a protocol for aiding interoperability on the basis of explicit dependency management.

To facilitate practical interoperability we also need to share ideas and reach common views on virtualisation of different types of data, particularly those outlined in the Warwick workshop

Description of work and role of partners

Task 2510 Research and development of common services and models to support interoperability.

In this task we will gather the conceptual models, services and formats that are used by the partners to address concrete interoperability challenges in digital preservation and try to develop structure the complex landscape of interoperability models, virtualisation of data, management, storage etc to facilitate practical interoperability, services and formats that tackle the indentified discrepancies. This includes conceptual models for exchanging provenance metadata (e.g. CRM Digital and OPM). We will establish collaborations with relevant standardization bodies and stakeholder communities on new standards.

Task 2520 Intelligibility Modelling and Reasoning

There is a need for services that help archivists in checking whether the archived digital artefacts remain intelligible and functional, in identifying hazards and obsolescence risks and investigating the consequences of probable losses. To tackle these requirements [48] [T, DEXA’07] showed how such services can be reduced to dependency management services, while [47] [TF, ECDL’07] extended that model with disjunctive dependencies. Central notions of these works is the notion of module, dependency and profile. In a nutshell, a module can be a software/hardware component or even a knowledge base expressed either formally or informally, explicitly or tacitly, that we want to preserve. A module may require the availability of other modules in order to function, be understood or managed (e.g. OAIS RepInfo). A profile is the set of modules that are assumed to be known (available or intelligible) by a user (or community of users), so this is an explicit representation of the concept of OAIS KB. Based on this model, a number of services have been defined for checking whether a module is intelligible by a community, or for computing the intelligibility gap of a module. GapMgr is a system which has been developed based on this model, and has been applied in the context of the EU project CASPAR.

In the context of this NoE we will attempt to extend the framework of task-based dependencies. One promising direction is to found the extended framework on Horn Rules. The proposed framework and methodology, apart from simplifying the disjunctive dependencies of [47] [TF, ECDL’07], is expected to be more expressive and flexible as it will allow expressing the various properties of dependencies (e.g. transitivity, symmetry) straightforwardly. Subsequently we plan to elaborate on the inference services required for task-performability, risk-detection and computing intelligibility gaps. In addition we will evaluate various implementation approaches, e.g. implementations over ORDBMS (Datalog queries through Recursive SQL), Semantic Web (Ontologies and Rules SWRL). It is worth noting that due to disjunction there is not a unique way to fill an intelligibility gap. To tackle this problem we will elaborate on abductive reasoningreasoning techniques for computing intelligibility gaps.

Task 2530 Semantic Interoperability and Scientific Data

Activities related to semantic interoperability, ontologies and knowledge bases have been growing in relevance within Earth Observation (EO), and other disciplines. Within the EO domain there is a clear need to cope with needs ranging from knowledge capture (e.g.: for the description of Ground Segment components) to support semantic access to EO resources (e.g.: for the identification of relevant EO products) to preservation attributes identification. Different information organisation techniques are employed ( like thesauri, ontologies, topic maps), and various thesauri / dictionaries have been developed by a number of institutions: General Multilingual Environmental Thesaurus (GMET) by the EEA, Wiktionary by Wikipedia, Eurovoc by the EC Publications Office, Semantic Web for Earth and Environmental Terminology (SWEET) by NASA, are some of the high relevant European and international initiatives. To support semantic access to EO resources relevant for a particular application domain, we can identify suitable tools and information organisation techniques, but there are often unbreakable barriers, for various and different reasons, which prevent reusing existing thesauri / dictionaries, issue which is exacerbated when preservation issues need to be taken into account.

Within this task we will address the limitations and barriers, establishing a networking capability with the objective to overcome them, taking into account a set of common high level objectives and requirements to be agreed upon. Such semantic interoperability high level objectives should permit application experts to easily identify within the archive the EO missions, sensors or products useful for their activity, using familiar semantic terms pertaining to their application domain and to follow-up and identify relevant preservation attributes. The baseline objectives to be given as input to the task will be agreed upon with experts via workshops and networking events. We will use as seed discussion elements the objective: to permit an easy, semantic identification from non-EO domains of relevant EO resources and of their preservation attributes; to keep ontology and architecture as simple as possible; support multiple application domains and limit dependencies from evolution / changes, taking into account the long lasting objective of long term data preservation.

List of deliverables

  • D25.1 Interoperability Objectives and Approaches (M26)
  • D25.2 Interoperability strategies (M33)

Description of deliverables

D25.1) Interoperability Objectives and Approaches: This document will gather the interoperability objectives and guidelines agreed by experts and stakeholders from the participating communities. Subsequently it will propose interoperability services, standard models, formats and virtualisation models and interfaces propose a useful map/framework/matrix to structure the complex ecosystem of interoperability issues in digital preservation, helping users and key stakeholders to solve their practical interoperability issues (i.e. finding suitable solutions) in different areas of digital preservation and for different interoperability objects. This deliverable will be the result of Tasks 2510 and 2530. [month 26]

D25.2) Interoperability strategies: This document will propose a methodology for capturing, modelling, managing and exploiting various interoperability dependencies. At first it will describe methods for modelling tasks and their dependencies which can have conjunctive or disjunctive nature. Then it will elaborate on the reasoning services required and it will investigate technologies that can be used for realizing them. Since there may exist several methods to fill an intelligibility gap (if there are disjunctive dependencies), we will investigate whether abductive reasoning which reasoning techniques can be exploited. The document will also consider the results reported in D25.1 [month 33]


Further revisions in response to feedback from Project Officer 12th/14th December

Objectives

Research and development of techniques to support interoperability of data between organisations and disciplines.

Background

According to the IEEE definition interoperability refers to “the ability of two or more systems or components to exchange information and to use the information that has been exchanged”. Various aspects or layers of interoperability have been identified, mainly:

Syntactic interoperability. If two or more systems are capable of communicating and exchanging data, they are exhibiting syntactic interoperability, which is required for any attempt s of further interoperability. Specified data formats, communication protocols and the like are fundamental. For instance, XML or SQL standards provide syntactic interoperability. This is also true for lower-level data formats, such as ensuring alphabetical characters are stored in ASCII format in both of the communicating systems.

Semantic interoperability. Beyond the ability of two or more computer systems to exchange information, semantic interoperability is the ability to automatically interpret the information exchanged meaningfully and accurately in order to produce useful results as defined by the end users of both systems. To achieve semantic interoperability, both sides must defer to a common information exchange reference model. The content of the information exchange requests should be unambiguously defined: what is sent is the same as what is understood.

Focus of this WP

Digital preservation has been termed “interoperability with the future”. This WP will elaborate on the interoperability problem, considering challenges, accomplishments and remaining gaps. In particular it aims at providing:

  • An overview of ongoing and past projects and initiatives on interoperability in different areas of digital preservation.
  • A description of the main interoperability scenarios and challenges encountered by partners and other stakeholders in their daily life activity that served to drive the definition of the main common interoperability objectives and guidelines for digital preservation.
  • An analysis of the current solutions adopted to enable semantic interoperability in the domain of Earth Science, building on the experience of one of the partners of the WP25, i.e. ESA, that is actively involved in the Earth Observation Long Term Data Preservation (LTDP) programme to favour the set-up of a European Framework for the long term preservation of Earth Science (ES) data.
  • An analysis of the key questions about global semantic interoperability in digital preservation enabled by the Semantic Web initiative and Linked Data, including an overview of the main strengths and weaknesses of the approach.
  • A broad matrix of models, standards and services for interoperability that cross the main areas of digital preservation which can be used as a tool to navigate the complex ecosystem of the current interoperability solutions, helping users and key stakeholders to solve their practical interoperability issues in different areas of digital preservation and for different interoperability objects.
  • A set of common interoperability objectives and guidelines to address the main interoperability challenges in digital preservation.
  • An analysis of the common interoperability objectives in terms of their dependencies, for defining a methodology for modeling these dependencies and enabling services like task performability checking, which in turn could reduce the human effort required for periodically checking or monitoring whether a task on an archived digital object or collection is performable,
Digital preservation has been termed “interoperability with the future”. Regarding syntactic interoperability, special attention will be dedicated to the metadata and standard protocols in the sector with specific reference to analyse significant properties according to OAIS model. Case studies will be developed in specific domains (like the case of the Italian universities’ networks where interoperability services based on syntactic framework will be planned also with reference to the preservation issues) . Regarding semantic interoperability, we will address techniques and issues related to the use of ontology to identify and qualify information sources, including a) character set or representation (b) language interoperability, and the issues described within Task 2530. Furthermore, we will also investigate collaborative methods for tackling such issues. We should mention however that the crux of the interoperability problem is that digital objects and services have various dependencies (syntactic, semantic, etc). We cannot achieve interoperability when the involved parties are not aware of the dependencies of the exchanged artefacts. One general approach to tackle this problem is standardization. From the dependency point of view, standardization essentially reduces the dependencies or makes them widely known (it does not vanish dependencies). Apart from developing standards, it is worth investigating more flexible methods for tackling the interoperability problem. A rising question is whether we could tackle the interoperability problem without having to rely to several and possibly discrepant standards. It is worth investigating whether we could establish a protocol for aiding interoperability on the basis of explicit dependency management. To facilitate practical interoperability we also need to share ideas and reach common views on virtualisation of different types of data, particularly those outlined in the Warwick workshop

Description of work and role of partners

Task 2510 Research and development of common services and models to support interoperability.

In this task we will gather the conceptual models, services and formats that are used by the partners to address concrete interoperability challenges in digital preservation and try to develop structure the complex landscape of interoperability initiatives, models, virtualisation of data, and solutions management, storage etc to facilitate practical interoperability, services and formats that tackle the indentified discrepancies to identify the future objectives and propose the possible recommendations to fill the identified discrepancies and gaps. This includes conceptual models for exchanging provenance metadata (e.g. CRM Digital and OPM). We will establish collaborations with relevant standardization bodies and stakeholder communities on new standards.The identified gaps could be used as input for activities that could lead to new standards and interoperability services.

The specific objectives of this task are: 1) to describe ongoing and past projects and initiatives on interoperability in different areas of digital preservation and for different stakeholders and domains; 2) to gather models, standards and services adopted to address different digital preservation interoperability issues in order to provide a concrete tool to classify these solutions and allow a better understanding of this complex ecosystem; 3) to analyze some example interoperability scenarios and challenges encountered by partners and other stakeholders in the domain of digital preservation that serve to drive the definition of the main common interoperability objectives and guidelines for digital preservation. 4) to provide a set of recommendations to fill the gap and discrepancies between the current situation and the future goals and objectives.

Task 2520 Intelligibility Modelling and Reasoning

Each interoperability objective/challenge, like those that will be collected in T2510, is a kind of demand for the performability of a particular task (or tasks). In this task (T2520) we will identify such tasks, we will reflect on their dependencies and on how these can be modelled. The objective is to propose a modelling approach that enables the desired reasoning, e.g. task performability checking, which in turn could greatly reduce the human effort required for periodically checking or monitoring whether a task on an archived digital object or collection is performable, and consequently whether an interoperability objective is achievable. Such services could also assist preservation planning, especially if converters and emulators can be modeled and exploited by the dependency services. Finally, we will propose technologies for implementating the proposed modeling approach and we will report results and recomendations.

There is a need for services that help archivists in checking whether the archived digital artefacts remain intelligible and functional, in identifying hazards and obsolescence risks and investigating the consequences of probable losses. To tackle these requirements [48] [T, DEXA’07] showed how such services can be reduced to dependency management services, while [47] [TF, ECDL’07] extended that model with disjunctive dependencies. Central notions of these works is the notion of module, dependency and profile. In a nutshell, a module can be a software/hardware component or even a knowledge base expressed either formally or informally, explicitly or tacitly, that we want to preserve. A module may require the availability of other modules in order to function, be understood or managed (e.g. OAIS RepInfo). A profile is the set of modules that are assumed to be known (available or intelligible) by a user (or community of users), so this is an explicit representation of the concept of OAIS KB. Based on this model, a number of services have been defined for checking whether a module is intelligible by a community, or for computing the intelligibility gap of a module. GapMgr is a system which has been developed based on this model, and has been applied in the context of the EU project CASPAR. In the context of this NoE we will attempt to extend the framework of task-based dependencies. One promising direction is to found the extended framework on Horn Rules. The proposed framework and methodology, apart from simplifying the disjunctive dependencies of [47] [TF, ECDL’07], is expected to be more expressive and flexible as it will allow expressing the various properties of dependencies (e.g. transitivity, symmetry) straightforwardly. Subsequently we plan to elaborate on the inference services required for task-performability, risk-detection and computing intelligibility gaps. In addition we will evaluate various implementation approaches, e.g. implementations over ORDBMS (Datalog queries through Recursive SQL), Semantic Web (Ontologies and Rules SWRL). It is worth noting that due to disjunction there is not a unique way to fill an intelligibility gap. To tackle this problem we will elaborate on abductive reasoningfor computing intelligibility gaps.

Task 2530 Semantic Interoperability and Scientific Data

This task will serve two purposes:

  • to highlight the barriers which currently hamper semantic exploitation of scientific data, with focus on Earth sciences domain
  • to identify a number of solutions enabling semantic interoperability and exploitation of those data, promoted by the European Space Agency (ESA)
Results of such investigation work will be collected into internal project deliverable D25.3 "Semantic interoperability and scientific data, in the domain of Earth sciences" (ESA, M22). The overaching goal of D25.3 is to allow deriving recommendations with respect to exploiting semantic aspects of scientific data in support of their long term preservation. D25.3 is concepted to provide D25.1 with a comprehensive description of main interoperability issues of one specific scientific domain, and so be instrumental for deriving D25.1 conclusions and recommendations.

Activities related to semantic interoperability, ontologies and knowledge bases have been growing in relevance within Earth Observation (EO), and other disciplines. Within the EO domain there is a clear need to cope with needs ranging from knowledge capture (e.g.: for the description of Ground Segment components) to support semantic access to EO resources (e.g.: for the identification of relevant EO products) to preservation attributes identification. Different information organisation techniques are employed ( like thesauri, ontologies, topic maps), and various thesauri / dictionaries have been developed by a number of institutions: General Multilingual Environmental Thesaurus (GMET) by the EEA, Wiktionary by Wikipedia, Eurovoc by the EC Publications Office, Semantic Web for Earth and Environmental Terminology (SWEET) by NASA, are some of the high relevant European and international initiatives. To support semantic access to EO resources relevant for a particular application domain, we can identify suitable tools and information organisation techniques, but there are often unbreakable barriers, for various and different reasons, which prevent reusing existing thesauri / dictionaries, issue which is exacerbated when preservation issues need to be taken into account. Within this task we will address the limitations and barriers, establishing a networking capability with the objective to overcome them, taking into account a set of common high level objectives and requirements to be agreed upon. Such semantic interoperability high level objectives should permit application experts to easily identify within the archive the EO missions, sensors or products useful for their activity, using familiar semantic terms pertaining to their application domain and to follow-up and identify relevant preservation attributes. The baseline objectives to be given as input to the task will be agreed upon with experts via workshops and networking events. We will use as seed discussion elements the objective: to permit an easy, semantic identification from non-EO domains of relevant EO resources and of their preservation attributes; to keep ontology and architecture as simple as possible; support multiple application domains and limit dependencies from evolution / changes, taking into account the long lasting objective of long term data preservation.

List of deliverables

  • D25.1 Interoperability Objectives and Approaches (M26)
  • D25.2 Interoperability strategies (M33)

Description of deliverables

D25.1) Interoperability Objectives and Approaches: This document will gather the interoperability objectives and guidelines agreed by experts and stakeholders from the participating communities. Subsequently it will propose interoperability services, standard models, formats and virtualisation models and interfaces propose a useful map/framework/matrix to structure the complex ecosystem of interoperability issues in digital preservation, helping users and key stakeholders to solve their practical interoperability issues (i.e. finding suitable solutions) in different areas of digital preservation and for different interoperability objects. This deliverable will be the result of Tasks 2510 and 2530. [month 26]

D25.2) Interoperability strategies: This deliverable will analyze the main intelligibility objectives (identified in D25.1) through a dependency point of view for proposing a modeling approach that can automate task-performability checking and thus assist usability and preservation planning. Specifically, it will propose a methodology for capturing, modelling, managing and exploiting various interoperability dependencies. At first it will describe methods for modelling tasks and their dependencies which can have conjunctive or disjunctive nature. Then it will elaborate on the reasoning services required and it will investigate technologies that can be used for realizing them. Since there may exist several methods to fill an intelligibility gap (if there are disjunctive dependencies), we will investigate whether abductive reasoning which reasoning techniques can be exploited. [month 33]

Edit | Attach | Watch | Print version | History: r18 < r17 < r16 < r15 < r14 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r18 - 2013-01-06 - DavidGiaretta
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback