Monthly Archives: March 2013

Data Context – Towards Pragmatic Boundaries

Research Data Alliance Logo

On March 18, the 1st Plenary of the global Research Data Alliance (RDA) in Gothenburg was launched by Neelie Kroes – Vice-President of the European Commission. The three days event brought together more than 200 ‘data advocates’ from around the world representing researchers, institutions, governments and industry, to facilitate discussions towards next possible steps with collaboration and joint work via working groups or interest groups – in the spirit of openness, sharing and re-use at the intersection of culture and science.

The procedure as to how RDA working groups and interest groups are being setup is introduced, but still considered a work in progress. At the launch event’s Agora session group proposals were presented and discussions continued within the so-called Birds of a Feather sessions. Besides two formal working groups on “Persistent Identifiers” and “Data Type Registries” mature case statements were introduced for “Data Foundations and Terminology“, “Practical Policy“, “Legal Interoperability” and “Communities and Engagement“, while other presentations where still refining their objectives “Metadata“, “Contextual Metadata“, “Repository Audit and Certification“, “Preservation e-Infrastructure“, “Marine Data Management“, “UPC Code“, “Data Citation” and new ideas emerged “Community Capability Model“, “Big Data Analysis Query“, “Worldwide PID“, “Data Publication“, “Architectural Data Interoperability” and “Industry and Health Informatics“.

While the different proposals indicate the wide range, implied challenges and overlap, this post is to inform about the outcome of discussions around “Contextual Metadata”, which started from four initial, rather un-specified use-cases (see also forum discussion), of which (1-3) where classified managerial and (4) was meant to speak for the researcher:

  1. Output Reporting to Funders
  2. Exchange of Information on Research Activity between Universtities
  3. Management of the Research Portfolio of a University by a Research Manager
  4. Discovery and Re-use of Datasets for other Purposes

The one hour BOF discussions approached these use-cases with the aim of understanding a context by identifying its significant underlying entities and their relationships towards delivering formal, standardised descriptions, i.e. implementations, taking into account material that is available and the expertise of people willing to engage. That is, putting forward a formal Case Statement to establish a RDA Working Group approached via use-cases.

Discussions revealed there is a lot of interest in the proposed group, which in the spirit of RDA will be renamed to “Data Context”. Furthermore, it was recommended that the proposed use-cases should be as specific as possible to ensure feasibility and delivery.

Pragmatic Boundaries

Pragmatic Boundaries

In the end, a new set of much more specific use-cases has been agreed. These will be posted in the RDA forum for further public consultation and refinement. They have been classified according to anticipated perspectives and in their order follow the discussed priorities:

  • Researcher: Find data and supplementary information (e.g. services, reports, tools, news, photos, …) to support a case study around an event (e.g. hurricane Katerina) from different catalogues.
  • Managerial: Indicate to funders what are the overlaps and gaps in currently funded research. Want to know from Data-Centers and Scientists if there are overlaps in Programmes – look a lot wider -> sub implications – understands amongst others semantics of geography and temporal contextual aspects.
  • Provenance: Allow to take Segments from Streamed Data and Workflows. (e.g. Social scientist reporting on social aspects of a climatic event) (e.g. an agency will publish storm reports/maps … and increasingly in public spaces … posting them on facebook, tweet where people wish to know from where is the data and who produced the image, who ran the processing job.)
  • Interoperability: Exchange of contextual metadata between different systems.

Close collaboration is foreseen and has started with other working groups, especially the proposed RDA “Metadata” group, where interaction facilitators have been nominated to maintain the bridge. In addition, there was an agreement to exchange group members between the ICSU/World Data System’s group “Knowledge Network” and to explore potential collaboration opportunities with CODATA working groups. Available standardisation and harmonisation approaches such as those initiated by DCC, PROV, PREMIS, MARC, CKAN, DCAT, CERIF, CASRAI, VIVO, OAIS, APA, W3C, ISO, OMG, etc. will certainly guide development and implementation processes. A report is being prepared, slides will be uploaded and discussions will be continued in the RDA forum.

The RDA initiative has been brought into existence by an initial three research funding organisations:

  • The Australian Commonwealth Government through the Australian National Data Service supported by the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative
  • The European Commission through the iCordi project funded under the 7th Framework Program
  • The United States of America through the RDA/US activity funded by the National Science Foundation

Research Data Alliance on Twitter:  http://twitter.com/resdatall

CERIF 1.5 Reference Doc – XML API – GtR Hack Day

The Gateway to Research (GtR) project organised its first hack days to test their APIs. The two-days event was hosted by Aston University in Birmingham and convened about 20 invited people with backgrounds in system development. A documentation has been published beforehand explaining in detail the GtR Application Programming Interface V1.0, providing at the time of testing two APIs

  • GtR API
  • CERIF API

both producing outputs in XML and JSON formats. More APIs will be added and existing APIs will change over time. Future updates will be informed at the GtR web portal. The hack event was considered as a test-drive where further events are being considered later this year. The GtR CERIF XML API received very positive feedback during the two days event.

This post is to inform the wider community about the availability of a CERIF 1.5 Reference Document as a result of a few hours of collaborative work at the hack event by Chris Gutteridge and Brigitte Jörg, automatically transforming and merging existing CERIF model files into a more readable version of CERIF 1.5 descriptions with the aim to serve in particular the community of developers by saving their time with getting in touch with the CERIF model, structure and thus, mission.

The file transformation script developed by Chris Gutteridge is provided without restrictions and thus encourages re-use and adjustments for upcoming CERIF release updates.

Further event results will be reported at the GtR website.

The goal of GtR is to give the public better access to information on research funded by the Research Councils, such as – who, what and where the Councils fund, and the output and outcomes, linking to available open access repositories and/or data catalogues. More information and discussion is available from the below links:

CERIF UK Coordination Meeting (preliminary summary)

On February 28th 2013 a CERIF UK Coordination Meeting was held at Prospero House in London, to identify the priorities for a feasible and sustainable CERIF coordination and implementation roadmap from ongoing activities in the UK Research Information Management (RIM) space. The current wider CERIF UK landscape is depicted as follows.

 

Wider CERIF UK Landscape

 

Before the meeting, an open spreadsheet has been prepared listing the ongoing activities, the organisations engaged and the outputs available (reflected in the above image). The spreadsheet was meant as a start for add-ons and is still open for extension and not yet to be considered final: http://bit.ly/13PWJqq

Many UK HEIs now use CERIF-based systems and work continues to ensure that funders’ systems to collect information about research outputs can accept information from universities in this standard format. Emerging national infrastructures such as RCUK’s Gateway to Research are based on CERIF and a CERIF-XML interface for the REF submission system is being prepared for addition during the first half of 2013. These developments are increasingly complemented by international activities within CASRAI and VIVO.

However, consistent implementation and standards development requires coordination. The meeting provided an overview of ongoing and past activities, the organisations engaged and the outputs and assets resulting from them to highlight the need for sustainability and to determine which of them require further development, maintenance and dissemination. Coordinated action is necessary to enable their consistent re-use and implementation. The meeting was an opportunity to identify what the current priorities and issues are and the commitments that can be made for the next steps forward.

In the end, it was clear that coordination is work and requires human resources and organisations to support the efforts, i.e. the work also needs to secure continued funding beyond June 2013.

A report and related material will be made available shortly.