Most nation-states have publicly-supported research programmes. It is realised that public sponsorship of research and development leads to wealth creation and improvement in the quality of life. Because public funding is involved, it is necessary for there to be appropriate governance, and also for the information to be available to the public. Broadly, each nation state has a similar research process of: strategic planning; programme announcement; call for proposals; proposal evaluation and awarding; project result monitoring, project result exploitation.
However, research is international. A research project in country A is likely to be based on previous research in several other countries. Many research projects are now transnational: well-known examples include the human genome and climate change but there are many others, especially where expensive infrastructure is utilised such as particle physics or space science. Furthermore, knowledge of the research activity in country A may influence the strategy towards research - including priorities and resources provided - in country B. Thus there is a need to share research information across countries, or even between different funding agencies in the same country.
The information is used by researchers (to find partners, to track competitors, to form collaborations); research managers (to assess performance and research outputs and to find reviewers for research proposals); research strategists (to decide on priorities and resourcing compared with other countries); publication editors (to find reviewers and potential authors); intermediaries / brokers (to find research products and ideas that can be carried forward with knowledge/ technology transfer to wealth creation); the media (to communicate the results of R&D in a socio-economic context) and the general public (for interest).
Within Europe this requirement has long been recognised. In the early eighties Heads of some national research funding organisations initiated a project named IDEAS to investigate linking databases of research information. The follow on project EXIRPTS 1987-1989 extended this - based on the G7 countries - to include USA and Japan. Both projects successfully demonstrated: (a) resolution of schema differences between heterogeneous distributed databases (b) translation and distribution of queries as subqueries to the different target systems (c) integration of the results of the distributed subqueries and presentation back to the queryer. The technology utilised a specially-developed protocol, run over email and file transfer. It should be noted that this was well before the emergence of WWW (World Wide Web).
It is from this work that CERIF emerged, first as a simple standard not unlike a library catalogue card or the present DC (Dublin Core Metadata Standard) and was intended as a data exchange format. It was based on records describing projects, with persons and organisational units as attributes. However, it was soon realised that in practice this CERIF91 standard was inadequate: it was too rigid in format, did not handle repeating groups of information, was not multilingual / multi character set and did not represent in a sufficiently rich way the universe of interest. A new group of experts was convened and CERIF2000 was generated.
Its essential features are: (a) it has the concept of objects or entities with attributes such as project, person, organisational unit; (b) it supports n:m relationships between them (and recursively on any of them) using 'linking relations' thus providing rich semantics including roles and time; (c) it is fully internationalised in language and character set; (d) it is extensible without prejudicing the core datamodel thus providing guaranteed interoperability at least at the core level but not precluding even richer intercommunication. It is designed for use both for data exchange (data file transfer) and for heterogeneous distributed query / result environments. With CERIF2004, minor improvements in consistency have been released. With CERIF2006 substantial improvements have been implemented with the model, concerning in particular the introduction of a so-called Semantic Layer, that makes the model flexible and scalable for application in very heterogeneous environments.