Category Archives: REF

Research Excellence Framework

REF Reporting Profile in CERIF XML (1.6) and Examples

With previous posts we introduced the mapping work to transform the REF XML Reporting Profile to CERIF XML (and vice-versa):

After quite a journey and some months later we now publish the current CERIF XML files to share them with the community for further discussion even if they are not as polished as initially planned. It is important to note, that the files did not undergo a final testing nor evaluation to this point. However, they are syntactically valid CERIF 1.6 XML and have been prepared thoroughly. To prevent from further delay and to not risk that the files will not be published and thus un-usable at all, we provide them for continued improvements and for further elaboration as such – this is important especially with respect to semantics. 

We consider the files a very valuable contribution for the guidance of future CERIF activities. They do demonstrate the complexity imposed by a multitude of applicable vocabularies and show the need for contextual clarity when defining boundaries, aggregation and governance levels.

It has to be mentioned here, that the “REF Reporting Profile in CERIF” was not a profile built according to REF Guidelines but a profile aimed at transforming a REF2014 XML file (following the REF Guidelines) into a CERIF XML file with an awareness of the substantial underlying structural differences at both ends – including that the data will finally have to be validated by the REF XML mechanism according to the guidelines (that is, e.g. the length of a string or the cardinality of values). It is for this reason also that a decision was taken, to use the REF XML element names as identifiers for the CERIF vocabulary terms whenever possible, to simplify the automated transformation script maximally and to ensure the recognition of the corresponding elements or hence terms (see below xml examples). This is also in support of a human understanding when examining the files. People familiar with CERIF will know that there is quite a number of required identifiers (often non human readable) within CERIF entities to enable the interlinkage or aggregation of objects; which may indeed be a challenge for the human reader (please have a look at the Excel Sheet comment column).

To provide for better access to the files – again for the human reader – the bulk reporting profile has been split into separate files:

Within the reporting files, the applied vocabulary terms (cfClassId) and their corresponding namespaces (cfClassId) are indicated by identifier references where the controlled vocabulary (cfClassId/cfClassSchemeId) itself is maintained in the vocabulary file.

For a quick reference we also provide an Excel Sheet of the profile. Its xml2xml tab covers all the involved entities and fields and indicates the explained structure. Its vocabularies tab collects all controlled terms (and their identifiers) except from those which are expected to be provided by the submitting institution themselves (hence a ‘institution’ prefix in the cfClassificationSchemeId column of the Excel). Examples of relevant institutional vocabulary terms are available with the vocabulary file and should be retrievable via the cfClassSchemeId field and the prefix ‘institution’ instead of ‘ref’.

If submitted in pieces and not in one bulk file, each object has to a) identify the reporting institution by provision of the UK Provider Reference Number (UKPRN) b) indicate multiple submissions and c) refer to the REF’s Units of Assessment.

The following snippets from REF XML and REF in CERIF XML provide insight into inherent structural differences. The complexity increases (not shown in the snippets) with CERIF relationships and furthermore with multiple vocabularies and definitions for possible aggregations and objects at a given time:

A REF2014 XML Snippet


<ref2014Data xmlns="http://www.ref.ac.uk/schemas/ref2014data">
  <institution>10006840</institution>
  <submissions>
    <submission>
      <unitOfAssessment>9</unitOfAssessment>
      <multipleSubmission>A</multipleSubmission>
    </submission>
  </submissions>
</ref2014>

The corresponding CERIF XML Snippet

<!-- REF 2014 XML in CERIF -->
<CERIF xmlns="urn:xmlns:org:eurocris:cerif-1.6-2" xsi:schemaLocation="urn:xmlns:org:eurocris:cerif-1.6-2 http://www.eurocris.org/Uploads/Web%20pages/CERIF-1.6/CERIF_1.6_2.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" date="2013-09-22" sourceDatabase="REF Common Fields">
<!-- -->
<cfOrgUnit>
<!-- for the identification of a submission -->
     <cfOrgUnitId>10006840</cfOrgUnitId>  <!-- UKPRN number *mandatory* -->
     <cfOrgUnit_Class>
          <cfClassId>institution</cfClassId>
          <cfClassSchemeId>ref-organisation-types</cfClassSchemeId>
     </cfOrgUnit_Class>
     <cfOrgUnit_Class>
          <cfClassId>A</cfClassId>
          <cfClassSchemeId>ref-multiple-submissions</cfClassSchemeId>
     </cfOrgUnit_Class>
     <cfOrgUnit_OrgUnit>
          <cfOrgUnitId2>9</cfOrgUnitId2>
          <cfClassId>unitOfAssessment</cfClassId>
          <cfClassSchemeId>ref-organisation-categories</cfClassSchemeId>
     </cfOrgUnit_OrgUnit>
</cfOrgUnit>
<!-- -->
</CERIF>

Many thanks again to Gareth Edwards (HEFCE) who was very supportive in explaining the meaning behind fields and structures which initially were not entirely clear.

The files are now available for further testing. An XSLT Transformation Script is available upon request to generate REFXML from CERIF XML (it needs a bug-fixing). We shall see upon feedback and responses how to proceed with it.

REF Reporting Profile in CERIF (Terms)

This post continues from the REF Reporting Profile in CERIF (Vocab). It elaborates on the employed vocabulary terms and their application within a context (following the underlying CERIF model). To indicate the employed CERIF entities, formal path mappings are presented. All defined terms in CERIF are formally recognised and thus replicated by their identifier, i.e. cfClassId value:

  • REF Multiple Submission Categories
    cfOrgUnit.cfOrgUnit_Class.cfClassId=”multipleSubmissionA”
    cfOrgUnit.cfOrgUnit_Class.cfClassId=”multipleSubmissionB”
    cfOrgUnit.cfOrgUnit_Class.cfClassId=”multipleSubmissionC”
    cfOrgUnit.cfOrgUnit_Class.cfClassId=”multipleSubmissionD”
  • REF Organisation Types
    cfOrgUnit.cfOrgUnit_Class.cfClassId=”institution”
    cfOrgUnit.cfOrgUnit_Class.cfClassId=”unitOfAssessment”
    cfOrgUnit.cfOrgUnit_Class.cfClassId=”group”
    cfOrgUnit.cfOrgUnit_Class.cfClassId=”employingOrganisation”
    cfOrgUnit.cfOrgUnit_Class.cfClassId=”publisher”
  • REF Organisation Categories
    cfOrgUnit.cfOrgUnit_OrgUnit.cfClassId=”UoA”
  • REF Action Types
    cfOrgUnit.cfOrgUnit_Class.cfClassId=”actionUpdate”
    cfOrgUnit.cfOrgUnit_Class.cfClassId=”actionOverwrite”
    cfOrgUnit.cfOrgUnit_Class.cfClassId=”actionDelete”
    cfPers.cfPers_Class.cfClassId=”actionUpdate”
    cfPers.cfPers_Class.cfClassId=”actionOverwrite”
    cfPers.cfPers_Class.cfClassId=”actionDelete”
    cfResPubl.cfResPubl_Class.cfClassId=”actionUpdate”
    cfResPubl.cfResPubl_Class.cfClassId=”actionOverwrite”
    cfResPubl.cfResPubl_Class.cfClassId=”actionDelete”
    cfResPat.cfResPat_Class.cfClassId=”actionUpdate”
    cfResPat.cfResPat_Class.cfClassId=”actionOverwrite”
    cfResPat.cfResPat_Class.cfClassId=”actionDelete”
    cfResProd.cfResProd_Class.cfClassId=”actionUpdate”
    cfResProd.cfResProd_Class.cfClassId=”actionOverwrite”
    cfResProd.cfResProd_Class.cfClassId=”actionDelete”
    cfEvent.cfEvent_Class.cfClassId=”actionUpdate”
    cfEvent.cfEvent_Class.cfClassId=”actionOverwrite”
    cfEvent.cfEvent_Class.cfClassId=”actionDelete”
    cfMeas.cfMeas_Class.cfClassId=”actionUpdate”
    cfMeas.cfMeas_Class.cfClassId=”actionOverwrite”
    cfMeas.cfMeas_Class.cfClassId=”actionDelete”
    cfSrv.cfSrv_Class.cfClassId=”actionUpdate”
    cfSrv.cfSrv_Class.cfClassId=”actionOverwrite”
    cfSrv.cfSrv_Class.cfClassId=”actionDelete”
  • REF Person Names Scheme
    cfPers.cfPersName_Pers.cfClassId=”surname”
    cfPers.cfPersName_Pers.cfClassId=”initials”
    cfPers.cfPersName_Pers.cfClassId=”name”
  • REF Identifier Types
    cfFedId.cfFedId_Class.cfClassId=”hesaStaffIdentifier”
    cfFedId.cfFedId_Class.cfClassId=”staffIdentifier”
    cfFedId.cfFedId_Class.cfClassId=”articleNumber”
    cfFedId.cfFedId_Class.cfClassId=”doi”
    cfFedId.cfFedId_Class.cfClassId=”patentNumber”
  • REF Staff Categories
    cfPers.cfPers_Class.cfClassId=”categoryA”
    cfPers.cfPers_Class.cfClassId=”categoryC”
  • REF Contract Types
    cfOrgUnit.cfPers_OrgUnit.cfClassId=”contractedFte”
    cfOrgUnit.cfPers_OrgUnit.cfClassId=”isResearchFellow”
    cfOrgUnit.cfPers_OrgUnit.cfClassId=”isEarlyCareerResearcherFalse”
    cfOrgUnit.cfPers_OrgUnit.cfClassId=”isEarlyCareerResearcherTrue”
    cfOrgUnit.cfPers_OrgUnit.cfClassId=”isOnFixedTermContract”
    cfOrgUnit.cfPers_OrgUnit.cfClassId=”isOnSecondment”
    cfOrgUnit.cfPers_OrgUnit.cfClassId=”isOnUnpaidLeave”
  • REF Staff Measurements
    cfPers.cfPers_Meas.cfClassId=”isNonUKBased”
    cfPers.cfPers_Meas.cfClassId=”earlyCareerStartDate”
    cfPers.cfPers_Meas.cfClassId=”totalPeriodOfAbsence”
    cfPers.cfPers_Meas.cfClassId=”numberOfQualifyingPeriods”
    cfPers.cfPers_Meas.cfClassId=”categoryCexplanatoryText”
    cfPers.cfPers_Meas.cfClassId=”complexOutputReduction”
    cfPers.cfPers_Meas.cfClassId=”circumstanceExplanation”
  • REF Staff Circumstance Identifiers
    cfMeas.cfMeas_Class.cfClassId=”circumstanceIdentifier1″
    cfMeas.cfMeas_Class.cfClassId=”circumstanceIdentifier2″
    cfMeas.cfMeas_Class.cfClassId=”circumstanceIdentifier3″
    cfMeas.cfMeas_Class.cfClassId=”circumstanceIdentifier4″
    cfMeas.cfMeas_Class.cfClassId=”circumstanceIdentifier5″
    cfMeas.cfMeas_Class.cfClassId=”circumstanceIdentifier6″
    cfMeas.cfMeas_Class.cfClassId=”circumstanceIdentifier7″
    cfMeas.cfMeas_Class.cfClassId=”categoryCCircumstances”
  • REF Sensitivity States
    cfPers.cfPers_Class.cfClassId=”isSensitive”
    cfResPubl.cfResPubl_Class.cfClassId=”isSensitive”
    cfResProd.cfResProd_Class.cfClassId=”isSensitive”
    cfResPat.cfResPat_Class.cfClassId=”isSensitive”
    cfEvent.cfEvent_Class.cfClassId=”isSensitive”
  • REF Staff Research Groups
    cfPers.cfPers_OrgUnit.cfClassId=”group1″
    cfPers.cfPers_OrgUnit.cfClassId=”group2″
    cfPers.cfPers_OrgUnit.cfClassId=”group3″
    cfPers.cfPers_OrgUnit.cfClassId=”group4″
  • REF Output Categories
    cfResPubl.cfResPubl_Class.cfClassId=”isPendingPublication”
    cfResProd.cfResProd_Class.cfClassId=”isPendingPublication”
    cfResPat.cfResPat_Class.cfClassId=”isPendingPublication”
    cfEvent.cfEvent_Class.cfClassId=”isPendingPublication”
    cfResPubl.cfResPubl_Class.cfClassId=”isNonEnglishLanguage”
    cfResProd.cfResProd_Class.cfClassId=”isNonEnglishLanguage”
    cfResPat.cfResPat_Class.cfClassId=”isNonEnglishLanguage”
    cfEvent.cfEvent_Class.cfClassId=”isNonEnglishLanguage”
    cfResPubl.cfResPubl_Class.cfClassId=”isInterdisciplinary”
    cfResProd.cfResProd_Class.cfClassId=”isInterdisciplinary”
    cfResPat.cfResPat_Class.cfClassId=”isInterdisciplinary”
    cfEvent.cfEvent_Class.cfClassId=”isInterdisciplinary”
    cfResPubl.cfResPubl_Class.cfClassId=”isDuplicateOutput”
    cfResProd.cfResProd_Class.cfClassId=”isDuplicateOutput”
    cfResPat.cfResPat_Class.cfClassId=”isDuplicateOutput”
    cfEvent.cfEvent_Class.cfClassId=”isDuplicateOutput”
    cfResPubl.cfResPubl_Class.cfClassId=”isOutputCrossReferred”
    cfResProd.cfResProd_Class.cfClassId=”isOutputCrossReferred”
    cfResPat.cfResPat_Class.cfClassId=”isOutputCrossReferred”
    cfEvent.cfEvent_Class.cfClassId=”isOutputCrossReferred”
    cfResPubl.cfResPubl_Class.cfClassId=”proposedDoubleWeighting1″
    cfResProd.cfResProd_Class.cfClassId=”proposedDoubleWeighting1″
    cfResPat.cfResPat_Class.cfClassId=”proposedDoubleWeighting1″
    cfEvent.cfEvent_Class.cfClassId=”proposedDoubleWeighting1″
    cfResPubl.cfResPubl_Class.cfClassId=”proposedDoubleWeighting2″
    cfResProd.cfResProd_Class.cfClassId=”proposedDoubleWeighting2″
    cfResPat.cfResPat_Class.cfClassId=”proposedDoubleWeighting2″
    cfEvent.cfEvent_Class.cfClassId=”proposedDoubleWeighting2″
    cfResPubl.cfResPubl_Class.cfClassId=”reserveOutput1″
    cfResProd.cfResProd_Class.cfClassId=”reserveOutput1″
    cfResPat.cfResPat_Class.cfClassId=”reserveOutput1″
    cfEvent.cfEvent_Class.cfClassId=”reserveOutput1″
    cfResPubl.cfResPubl_Class.cfClassId=”reserveOutput2″
    cfResProd.cfResProd_Class.cfClassId=”reserveOutput2″
    cfResPat.cfResPat_Class.cfClassId=”reserveOutput2″
    cfEvent.cfEvent_Class.cfClassId=”reserveOutput2″
    cfResPubl.cfResPubl_Class.cfClassId=”reserveOutput3″
    cfResProd.cfResProd_Class.cfClassId=”reserveOutput3″
    cfResPat.cfResPat_Class.cfClassId=”reserveOutput3″
    cfEvent.cfEvent_Class.cfClassId=”reserveOutput3″
    cfResPubl.cfResPubl_Class.cfClassId=”reserveOutput4″
    cfResProd.cfResProd_Class.cfClassId=”reserveOutput4″
    cfResPat.cfResPat_Class.cfClassId=”reserveOutput4″
    cfEvent.cfEvent_Class.cfClassId=”reserveOutput4″
    cfResPubl.cfResPubl_Class.cfClassId=”hasConflictsOfInterests”
    cfResProd.cfResProd_Class.cfClassId=”hasConflictsOfInterests”
    cfResPat.cfResPat_Class.cfClassId=”hasConflictsOfInterests”
    cfEvent.cfEvent_Class.cfClassId=”hasConflictsOfInterests”
    cfMeas.cfMeas_Class.cfClassId=”outputNumber1″
    cfMeas.cfMeas_Class.cfClassId=”outputNumber2″
    cfMeas.cfMeas_Class.cfClassId=”outputNumber3″
    cfEvent.cfEvent_Class.cfClassId=”outputNumber4″
    cfResPubl.cfResPubl_OrgUnit.cfClassId=”crossReferToUoA”
    cfResProd.cfResProd_OrgUnit.cfClassId=”crossReferToUoA”
    cfResPat.cfResPat_OrgUnit.cfClassId=”crossReferToUoA”
    cfEvent.cfEvent_OrgUnit.cfClassId=”crossReferToUoA”
    cfResPubl.cfResPubl_OrgUnit.cfClassId=”publisher”
    cfResProd.cfResProd_OrgUnit.cfClassId=”publisher”
    cfResPat.cfResPat_OrgUnit.cfClassId=”publisher”
    cfEvent.cfEvent_OrgUnit.cfClassId=”publisher”
    cfResPubl.cfResPubl_ResPubl.cfClassId=”volume”
  • REF Output Types
    cfResPubl.cfResPubl_Class.cfClassId=”outputTypeA”
    cfResPubl.cfResPubl_Class.cfClassId=”outputTypeB”
    cfResPubl.cfResPubl_Class.cfClassId=”outputTypeC”
    cfResPubl.cfResPubl_Class.cfClassId=”outputTypeR”
    cfResPubl.cfResPubl_Class.cfClassId=”outputTypeD”
    cfResPubl.cfResPubl_Class.cfClassId=”outputTypeE”
    cfResPubl.cfResPubl_Class.cfClassId=”outputTypeU”
    cfResProd.cfResProd_Class.cfClassId=”outputTypeL”
    cfResProd.cfResProd_Class.cfClassId=”outputTypeP”
    cfEvent.cfEvent_Class.cfClassId=”outputTypeI”
    cfEvent.cfEvent_Class.cfClassId=”outputTypeM”
    cfResPat.cfResPat_Class.cfClassId=”outputTypeF”
    cfResProd.cfResProd_Class.cfClassId=”outputTypeK”
    cfResPubl.cfResPubl_Class.cfClassId=”outputTypeN”
    cfResPubl.cfResPubl_Class.cfClassId=”outputTypeO”
    cfResProd.cfResProd_Class.cfClassId=”outputTypeG”
    cfResPubl.cfResPubl_Class.cfClassId=”outputTypeH”
    cfResProd.cfResProd_Class.cfClassId=”outputTypeQ”
    cfResProd.cfResProd_Class.cfClassId=”outputTypeS”
    cfResProd.cfResProd_Class.cfClassId=”outputTypeT”
  • REF Output Measurements
    cfResPubl.cfResPubl_Meas.cfClassId=”numberOfAdditionalAuthors”
    cfResPubl.cfResPubl_Meas.cfClassId=”additionalInformation”
    cfResPubl.cfResPubl_Meas.cfClassId=”doubleWeightingStatement1″
    cfResPubl.cfResPubl_Meas.cfClassId=”doubleWeightingStatement2″
    cfResPubl.cfResPubl_Meas.cfClassId=”conflictedPanelMembers”
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”researchGroup”
    cfPers.cfPers_ResPubl.cfClassId=”outputNumber1″
    cfPers.cfPers_ResPubl.cfClassId=”outputNumber2″
    cfPers.cfPers_ResPubl.cfClassId=”outputNumber3″
    cfPers.cfPers_ResPubl.cfClassId=”outputNumber4″
  • REF Organisation Measurements
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”degreesAwarded”
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”researchDoctoralsAwarded”
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”income2008″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”income2009″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”income2010″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”income2011″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”income2012″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”statement”
  • REF Income Source Identifiers
    cfMeas.cfMeas_Class.cfClassId=”income”
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”source1″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”source2″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”source3″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”source4″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”source5″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”source6″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”source7″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”source8″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”source9″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”source10″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”source11″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”source12″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”source13″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”source14″
  • REF Income Kind Source Identifiers
    cfMeas.cfMeas_Class.cfClassId=”incomeKind”
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”source14″
    cfOrgUnit.cfOrgUnit_Meas.cfClassId=”source15″
  • REF Impact Statements
    cfSrv.cfSrv_Meas.cfClassId=”statement”
    cfSrv.cfSrv_Meas.cfClassId=”redactedStatement”
  • REF Environment Statements
    cfSrv.cfSrv_Meas.cfClassId=”statement”
    cfSrv.cfSrv_Meas.cfClassId=”redactedStatement”
  • REF Case Studies
    cfSrv.cfSrv_Meas.cfClassId=”caseStudy”
    cfSrv.cfSrv_Meas.cfClassId=”redactedCaseStudy”
  • REF Case Study Categories
    cfMeas.cfMeas_Class.cfClassId=”conflictedPanelMembers”
    cfMeas.cfMeas_Class.cfClassId=”isCaseStudyCrossReferred”
    cfMeas.cfOrgUnit_Meas.cfClassId=”crossReferToUoA”
  • REF Redaction Statuses
    cfMeas.cfMeas_Class.cfClassId=”notRedacted”
    cfMeas.cfMeas_Class.cfClassId=”requiresRedaction”
    cfMeas.cfMeas_Class.cfClassId=”notForPublication”
  • REF Case Study Contacts
    cfPers.cfPers_Meas.cfClassId=”contact1″
    cfPers.cfPers_Meas.cfClassId=”contact2″
    cfPers.cfPers_Meas.cfClassId=”contact3″
    cfPers.cfPers_Meas.cfClassId=”contact4″
    cfPers.cfPers_Meas.cfClassId=”contact5″
    cfPers.cfPers_EAddr.cfClassId=”emailAddress”
    cfPers.cfPers_EAddr.cfClassId=”alternateEmailAddress”
    cfPers.cfPers_EAddr.cfClassId=”phone”
    cfOrgUnit.cfOrgUnit_Pers.cfClassId=”organisation”
    cfOrgUnit.cfOrgUnit_PAddr=”address”
  • REF Units of Assessment 
  • Institution’s Research Group Codes
  • Institution’s Geographic Boundings
  • Institution’s Media Relations
  • Institution’s Job Titles

Where the provided list of terms gives an idea of the range of coverage, it mainly demonstrates the formal CERIF application of a term via cfClassId values within particular entities; each term naturally comes with a human readable name or label that is additionally stored within the CERIF Semantic Layer. For the REF Reporting Profile in CERIF, we decided to not use uuids with the vocabularies, i.e. cfClassIds. This decision was taken for reasons of simplicity with respect to mappings and for a better human readability (most vocabulary terms can be recognised from their Ids and if needed, uuids can still be assigned at a later point). The REF Reporting Profile in CERIF is based on HEFCE’s REF XML requirements (see import example xml file), guided by a data requirements for REF import files  where e.g. the income sources, income kind sources or circumstances are explained, where data types are defined, and where the meaning of each field is explained in some details. Similarly, the Units of Assessment codes (they are an exception – because they are considered as organisational units) and the Output Type codes as defined by HEFCE are consistently replicated within cfClassIds. In addition, each submitting institutions has to submit their research group codes, applied geographic boundings, media relations and their job titles with staff circumstances.

HEFCE’s REF XML file maintains a different structure compared to a CERIF XML structure, and the Ids of the presented vocabulary terms have also been kept (because they are in use and will be used) in order to enable a simple and consistent mapping from HEFCE’s REF XML file to the REF Reporting Profile in CERIF. That is, the element names of HEFCE’s REF XML file are mostly replicated through the vocabulary terms’ Ids as employed with the REF Reporting Profile in CERIF.

The transformation or mapping will be presented in a separate post – it requires the definition of rules. The next post will support understanding and present some XML snippets of the REF Reporting Profile in CERIF.

See: REF Reporting Profile in CERIF (XML snippets)

REF Reporting Profile in CERIF (Vocab)

This post continues from the REF Reporting Profile in CERIF (Entities) and elaborates on the employed vocabularies. With CERIF all vocabularies are maintained in the so-called CERIF Semantic Layer, which allows for the maintenance of multiple vocabularies. Within the Semantic Layer, each term belongs to at least one particular vocabulary or so-called classification scheme. A classification scheme defines the range of terms and could in fact – in more technical terms – also be seen as a term’s namespace.

For the REF Reporting Profile, the following vocabularies (classification schemes) have been defined:

  • REF Multiple Submission Categories
  • REF Organisation Types
  • REF Organisation Categories
  • REF Action Types
  • REF Person Names Scheme
  • REF Identifier Types
  • REF Staff Categories
  • REF Contract Types
  • REF Staff Measurements
  • REF Staff Circumstance Identifiers
  • REF Sensitivity States
  • REF Staff Research Groups
  • REF Output Categories
  • REF Output Types
  • REF Output Measurements
  • REF Organisation Measurements
  • REF Income Source Identifiers
  • RER Income Kind Source Identifiers
  • REF Impact Statements
  • REF Environment Statements
  • REF Case Studies
  • REF Case Study Categories
  • REF Redaction Statuses
  • REF Case Study Contacts
  • REF Units of Assessment
  • Institution’s Research Group Codes
  • Institution’s Geographic Boundings
  • Institution’s Media Relations
  • Institution’s Job Titles

Thus, each classification scheme from this list contains multiple vocabulary terms, e.g. Update, Overwrite, Delete with REF Action Types, or e.g. HESA Staff Identifier, Staff Identifier, Article Number, DOI, Patent Number with REF Identifier Types. The terms and the schemes have been defined following the REF Submission requirements.

The subsequent post will elaborate on the applied REF terms and their application with particular CERIF entities. That is, the context in which they occur following the CERIF model, e.g. Update may be required with a staff record, i.e. concerns a person (cfPers.Pers_Class.), or at an institution (cfOrgUnit.OrgUnit_Class), or with an output (cfResPubl.ResPubl_Class), etc. Another example would be, e.g. the HESA Staff Identifier (a federated identifier) belongs to a person (cfPers.cfFedId.cfFedId), whereas the DOI (a federated identifier) comes with a publication (cfResPubl.cfFedId.cfFedId).

See: REF Reporting Profile in CERIF (Terms)

REF Reporting Profile in CERIF (Entities)

This post continues from the REF Reporting Profile in CERIF (Intro) and elaborates on the employed CERIF entities for the REF reporting format. Building a profile in CERIF requires at first the identification of the relevant entities and their relationships.

cerif_entities_REF-profile

Selected CERIF entities for the REF Reporting Profile

The green squares and lines represent the CERIF REF Reporting Profile. In the formal CERIF model also relationships are entities. They are called link entities, maintain a semantically agnostic name (syntax) and link entities such as e.g. person with organisation (cfPerson_OrganisationUnit) or project (cfPerson_Project). In formal CERIF terms, each entity and each relationship has a defined name, i.e. syntax.

The formal definition of the REF Reporting Profile in CERIF requires an aggregation of the relevant entities to describe the concepts – these are as follows:

  • Institution: cfOrgUnit, cfOrgUnit_OrgUnit, cfOrgUnit_Class
  • Staff: cfPers, cfPersName, cfPersName_Pers, cfPers_Class, cfPers_OrgUnit, cfPers_ResPubl, cfPers_ResPat, cfPers_ResProd, cfPers_Event
  • Outputs: cfResPubl, cfResPubl_Class, cfResPubl_cfResPubl
  • Outputs: cfResProd, cfResProd_Class, cfResProd_ResProd, cfResPubl_ResProd
  • Outputs: cfResPat, cfResPat_Class, cfResPat_ResPat, cfResPubl_ResPat
  • Outputs: cfEvent, cfEvent_Class, cfResPubl_Event
  • Environments, Circumstances, Impact, Case Studies: cfMeas, cfMeas_Class, cfSrv_Meas, cfOrgUnit_Meas, cfPers_Meas, cfResPubl_Meas, cfResProd_Meas, cfResPat_Meas
  • Submission Contacts: cfEAddr, cfPers_EAddr, cfOrgUnit_PAddr
  • Submission Contacts: cfPAddr, cfPers_PAddr, cfOrgUnit_PAddr
  • Federated Identifiers: cfFedId
  • Employed Vocabularies: cfClass, cfClassScheme

Having selected the profiles’ entities, requires in a next step the employment of vocabularies. The vocabulary terms allow for a meaningful labelling of the link entities, such as: Manager in cfProj_Pers, or cfPers_OrgUnit, or e.g. Author in cfPers_ResPubl.

See: REF Reporting Profile in CERIF (Vocab)

 

REF Reporting Profile in CERIF (Intro)

This is to inform, there is a REF Reporting Profile available in CERIF-XML which is being prepared for its addition to the REF submission system during the first half of 2013. The REF submission system will be opened for institution’s to start preparing their submissions to the REF 2014 from late January 2013.

A multitude of ongoing related activities in the UK are in need of clarity over concepts and definitions (see also results from the CiA workshop) that allow for a meaningful and formal description of its certain concerned aspects within the Research ecosystem. These are often concerned with Research information and data exchange and to achieve interoperability. This contribution introduces the concepts applied with reporting to the REF – hence the REF Reporting Profile and explains how these REF reporting concepts have been formally described in CERIF-XML. The idea is to introduce the different employed modules stepwise and in the end aggregate them within one new post.

The REF requires reporting of information on behalf of institutions. One institution may have multiple submissions to different REF panels or so-called Units of Assessment. The REF reporting concepts are:

  • institution
  • group
  • staff
  • circumstances
  • outputs
  • environments
  • income
  • impact
  • case studies
  • submission contacts

The Research Excellence Framework (REF) is the new system for assessing the quality of research in UK higher education institutions (HEIs). It will replace the Research Assessment Exercise (RAE) and will be completed in 2014. UK Universities are currently very busy preparing their data for the January submission in different formats (e.g. xml, xsd, accdb, mdb).

On behalf of Jisc, this contribution resulted from a collaboration between Gareth Edwards (HEFCE) and Brigitte Jörg (UKOLN, euroCRIS). The former is very familiar with the REF submission system, conceps and requirements, and has been involved in multiple previous RAE submissions. The latter has been leading the CERIF task group at euroCRIS for many years, and recently joined UKOLN in the role of the CERIF National Coordinator with the Jisc Innovation Support Center, and was involved in various related UK projects such as R4R and MICE.

See: REF Reporting Profile in CERIF (Entities)