wiki:oaipmh_improvements

Version 5 (modified by simonp, 12 years ago) ( diff )

--

Improvements to the OAIPMH Harvester

Date 2012-07-16
Contact(s) Simon Pigot
Last edited 2012-07-16
Status draft, being discussed
Assigned to release 2.9
Resources Available
Ticket # #XYZ

Overview

OAIPMH Harvester in GeoNetwork needs to be enhanced to support the following:

  • Object harvesting: In recent times GeoNetwork has moved from supporting ISO19115/19139 metadata in the form of a 'record' to supporting a tree based hierarchy of ISO19115/19139 metadata 'objects'. The diagram below shows a typical hierarchy:

The mechanisms for handling these relationships are part of the ISO standard. They can be explicit in the form of an xlink that refers directly to the related metadata object or implicit by including the UUID of a related metadata object as content in an element. Here is an example of an explicit relationship between a metadata record and a fragment of contact information that it includes:

  <mcp:metadataContactInfo>
    <mcp:CI_Responsibility>
      <mcp:role>
        <gmd:CI_RoleCode codeList="..." codeListValue="custodian"/>
      </mcp:role>
      <mcp:party xlink:href="http://mygeonetwork.com/xml.metadata.get?uuid=urn:marine.csiro.au:marlin:person:28_person_organisation"/>
    </mcp:CI_Responsibility>
  </mcp:metadataContactInfo>

Here is an example of an implicit relationship where the UUID of the parent record in a parent-child relationship is held in the content of the parent identifier element:

    <gmd:parentIdentifier>
        <gco:CharacterString>urn:marine.csiro.au:project:187</gco:CharacterString>
    </gmd:parentIdentifier>

And another example of an implicit relationship where the UUID of the sibling record in a sibling relationship between a dataset metadata record and a project metadata record (uuid: urn:marine.csiro.au:marlin:project:187) is held as a code in an identifier element:

    <gmd:aggregateInformation>
       <gmd:MD_AggregateInformation>
            <gmd:aggregateDataSetIdentifier>
                ...
                <gmd:MD_Identifier>
                    <gmd:code>
                        <gco:CharacterString>urn:marine.csiro.au:marlin:project:187</gco:CharacterString>
                    </gmd:code>
                </gmd:MD_Identifier>
                ...
            </gmd:aggregateDataSetIdentifier>
            <gmd:associationType>
                <gmd:DS_AssociationTypeCode codeList="..." codeListValue="crossReference">crossReference</gmd:DS_AssociationTypeCode>
            </gmd:associationType>
            <gmd:initiativeType>
                <gmd:DS_InitiativeTypeCode codeList="..." codeListValue="project">project</gmd:DS_InitiativeTypeCode>
            </gmd:initiativeType>
        </gmd:MD_AggregateInformation>
    </gmd:aggregateInformation>

OAIPMH and most other harvesters are record based. ie. it is expected or assumed that a harvest will retrieve one or more metadata records. GeoNetwork's OAIPMH server returns records by resolving xlink references to metadata objects. The resolve process:

  • finds the fragment of metadata referenced by the xlink (which could be local to the catalog or external to the catalog)
  • copies the fragment of metadata into the record

Metadata objects that are implicitly referenced as UUIDs in the content are not resolved.

One of the goals of this proposal is to provide an alternative OAIPMH harvester service that:

  • retrieves metadata records with unresolved references
  • retrieves and adds all referenced metadata objects to the OAIPMH harvest results
  • extends the current OAIPMH implementation: the default behaviour will be to return resolved metadata records. Referencing the alternative OAIPMH service will deliver all metadata objects in unresolved form.

The reason for implementing this extension is to enable easy conversion to

  • Deleted Records:

Proposal Type

  • Type: GUI Change, Core Change, Module Change, Guideline and project governance procedures, ...
  • App: GeoNetwork or !Intermap
  • Module: eg. Harvester, Kernel, Data Manager, Metadata Import, Lucene Index, Search Interface ...
  • Documents:
  • Email discussions:
  • Other wiki discussions:

Voting History

  • Vote proposed by X on Y, result was +/-n (m non-voting members).

Attachments (1)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.