Changes between Version 19 and Version 20 of oaipmh_improvements


Ignore:
Timestamp:
Jul 16, 2012, 5:48:33 PM (12 years ago)
Author:
simonp
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • oaipmh_improvements

    v19 v20  
    8181
    8282 * retrieves metadata objects with '''unresolved''' references to other objects
    83  * retrieves and adds one copy of all referenced metadata objects to the OAIPMH harvest results
     83 * adds one instance of all referenced metadata objects to the OAIPMH harvest results
    8484 * extends the current OAIPMH implementation: the default behaviour will be to return resolved metadata records (ie. the current implementation). Referencing the alternative OAIPMH service will deliver metadata objects.
    8585
    86 The reason for implementing this extension is to enable conversions to formats that support metadata objects and relationships eg. ANDS RIF-CS.
     86The reason for implementing this extension is to enable conversions to other metadata schemas that use metadata objects and relationships eg. ANDS RIF-CS. Note that these schemas must support a subset of the objects and relationships available in the ISO19115/19139 model.
    8787 
    8888==== '''Deleted Records''' ====
     
    104104==== Metadata Object Harvesting ====
    105105
    106 Currently, !GeoNetwork will store the unresolved metadata record (ie. without linked fragments) in the database and resolve all links to fragments before returning the record as a search result. This proposal will implement a :
    107  * Implement Lucene search that retrieves metadata objects
    108    * First stage returns metadata objects with unresolved references to other external metadata objects. This is not difficult as metadata objects are stored with unresolved references in the database. Note: links to internal relationships (eg. repeated metadata in the same record) will still need to be resolved.
    109    * Since XLinks and parent-child + sibling relationships are or can be indexed in Lucene, a second query will be generated to collect all related metadata objects and add those to the search results. 
    110  * As RIF-CS is a metadata object standard, we can test conversion of the ISO metadata objects returned by the two stage search to the RIF-CS schema. To implement this we will need to adapt the RIF-CS XSLTs for iso19139.mcp, eml-gbif and iso19139.anzlic to handle:
     106Currently, !GeoNetwork stores the unresolved metadata record (ie. without linked fragments) in the database. All links to fragments are resolved before indexing the record in Lucene or returning the record as a search result. To support retrieving metadata objects one of two approaches could be used:
     107 - Use Lucene search to retrieve records that match OAIPMH query and then process search results to add any referenced metadata objects of interest
     108   * First search returns UUID of metadata records that match the OAIPMH query.
     109   * Since XLinks and parent-child + sibling relationships are indexed in Lucene, the search results can be processed to collect all referenced metadata objects and add their UUIDs to the search results.
     110   * Each object can then be retrieved from the database by UUID in '''unresolved''' form and returned as a result.
     111 - The second approach would require metadata records and the objects that are referenced from the record to be stored as a document block in Lucene (using Lucene 3.6). The OAIPMH query would search on the metadata record but the results returned would include the referenced objects from the document block. A possible advantage of this approach over the first is that the search results would not have to be processed before they are returned which means that this approach is likely to be considerably faster than the first. A disadvantage of this approach over the first is that the Lucene indexing process in GeoNetwork would need to be substantially modified.       
     112
     113 - As RIF-CS is a metadata object standard, we can test conversion of the ISO metadata objects returned by the two stage search to the RIF-CS schema. To implement this we will need to adapt the RIF-CS XSLTs for iso19139.mcp, eml-gbif and iso19139.anzlic to handle:
    111114   * metadata objects such as person and organisation contact information as parties
    112115   * metadata records with a project hierarchy level as activities