wiki:ComposedMetadataRecords

Version 31 (modified by simonp, 15 years ago) ( diff )

--

Composed Metadata Records

Date 2009/09/01
Contact(s) Simon Pigot
Last edited Timestamp
Status draft, being discussed, in progress, early stage complete
Assigned to release 2.5
Resources Available for first stage

Overview

For GeoNetwork to become part of an institutional metadata and data management fabric, it must be able to compose metadata from content held in databases external to GeoNetwork. This proposal would add two new components to the kernel and harvester modules of GeoNetwork:

  • WFS fragment harvester - a harvester that can import metadata fragments (also known as subtemplates) from an external database with a WFS interface
  • XLink resolver and cache - metadata records can then be composed from a skeleton with links to fragments of metadata

Proposal Type

  • Type: Core Change
  • App: GeoNetwork
  • Module: Harvester, Kernel, Data Manager, Jeeves
  • Subtemplates: Metadata fragments are equivalent to subtemplates (which were only partially implemented in GeoNetwork). Subtemplates appear to have been an extension of the template concept in GeoNetwork. Templates are complete metadata records with some elements filled in. A user can clone such a record for use in the editor as a template. At this stage all connection between the template and the cloned record is broken ie. changes to the template are not visible in the cloned record and vice versa. Subtemplates (as they were partially implemented in GeoNetwork) took the template concept down to the level of individual elements in the metadata record. So for example, contact information could be saved as a subtemplate and then reused when editing different elements. The implementation of subtemplates didn't make clear whether the link between a subtemplate and the record it had been added to was maintained. This proposal intends to implement subtemplates as fragments of metadata harvested from an external database - the link between a metadata record and a fragment will be maintained ie. changes in the fragment will be visible in the record.
  • Composed, Componentized and Relational Metadata: The idea of composed or componentized metadata and the term composed metadata is not new, it appears to be common to many discussions on the net and in the literature and its implementation is probably an aim of many metadata tools. Another term with similar aims but which uses the concepts of reuse/normalization/removal of redundancy from relational database terminology is "relational" metadata (eg. LISASoft metadata report). Although there has been discussion in and around these topics and even some implementation of fragments in GeoNetwork as subtemplates, this proposal appears to be the first to suggest the mechanisms for implementing these concepts in GeoNetwork using fragments harvested from a database with a WFS interface.

Voting History

  • Not voted on as yet.

Motivations

The motivation for this proposal comes from the need to fit GeoNetwork into organisations that already manage metadata in a number of different databases external to GeoNetwork.

Proposal

The two components to be added by this proposal (in more detail):

  • WFS fragment harvester - this is a harvester that accepts (along with the usual harvester parameters) a WFS GetFeature query, a template which will be for each feature and linked to metadata fragments (more than one fragment can be returned from a feature) plus permissions and categories for fragments and records created. Note: the template is only used to create records on the first run of the harvester. The WFS response to the GetFeature request is assumed to be in the form of fragments. Current implementation uses deegree WFS where the transformation of the WFS response can be done by the server using xslt. (See http://geonetwork.svn.sourceforge.net/viewvc/geonetwork/sandbox/BlueNetMEST/src/org/fao/geonet/kernel/harvest/harvester/metadatafragments/ - this will be renamed wfsmetadatafragments)

Minor changes were required to change the display of metadata fragments in the GeoNetwork editor so that they cannot be edited. Future work would add:

  • Support for metadata fragments through GeoServer community schemas, including how to configure the GeoServer WFS embedded with GeoNetwork to provide the WFS interface.
  • URN resolver to provide a level of indirection that can be used to cope with changing URLs and ensure referential integrity. Metadata fragments are linked into records using XLinks. XLinks can use URLs or URNs in the link attribute (xlink:href). URNs are intended to provide a permanent A service that provides the ability to register a urn and associated URL, and lookup a URL given a URN, would allow the implementation to use URNs in place of URLs, thus providing a measure of control over broken links/missing content which can occur if we were to use URLs.
  • Support for updating a fragment in GeoNetwork- sometimes it makes sense for a fragment to be edited and saved back into the external database from which it was harvested. WFS-T support would be used to provide this facility.
  • Change other suitable GeoNetwork harvesters (eg. OGC WxS capabilities harvester) to harvest fragments rather than complete metadata records using the same approach as the WFS fragment harvester.
  • Support in the editor for fragments: the original intention of subtemplates was that they be accessible from the editor ie. a user could select a fragment (eg. contact info) to use when editing that portion of a metadata record. Some work appears to have already been done in the geocat.ch sandbox on this function.
  • Access to fragments by other editor tools: Other editing tools (eg. the wizard based ANZMETLite tool) can use fragments in their interface to ease the metadata entry and editing process. Fragments harvested into GeoNetwork should be accessible to these tools.

Backwards Compatibility Issues

Metadata records as traditionally handled by GeoNetwork should not be affected by the addition of this feature.

Effects on harvesting: Composed metadata records can have their XLinks resolved before they are harvested as a harvester usually updates the set of records harvested from a remote site on a regular basis. If there was a requirement for composed metadata records to be harvested with unresolved XLinks then an option could be added to the harvester to prevent XLink resolution before harvesting.

Effects on export (eg. MEF): Composed metadata records exported as MEF files would normally have their XLinks resolved before export. If there was a requirement for composed metadata records to be exported with unresolved XLinks then an option could be added to the MEF export service to prevent XLink resolution before export.

XLinks and metadata records composed from xlink'd fragments can be made optional through the use of a system configuration option that can turn this feature on or off. This has been implemented in the BlueNetMEST sandbox.

Risks

Some XLink concepts are open to a number of interpretations eg. the notion of a relative URL with fragment identifier such as:

<gmd:temporalExtent xlink:href="#temporalExtent">

would (I think) be interpreted as a link to a fragment within the same document. From discussions with the deegree developers (who have an advanced xlink implementation in their WFS), it appears that some organisations are interpreting such a link as being a fragment in any document within the local database. (reference required)

Participants

  • Simon Pigot CSIRO Marine and Atmospheric Research, IMOS/eMii developers
  • geocat.ch developers - have similar requirements including support for updating metadata fragments and using fragments in the editor and have implemented XLinks and caching
  • URN resolver and GeoServer community schema support: AuScope/Spatial Information Services Stack (SISS) team
  • "Relational" metadata - LISASoft developers?
  • Others?
Note: See TracWiki for help on using the wiki.