Version 15 (modified by 13 years ago) ( diff ) | ,
---|
Batch Operation to Extract Subtemplates
Date | 2012/05/03 |
Contact(s) | Simon Pigot |
Last edited | 2012/05/03 |
Status | ready to commit |
Assigned to release | 2.7.x |
Resources | Available |
Ticket # | #878 |
Overview
With the addition of the XLink processing, fragment harvesting, subtemplate (= fragment with an id in the Metadata table of the GeoNetwork database) support and tools for managing directories of subtemplates, GeoNetwork can now begin to support reusable fragments of metadata linked into records. However many sites have metadata records with common fragments of metadata that they would like to extract into directories of subtemplates. This proposal adds a batch operation for admin users that will extract subtemplates from a selected set of records. Subtemplates are identified as follows: if the root element of the subtemplate has a uuid attribute, then this will be the uuid of the extracted subtemplate. If there is no uuid attribute on the root element of the subtemplate, then one is obtained by calculating the checksum of its text content.
Proposal Type
- Type: New batch function for admin users
- App: GeoNetwork
- Module: Batch Operations
Voting History
- Proposed for voting on May 3, 2012
Motivations
Many sites have existing metadata records with common information eg. contact information in an ISO CI_ResponsibleParty element. With the addition of subtemplate support and maintenance functions to GeoNetwork, it should be possible to extract these fragments of metadata, remove duplicates and store them as subtemplates. This proposal describes a function that does this.
Proposal
This function works as follows:
- Identify fragments of metadata that they would like to manage as reusable subtemplates. This can be done using an XPath. eg. the XPath /grg:RE_Register/grg:containedItem/gnreg:RE_RegisterItem identifies register items in an ISO19135 register record such as that describing the ANZLIC Geographic Extent Names vocabulary and shown in the following example:
<grg:containedItem> <gnreg:RE_RegisterItem gco:isoType="grg:RE_RegisterItem" uuid="da078149-ba39-4cb9-817d-7229e479243b"> <grg:itemIdentifier> <gco:Integer>59</gco:Integer> </grg:itemIdentifier> <grg:name> <gco:CharacterString>AUSTRALIA EXCLUDING EXTERNAL TERRITORIES</gco:CharacterString> </grg:name> <grg:status> <grg:RE_ItemStatus>valid</grg:RE_ItemStatus> </grg:status> <grg:dateAccepted> <gco:Date>2006-10-10</gco:Date> </grg:dateAccepted> <grg:definition> <gco:CharacterString>AUSTRALIA EXCLUDING EXTERNAL TERRITORIES|-9|-44|154|112|Australia</gco:CharacterString> </grg:definition> ....... <gnreg:itemExtent> <gmd:EX_Extent> <gmd:geographicElement> <gmd:EX_GeographicBoundingBox> <gmd:westBoundLongitude> <gco:Decimal>112</gco:Decimal> </gmd:westBoundLongitude> <gmd:eastBoundLongitude> <gco:Decimal>154</gco:Decimal> </gmd:eastBoundLongitude> <gmd:southBoundLatitude> <gco:Decimal>-44</gco:Decimal> </gmd:southBoundLatitude> <gmd:northBoundLatitude> <gco:Decimal>-9</gco:Decimal> </gmd:northBoundLatitude> </gmd:EX_GeographicBoundingBox> </gmd:geographicElement> </gmd:EX_Extent> </gnreg:itemExtent> <gnreg:itemIdentifier> <gco:CharacterString>http://www.ga.gov.au/anzmeta/gen/AUS</gco:CharacterString> </gnreg:itemIdentifier> </gnreg:RE_RegisterItem> </grg:containedItem>
- Identify a field or fields within the fragment that will be used as the title of the subtemplate. It is important to choose a set of fields that will allow a human to identify the subtemplate when they choose to either reuse the subtemplate in a new record or edit in the subtemplate directories interface (see below).
- Write a small XSLT that when applied to the fragment, will extract the title information. As an example, here is an XSLT that when applied to an ISO19135 register item (gnreg:RE_RegisterItem), will extract the name of the register item (grg:name/gco:CharacterString) for use as the title of the subtemplate.
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:gco="http://www.isotc211.org/2005/gco" xmlns:grg="http://www.isotc211.org/2005/grg" xmlns:gnreg="http://geonetwork-opensource.org/register" xmlns:gmd="http://www.isotc211.org/2005/gmd"> <xsl:template match="gnreg:RE_RegisterItem"> <title><xsl:value-of select="grg:name/gco:CharacterString"/></title> </xsl:template> </xsl:stylesheet>
- In GeoNetwork main page, search for and then select the records from which the subtemplates will be extracted. See the following example:
- Choose 'Extract subtemplates' from the 'Actions on selected set' drop down menu at the top right of the search interface.
- Enter the XPath of the fragment, server side path name of the XSLT to extract a title for the subtemplate to be created from the fragment and a category to which the new subtemplates will be assigned. See the following example:
- Run the command and check the logs to see whether your XPath and title extraction XSLT are doing what you expect.
- Check the checkbox alongside the 'I really want to do this!' when you're sure that everything is ok. The end result will be that the fragments of metadata specified by the XPath will be removed from the records in the selected set and saved as subtemplates and then linked into the records that use them. Here is what part of the register record example used here looks like with an XLink replacing the original gnreg:RE_RegisterItem:
<grg:containedItem> <gnreg:RE_RegisterItem gco:isoType="grg:RE_RegisterItem" xlink:href="http://localhost:8080/geonetwork/src/eng/?da078149-ba39-4cb9-817d-7229e479243b"> </grg:containedItem>
- Check the subtemplates created in the Administration->Manage Directories function. Here is an example of how this looks after we have extracted subtemplates from the ANZLIC Geographic Extent Names register record (presentation of this subtemplate needs a little improvement - but this is just to demonstrate how subtemplates can be produced from metadata records):
Removing duplicates in the extraction process
As mentioned above, subtemplates created from an extraction can be assigned a uuid from the uuid attribute on the root element of the subtemplate or if that doesn't exist, a uuid will be calculated using a sha1 checksum on the text content of the subtemplate.
The advantage of this procedure is that the metadata records can be preprocessed using a batch XSLT operation in GeoNetwork that calculates a uuid and stores it as an attribute using rules appropriate to the site - eg. if extracting contact information as subtemplates, a site may decide that all fragments of contact information with the same organisation name should be linked to one subtemplate. To achieve this, a batch XSLT operation can be run before the subtemplate extraction to assign the same uuid to all CI_ResponsibleParty fragments with a common organisation name (eg. by calculating a checksum or by using a lookup table).
Backwards Compatibility Issues
None?
Risks
Need more functions to manage the links between records and subtemplates.
Participants
- Simon Pigot
Attachments (5)
- extract-subtemplates-selectedset-action.png (63.9 KB ) - added by 13 years ago.
- extract-subtemplates.png (106.4 KB ) - added by 13 years ago.
- manage-directories-window.png (101.4 KB ) - added by 13 years ago.
- extract-subtemplates-testresults.png (120.9 KB ) - added by 13 years ago.
- extractSubtemplates.patch (72.5 KB ) - added by 13 years ago.
Download all attachments as: .zip