|Version 10 (modified by heikki, 4 years ago)|
ebXML : Transforming ISO19139 metadata to ebRIM : issues in the specification
author: Heikki Doeleman
This page describes uncertainties arising from the obscurity of OGC 07-038, (mainly) section F.
The specification in OGC 07-038 section F about how to register ISO metadata in a ebRIM registry is rather obscure. Apart from a very loose use of language relating to specific technical concepts like XML 'elements' and 'attributes' (usually anything is called an 'attribute' or a 'property' in that document, regardless), there are more things unclear. This page lists our uncertainties in how to interpret that document.
- The object type of DataSet is defined in the Basic Extension package (OGC 07-144r2), as being "urn:ogc:def:ebRIM-ObjectType:OGC:Dataset". In the CIM spec (07-038), many references are made to DataSet, but with object type "urn:x-ogc:specification:csw-ebrim-cim:ObjectType:Dataset". This type is not defined in 07-038 (nor anywhere else that we are aware of). So what should we use ?? For now, I'm asuming the type defined in 07-144r2 is preferred, as it is actually defined, whereas the alternative type in 07-038 is undefined.
Table F.2 describes the creation of a ResourceMetadata object.
- fileIdentifier : the table says it is not mapped, but "see Table F.1". Does this mean the information (already put in a MetadataInformation in table F.1) must be repeated in this ResourceMetadata ? In this case I opted for YES.
- language : the table says it is not mapped, but "see Table F.1". Does this mean the information (already put in a MetadataInformation in table F.1) must be repeated in this ResourceMetadata ? In this case I opted for YES.
- parentIdentifier : the table says it is not mapped, but "see Table F.1". Does this mean the information (already processed into an extra MetadataInformation in table F.1) must be repeated ? In this case I opted for NO, as there already is a parent MetadataInformation as per table F.1.
- identificationInfo : "In this profile, the cardinality of this property is restricted to 1..1 for the ISO 19139 metadata files stored in the ebRIM Repository." Very good, but what to do with perfectly valid ISO19139 documents that have more than 1 identificationInfo ? Then in again, in Table F.2 it is stated to apply the Section F.3 transformation to "each instance of the property (identificationInfo)". I'm doing it for-each, now.
- Section F.3.1 distinguishes DataSet/DatasetCollection, Service and Application types of Information Resource by the value of 'hierarchyLevel'. In ISO19139 it is perfectly valid to have 0 or more than 1 hierarchyLevel. What to do in these cases ??? For now, I'm just using the first one if there are more, and if there are zero, try 'DataSet'.
- Section F.3.1 tries to describes cases for distinguishing DataSet from DatasetCollection, when hierarchyLevel is 'dataset'. Says the spec: "in the case of an ISO 19139 compliant metadata record, the value of the MD_Metadata.hierarchyLevel property may serve as a discriminator since ISO 19139 extends the MD_ScopeCode codelist to add specific values for aggregation;" But alas, it's not defined exactly *which* codelist values map to aggregate. This is not obvious. I'm ignoring this for now.
- "The existence of an instance of MD_Metadata.referenceSystemInfo will possibly imply to create an instance of CitedItem along with an instance of the association Auhority between IdentifiedItem and CitedItem."
*possibly* ? what is that supposed to mean ? I'm assuming : if there is an authority element in referenceSystemInfo.
- alternateTitle : this has cardinality 0..n, but this spec doesn't mention it. I'm assuming they mean to say "for each".
- date : must be mapped to <<slot>> created, <<slot>> modified or <<slot>> issued. The spec does not say *how* this must be mapped. I'm using 'creation', 'revision' and 'publication' from the codelists used in ISO.
- date : this has cardinality 1..n, but this spec doesn't mention it. I'm assuming they mean to say "for each".
- identifier.MD_Identifier.code : "Identifiers with no codespace do not carry sufficient information and are not mapped to externalIdentifier, for which the codespace is required." BUT MD_Identifier *never* has a codespace, per the XSD. Only its substitutiongroup RS_Identifier may have a codespace. I'm assuming they intended to say, RS_Identifier.
- everytime, an Organization is created. So in this way these Organizations are never re-used / shared between data referring to them. Does not seem to make much sense, to me.
- individualName : is ignored, but not "If needed". Well .. I'm ignoring it.
- organizationName : this must be organisationName (with 's') in ISO.
- organizationName : this is not a required element in ISO. What if it is absent ? The created Organization will be rather non-descript.
- about the CitedResponsibleParty Association : "The association Type has a set of subtypes operating to the same object types: PointOfCOntact, Author, Originator, Publisher." This is not true, no such subtypes are defined. From clues elsewhere in that document I take it this stuff is handled by classifying the association.
- the codelist values for gmd:role can be many other things than just 'pointOfCOntact', 'author', 'originator', or 'publisher'. If it is not one of those 4, I ignore it so no classification will be created. Does it make sense to you?
TO BE CONTINUED