wiki:CSW202Improvements

Context

GeoNetwork CSW current implementation has some known issues or missing functionnalities. The purpose of this document is to analyse the missing or incomplete elements in GeoNetwork according to CSW ISO 2.0.2 specification. The work will be based on current GeoNetwork trunk and focus on apply patch to GeoNetwork project and GeoSource.


OGC-CSW Operation Analysis

Reference document

GetCapabilities operation

  • Automatically added GetRecords search parameters in GetCapabilities operation
  • Get capability contact from catalog users, using admin page for configuration.
  • Automatically added keywords from catalog, ordered by frequency (number of keywords in configuration, default is 10).
  • Changed GetCapabilities properties for OGC Testsuite compliance
    • Added Constraint name="IsoProfiles"
    • Added gmd:MD_Metadata Typenames
    • Added support of sections parameter

DescribeRecords operation

  • Fixed DescribeRecord operation.
  • Make OGC testSuite successful : added three SchemaComponent, modified default results ...
  • Added profil support

GetRecords operation

CSW Search parameters

Here is a summary of what should be achieve on Geonetwork trunk CSW in order to respectuful of the 2.0.2 CSW specification.

Issues
  • How to handle namespace properly ? GeoNetwork will be as much flexible as possible and compliant to the specification regarding namespace handling.
    1. remove namespace from mapping (dc, csw, apiso ...)
    2. mapped all properties with apiso namespace and dc (ie !SupportedISOQueryables, SupportedDublinCoreQueryables)
    3. handled properties without dealing with namespace (ie whatever we provide : dc, csw, apiso... csw should be able to perform operation and to handle search parameters)
  • All GeoNetwork search parameters will be supported in CSW search (even if not list in the GetCapabilities documents). Eg. any = africa will work and AnyText = africa also.
  • Updated CSW202 schema to make validate option working.
  • Added missing elements in brief/summary/full response and reorder misplaced elements.
  • Inconsystency concerning case sensitivity between :
    • Case sensitivity for common queryables must be the same as defined in the base specification (e.g. 'apiso:title') [§8.2.2.1.1, OGC 07-045, page 71], [§4.7.2 CSW spec / INSPIRE], [§7.2.4, OGC 07-045, page 46], [§3.2.2 CSW spec / INSPIRE]
    • Parameter names in all KVP encodings must be handled in a case insensitive manner while parameter values shall be handled in a case sensitive manner [§4.2 CSW spec / INSPIRE].
  • Filter Encoding: difference between PropertyIsEqualTo & PropertyIsLike : first one has an attribute "matchCase" defined to true by default, IsLike property has no case constraint. Mainly because the fisrt one is BinaryComparisonOpType compare to PropertyIsLikeType.
  • Coupound parameters :
    • Degree & Specification must be coupled because there are linked or compound attribute. Do we need to create a couple field name and handle rules in java to perform adequate search ? There is also the same problem with operatesOn (Identifier & Name)
    • Should we add as AdditionalQueryables all compound elements such as CRS, SpatialResolution ... as concat field in lucene
  • Spatial filter only applied to LatLongBoundingBox : not to any kind of gmd:extent ?
  • OperatesOnWithOpName replaced by OperatesOnName (probably a typo in OGC 07-045 example).
Core queryable properties
  • Type : SupportedISOQueryables ??
Name Missing Must be modified Comments
Subject No Yes must be mapped to keyword AND topicCategory (only mapped with keyword for now)
Title No No -
Abstract No No -
AnyText No No -
Format No No -
Identifier No Yes must be mapped to FileIdentifier (fileId in lucene index)
Modified No Yes modified currently mapped to changeDate which is only indexed for dc schema in geonetwork. Modified must be mapped to dateStamp.Date or DateTime in ISO schema ???
Type No No -
WestBoundLongitude Yes - -> westBL in lucene index, check spatial search in lucene since Jesse's work
SouthBoundLongitude Yes - -> southBL in lucene index, check spatial search in lucene since Jesse's work
EastBoundLongitude Yes - -> eastBL in lucene index, check spatial search in lucene since Jesse's work
NorthBoundLongitude Yes - -> northBL in lucene index, check spatial search in lucene since Jesse's work
Authority Yes - crs field in lucene must be modified, currently corresponding to codeSpace and code concatenation. Index fields authority, crsCode, crsVersion created. Should we add a CRS property in CSW parameters ??? In this case should we concat the 3 properties??
IDYes- e.g. Authority
VersionYes- e.g. Authority
Additional queryable properties
common additional queryable properties
  • Type : SupportedISOQueryables
Name Missing Must be modified Comments
RevisionDate Yes - -> revisionDate (lucene index)
AlternateTitle No No -
CreationDate No No -
PublicationDate Yes - -> publicationDate (lucene index)
OrganisationName No No -
HasSecurityConstraints No No -
Language No No -
ResourceIdentifier No Yes currently dc:identifier in Geonetwork
ParentIdentifier No No -
KeywordType No No -
additional queryable properties for dataset, datasetcollection, application
  • Type : SupportedISOQueryables
Name Missing Must be modified Comments
TopicCategory No Yes topicCat lucene index should map the codelist??
ResourceLanguage No Yes currently DatasetLanguage in !Geonetwork
GeographicDescriptionCode No No -
Denominator No No -
DistanceValue No No -
DistanceUOM No No -
!TempExtent_begin No No -
!TempExtent_end No No -
additional queryable properties for services
  • Type : SupportedISOQueryables
Name Missing Must be modified Comments
ServiceType Yes - should map a Codelist? add serviceType to lucene index
ServiceTypeVersion Yes - should also map a Codelist? add serviceTypeVersion to lucene index
Operation Yes - add operation to lucene index
GeographicDescriptionCode No No e.g. GeographicDescriptionCode for dataset
OperatesOn Yes - add OperatesOn to lucene index, should map MD_DataIdentification.citation.CI_Citation.identifier but in Geonetwork/Geosource, identifier is store in uuidref attribute of MD_DataIndetification
OperatesOnIdentifier Yes - add OperatesOnIdentifier to lucene index
OperatesOnName Yes - add OperatesOnName to lucene index
CouplingType Yes - add CouplingType to lucene index, should map to a tag code under SV_CouplingType, there is no code tag under GN!! ???
INSPIRE additional queryable properties
  • Type = AdditionalQueryables
Name Missing Must be modified Comments
Degree Yes - add Degree to lucene index
SpecificationTitle Yes - add SpecificationTitle to lucene index
SpecificationDate Yes - add SpecificationDate to lucene index
SpecificationDateType Yes - add SpecificationDateType to lucene index
AccessConstraints Yes - add AccessConstraints to lucene index
OtherConstraints Yes - add OtherConstraints to lucene index
Classification Yes - add Classification to lucene index
ConditionApplyingToAccessAndUse Yes - add ConditionApplyingToAccessAndUse to lucene index
MetadataPointOfContact Yes - it shall include an email contact??
Lineage Yes - add Lineage to lucene index
Spatial Request & Filter Encoding
  • Added support for spatial request.
CQL
  • Migrate from zing Parser to GeoTools CQL parser.

GetRecordById

GetDomain

  • Supported ParameterName and PropertyName.
  • Handle list of values and range of values for PropertyName.
  • Display sorted list of values.

Harvest

  • Asynchronous mode only.

Main differences between CSW Harvest operation and GeoNetwork harvesting configuration are :

  • can't define criteria using CSW Harvest (ie harvest all remote node content)
  • can't login to remote node
  • can't define privileges
  • can't define logos.

Transaction

  • Done.

OGC-CSW Configuration system

The following CSW options could be set up in config-csw.xml in WEB-INF directory.

  • List of parameters available in GetRecords operations
    <geonet>
    	<operations>
    		<operation name="GetRecords">
    			<parameters>
    				<!-- - - - - - - - - - - - - - -->
    				<!-- Core queryable properties -->
    				<!-- - - - - - - - - - - - - - -->
    				<parameter name="Subject" field="subject" type="SupportedISOQueryables" />
    				<parameter name="CswParameterName" field="LuceneFieldKey" type="CswType" />
    
  • Profil support in CSW for DescribeRecord operation
    		<operation name="DescribeRecord">
    			<!-- schema attribute must defined an existing schema file name -->
    			<!-- located at /web/geonetwork/xml/validation/csw/2.0.2/ -->
    			<typenames>
    				<typename namespace="http://www.isotc211.org/2005/gmd" 
    				    prefix="gmd" name="MD_Metadata" schema="identification.xsd" />
    

The following CSW options could be define in the administrator interface:

  • Contact for the CSW service

In GeoNetwork administration page, catalogue configuration, administrator could configure CSW main contact and service title, abstract and constraints).


OGC-CSW Iso Profil management

  • Add the profil for DescribeRecord operation in config-csw.xml
  • Add schema to xml/validation/csw202_apiso100/csw/2.0.2
  • Add profil-brief.xsl, profil-summary.xsl, profil-full.xsl in xml/csw/schemas/iso19139 directory for GetRecords output.

OGC-CSW TestPage

OGC-CSW TestSuite

Teamengine

Results

  • Make Test csw:level1.1 passed by modifying GetRecords-InvalidRequest operation : Typenames must be an attribute of the query element. (but in specificiation, should be csw:Record by default ??)
  • csw:level1.2 / DescribeRecord-ValidResponseStructure
    • Regarding the BRGM specification : The DescribeRecordResponse must include three "SchemaComponent" elements :
      • identification.xsd for data (ISO19115/19139)
      • serviceMetadata.xsd for service (ISO19119)
      • record.xsd for CSW/DC definition
    • Regarding OGC spec (07-045) : The DescribeRecordResponse must include two "SchemaComponent" elements :
      • identification.xsd for data (ISO19115/19139)
      • serviceMetadata.xsd for service (ISO19119)
    • Should we add "application" and "datasetcollection" in schemaComponent file mapping ???
    • Fix namespace for TypeName parameter : The value of TypeName has to be qualified by a namespace or should be the default document namespace. There is currently an exception if any namespace sent in request.
  • csw:level1.3 / csw.GetCapabilities.document:
    • Is there a reason to comment NullCheck ComparisonOperator in GN ?
    • Regarding this test, 7 commons parameters are expected, idem -> what to do with common constraints (i.e. IsoProfiles, postEncoding)
      ...
       <xsl:if test="not($response/csw2:Capabilities/ows:OperationsMetadata/ows:Operation/ows:Parameter[@name='typeName'])">
              <message>FAILURE: the second assertion failed (9)</message>
              <fail/>
            </xsl:if>
      
      ...
      

Comments

  • Multilingual metadata in the test suite use locale="#locale-en" but locale-en is not define in metadata info section.
  • hierarchyLevelName = 'Application' : should we add this value to the ISO19139 codelist ?

Configuration


OGC-CSW Interoperability Matrix

GeoNetwork 2.2 GeoNetwork trunk csw GeoSource v2 MdWeb geocatalogue.fr eXcat CSW 2.0.1
GeoNetwork 2.2
GeoNetwork trunk csw OK OK OK NOK (only 2.0.2 supported)
GeoSource v2
MdWeb
geocatalogue.fr
!eXcat

Test URL

Test User Interface

Creation of a web interface for testing operation using POST method.


OGC-CSW extension in GeoNetwork

GetRecords operation

  • Added resultsType="results_with_summary" for narrow your search option (see geocat.ch sandbox)
  • Added outputSchema="own" to return metadata record in their own format (dc, fgdc, iso, ...)

Added gml:id attribute support

  • Allow use of gml:id in SpatialFilter to reduce big geometry transfert between client and server:
    <ogc:Intersects>
      <ogc:PropertyName>ows:BoundingBox</ogc:PropertyName>
      <gml:MultiPolygon xmlns:gml="http/www.opengis.net/gml" gml:id="kantone:2"/>
    </ogc:Intersects>
    
  • Require that the server knows some well know geographic features.

output/format

  • csw.pdf : GetRecords operation results in pdf format.
  • csw.txt : GetRecords operation results in txt format.
  • csw.kml : GetRecords operation results in txt format.
  • csw.rss : GetRecords operation results in txt format.

Voting history

  • Vote proposed : 20090326
  • Vote passed: 20090328
    • Jeroen Ticheler +1
    • Andrea Carboni +1
    • Patrizia Monteduro +1
    • Emanuele Tajariol +1
    • Francois Prunayre +1
    • Simon Pigot +1
    • Archie Warnock +1
Last modified 16 years ago Last modified on 05/03/09 10:30:03

Attachments (5)

Note: See TracWiki for help on using the wiki.