Changes between Version 12 and Version 13 of MimeTypeCalculationIndexing


Ignore:
Timestamp:
Apr 16, 2010, 3:32:02 AM (14 years ago)
Author:
simonp
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • MimeTypeCalculationIndexing

    v12 v13  
    3333This proposal implements:
    3434
    35  * Mime type calculation for online resources using [http://mime-util.sourceforge.net mime-util] immediately after a metadata record is saved/imported in update-fixed-info.xsl.
    36  * Indexing of the mime type in Lucene by index-fields.xsl
     35 * Mime type calculation for online resources (gmd:protocol fields that start with WWW:DOWNLOAD or WWW:LINK - others can be added if required by individual sites) using [http://mime-util.sourceforge.net mime-util] immediately after a metadata record is saved/imported in update-fixed-info.xsl.
     36 * Calculated mime type is stored in metadata record as gmx:MimeFileType child of gmd:name (replaces gco:CharacterString) and will look like the following example:
     37{{
     38                                        <gmd:onLine>
     39                                                <gmd:CI_OnlineResource>
     40                                                        <gmd:linkage>
     41                                                                <gmd:URL>http://localhost:8080/geonetwork/srv/en/file.disclaimer?id=10&amp;fname=basins.zip&amp;access=private</gmd:URL>
     42                                                        </gmd:linkage>
     43                                                        <gmd:protocol>
     44                                                                <gco:CharacterString>WWW:DOWNLOAD-1.0-http--download</gco:CharacterString>
     45                                                        </gmd:protocol>
     46                                                        <gmd:name xmlns:gmx="http://www.isotc211.org/2005/gmx" xmlns:srv="http://www.isotc211.org/2005/srv">
     47                                                                <gmx:MimeFileType type="application/x-zip">basins.zip</gmx:MimeFileType>
     48                                                        </gmd:name>
     49                                                        <gmd:description>
     50                                                                <gco:CharacterString>Hydrological basins in Africa (Shapefile Format)</gco:CharacterString>
     51                                                        </gmd:description>
     52                                                </gmd:CI_OnlineResource>
     53                                        </gmd:onLine>
     54}}
     55 * Indexing of the mime type (from the type attribute of gmx:MimeFileType) in Lucene by index-fields.xsl
    3756 * Inclusion of the mime type Lucene field as an !AdditionalQueryable in the CSW config.
    3857
     
    4261
    4362 * include adding a search field in the advanced search interface
    44  * replace mime type calculations done elsewhere (specifically in Jeeves src/jeeves/util/BinaryFile.java and in metadata-iso19139.xsl) in !GeoNetwork with mime-util code or the results of mime-util
     63 * replace mime type calculations done elsewhere (specifically in Jeeves src/jeeves/util/BinaryFile.java) in !GeoNetwork with mime-util code
    4564
    4665These can be done at a later date.
    4766
    48 Note that the patch file attached to this proposal includes some enhancements to the Lucene Index Reader provider code in !SearchManager.java and the [wiki:TemporalExtentSearch temporal extent search proposal].
     67Note that the patch file attached to this proposal includes some enhancements to the Lucene Index Reader provider code in !SearchManager.java, a nicer file download dialog (the trunk is using file.download service but those that don't want that can switch back to resources.get by editing update-fixed-info.xsl for their schema) and the [wiki:TemporalExtentSearch temporal extent search proposal].
    4968
    5069== Risks ==
    5170
    52 At present the only reasonable place for storing the mime type in the onlineresource field is as a uuid attribute of the gmd:linkage element. If an element/attribute is not used then the mime type will have to be calculated at the time of indexing - this may slow down indexing of many records with attached online resources.
     71The update-fixed-info.xsl calls Java objects in src/org/fao/geonet/util/MimeTypeFinder.java to do the mime-util based calculation. This may slow down indexing of records with attached online resources - haven't noticed much of a slow down in the 3 months or so this has been in the BlueNetMEST branch.
    5372
    5473== Participants ==