Changes between Initial Version and Version 1 of MimeTypeCalculationIndexing


Ignore:
Timestamp:
Apr 14, 2010, 5:30:51 AM (14 years ago)
Author:
simonp
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • MimeTypeCalculationIndexing

    v1 v1  
     1= Mime Type Calculation and Indexing =
     2
     3|| '''Date''' || 2010/04/14 ||
     4|| '''Contact(s)''' || Simon Pigot ||
     5|| '''Last edited''' || [[Timestamp]] ||
     6|| '''Status''' || Complete ||
     7|| '''Assigned to release''' || 2.5.0 ||
     8|| '''Resources''' || Available ||
     9
     10== Overview ==
     11
     12!GeoNetwork uses all kinds of code in a few different places to calculate mime types for files that are uploaded with metadata records as online resources (usually based on filenames). However the mime type is never indexed with the metadata record that points to the resource and the mime type calculation is usually done based on the filename so may not reflect the true content of the file or return the correct registered mime type for alternatives filenames. This means that searches cannot be done on the content of online resources.
     13
     14=== Proposal Type ===
     15 * ''Sandbox''': BlueNetMEST
     16 * '''App''': !GeoNetwork
     17 * '''Module''': Lucene Index, Metadata Schemas
     18
     19=== Links ===
     20 * '''Documents''': [http://mime-util.sourceforge.net/ mime-util]
     21
     22=== Voting History ===
     23 * Vote proposed but voting not yet closed.
     24
     25----
     26
     27== Motivations ==
     28
     29The motivation for this proposal is to get the mime type into the Lucene index for use by other programs that may want to do searches via CSW to determine whether GeoNetwork has metadata records with attached online resources that could be of interest. The particular use case this was developed for is a data visualization program which wanted to search the GeoNetwork catalog for files of interest.
     30
     31== Proposal ==
     32
     33This proposal implements:
     34
     35 * Mime type calculation for online resources using [http://mime-util.sourceforge.net/ mime-util] immediately after a metadata record is saved/imported in update-fixed-info.xsl.
     36 * Indexing of the mime type in Lucene by index-fields.xsl
     37
     38At this stage the proposal does not:
     39
     40 * include adding a search field in the advanced search interface
     41 * replace mime type calculations done elsewhere (specifically in Jeeves src/jeeves/util/BinaryFile.java and in metadata-iso19139.xsl) in GeoNetwork with mime-util code or the results of mime-util
     42
     43These can be done at a later date.
     44
     45== Risks ==
     46
     47At present the only reasonable place for storing the mime type in the onlineresource field is as a uuid attribute of the gmd:linkage element. If an element/attribute is not used then the mime type will have to be calculated at the time of indexing - this may slow down indexing of many records with attached online resources.
     48
     49== Participants ==
     50
     51 * CSIRO: Gary Carroll and Uwe Rosebrock
     52