wiki:MultilingualMetadata

Version 13 (modified by fxp, 15 years ago) ( diff )

--

Multilingual editing

Date 2008/07/03
Contact(s) fxprunayre
Last edited Timestamp
Status Motion proposed
Assigned to release to be determined
Resources Done in geocat.ch sandbox

Overview

Adding the support for multilingual metadata. For ISO based standards, all the gco:CharacterString elements and their translation can be stored in GeoNetwork. This also allows editing mulitilingual metadatas. Add multilingual metadata support in view mode and editing. Use Google translation service to suggest translation in editing.

Proposal Type

  • Type: GUI Change, Module Change
  • App: GeoNetwork
  • Module: Data Manager, Editor (XSL changes mainly)
  • Documents:
  • Email discussions:
  • Other wiki discussions:

Voting History

Proposal

Allow multilingual editing in GeoNetwork. One metadata record define:

  • one main language (using gmd:language element)
  • n other languages (using gmd:locale elements)

Editing mode

A multilingual metadata template is added to default installation.

In editing mode, each multilingual elements are composed of:

  • text input
  • language selection list

By default, the selected language is the GUI language if language is defined in the metadata.

Optionnaly, Google translation service is defined. Translation could be suggest to the editor using the small icon right to the language selector. The translation convert the default metadata character string in the current selected language. Google translation service limit to 5000 characters. Other translation service could be pluged in here if needed.

View mode

In view mode, according to GUI language :

  • if GUI language is available in the metadata, the element is displayed in this language
  • else the elemenet is displayed in metadata default language.

This behaviour is also applied to dublin core output for CSW services.

Indexing

By default, full text indexing is applied to all fields. Multilingual content is indexed for the any search criteria. More clever indexing mechanism could be implemented in lucene (see MultilingualIndexMechanism). This is not part of this proposal.

Backwards Compatibility Issues

  • Only ISO compliant

Implementation

Metadata is defined by one main language (gmd:MD_Metadata/gmd:language:gco:CharacterString) and other locale (gmd:MD_Metadata/gmd:locale/*).

If user add new translation, a new local element has to be added:

<gmd:locale>
  <gmd:PT_Locale id="FR">
    <gmd:languageCode><gmd:LanguageCode codeList="#LanguageCode" codeListValue="fra">French</gmd:LanguageCode></gmd:languageCode>			
    <gmd:characterEncoding><gmd:MD_CharacterSetCode codeList="#MD_CharacterSetCode" codeListValue="utf8">UTF 8</gmd:MD_CharacterSetCode></gmd:characterEncoding>
  </gmd:PT_Locale>
</locale>

After editing, the new translated element is stored with:

  • an xsi:type= « gmd:PT_FreeText_PropertyType»
  • a first gco:CharacterString in the metadata language (in the example "en")
  • n gmd:PT_FreeText elements with each translation
    <scope xsi:type="gmd:PT_FreeText_PropertyType">
      <gco:CharacterString>Codelists for description of metadata datasets compliant with ISO/TC 211 19115:2003 and 19139</gco:CharacterString>
      <gmd:PT_FreeText>
        <gmd:textGroup>
          <gmd:LocalisedCharacterString locale="#FR ">Listes de codes pour la description de lots de métadonnées conforme ISO TC/211 19115:2003 et 19139</gmd:LocalisedCharacterString>
        </gmd:textGroup>
      </gmd:PT_FreeText>
    </scope>
    

In the editor any gco:CharacterString could be multilingual but an XSL template is defined to exclude element which should not be multilingual (eg. ISBN, ISSN). The gco:CharacterString is in default language, the other PT_FreeText elements has to use a locale declared in the metadata document.

Risks

Participants

  • Francois
  • Jesse
  • swisstopo (support)

Attachments (4)

Note: See TracWiki for help on using the wiki.