wiki:proposals/GeoNetworkDataDir

GeoNetwork data directory

Date 2012/03/09
Contact(s) François Prunayre
Last edited
Status Motion passed / Done
Assigned to release 2.7.0
Resources Available
Ticket # #826

Overview

The GeoNetwork data directory is the location on the file system where GeoNetwork stores all of its custom configuration. This configuration defines such things as: What thesaurus is used by GeoNetwork? What schema is plugged in GeoNetwork?. The data directory also contains a number of support files used by GeoNetwork for various purposes (eg. Lucene index, spatial index, logos).

It is a good idea to define an external data directory when going to production in order to make upgrade easier.

Proposal Type

  • Type: Core Change
  • App: GeoNetwork
  • Module: Config
  • Documents:
  • Email discussions:
  • Other wiki discussions:

Voting History

  • Vote proposed by François Prunayre on 2012/03/19, result was +1 from Jeroen, Simon, Patrizia, Francois

Proposal

Creating a new data directory

The data directory needs to be created before starting the catalogue. It must be readable and writable by the user starting the catalogue. If the data directory is an empty folder, the catalogue will initialized the directory default structure. The easiest way to create a new data directory is to copy one that comes with a standards installation.

Setting the data directory

The data directory variable could be set using:

  • Java environment variable
  • Servlet context parameter
  • System environment variable

For java environment variable and servlet context parameter use:

  • <webappName>.dir and if not set using:
  • geonetwork.dir

For system environment variable use:

  • <webappName>_dir and if not set using:
  • geonetwork_dir

Java System Property

Depending on the servlet container used it is also possible to specify the data directory location with a Java System Property.

For Tomcat, configuration is:

CATALINA_OPTS="-Dgeonetwork.dir=/var/lib/geonetwork_data"

Run the web application in read-only mode

In order to run the catalogue with the webapp folder in read-only mode, user needs to set 2 variables:

For Tomcat, configuration could be:

CATALINA_OPTS="-Dgeonetwork.dir=/var/lib/geonetwork_data -Dgeonetwork.jeeves.configuration.overrides.file=/var/lib/geonetwork_data/config/my-config.xml"

Structure of the data directory

  • data_directory/
    • data
      • metadata_data: The data related to metadata records
      • resources:
        • htmlcache
        • images
          • harvesting
          • logo
          • statTmp
      • removed: Folder with removed metadata.
      • svn_repository: The subversion repository
    • config: Extra configuration (eg. overrides)
      • schemaplugin-uri-catalog.xml
      • JZKitConfig.xml
      • codelist: The thesaurus in SKOS format
      • schemaPlugins: The directory used to store new metadata standards
    • index: All indexes used for search
      • nonspatial: Lucene index
      • spatialindex.*: ESRI Shapefile for the index (if not using PostGIS)

Advanced configuration

All sub-directories could be configured separately using java system property. For example, to put index directory in a custom location use:

  • <webappName>.lucene.dir and if not set using:
  • geonetwork.lucene.dir

Example

Add the following java properties to start-geonetwork.sh script:

java -Xms48m -Xmx512m -Xss2M -XX:MaxPermSize=128m -Dgeonetwork.dir=/app/geonetwork_data_dir -Dgeonetwork.lucene.dir=/ssd/geonetwork_lucene_dir

Add the following system properties to start-geonetwork.sh script:

# Set custom data directory location using system property
export geonetwork_dir=/app/geonetwork_data_dir
export geonetwork_lucene_dir=/ssd/geonetwork_lucene_dir

System information

Backwards Compatibility Issues

Main changes of the proposal:

  • config.xml appHandler properties for directory are removed
  • Schema plugin URI catalogue is splitted into 2 files:
    • WEB-INF/schema-uri-catalog.xml which contains URI catalogue for core schemas (ie. those under xml/schemas/*). URI are relative to XSL files (eg. ../xml/schemas/iso19115/present/metadata-iso19115.xsl).
    • <geonetwork.dir>/config/schemaplugin-uri-catalog.xml which contains URI catalogue for plugged schemas. URI are absolute path to XSL files (eg. /var/lib/geonetwork_data/schema_plugins/iso19139.fra/present/metadata-iso19139.fra.xsl).
  • z3950.Repositories (JZKitConfig.xml) is built in config directory.
  • LogoFilter is renamed to ResourceFilter and provide access to the following filter:
    • images/logos
    • images/harvesting
    • images/statTmp
    • htmlcache
  • htmlCacheDir is moved to <geonetwork.dir>/resources/htmlcache and published using the ResourceFilter (like logos)

Risks

Participants

  • Francois Prunayre
Last modified 12 years ago Last modified on Mar 21, 2012, 5:11:12 AM

Attachments (1)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.