= GeoNetwork Data Catalog Vocabulary services = || '''Date''' || 2012/05/10 || || '''Contact(s)''' || François Prunayre, Paul Hasenohr || || '''Last edited''' || || || '''Status''' || Motion passed - Done || || '''Assigned to release''' || 2.9 || || '''Resources''' || Available (funding EEA) || || '''Ticket #''' || #912 || || '''Github source''' || https://github.com/fxprunayre/core-geonetwork/tree/feature/dcat-rdf || [[PageOutline(2-3,,inline)]] == Overview == Data Catalog Vocabulary services in !GeoNetwork opensource increase discoverability and enable applications easily to consume metadata. Those services provide information about 3 types of objects : * the catalogue, * the datasets and services in the catalogue * and the link to distributed resources. The description contains relation to thesaurus (eg. GEMET), keywords and organization. The document could be used in a linked data context. The output format produced by the services are based on DCAT, an RDF vocabulary. Two types of services are created: * Metadata service to access to one metadata record * Search service to search the catalogue and retrieve a set of metadata The Data Catalog Vocabulary services could be used by Semantic web tools to harvest, search (eg. using SPARQL) and link catalogue content with other interlinked resources. A semantic portal sitemap is created in order to be able to harvest the catalogue. === Proposal Type === * '''Type''': Discoverability * '''App''': !GeoNetwork * '''Module''': Metadata and search services === Links === * '''Documents''': * Data Catalog Vocabulary (DCAT) http://www.w3.org/TR/vocab-dcat/#property--data-dictionary * Vocabulary of interlinked Dataset (VoID) http://www.w3.org/TR/void/ * http://dvcs.w3.org/hg/gld/raw-file/default/dcat/index.html * Semantic Web Crawling: A Sitemap Extension http://sw.deri.org/2007/07/sitemapextension/ * http://geovocab.org/doc/neogeo.html === Voting history === Vote proposed by Francois on 2012/07/04, result was * +1 from Jeroen, Simon, Francois == Proposal == === RDF Model === RDF model is defined for ISO19139, ISO19110 and Dublin Core standards in order to cover most of the metadata of the catalogue. Model is based on DCAT which is "an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web". ==== Vocabularies: ==== || Prefix || Specification || Namespace || || dcat || http://www.w3.org/TR/vocab-dcat/#class--catalog || http://www.w3.org/ns/dcat# || || void || http://www.w3.org/TR/void/ || http://rdfs.org/ns/void# || || dc || http://dublincore.org/ || http://purl.org/dc/elements/1.1/ || || dcterms || || http://purl.org/dc/terms/ || || dctype || || http://purl.org/dc/dcmitype/ || || foaf || http://xmlns.com/foaf/spec/ || http://xmlns.com/foaf/0.1/ || || skos || http://www.w3.org/2009/08/skos-reference/skos.html# || http://www.w3.org/2004/02/skos/core# || || rdf || || http://www.w3.org/1999/02/22-rdf-syntax-ns# || || rdfs || || http://www.w3.org/2000/01/rdf-schema# || {{{ }}} ==== Classes: ==== * Catalogue (dcat:Catalog) is the local catalogue or any harvested nodes. {{{ My GeoNetwork geospatial metadata catalogue My GeoNetwork geospatial metadata catalogue http://localhost:8080/geonetwork http://localhost:8080/geonetwork/srv/eng/portal.opensearch http://localhost:8080/geonetwork/search/rdf?any= en }}} * Organization (foaf:Organization) {{{ }}} * Dataset (dcat:CatalogRecord+dcat:Dataset) {{{ Polygon((13.208233 50.71671, 13.208233 51.24864, 14.40099 51.24864, 14.40099 50.71671, 13.208233 50.71671)) }}} * Series (dcat:CatalogRecord+dcat:Dataset+dc:relation) * Service (dcat:CatalogRecord+rdf:Description+dc:relation) {{{ }}} * Feature catalogue (rdf:Description+dc:relation) {{{ }}} * Thesaurus (skos:ConceptScheme) {{{ GEMET - INSPIRE themes, version 1.0 INSPIRE themes thesaurus for GeoNetwork opensource. EEA http://www.eionet.europa.eu/gemet/about?langcode=en http://www.eionet.europa.eu/gemet/about?langcode=en 2008-06-01 2008-06-01 }}} * Keyword (skos:Concept) {{{ }}} * Online resources (dcat:Distribution) {{{ text/csv CSV }}} === Formats === * RDF/XML is the output format for new services. * RDFa is used to add anotations to HTML pages. * Sitemap use XML file that uses the Semantic Crawling extension (See #81) === Services === New services: * Metadata service: http://://srv/eng/rdf.metadata.get?uuid= * RDF search service: http://://srv/eng/rdf.search? * All !GeoNetwork search criteria can be used to extract a subset of the catalogue. * Sitemap: http://://srv/eng/portal.sitemap?format=rdf Rewriting rules for simple URL: * http://://metadata/.rdf * http://://search/rdf? Conversion: * /convert/rdf.xsl Schema supported: * ISO19139 * ISO19110 * dublin-core === Site map === A sitemap using the semantic crawling extension is added to existing XML sitemap. {{{ My GeoNetwork full content catalogue for Linked Data spiders (RDF) For 5 latests update: http://://metadata/.rdf Link to a full dump using the search API http://://search/rdf/ or provide for all catalogue record a link using http://://metadata/.rdf daily }}} Sitemap will be accessible using existing sitemap service with format=rdf as parameter: http://:/geonetwork/srv/eng/portal.sitemap?format=rdf In robots.txt, the following line is added: {{{ Sitemap: http://:/geonetwork/srv/eng/portal.sitemap?format=rdf }}} == Using RDF outputs == === Save metadata as xml === A save as RDF is added to the metadata menu: [[Image(rdf-save-as-action.png)]] === Visualization tools === Running a search on the catalogue using the rdf.search service will provide a full or partial view of the catalogue which could be analyzed in visualization tools. [[Image(rdf-visual-ex.png, 700px)]] [[Image(rdf-visual-ex-by-foaf-organization.png, 700px)]] [[Image(rdf-visual-ex-by-inspire-themes.png, 700px)]] === SPARQL queries === Once loaded in a SPARQL endpoint, the catalogue content could be queried using SPARQL: * Get metadata titles {{{ sparql select ?title where {?s ?title}; }}} * Get metadata about transport network {{{ sparql PREFIX dc: PREFIX dcat: PREFIX skos: SELECT ?title, ?label WHERE { ?x dc:title ?title . ?x dcat:theme ?theme . ?theme skos:prefLabel ?label FILTER(?label = "Transport network") }; }}} == Risks == == Future improvement == This proposal does not cover the following items which could be addressed in future works: * multilingual RDF output for multilingual metadata records == Participants == * Francois Prunayre