wiki:NarrowYourSearchWidget

"Narrow your search" widget

Date 2012/09/11
Contact(s) François Prunayre
Last edited 2012/09/11
Status Motion passed - Done
Assigned to release 2.9.x
Resources Resource available (Thanks to swisstopo for first implementation, BRGM and Magellium for prototyping, GeoNovum for final implementation)
Ticket # #1162
Dev branch https://github.com/fxprunayre/core-geonetwork/tree/feature/lucene-facet

  1. Overview
    1. Proposal Type
    2. Links
    3. Voting History
  2. Motivations
  3. Proposal
    1. Faceted search widgets
    2. Facet configuration
    3. Facet response
    4. Performance analysis
    5. Changes
    6. Backwards Compatibility Issues
  4. Risks
  5. Participants

Overview

This proposal aims to provide a « Narrow your search » module aka as faceted search (http://en.wikipedia.org/wiki/Faceted_search). This module allows aggregation based on criteria (e.g. keywords, organization, dates, ...) with frequency for the current search. Output view could be a list (with link to add a specific criteria to the search) and optionnaly a tagcloud view (http://en.wikipedia.org/wiki/Tag_cloud).

On the client side, user interacts with facets using :

  • the facets summary used to select new filter
  • the facet breadcrumb which indicates which filter has been applied

and allows user to remove a filter.

Those 2 components are created and interact with search form and search results widgets.

A first implementation has been made on geocat.ch using the internal summary module in 2008. This proposal is using Lucene faceting module to compute the summary.

See http://www.youtube.com/watch?v=VOGB7ZI7ey8

Proposal Type

  • Type: GUI Change, LuceneSearcher
  • App: GeoNetwork
  • Module: Search Interface

Voting History

  • Vote proposed by Francois on 2012/11/21, result was +1 Simon, Patrizia, Jeroen, Francois, (Jesse).

Motivations

  • Improve user search experience : facets help user to quickly refine search
  • Better performance : Lucene facet module using its own taxonomy index is faster than computing summary.
  • New way of presenting information : facet could be use to provide catalog indicators using charts (http://www.youtube.com/watch?v=ISEOKOq6t2Q&feature=plcp)

Proposal

Faceted search widgets

Using the narrow your search widget

The narrow your search module also known as faceted search (http://en.wikipedia.org/wiki/Faceted_search) allows aggregation based on criteria (eg. keywords, organization, dates) with frequency of each category for the current search. When running a search, all search results are analyzed and a list of main values for each criteria is computed. This summary is displayed as a list next to the search results where user could filter the search by selecting a value.

Once selected, the value is added to a breadcrumb widget which provides all filters applied on top of the first search. For example, searching for "dataset" with keyword "society" in "1998":

For each filter, user can click on it to:

  • remove it
  • or switch to another value

A filter is populated with the list of values computed on the first search (ie. before any filters is applied to the search).

The filters can be reset using the reset search button.

Configuration of the facet

Facets configuration is defined on the server side because facets are computed in a separate index (named taxonomy) which is populated when metadata are indexed. The client application could also overrides the default server configuration by only displaying a subset of the information retrieved from the server.

The FacetsPanel widget provides a facetListConfig property to define:

  • the list of facet to display (facets are displayed in the order of the list)
  • and for each facet, the maximum number of top values to display
GeoNetwork.Settings.facetListConfig = [
  {name: 'orgNames', count: 5}, // First display the top 5 organization 
  {name: 'types'}, // then types
  {name: 'denominators'}, // ...
  {name: 'keywords'}, 
  {name: 'createDateYears'} // and in last position years.
];

Facet configuration

The facet configuration define which field in the index to use to compute the summary. The configuration is stored in WEB-INF/config-summary.xml. For each facet configure an item with the following properties:

  • name: the name of the facet (ie. the tag name in the XML response)
  • plural: the plural for the name (ie. the parent tag of each facet values)
  • indexKey: the name of the field in the index
  • (optional) sortBy: the ordering for the facet. Defaults is by count.
  • (optional) sortOrder: asc or desc. Defaults is descendant.
  • (optional) max: the number of values to be returned for the facet. Defaults is 10.

When an item is modified or added, the index MUST be rebuild.

Configuration example:

<item name="keyword" plural="keywords" indexKey="keyword"/>

<item name="createDateYear" plural="createDateYears" indexKey="createDateYear"
				sortBy="value" sortOrder="asc" max="40"/>

Facet configuration is loaded on startup. It could be reloaded from the administration panel using the reload configuration button.

Facet response

Facet summary is returned in the XML response. The response structure is the following:

<response>
  <summary count="4797" type="local">
    <keywords>
      <keyword count="1280" name="soil types"/>
      <keyword count="1003" name="geology"/>
      <keyword count="958" name="soils"/>
      <keyword count="894" name="soil classification"/>
      <keyword count="695" name="land use"/>
      <keyword count="609" name="topography"/>
      <keyword count="595" name="land suitability"/>
      <keyword count="550" name="land"/>
      <keyword count="508" name="physiography"/>
      <keyword count="507" name="crops"/>
    </keywords>
    <spatialRepresentationTypes>
      <spatialRepresentationType count="2641" name="grid"/>
      <spatialRepresentationType count="625" name="vector"/>
    </spatialRepresentationTypes>
  </summary>
  <metadata>
   ...

The summaryOnly=true parameter return only the summary without records.

Performance analysis

Changes

  • Lucene
    • Update to version 3.6.1
    • New dependency : lucene-facet
  • LuceneSearcher.makeSummary()
  • Add an option to search services to only produce the summary
  • LuceneConfig
    • Summary configuration is part of LuceneConfig
    • Summary configuration could be reload dynamically when the LuceneConfig is reload.

Backwards Compatibility Issues

Risks

Participants

  • As above
Last modified 12 years ago Last modified on 12/05/12 03:16:47

Attachments (4)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.