wiki:NarrowYourSearchWidget

Version 8 (modified by fxp, 12 years ago) ( diff )

--

"Narrow your search" widget

Date 2012/09/11
Contact(s) François Prunayre
Last edited 2012/09/11
Status Ongoing
Assigned to release 2.9.x
Resources Resource available

Overview

This proposal aims to provide a « Narrow your search » module aka as faceted search (http://en.wikipedia.org/wiki/Faceted_search). This module allows aggregation based on criteria (e.g. keywords, organization, dates, ...) with frequency for the current search. Output view could be a list (with link to add a specific criteria to the search) and optionnaly a tagcloud view (http://en.wikipedia.org/wiki/Tag_cloud).

On the client side, user interacts with facets using :

  • the facets summary used to select new filter
  • the facet breadcrumb which indicates which filter has been applied

and allows user to remove a filter.

Those 2 components are created and interact with search form and search results widgets.

A first implementation has been made on geocat.ch using the internal summary module in 2008. This proposal is using Lucene faceting module to compute the summary.

Proposal Type

  • Type: GUI Change, LuceneSearcher
  • App: GeoNetwork
  • Module: Search Interface

Voting History

  • Vote proposed by X on Y, result was +/-n (m non-voting members).

Motivations

  • Improve user search experience : facets help user to quickly refine search
  • Better performance : Lucene facet module using its own taxonomy index is faster than computing summary.
  • New way of presenting information : facet could be use to provide catalog indicators using charts (http://www.youtube.com/watch?v=ISEOKOq6t2Q&feature=plcp)

Proposal

Facet configuration

The facet configuration define which field in the index to use to compute the summary. The configuration is stored in WEB-INF/config-summary.xml. For each facet configure the following properties:

  • name: the name of the facet (ie. the tag name in the XML response)
  • plural: the plural for the name (ie. the parent tag of each facet values)
  • indexKey: the name of the field in the index
  • sortBy: the ordering for the facet. Defaults is by count.
  • sortOrder: asc or desc. Defaults is descendant.
  • max: the number of values to be returned for the facet. Defaults is 10.

Configuration example:

<item name="keyword" plural="keywords" indexKey="keyword"/>

<item name="createDateYear" plural="createDateYears" indexKey="createDateYear"
				sortBy="value" sortOrder="asc" max="40"/>

Facet response

Current implementation produce (see xml.search service) :

<response>
  <summary count="4797" type="local">
    <keywords>
<keyword count="1280" name="soil types"/>
<keyword count="1003" name="geology"/>
<keyword count="958" name="soils"/>
<keyword count="894" name="soil classification"/>
<keyword count="695" name="land use"/>
<keyword count="609" name="topography"/>
<keyword count="595" name="land suitability"/>
<keyword count="550" name="land"/>
<keyword count="508" name="physiography"/>
<keyword count="507" name="crops"/>
</keywords>
  <categories>
<category count="165" name="datasets"/>
<category count="2" name="interactiveresources"/>
<category count="150" name="maps"/>
</categories>
  <sources>
<source count="4350" name="d5a5b43b-6e52-4ec6-94af-f95c6e4dac24"/>
<source count="447" name="34475e9d-9b9e-48d1-b75c-08701c3a8f93"/>
</sources>
</summary>
<metadata>

Performance analysis

Changes

  • Lucene
    • Update to version 3.6.1
    • New dependency : lucene-facet
  • LuceneSearcher.makeSummary()
  • Add an option to search services to only produce the summary (like the fast option but really fast / only summary)
  • Add config file to define
    • criteria : MUST be in the index
    • criteria type : String|Number|Date
    • agregation type :
      • String : count
      • Number : count|equalInterval + classe ?| quantil + classe
      • Date : count|annualy|monthly|daily
      • sort option (count|name)

Backwards Compatibility Issues

Risks

Participants

  • As above

Attachments (4)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.