Changes between Initial Version and Version 1 of Statistics


Ignore:
Timestamp:
04/20/09 01:34:34 (16 years ago)
Author:
fxp
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Statistics

    v1 v1  
     1= Proposal title =
     2
     3|| '''Date''' || 2009/04/20 ||
     4|| '''Contact(s)''' || nicolas, francois ||
     5|| '''Last edited''' || [[Timestamp]] ||
     6|| '''Status''' || being discussed ||
     7|| '''Assigned to release''' || 2.5.0 ||
     8|| '''Resources''' || Done in geocat.ch sandbox ||
     9
     10== Overview ==
     11Log all search made on the catalogue. Create an analyzer for the logged information and add an administration view for the analyzer results.
     12
     13...
     14
     15=== Proposal Type ===
     16 * '''Type''': GUI Change, New module
     17 * '''App''': !GeoNetwork
     18 * '''Module''': Lucene searcher
     19
     20=== Links ===
     21 * '''Documents''':
     22 * '''Email discussions''':
     23  * http://n2.nabble.com/Choosing-a-product-to-generate-graphics-td2371468.html
     24 * '''Other wiki discussions''':
     25
     26=== Voting History ===
     27 * Vote proposed by X on Y, result was +/-n (m non-voting members).
     28
     29----
     30
     31== Motivations ==
     32Improve catalogue usage for the administrator.
     33
     34== Proposal ==
     35
     36To be logged:
     37    * search criteria
     38    * criteria values
     39    * user IP
     40    * metadata view (done using popularity in GeoNetwork)
     41
     42Where to log :
     43    * in all searchers : Lucene searcher, Z39.50 searcher and CatalogueSearcher
     44
     45Indicators:
     46    * more used criteria (top 10)
     47    * more used values (top 10)
     48    * groups selected
     49    * type of metadata
     50    * spatial search or not
     51    * simple search (ie full text) or advanced
     52    * number of results by search (average, 0)
     53    * metadata popularity and popularity by group
     54
     55Reports:
     56    * HTML format
     57     * Charts for number of search by days
     58     * tagclouds view
     59     * tables
     60
     61...
     62
     63=== Backwards Compatibility Issues ===
     64
     65== Risks ==
     66
     67== Participants ==
     68 * Nicolas (main actor)
     69 * Francois (support)
     70
     71
     72== Specification ==
     73Storing and displaying information about metadata search involves several sub components or logical elements:
     74 * The '''Database schema''' that will store search information
     75 * The '''Jeeves services''' that will be impacted by the statistics registration
     76 * The '''configuration''' part in which the kind of information to present is configured (''to be precised, too vague for me'')
     77 * The '''Graphic rendering''' to display charts
     78 * The '''HTML output''' to present result tables
     79
     80
     81==== Database schema ====
     82Geonetwork already uses a database to store configuration information. This schema is extended to add tables that will store search criteria.
     83The DB schema is shown on the following picture:
     84
     85{{attachment:gn_stats_db_schema.png}}
     86
     87Druid is used to define the tables that will store search criteria.
     88
     89Database access is done 2 times during the statistics lifecyle:
     90 * When a lucene query is performed, Requests and Params objects are created from the current context and the Lucene search terms (Params). These objects are store into the dabase.
     91 * When the statistics are used in the GN, administration page: each time a statistic is displayed, a query is made in the database to get results.
     92
     93==== Configuration/administration ====
     94The configuration of statistics can be made at the Jeeves service level: by editing the config_statistics.xml, one can change the queries used to generate stat report (when doing this, also change the XLST sheets corresponding to the service to be sure attribute names are consistent whith those returns by SQL queries.
     95
     96The Graphics service can also receive HTTP parameters sent when calling the service (image size for instance). One could add an "advanced" tab to the stats page to allow administrator to set these parameters.
     97
     98==== Jeeves services ====
     992 kind of services are created to deal with statistics:
     100 * pure XML services, where the SQL query to get stats is sent by a Jeeves service. result is processed by a dedicated XSLT sheet.
     101 * Java service, when a specific processing is to be done before presenting the result:
     102Such services are defined to set the graphic (date range, graphic properties, graphic file writing), or to deal with SQL results
     103
     104==== Graphic Rendering ====
     105
     106JFreeChart lib is used to produce graphics. Graphic type (pie chart, temporal serie) is hard-coded in the Java service. An image file is written on the server (web/images/statistics) and its URL is sent to the XSLT sheet.
     107
     108An optional !ImageMap HTML information can be returned by the service, allowing tooltips on the graphic image.
     109
     110The UI part allow to define the time range and stat type (Daily, monthly or yearly)
     111
     112A caching mechanism is used to use already-generated images.
     113
     114==== HTML output ====
     115
     116The statistics page contains links to stat services. Their results are injected into the page (ajax call).
     117
     118
     119Tagcloud feature:
     120
     121A JS library is used to generate a lightweight tagcloud based on a SQL query counting search results group by keywords.
     122This tag is currently displayed at the beginning of the page.