Opened 12 years ago

Closed 12 years ago

#924 closed enhancement (fixed)

Upgrade thesaurus and keyword to for multilingual keywords

Reported by: jesseeichar Owned by: jesseeichar
Priority: minor Milestone: v2.10.0 RC0
Component: Catalog server Version:
Keywords: Cc:

Description

There are a few issues I have identified with the Thesaurus and Keyword code. It seems to have been added at some point and all development on it has been incremental updates. No documentation SERQL queries that are built rather half-hazardly through string concatenation Very poor multi-lingual support It is hard to track when 2 letter language codes are used and when 3 letter codes are used My work has been trying to address all of these issues. A summary of what has been done so far:

Document every method in Thesaurus, KeywordBean and KeywordSearcher Add Unit tests for all methods in Thesaurus, KeywordSearcher and the new classes I created to support these classes Change Keyword bean to be multilingual. It has the concept of "Default" language still for backwards compatibility but it also has getValues and getDefinitions which are maps from language code (3 letter) to the value. Change Keyword bean to have a fluent interface so you can: new KeywordBean().addValue("eng", "Water").addValue("ger", "Wasser").setCode("http://geonetwork.net#water") Changed Thesaurus so that the addElement and updateElement methods are deprecated with updateElement(KeywordBean) being the preference since Keyword bean has all the same information but handles codes and localization nicely as well. thesaurus.addElement( new KeywordBean().setCode("http://geonetwork.net#house").addValue("eng","house")) Added a small DSL for creating SERQL queries and getting results. I now have a small generic DSL for writing SERQL queries with specific strategies for keywords. For example: QueryBuilder.keywordBuilder().limit(50).offset(10).where(Wheres.prefNote("water").or(Wheres.prefNote("wasser")).build.execute(thesaurus) One of the reasons I did this was so that I could write queries in my unit tests without copy pasting SERQL queries and so that someone that doesn't know SERQL should be able to write simple queries easier. Changed some search method names in KeywordSearch so they are easier to understand by a non RDF specialist. Remaining work it to migrate the deprecated methods to the new API.

You can look at the work either as a diff compared to master:

https://github.com/jesseeichar/geonetwork/compare/master...thesaurus

or the raw code at:

https://github.com/jesseeichar/geonetwork/tree/thesaurus

Attachments (1)

thesaurus.patch (209.2 KB ) - added by jesseeichar 12 years ago.

Download all attachments as: .zip

Change History (3)

by jesseeichar, 12 years ago

Attachment: thesaurus.patch added

comment:1 by jesseeichar, 12 years ago

Component: GeneralCatalog server
Milestone: v2.6.5v2.9.0
Owner: changed from geonetwork-devel@… to jesseeichar
Status: newassigned
Version: v2.6.4

comment:2 by jesseeichar, 12 years ago

Priority: majorminor
Resolution: fixed
Status: assignedclosed
Type: defectenhancement
Note: See TracTickets for help on using tickets.