---- '''WORK IN PROGRESS''' ---- = Proposal number : Persistence framework = || '''Date''' || 2008/06/19 || || '''Contact(s)''' || etj || || '''Last edited''' || [[Timestamp]] || || '''Status''' || draft || || '''Assigned to release''' || to be determined || || '''Resources''' || ??? || == Overview == Suggestions for using a persistence framework. === Proposal Type === * '''Type''': GUI Change, Core Change, Module Change, Guideline and project governance procedures, ... * '''App''': !GeoNetwork * '''Module''': Data Manager, DB access === Links === * '''Documents''': [http://www.hibernatespatial.org/ Hibernate spatial], [http://www.hibernate.org/ Hibernate] * '''Email discussions''': [http://www.nabble.com/Re%3A-Agenda-for-GeoNetwork-hacking-in-Bolsena--release-planning--SEC%3DUNCLASSIFIED--td17770112.html#a17770112 GN-devel thread] * '''Other wiki discussions''': === Voting History === * No vote requested yet. ---- == Motivations == Snip from the aforementioned email discussion: > As of version 2.2.0 the GeoNetwork application cannot be deployed to a > cluster. Existing deployments probably haven't gotten to the size where > clustering is necessary, but if this were to happen, deployment to a cluster > will fail. > There are several reasons for this. Firstly, the application is storing > non-Serializable objects in the !HttpSession. Not terribly difficult to fix > but is still a show stopper. > Secondly, and this is the real killer, the current mechanism of generating > unique primary keys in jeeves.util.!SerialFactory will fail in a cluster due > to duplicate primary keys. The !SerialFactory caches the max primary key > values for each table. In a cluster multiple !SerialFactory instances will > exist and are oblivious of each other. The first node to insert a record will > succeed, other nodes will fail. > Geoscience Australia has deployed GeoNetwork using Oracle. The correct way to > deal with this in Oracle is to use a SEQUENCE. This requires generating > Oracle specific SQL, something the project has avoided doing. > In my humble opinion, if GeoNetwork is to achieve its full potential it needs > to be scalable. Issues like in memory key generation prevent this from > occurring. The bottom line is you need to need to be DB independent but > scalable. The project should seriously consider the adoption of a persistence > framework such as Hibernate. We'll also have to get independent from the spatial dbms used, so a persistence framework with spatial capabilities would be the better choice. == Proposal == The suggested framework is Hibernate Spatial. ...etc === Backwards Compatibility Issues === == Risks == * It has been reported (aaime) that H does use its cache a lot. When a search gives an high number of results, the cache could generate an !OutOfMemory error. This may be avoided using directQueries, which somehow don't use cache. It's a good solutions for read-only queries (and catalog queries are like that). This kind of queries may have drawbacks in terms of unusable lazy loads (an internal H feature), and this could lead to potential problems with an ebRIM based schema, because of the high number of related objects. This issue has been reported on an old H version (about start of 2007), so it may not be valid any longer. == Participants == * ETj * Some ideas and discussions with A Aime, S Giannecchini, A Fabiani.