WORK IN PROGRESS
Proposal number : Persistence framework
Date | 2008/06/19 |
Contact(s) | etj |
Last edited | Timestamp |
Status | draft |
Assigned to release | to be determined |
Resources | ??? |
Overview
Suggestions for using a persistence framework.
Proposal Type
- Type: GUI Change, Core Change, Module Change, Guideline and project governance procedures, ...
- App: GeoNetwork
- Module: Data Manager, DB access
Links
- Documents: Hibernate spatial, Hibernate
- Email discussions: GN-devel thread
- Other wiki discussions:
Voting History
- No vote requested yet.
Motivations
Snip from the aforementioned email discussion:
As of version 2.2.0 the GeoNetwork application cannot be deployed to a cluster. Existing deployments probably haven't gotten to the size where clustering is necessary, but if this were to happen, deployment to a cluster will fail.
There are several reasons for this. Firstly, the application is storing non-Serializable objects in the HttpSession. Not terribly difficult to fix but is still a show stopper.
Secondly, and this is the real killer, the current mechanism of generating unique primary keys in jeeves.util.SerialFactory will fail in a cluster due to duplicate primary keys. The SerialFactory caches the max primary key values for each table. In a cluster multiple SerialFactory instances will exist and are oblivious of each other. The first node to insert a record will succeed, other nodes will fail.
Geoscience Australia has deployed GeoNetwork using Oracle. The correct way to deal with this in Oracle is to use a SEQUENCE. This requires generating Oracle specific SQL, something the project has avoided doing.
In my humble opinion, if GeoNetwork is to achieve its full potential it needs to be scalable. Issues like in memory key generation prevent this from occurring. The bottom line is you need to need to be DB independent but scalable. The project should seriously consider the adoption of a persistence framework such as Hibernate.
We'll also have to get independent from the spatial dbms used, so a persistence framework with spatial capabilities would be the better choice.
Proposal
The suggested framework is Hibernate Spatial. ...etc
Backwards Compatibility Issues
Risks
- It has been reported (aaime) that H does use its cache a lot. When a search gives an high number of results, the cache could generate an OutOfMemory error. This may be avoided using directQueries, which somehow don't use cache. It's a good solutions for read-only queries (and catalog queries are like that). This kind of queries may have drawbacks in terms of unusable lazy loads (an internal H feature), and this could lead to potential problems with an ebRIM based schema, because of the high number of related objects. This issue has been reported on an old H version (about start of 2007), so it may not be valid any longer.
Participants
- ETj
- Some ideas and discussions with A Aime, S Giannecchini, A Fabiani.