Version 6 (modified by 13 years ago) ( diff ) | ,
---|
Proposal title
Date | 2012/03/22 |
Contact(s) | Heikki Doeleman, Jose García |
Last edited | |
Status | Proposed for vote |
Assigned to release | 2.7.0 |
Resources | Available |
Ticket # | #432 |
Overview
It is currently not possible to load-balance GeoNetwork, for various reasons described here. This proposal aims to make it possible to horizontally scale GeoNetwork by implementing the changes described below.
Proposal Type
- Module: ALL
Links
- Documents
Why you cannot scale GeoNetwork
- Email discussions:
- Other wiki discussions:
Voting History
Motivations
- Various users of GeoNetwork would like to be able to scale it horizontally to provide for increased performance.
Proposal
This proposal assumes an infrastructure where multiple GeoNetwork nodes act as a single catalog. The nodes will share one database (this may be scaled separately but is not in scope of this proposal). Each node maintains its own local Lucene index and, if configured, SVN. This proposal does not address making GeoNetwork's HTTPSession serializable; therefore load-balancing using sticky sessions is presumed, and there is no support for transparent fail-over.
database primary key values
Currently, new records in the database receive primary key values that are generated in-memory, by the Jeeves class SerialFactory. To avoid clashes when multiple GeoNetwork nodes write to the same database, this should be changed. We propose to generate unique values by using random UUIDs for this. Another option would be using auto-increment column type, but that would make database replication all but impossible.
The generated UUIDs are slightly modified to replace their hyphens with underscores and to replace the first character, if it is a digit, with a letter character mapped to that digit. The reason for this is that the resulting values might be used as identifiers in languages like Javascript and are valid values for ID attributes in X(HT)ML.
Because there is no vendor-independent database column type for UUIDs, they will be stored in a column of type varchar2(36). Current databases will be migrated to use this type for their ID columns, without changing the current values (except for their type, obviously).
A notion exists that database perform faster if their (foreign) keys are auto-incremented integers, not strings. However if you google it, opinions whether this is really the case vary wildly and almost no-one offers actual measurements. Here is a post that does show measurements in MySQL which is re-assuring. Also, a change to String UUIDs allows for the removal of thousands of places in the GeoNetwork Java code where integers are converted to strings and vice-versa.
uploaded files
GeoNetwork creates directories for uploaded data that's associated with a metadata. The names of these (sub-)directories are calculated from the value of the metadata's database ID, probably to avoid having a totally flat structure with too many subdirectories. To support both new UUID-based and old integer-based IDs, the code doing this calculation will be modified so that it recognizes whether an ID is old or new; for old IDs, it uses the existing calculation method. For new IDs it generates a directory name that's /ab/cd/ef, using the first 6 characters of the ID. As UUIDs really are hexadecimal numbers, each (sub-)directory will have maximally 16*16 = 256 subdirectories, and the 3-level nesting creates room for in total 2563 = 16,777,216 metadata records with uploaded files. If you think it's not safe we could make it 4 levels, supporting 4,294,967,296 metadata.
synchronization between nodes
In order to propagate changes made in one node to all others, each time the Lucene index (and SVN) is updated in one node, when a metadata changes, a message is sent to all other nodes causing them to do the same (note: this does not involve a full rebuild-index, just a re-index of the changed metadata -- equal to what happens in the originating node).
The nodes know about the other nodes (where to broadcast the update message to) because all nodes will be listed in a new table.
site uuid
The site uuid identifies this catalog. It's generated at start-up of a GeoNetwork node. We should prevent this happening more than once (e.g. if months later an extra node is added, it should not change). To achieve this, it will be inserted by the insert-data SQL scripts with a value of CHANGEME. When any node in the cluster starts up it checks the value and only if it is still CHANGEME, will it update its value to a UUID.
harvesters
Since the harvesting configuration is stored inside the database, all GeoNetwork instances inside the cluster will share the same harvesting configuration and will hence all attempt to harvest the same nodes.
To prevent the associated overhead in memory footprint and performance, the harvester configurations will be extended with a field containing the node-uuid of the GeoNetwork node where it was created. The harvester will only run in this node. An admin function to check sanity (signal harvesters with a node-uuid that doesn't exist in the new nodes table) will allow Administrator users to replace the node-uuid with one that does exist in the nodes table.
edit metadata lock
When a metadata is being edited by one user, and then another user also opens it for editing, the second user cannot save his changes because GeoNetwork maintains an in-memory 'version-number' to prevent this from happening. In a clustered scenario the in-memory version number is not globally available so this strategy must change.
The current implementation is in effect a form of pessimistic locking (concurrent edit sessions cannot successfully save), with additional disadvantage that the users are not informed when they start editing that they will lose their changes. This will be replaced by a more direct form of pessimistic locking, making it impossible to open a metadata for editing if it is being edited already at that moment. Admin functions will be available to force unlock metadata.
settings
When settings are modified from one GN node, the other nodes will be out of date because the settings are kept in memory from startup. To prevent this we propose two alternatives:
- no longer keep the settings in memory, but look them up everytime. This costs a bit more DB selects of course, but with an index on the unique column these simple selects should not take too long. This would also allow us to delete SettingManager, which we have wanted to do for a long time.
- alternatively, a second update-broadcast to the peer nodes can instruct them to re-initialize SettingManager.
Your opinions on this are welcome.
Backwards Compatibility Issues
Are there any ?
New libraries added
No new libraries.
Risks
Participants
Heikki Doeleman, Jose García
Attachments (2)
-
clustering.diff
(1.4 MB
) - added by 13 years ago.
Do not use patch with existing database; migration scripts yet to be included.
-
cluster.png
(15.1 KB
) - added by 12 years ago.
example cluster
Download all attachments as: .zip