Changes between Version 9 and Version 10 of LoadBalanceable


Ignore:
Timestamp:
04/13/12 05:28:33 (13 years ago)
Author:
heikki
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • LoadBalanceable

    v9 v10  
    4343A notion exists that database perform faster if their (foreign) keys are auto-incremented integers, not strings. However if you google it, opinions whether this is really the case vary wildly and almost no-one offers actual measurements. [http://krow.livejournal.com/497839.html Here] is a post that does show measurements in MySQL which is re-assuring. Also, a change to String UUIDs allows for the removal of thousands of places in the GeoNetwork Java code where integers are converted to strings and vice-versa. Removing those increases performance because they're no longer done, and fewer short-lived objects are created which can help speed up garbage collection delays.
    4444
    45 === uploaded files ===
     45=== shared data directory ===
    4646GeoNetwork creates directories for uploaded data that's associated with a metadata. The names of these (sub-)directories are calculated from the value of the metadata's database ID, probably to avoid having a totally flat structure with too many subdirectories. To support both new UUID-based and old integer-based IDs, the code doing this calculation will be modified so that it recognizes whether an ID is old or new; for old IDs, it uses the existing calculation method. For new IDs it generates a directory name that's /ab/cd/ef, using the first 6 characters of the ID. As UUIDs really are hexadecimal numbers, each (sub-)directory will have maximally 16*16 = 256 subdirectories, and the 3-level nesting creates room for in total 256^3 = 16,777,216 metadata records with uploaded files. If you think it's not safe we could make it 4 levels, supporting 4,294,967,296 metadata.
    4747
    48 === synchronization between nodes ===
    49 In order to propagate changes made in one node to all others, each time the Lucene index (and SVN) is updated in one node, when a metadata changes, a message is sent to all other nodes causing them to do the same (note: this does not involve a full rebuild-index, just a re-index of the changed metadata -- equal to what happens in the originating node).
     48In addition, each node has its own, local directories for Lucene, SVN and Cluster Configuraton (this last folder just contains a unique identifier of the node).
    5049
    51 We'll use JMS to propagate these message using durable subscriptions to a Topic/Subscribe queue. This decouples knowledge of the other nodes from each node and enables guaranteed delivery in correct order even after a node has been down.
     50=== synchronization between nodes: JMS topics ===
     51
     52In order to propagate changes made in one node to all others, JMS messages are placed on !Topic/Subscribe channels. Each node is also using durable subscriptions to each topic. This decouples knowledge of the other nodes from each node and enables guaranteed delivery in correct order even after a node has been down.
     53
     54Messages published to these topics are received by all nodes in the cluster. If a node is down, it will receive the messages
     55published during its absence when it comes back up, in correct order. When all nodes have read the message, it will be removed
     56from the topic (at some point).
     57
     58The topics are:
     59
     60 - RE-INDEX
     61  Used to synchronize the nodes' Lucene indexes when metadata is added, deleted, updated, its privileges change, etc.
     62
     63 - OPTIMIZE-INDEX
     64  Used to propagate the Optimize Index command to all nodes.
     65
     66 - RELOAD-INDEX-CONF
     67  Used to propagate the Reload Index Configuration command to all nodes.
     68
     69 - SETTINGS
     70  Used to propagate a change in Settings to all nodes.
     71
     72 - ADD-THESAURUS
     73
     74 - DELETE-THESAURUS
     75
     76 - ADD-THESAURUS-ELEMENT
     77
     78 - UPDATE-THESAURUS-ELEMENT
     79
     80 - DELETE-THESAURUS-ELEMENT
     81
     82 - MD-VERSIONING
     83  Used to invoke the nodes' SVN versioning control.
     84
     85 - HARVESTER
     86  Used to propagate changes to Harvesters to all nodes.
     87
     88 - SYSTEM_CONFIGURATION
     89  Used to request all nodes to publish their System Information.
     90
     91 - SYSTEM_CONFIGURATION_RESPONSE
     92  Used to publish System Information.
     93
     94=== synchronization between nodes: JMS queues ===
     95
     96Messages published to these queues are received by one single node in the cluster. This can be any one of the nodes, whichever
     97is first. When a node reads a message it is removed from the queue.
     98
     99The queues are:
     100
     101 - HARVEST
     102  Used to run a Harvester. When clustering is enabled, a Harvester that's set to run periodically is invoked by periodic
     103  publication of a message to this queue; any one of the nodes in the cluster that picks it up first, will actually run
     104  the Harvester.
     105
    52106
    53107=== site uuid ===
    54108The site uuid identifies this catalog. It's generated at start-up of a GeoNetwork node. We should prevent this happening more than once (e.g. if months later an extra node is added, it should not change). To achieve this, it will be inserted by the insert-data SQL scripts with a value of CHANGEME. When any node in the cluster starts up it checks the value and only if it is still CHANGEME, will it update its value to a UUID.
    55109
    56 === harvesters ===
    57 Since the harvesting configuration is stored inside the database, all GeoNetwork instances inside the cluster will share the same harvesting configuration and will hence all attempt to harvest the same nodes.
    58 
    59 To prevent the associated overhead in memory footprint and performance, the harvester configurations will be extended with a field containing the node-uuid of the GeoNetwork node where it was created. The harvesting schedule will run only in this node. To improve distribution of load, the scheduler will not start the harvest job as such, but places a message on a JMS Point-to-Point queue. All nodes are registered with this queue and one of them will be the first to pick it up, removing the message from the queue and starting the harvester job.
    60 
    61 An admin function to check sanity (signal harvesters with a node-uuid that doesn't exist in the new nodes table) will allow Administrator users to replace the node-uuid with one that does exist in the nodes table.
    62 
    63110=== edit metadata lock ===
    64111When a metadata is being edited by one user, and then another user also opens it for editing, the second user cannot save his changes because GeoNetwork maintains an in-memory 'version-number' to prevent this from happening. In a clustered scenario the in-memory version number is not globally available so this strategy must change.
    65112
    66 The current implementation is in effect a form of pessimistic locking (concurrent edit sessions cannot successfully save), with additional disadvantage that the users are not informed when they start editing that they will lose their changes. This will be replaced by a more direct form of pessimistic locking, making it impossible to open a metadata for editing if it is being edited already at that moment. Admin functions will be available to force unlock metadata.
     113The current implementation is in effect a form of pessimistic locking (concurrent edit sessions cannot successfully save), with additional disadvantage that the users are not informed when they start editing that they will lose their changes. This will be replaced by a more direct form of pessimistic locking, making it impossible to open a metadata for editing if it is being edited already at that moment. Admin functions will be available to force unlock metadata.
     114
     115NOTE: this will not be implemented in the scope of this proposal; rather, we'll soon publish a separate proposal dedicated to improvements in locking, lifecycle and metadata state.
    67116
    68117=== settings ===
    69 When settings are modified from one GN node, the other nodes will be out of date because the settings are kept in memory from startup. To prevent this we propose two alternatives:
     118Administrator users can enable clustering in the System Configuration. When enabled, a URL to the !ActiveMQ JMS server needs to be specified.
    70119
    71  * no longer keep the settings in memory, but look them up everytime. This costs a bit more DB selects of course, but with an index on the unique column these simple selects should not take too long. This would also allow us to delete !SettingManager, which we have wanted to do for a long time.
    72  * alternatively, a second update-broadcast to the peer nodes can instruct them to re-initialize !SettingManager.
    73 
    74 Your opinions on this are welcome.
     120=== documentation ===
     121See the GeoNetwork User Documentation for a description of how to install and configure a cluster.
    75122
    76123
    77124=== Backwards Compatibility Issues ===
    78 Are there any ?
     125
     126Any clients relying on the integer nature of database IDs, if they exist, need to change so they expect UUIDs instead.
    79127
    80128=== New libraries added ===
    81129ActiveMQ for JMS
     130
     131== Test cluster ==
     132
     133We have a fully functional !GeoNetwork cluster which uses 2 physical machines hosting 4 !GeoNetwork nodes in 2 Tomcats and 1 Jetty. The nodes are not load-balanced to facilitate testing propagation between nodes. You may access the test nodes at [http://dev.ace.geocat.net:7080/geonetworkn1 1], [http://dev.ace.geocat.net:7080/geonetworkn2 2], [http://dev.ace.geocat.net:7070/geonetwork 3] and  [http://78.46.99.131:7080/geonetwork 4].
     134
     135We do not guarantee anything about this test cluster and we'll take it down soon without notice.
    82136
    83137== Risks ==