Opened 12 years ago

Closed 12 years ago

#1098 closed defect (fixed)

GeoNetwork 2.6.x exhausts open files resources over time

Reported by: simonp Owned by: geonetwork-devel@…
Priority: major Milestone: v2.6.5
Component: General Version: v2.6.4
Keywords: Cc:

Description

Reported by Sylvain Grellet:

...

We are monitoring files opened by Geonetwork on our server. It appears that the number of files opened by Geonetwork grows steadily over time (see the 2 images attached). On "Geonetwork_opened_files.png" you'll see we had to reload Geonetwork 3 times (lsof goes down to zero).

We see some peaks when we edit metadata files (yellow peaks on Geonetwork_opened_files_third_peak_detail.png), so it seems there is some kind of garbage collection. But still the trend is not really nice.

...

Cheers Sylvain

Attachments (2)

Geonetwork_opened_files.png (13.8 KB ) - added by simonp 12 years ago.
Geonetwork_opened_files_third_peak_detail.png (13.3 KB ) - added by simonp 12 years ago.

Download all attachments as: .zip

Change History (8)

by simonp, 12 years ago

Attachment: Geonetwork_opened_files.png added

comment:1 by simonp, 12 years ago

  • GN uses reference counting IndexReader as advised by Lucene In Action (2nd edition)
  • IndexReader (LuceneSearcher or CatalogSearcher) gets stored in session so that user can return to page through results (this should be the same for CSW or user interface)
  • When changes are made to metadata records, IndexWriter is used and Lucene index is rewritten to include changes
  • Lucene is a file based index, rewriting of index deletes some files and creates new ones
  • it seems that the deleted files are never closed, so number of open files used by the servlet gradually gets larger

Files left open that cause the eventual problem (see via ls /proc/<process_id>/fd or use lsof) have a path something like the following:

/usr/local/gn264/web/geonetwork/WEB-INF/lucene/nonspatial/_nnz.cfs (deleted)

comment:2 by simonp, 12 years ago

Problem apparently disappears with Lucene 3.6.1 when our code is upgraded to use the Lucene SearcherManager and SearcherLifetimeManager. A benefit of using these is that we no longer need to cache the IndexSearcher/Reader in the user session to capture the snapshot of the index at search time. Instead we can cache a token and ask Lucene to give the IndexSearcher/Reader back when we need it. Also, old search sessions are pruned (SearcherLifetimeManager) - user gets 'search expired' exception if they leave a search session idle for too long (and the index changes) - this also saves resources.

comment:3 by simonp, 12 years ago

Initial commit

https://github.com/geonetwork/core-geonetwork/commit/8a2f3b5ed7ab33236c766c3861255a121889ff60

Some more testing is needed before this ticket can be closed. Stay tuned.

comment:4 by jesseeichar, 12 years ago

I have a change ready for trunk as well. It was a bit complicated because on trunk there is one index per language when dealing with multilingual metadata which I do frequently. The meant a single search manager couldn't work.

You can take a look at:

https://github.com/jesseeichar/core-geonetwork/compare/improvement/lucene-searchmanager

comment:6 by simonp, 12 years ago

Resolution: fixed
Status: newclosed

Fixed in commit 8a2f3b5ed7ab33236c766c3861255a121889ff60

Note: See TracTickets for help on using tickets.