Changes between Initial Version and Version 1 of RemoteSearch


Ignore:
Timestamp:
Dec 21, 2009, 10:00:22 AM (14 years ago)
Author:
awarnock
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • RemoteSearch

    v1 v1  
     1= Remote & Distributed Search =
     2
     3|| '''Date''' || 2009/12/20 ||
     4|| '''Contact(s)''' || A. Warnock (A/WWW Enterprises), D. Nebert (USGS/FGDC), L. Miao (GMU), Z. Li (GMU), H. Wu (GMU)  ||
     5|| '''Last edited''' || [[Timestamp]] ||
     6|| '''Status''' || Proposed for voting ||
     7|| '''Assigned to release''' || 2.5 ||
     8|| '''Resources''' || No additional resources needed ||
     9
     10== Overview ==
     11Some collections may be too large or too dynamic to harvest directly while still being desirable to
     12search within the context of the general search capability.  This proposal will add the ability to
     13refer some search requests to remote collections and return results along with results from local collections.
     14
     15=== Proposal Type ===
     16 * '''Type''': Core Change
     17 * '''App''': !GeoNetwork
     18 * '''Module''': Search Interface
     19
     20=== Links ===
     21 * '''Documents''':
     22 * '''Email discussions''': See thread on ''Proposal: Local harvesting and remote search'' on geonetwork-devel list
     23 * '''Other wiki discussions''':
     24
     25=== Voting History ===
     26 * Vote proposed by A. Warnock on 2009-12-20.
     27
     28----
     29
     30== Motivations ==
     31The current configuration implements search only on locally-held collections.  In order to provide
     32full functionality as either a geospatial portal or clearinghouse, it would be desirable for !GeoNetwork
     33to have the ability to search remote collections (at least via CSW or Z39.50) without harvesting them
     34locally.
     35
     36== Proposal ==
     37A number of metadata sources in GEOSS are either too large
     38(250,000 - 1 million records or more) or too dynamic (several
     39updates/hour, perhaps related to emergency conditions) to harvest and
     40hold locally.  In these cases, we anticipate that searching through the
     41clearinghouse instance should be performed as a distributed search
     42against the original collection, rather than against a locally-held
     43harvested collection, and further, that this process should be
     44transparent to the end-user.  That is, while such collections may be
     45presented to the end user as an optional source to be searched, they
     46should not be expected to know which collections are held in the
     47clearinghouse and which are searched remotely, nor should the end user
     48be directed away from the clearinghouse site to search these remote
     49sites separately.
     50
     51We are fully cognizant of the network latencies involved in such a
     52scenario, having had direct experience with it in the FGDC Clearinghouse
     53network in years past.  Nonetheless, support for distributed, remote
     54searching is seen to be unavoidable within the GEOSS framework.  The
     55basic client functions for doing distributed, remote search is already
     56in GeoNetwork - we propose to implement it as part of the search
     57interface, at least through the CSW API.  Note that, in GEOSS anyway,
     58clearinghouse and portal functions are separate - portals provide the
     59user interface, clearinghouses provide the programmatic search interface
     60to the portals via CSW.
     61
     62=== Backwards Compatibility Issues ===
     63None anticipated.
     64
     65== Risks ==
     66None forseen.  Development will take place on a separate branch so that testing can take place before merging into the trunk.
     67
     68== Participants ==
     69 * A. Warnock (A/WWW Enterprises)
     70 * D. Nebert (USGS/FGDC)
     71 * L. Miao (GMU)
     72 * Z. Li (GMU)
     73 * H. Wu (GMU)