Changes between Version 8 and Version 9 of XPATH


Ignore:
Timestamp:
Apr 17, 2009, 2:46:17 AM (15 years ago)
Author:
heikki
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • XPATH

    v8 v9  
    88----
    99
    10 === Introduction ===
     10== Introduction ==
    1111
    1212XPATH is a language to precisely select a set of nodes in an XML document. It is also sometimes used to address certain parts of Java object graphs. One of the requirements for the ebRIM project is to support XPATH queries in OGC Filters.
    1313
    1414
    15 === Issue: XPATH on Lucene ===
     15== Issue: XPATH on Lucene ==
    1616
    1717It would be straightforward to evaluate XPATH against complete ebRIM documents (domain object graphs). However, we do not have all ebRIM data in memory, therefore we cannot simply evaluate XPATH queries against complete ebRIM documents. This suggests that the best option is to evaluate XPATH queries against the Lucene index. How should we go about doing that ?
     
    2828 }}}
    2929
    30 I'm currently looking into their sources to learn more about their implementation, specificaly [http://dev.alfresco.com/resource/docs/java/repository/org/alfresco/util/SearchLanguageConversion.html org.alfresco.util.SearchLanguageConversion].
     30I'm currently looking into their sources to learn more about their implementation, specifically [http://dev.alfresco.com/resource/docs/java/repository/org/alfresco/util/SearchLanguageConversion.html org.alfresco.util.SearchLanguageConversion].
    3131
    3232An exchange on the [http://www.mail-archive.com/solr-user@lucene.apache.org/msg07186.html Solr mailing list] point in the same direction (i.e. storing XPATH info in the Lucene index).
    3333
    3434Then there's [http://acs.lbl.gov/nux/ Nux], which also seems to support XPATH on Lucene: "Arbitrary Lucene fulltext queries can be run from Java or from XQuery/XPath/XSLT via a simple extension function." I have no clue yet how they do it, but I'm taking a peek at their source very soon.
     35
     36=== Storing XPATH information in the index vs. mapping XPATH queries directly to structured Lucene queries ===
     37
     38Erik and me have started a highly polemical debate about which is a better approach, storing XPATH info in Lucene or mapping XPATH queries to more structured Lucene queries.
     39 {{{
     40Stored XPATH approach
     41
     42 * use an extra field in the Lucene index to store XPATH information
     43 * the information stored at index time could be a !LocationPath to the object being indexed, which could be matched at search time to XPATH queries
     44 * Erik thinks this approach would 'pollute' our domain driven development model, as there is no intrinsic justification for this extra index field in our domain.
     45 * Heikki thinks that that doesn't matter, because the structure of a Lucene index is inherently not domain-driven (for example there are
     46   fields that are repeated TOKENIZED and NON-TOKENIZED to allow for both full text search and ordering of results).
     47 }}}
     48 {{{
     49Structured Lucene query approach
     50
     51 * use Lucene queries that refer to properties of the queried object using the dot notation
     52 * example : !ExtrinsicObject.classificationList.Classification.id=someClassificationId
     53 * this would obviate the need for an extra field in the index
     54 * it is not known whether such Lucene queries actually work, in our index structure. Jose ?
     55 }}}
     56We should decide on this matter very soon.
    3557
    3658
     
    3961
    4062
    41