Opened 12 years ago

Closed 12 years ago

#1105 closed defect (fixed)

postgis query too large

Reported by: plcking Owned by: jeichar
Priority: major Milestone: v2.10.0 RC0
Component: General Version: v2.6.4
Keywords: Cc:

Description

The query that is being sent to postgres/postgis from geonetwork(as a result of a csw spatial query) has over 50,000 "OR" clauses(the query just "hangs") - here is the postgres log :

LOG: execute fetch from S_4/C_5: SELECT "fid", asText("the_geom") FROM "public"."spatialindex" WHERE ("fid" = '10') OR ("fid" = '100') OR ("fid" = '1000') OR ("fid" = '10001') OR ("fid" = '10005') OR ("fid" = '10011') OR ("fid" = '10014') OR ("fid" = '10015') OR ("fid" = '10018') OR ("fid" = '10020') OR ("fid" = '10021') OR ("fid" = '10023') OR ("fid" = '10025') OR ("fid" = '10027') OR ("fid" = '10032') OR ("fid" = '10038') OR ("fid" = '10041') OR ("fid" = '10042') OR ("fid" = '10044') OR ("fid" = '10045') OR ("fid" = '10047') OR ("fid" = '10048') OR ("fid" = '1005') OR ("fid" = '10050') OR ("fid" = '10051') OR ("fid" = '10052') OR ("fid" = '10053') OR ("fid" = '10054') OR ("fid" = '10055') OR ("fid" = '10058') OR ("fid" = '10061') OR ("fid" = '10062') OR ("fid" = '10063') OR ("fid" = '10066') OR ("fid" = '10067') OR ("fid" = '1007') OR ("fid" = '10070') OR ("fid" = '10076') OR ("fid" = '10080')........................

What can be done to fix the issue ?

Postgres seems to have a limit of 50,000 on the number of clauses(and this makes sense from an efficiency standpoint) - I tested the query when there were 49K clauses and it was successful.

Pat

Change History (19)

comment:1 by jeichar, 12 years ago

Do you have a stack trace so I can see what series of calls are causing this error?

in reply to:  1 ; comment:2 by plcking, 12 years ago

Replying to jeichar:

Do you have a stack trace so I can see what series of calls are causing this error?

The query jusy "hangs". Here is the output from geonetwork.log :

2012-10-09 10:55:58,036 INFO [jeeves.request] - ========================================================== 2012-10-09 10:55:58,036 INFO [jeeves.request] - HTML Request (from 132.156.10.107) : /geonetwork/srv/en/csw 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Method : POST 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Content type : text/xml 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Accept : */* 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Session id is 3BB6476D1ABB2409C67968E8F52F185D 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Session created for client : 132.156.10.107 2012-10-09 10:55:58,038 INFO [jeeves.service] - Dispatching : csw 2012-10-09 10:55:58,038 DEBUG [jeeves.service] - -> parameters are : <csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:ogc="http://www.opengis.net/ogc" xmlns:gml="http://www.opengis.net/gml " service="CSW" version="2.0.2" resultType="results" outputSchema="http://www.isotc211.org/2005/gmd" maxRecords="100">

<csw:Query typeNames="gmd:MD_Metadata">

<csw:ElementSetName>full</csw:ElementSetName> <csw:Constraint version="1.1.0">

<ogc:Filter>

<ogc:Intersects>

<ogc:PropertyName>BoundingBox</ogc:PropertyName> <gml:Envelope>

<gml:lowerCorner>-44.00 -8.38</gml:lowerCorner> <gml:upperCorner>-40.00 1.00</gml:upperCorner>

</gml:Envelope>

</ogc:Intersects>

</ogc:Filter>

</csw:Constraint>

</csw:Query>

</csw:GetRecords> 2012-10-09 10:55:58,039 INFO [jeeves.webapp.csw] - Received: <csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:ogc="http://www.opengis.net/ogc" xmlns:gml="http://www.opengis.net/gml " service="CSW" version="2.0.2" resultType="results" outputSchema="http://www.isotc211.org/2005/gmd" maxRecords="100">

<csw:Query typeNames="gmd:MD_Metadata">

<csw:ElementSetName>full</csw:ElementSetName> <csw:Constraint version="1.1.0">

<ogc:Filter>

<ogc:Intersects>

<ogc:PropertyName>BoundingBox</ogc:PropertyName> <gml:Envelope>

<gml:lowerCorner>-44.00 -8.38</gml:lowerCorner> <gml:upperCorner>-40.00 1.00</gml:upperCorner>

</gml:Envelope>

</ogc:Intersects>

</ogc:Filter>

</csw:Constraint>

</csw:Query>

</csw:GetRecords> 2012-10-09 10:55:58,049 DEBUG [jeeves.engine] - TransformerFactoryFactory: de.fzi.dbs.xml.transform.CachingTransformerFactory 2012-10-09 10:55:58,064 DEBUG [jeeves.engine] - TransformerFactoryFactory: produces transformer implementation net.sf.saxon.IdentityTransform er

Pat

in reply to:  2 comment:3 by plcking, 12 years ago

Replying to plcking:

Replying to jeichar:

Do you have a stack trace so I can see what series of calls are causing this error?

The query jusy "hangs". Here is the output from geonetwork.log :

2012-10-09 10:55:58,036 INFO [jeeves.request] - ========================================================== 2012-10-09 10:55:58,036 INFO [jeeves.request] - HTML Request (from 132.156.10.107) : /geonetwork/srv/en/csw 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Method : POST 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Content type : text/xml 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Accept : */* 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Session id is 3BB6476D1ABB2409C67968E8F52F185D 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Session created for client : 132.156.10.107 2012-10-09 10:55:58,038 INFO [jeeves.service] - Dispatching : csw 2012-10-09 10:55:58,038 DEBUG [jeeves.service] - -> parameters are : <csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:ogc="http://www.opengis.net/ogc" xmlns:gml="http://www.opengis.net/gml " service="CSW" version="2.0.2" resultType="results" outputSchema="http://www.isotc211.org/2005/gmd" maxRecords="100">

<csw:Query typeNames="gmd:MD_Metadata">

<csw:ElementSetName>full</csw:ElementSetName> <csw:Constraint version="1.1.0">

<ogc:Filter>

<ogc:Intersects>

<ogc:PropertyName>BoundingBox</ogc:PropertyName> <gml:Envelope>

<gml:lowerCorner>-44.00 -8.38</gml:lowerCorner> <gml:upperCorner>-40.00 1.00</gml:upperCorner>

</gml:Envelope>

</ogc:Intersects>

</ogc:Filter>

</csw:Constraint>

</csw:Query>

</csw:GetRecords> 2012-10-09 10:55:58,039 INFO [jeeves.webapp.csw] - Received: <csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:ogc="http://www.opengis.net/ogc" xmlns:gml="http://www.opengis.net/gml " service="CSW" version="2.0.2" resultType="results" outputSchema="http://www.isotc211.org/2005/gmd" maxRecords="100">

<csw:Query typeNames="gmd:MD_Metadata">

<csw:ElementSetName>full</csw:ElementSetName> <csw:Constraint version="1.1.0">

<ogc:Filter>

<ogc:Intersects>

<ogc:PropertyName>BoundingBox</ogc:PropertyName> <gml:Envelope>

<gml:lowerCorner>-44.00 -8.38</gml:lowerCorner> <gml:upperCorner>-40.00 1.00</gml:upperCorner>

</gml:Envelope>

</ogc:Intersects>

</ogc:Filter>

</csw:Constraint>

</csw:Query>

</csw:GetRecords> 2012-10-09 10:55:58,049 DEBUG [jeeves.engine] - TransformerFactoryFactory: de.fzi.dbs.xml.transform.CachingTransformerFactory 2012-10-09 10:55:58,064 DEBUG [jeeves.engine] - TransformerFactoryFactory: produces transformer implementation net.sf.saxon.IdentityTransform er

Pat

How does geonetwork construct the query ? Using so many feature Id's ?

Pat

comment:4 by jeichar, 12 years ago

My understanding of what is happening is the following:

The envelope query is converted to a lucene query which returns the ID of all the metadata in question. Then the ids together form a database query. Obviously some checks on the number of ids need to be added and multiple queries will be made.

I would have expected the maxRecords to reduce the number of clauses. Sounds like it is not taken into account at the right time.

Are you using version 2.6.4 as the bug report indicates?

in reply to:  4 comment:5 by plcking, 12 years ago

Replying to jeichar:

My understanding of what is happening is the following:

The envelope query is converted to a lucene query which returns the ID of all the metadata in question. Then the ids together form a database query. Obviously some checks on the number of ids need to be added and multiple queries will be made.

I would have expected the maxRecords to reduce the number of clauses. Sounds like it is not taken into account at the right time.

Are you using version 2.6.4 as the bug report indicates?

Hi :

First of all, I'd like to thank you for your help so far. I really appreciate it !

The number of records that I have imported is in excess of 400,000. I never had the problem when the db had 330,000 records.

I am definitely using geonetwork v2.6.4.

Pat

comment:6 by jeichar, 12 years ago

Owner: changed from geonetwork-devel@… to jeichar
Status: newassigned

Looking closer I see I made an incorrect assumption. It is a bug with the spatial search which explains why the max records did not take effect.

in reply to:  6 comment:7 by plcking, 12 years ago

Replying to jeichar:

Looking closer I see I made an incorrect assumption. It is a bug with the spatial search which explains why the max records did not take effect.

ok. On a different note I want to re-load only a portion of my original data. How can I delete the existing Lucene indexes (I have already deleted and re-created my postgis db) ?

comment:8 by jeichar, 12 years ago

to delete the indexes you can delete the <datadir>/index/nonspatial directory. That should work. You can also just restart after a clean db and trigger a rebuild index operation.

comment:9 by simonp, 12 years ago

We did come across this as well in building catalogs with a couple of million records. A work around is to increase the operating system stack size resources and max_stack_depth in postgis config as explained in the section of the users manual on advanced configuration. The section of the manual that explains how to do this is available online at:

http://geonetwork-opensource.org/manuals/trunk/eng/users/admin/advanced-configuration/index.html#advanced-configuration-for-larger-catalogs

Even better though if the query size can be reduced somehow by fixing the bug Jesse!

comment:10 by jeichar, 12 years ago

Resolution: fixed
Status: assignedclosed

Fixes applied to 2.6.x, 2.8.x and master

comment:11 by plcking, 12 years ago

Resolution: fixed
Status: closedreopened

in reply to:  9 comment:12 by plcking, 12 years ago

Replying to simonp:

We did come across this as well in building catalogs with a couple of million records. A work around is to increase the operating system stack size resources and max_stack_depth in postgis config as explained in the section of the users manual on advanced configuration. The section of the manual that explains how to do this is available online at:

http://geonetwork-opensource.org/manuals/trunk/eng/users/admin/advanced-configuration/index.html#advanced-configuration-for-larger-catalogs

Even better though if the query size can be reduced somehow by fixing the bug Jesse!

The stack size (ulimit -s) was increased to 200 Mb. and postgres config was modified to increase its stack depth. This was done when I had 300K records and was tryig to do a csw spatial search(the geonetwork log revealed a specific postgres error with mention of the stack size). The query was then successful. Now, with over 400K records, the query itself became too large (over 50K "OR" clauses) and this query was not digestible by postgres(there is no parameter "fix" within postgres for this).

Pat

comment:13 by jeichar, 12 years ago

If I make a new war for you can you test it to see if my fix works for you?

in reply to:  13 ; comment:14 by plcking, 12 years ago

Replying to jeichar:

If I make a new war for you can you test it to see if my fix works for you?

Thank you for your help !!!!

I think a war file would help, and I will re-edit(after tomcat war expansion) the ./web/geonetwork/WEB-INF/config.xml to point to my postgis db. Of special note is that I had to modify(when I had 300k records) ./WEB_INF/classes/org/fao/geonet/kernel/search/LuceneSearcher.java (BooleanQuery.setMaxClauseCount had to be set to a large value due to a "too many clauses" error). If you don't include the change for this, I have my CLASSPATH env var set to recompile the code "tomcat in-situ" (I couldn't build the distribution with maven due to numerous errors).

Pat

in reply to:  14 comment:15 by plcking, 12 years ago

Replying to plcking:

Replying to jeichar:

If I make a new war for you can you test it to see if my fix works for you?

Thank you for your help !!!!

I think a war file would help, and I will re-edit(after tomcat war expansion) the ./web/geonetwork/WEB-INF/config.xml to point to my postgis db. Of special note is that I had to modify(when I had 300k records) ./WEB_INF/classes/org/fao/geonet/kernel/search/LuceneSearcher.java (BooleanQuery.setMaxClauseCount had to be set to a large value due to a "too many clauses" error). If you don't include the change for this, I have my CLASSPATH env var set to recompile the code "tomcat in-situ" (I couldn't build the distribution with maven due to numerous errors).

Pat

Hi :

Any luck ?

Pat

comment:16 by jeichar, 12 years ago

I have uploaded a file to: http://dev.mapfish.org/geonetwork.war. You can download that and try it out. It is a 2.6.x build from the github 2.6.x branch made yesterday so it should have the issue fixed with luck.

Let me know if you find bugs.

in reply to:  16 ; comment:17 by plcking, 12 years ago

Replying to jeichar:

I have uploaded a file to: http://dev.mapfish.org/geonetwork.war. You can download that and try it out. It is a 2.6.x build from the github 2.6.x branch made yesterday so it should have the issue fixed with luck.

Let me know if you find bugs.

yes, thank you very much. I will be installing it today !....Pat

in reply to:  17 comment:18 by plcking, 12 years ago

Replying to plcking:

Replying to jeichar:

I have uploaded a file to: http://dev.mapfish.org/geonetwork.war. You can download that and try it out. It is a 2.6.x build from the github 2.6.x branch made yesterday so it should have the issue fixed with luck.

Let me know if you find bugs.

yes, thank you very much. I will be installing it today !....Pat

my tests so far have been successful ! On a different note, I have just posted a ticket regarding a CSW temporal query error - ticket #1113. Is it possible for you to have a look at this one ? I am working on a project that involves several Federal Government agencies internationally who acquire satellite scenes with their own ground stations....Pat

comment:19 by jesseeichar, 12 years ago

Resolution: fixed
Status: reopenedclosed

Great. I have been using it on master as well so I think I can close this issue. As for the csw temporal queries, I will take a look at the ticket and we can continue that discussion there

Note: See TracTickets for help on using tickets.