Opened 12 years ago
Closed 12 years ago
#1105 closed defect (fixed)
postgis query too large
Reported by: | plcking | Owned by: | jeichar |
---|---|---|---|
Priority: | major | Milestone: | v2.10.0 RC0 |
Component: | General | Version: | v2.6.4 |
Keywords: | Cc: |
Description
The query that is being sent to postgres/postgis from geonetwork(as a result of a csw spatial query) has over 50,000 "OR" clauses(the query just "hangs") - here is the postgres log :
LOG: execute fetch from S_4/C_5: SELECT "fid", asText("the_geom") FROM "public"."spatialindex" WHERE ("fid" = '10') OR ("fid" = '100') OR ("fid" = '1000') OR ("fid" = '10001') OR ("fid" = '10005') OR ("fid" = '10011') OR ("fid" = '10014') OR ("fid" = '10015') OR ("fid" = '10018') OR ("fid" = '10020') OR ("fid" = '10021') OR ("fid" = '10023') OR ("fid" = '10025') OR ("fid" = '10027') OR ("fid" = '10032') OR ("fid" = '10038') OR ("fid" = '10041') OR ("fid" = '10042') OR ("fid" = '10044') OR ("fid" = '10045') OR ("fid" = '10047') OR ("fid" = '10048') OR ("fid" = '1005') OR ("fid" = '10050') OR ("fid" = '10051') OR ("fid" = '10052') OR ("fid" = '10053') OR ("fid" = '10054') OR ("fid" = '10055') OR ("fid" = '10058') OR ("fid" = '10061') OR ("fid" = '10062') OR ("fid" = '10063') OR ("fid" = '10066') OR ("fid" = '10067') OR ("fid" = '1007') OR ("fid" = '10070') OR ("fid" = '10076') OR ("fid" = '10080')........................
What can be done to fix the issue ?
Postgres seems to have a limit of 50,000 on the number of clauses(and this makes sense from an efficiency standpoint) - I tested the query when there were 49K clauses and it was successful.
Pat
Change History (19)
follow-up: 2 comment:1 by , 12 years ago
follow-up: 3 comment:2 by , 12 years ago
Replying to jeichar:
Do you have a stack trace so I can see what series of calls are causing this error?
The query jusy "hangs". Here is the output from geonetwork.log :
2012-10-09 10:55:58,036 INFO [jeeves.request] - ========================================================== 2012-10-09 10:55:58,036 INFO [jeeves.request] - HTML Request (from 132.156.10.107) : /geonetwork/srv/en/csw 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Method : POST 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Content type : text/xml 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Accept : */* 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Session id is 3BB6476D1ABB2409C67968E8F52F185D 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Session created for client : 132.156.10.107 2012-10-09 10:55:58,038 INFO [jeeves.service] - Dispatching : csw 2012-10-09 10:55:58,038 DEBUG [jeeves.service] - -> parameters are : <csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:ogc="http://www.opengis.net/ogc" xmlns:gml="http://www.opengis.net/gml " service="CSW" version="2.0.2" resultType="results" outputSchema="http://www.isotc211.org/2005/gmd" maxRecords="100">
<csw:Query typeNames="gmd:MD_Metadata">
<csw:ElementSetName>full</csw:ElementSetName> <csw:Constraint version="1.1.0">
<ogc:Filter>
<ogc:Intersects>
<ogc:PropertyName>BoundingBox</ogc:PropertyName> <gml:Envelope>
<gml:lowerCorner>-44.00 -8.38</gml:lowerCorner> <gml:upperCorner>-40.00 1.00</gml:upperCorner>
</gml:Envelope>
</ogc:Intersects>
</ogc:Filter>
</csw:Constraint>
</csw:Query>
</csw:GetRecords> 2012-10-09 10:55:58,039 INFO [jeeves.webapp.csw] - Received: <csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:ogc="http://www.opengis.net/ogc" xmlns:gml="http://www.opengis.net/gml " service="CSW" version="2.0.2" resultType="results" outputSchema="http://www.isotc211.org/2005/gmd" maxRecords="100">
<csw:Query typeNames="gmd:MD_Metadata">
<csw:ElementSetName>full</csw:ElementSetName> <csw:Constraint version="1.1.0">
<ogc:Filter>
<ogc:Intersects>
<ogc:PropertyName>BoundingBox</ogc:PropertyName> <gml:Envelope>
<gml:lowerCorner>-44.00 -8.38</gml:lowerCorner> <gml:upperCorner>-40.00 1.00</gml:upperCorner>
</gml:Envelope>
</ogc:Intersects>
</ogc:Filter>
</csw:Constraint>
</csw:Query>
</csw:GetRecords> 2012-10-09 10:55:58,049 DEBUG [jeeves.engine] - TransformerFactoryFactory: de.fzi.dbs.xml.transform.CachingTransformerFactory 2012-10-09 10:55:58,064 DEBUG [jeeves.engine] - TransformerFactoryFactory: produces transformer implementation net.sf.saxon.IdentityTransform er
Pat
comment:3 by , 12 years ago
Replying to plcking:
Replying to jeichar:
Do you have a stack trace so I can see what series of calls are causing this error?
The query jusy "hangs". Here is the output from geonetwork.log :
2012-10-09 10:55:58,036 INFO [jeeves.request] - ========================================================== 2012-10-09 10:55:58,036 INFO [jeeves.request] - HTML Request (from 132.156.10.107) : /geonetwork/srv/en/csw 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Method : POST 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Content type : text/xml 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Accept : */* 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Session id is 3BB6476D1ABB2409C67968E8F52F185D 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Session created for client : 132.156.10.107 2012-10-09 10:55:58,038 INFO [jeeves.service] - Dispatching : csw 2012-10-09 10:55:58,038 DEBUG [jeeves.service] - -> parameters are : <csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:ogc="http://www.opengis.net/ogc" xmlns:gml="http://www.opengis.net/gml " service="CSW" version="2.0.2" resultType="results" outputSchema="http://www.isotc211.org/2005/gmd" maxRecords="100">
<csw:Query typeNames="gmd:MD_Metadata">
<csw:ElementSetName>full</csw:ElementSetName> <csw:Constraint version="1.1.0">
<ogc:Filter>
<ogc:Intersects>
<ogc:PropertyName>BoundingBox</ogc:PropertyName> <gml:Envelope>
<gml:lowerCorner>-44.00 -8.38</gml:lowerCorner> <gml:upperCorner>-40.00 1.00</gml:upperCorner>
</gml:Envelope>
</ogc:Intersects>
</ogc:Filter>
</csw:Constraint>
</csw:Query>
</csw:GetRecords> 2012-10-09 10:55:58,039 INFO [jeeves.webapp.csw] - Received: <csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:ogc="http://www.opengis.net/ogc" xmlns:gml="http://www.opengis.net/gml " service="CSW" version="2.0.2" resultType="results" outputSchema="http://www.isotc211.org/2005/gmd" maxRecords="100">
<csw:Query typeNames="gmd:MD_Metadata">
<csw:ElementSetName>full</csw:ElementSetName> <csw:Constraint version="1.1.0">
<ogc:Filter>
<ogc:Intersects>
<ogc:PropertyName>BoundingBox</ogc:PropertyName> <gml:Envelope>
<gml:lowerCorner>-44.00 -8.38</gml:lowerCorner> <gml:upperCorner>-40.00 1.00</gml:upperCorner>
</gml:Envelope>
</ogc:Intersects>
</ogc:Filter>
</csw:Constraint>
</csw:Query>
</csw:GetRecords> 2012-10-09 10:55:58,049 DEBUG [jeeves.engine] - TransformerFactoryFactory: de.fzi.dbs.xml.transform.CachingTransformerFactory 2012-10-09 10:55:58,064 DEBUG [jeeves.engine] - TransformerFactoryFactory: produces transformer implementation net.sf.saxon.IdentityTransform er
Pat
How does geonetwork construct the query ? Using so many feature Id's ?
Pat
follow-up: 5 comment:4 by , 12 years ago
My understanding of what is happening is the following:
The envelope query is converted to a lucene query which returns the ID of all the metadata in question. Then the ids together form a database query. Obviously some checks on the number of ids need to be added and multiple queries will be made.
I would have expected the maxRecords to reduce the number of clauses. Sounds like it is not taken into account at the right time.
Are you using version 2.6.4 as the bug report indicates?
comment:5 by , 12 years ago
Replying to jeichar:
My understanding of what is happening is the following:
The envelope query is converted to a lucene query which returns the ID of all the metadata in question. Then the ids together form a database query. Obviously some checks on the number of ids need to be added and multiple queries will be made.
I would have expected the maxRecords to reduce the number of clauses. Sounds like it is not taken into account at the right time.
Are you using version 2.6.4 as the bug report indicates?
Hi :
First of all, I'd like to thank you for your help so far. I really appreciate it !
The number of records that I have imported is in excess of 400,000. I never had the problem when the db had 330,000 records.
I am definitely using geonetwork v2.6.4.
Pat
follow-up: 7 comment:6 by , 12 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Looking closer I see I made an incorrect assumption. It is a bug with the spatial search which explains why the max records did not take effect.
comment:7 by , 12 years ago
Replying to jeichar:
Looking closer I see I made an incorrect assumption. It is a bug with the spatial search which explains why the max records did not take effect.
ok. On a different note I want to re-load only a portion of my original data. How can I delete the existing Lucene indexes (I have already deleted and re-created my postgis db) ?
comment:8 by , 12 years ago
to delete the indexes you can delete the <datadir>/index/nonspatial directory. That should work. You can also just restart after a clean db and trigger a rebuild index operation.
follow-up: 12 comment:9 by , 12 years ago
We did come across this as well in building catalogs with a couple of million records. A work around is to increase the operating system stack size resources and max_stack_depth in postgis config as explained in the section of the users manual on advanced configuration. The section of the manual that explains how to do this is available online at:
Even better though if the query size can be reduced somehow by fixing the bug Jesse!
comment:10 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Fixes applied to 2.6.x, 2.8.x and master
comment:11 by , 12 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
comment:12 by , 12 years ago
Replying to simonp:
We did come across this as well in building catalogs with a couple of million records. A work around is to increase the operating system stack size resources and max_stack_depth in postgis config as explained in the section of the users manual on advanced configuration. The section of the manual that explains how to do this is available online at:
Even better though if the query size can be reduced somehow by fixing the bug Jesse!
The stack size (ulimit -s) was increased to 200 Mb. and postgres config was modified to increase its stack depth. This was done when I had 300K records and was tryig to do a csw spatial search(the geonetwork log revealed a specific postgres error with mention of the stack size). The query was then successful. Now, with over 400K records, the query itself became too large (over 50K "OR" clauses) and this query was not digestible by postgres(there is no parameter "fix" within postgres for this).
Pat
follow-up: 14 comment:13 by , 12 years ago
If I make a new war for you can you test it to see if my fix works for you?
follow-up: 15 comment:14 by , 12 years ago
Replying to jeichar:
If I make a new war for you can you test it to see if my fix works for you?
Thank you for your help !!!!
I think a war file would help, and I will re-edit(after tomcat war expansion) the ./web/geonetwork/WEB-INF/config.xml to point to my postgis db. Of special note is that I had to modify(when I had 300k records) ./WEB_INF/classes/org/fao/geonet/kernel/search/LuceneSearcher.java (BooleanQuery.setMaxClauseCount had to be set to a large value due to a "too many clauses" error). If you don't include the change for this, I have my CLASSPATH env var set to recompile the code "tomcat in-situ" (I couldn't build the distribution with maven due to numerous errors).
Pat
comment:15 by , 12 years ago
Replying to plcking:
Replying to jeichar:
If I make a new war for you can you test it to see if my fix works for you?
Thank you for your help !!!!
I think a war file would help, and I will re-edit(after tomcat war expansion) the ./web/geonetwork/WEB-INF/config.xml to point to my postgis db. Of special note is that I had to modify(when I had 300k records) ./WEB_INF/classes/org/fao/geonet/kernel/search/LuceneSearcher.java (BooleanQuery.setMaxClauseCount had to be set to a large value due to a "too many clauses" error). If you don't include the change for this, I have my CLASSPATH env var set to recompile the code "tomcat in-situ" (I couldn't build the distribution with maven due to numerous errors).
Pat
Hi :
Any luck ?
Pat
follow-up: 17 comment:16 by , 12 years ago
I have uploaded a file to: http://dev.mapfish.org/geonetwork.war. You can download that and try it out. It is a 2.6.x build from the github 2.6.x branch made yesterday so it should have the issue fixed with luck.
Let me know if you find bugs.
follow-up: 18 comment:17 by , 12 years ago
Replying to jeichar:
I have uploaded a file to: http://dev.mapfish.org/geonetwork.war. You can download that and try it out. It is a 2.6.x build from the github 2.6.x branch made yesterday so it should have the issue fixed with luck.
Let me know if you find bugs.
yes, thank you very much. I will be installing it today !....Pat
comment:18 by , 12 years ago
Replying to plcking:
Replying to jeichar:
I have uploaded a file to: http://dev.mapfish.org/geonetwork.war. You can download that and try it out. It is a 2.6.x build from the github 2.6.x branch made yesterday so it should have the issue fixed with luck.
Let me know if you find bugs.
yes, thank you very much. I will be installing it today !....Pat
my tests so far have been successful ! On a different note, I have just posted a ticket regarding a CSW temporal query error - ticket #1113. Is it possible for you to have a look at this one ? I am working on a project that involves several Federal Government agencies internationally who acquire satellite scenes with their own ground stations....Pat
comment:19 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Great. I have been using it on master as well so I think I can close this issue. As for the csw temporal queries, I will take a look at the ticket and we can continue that discussion there
Do you have a stack trace so I can see what series of calls are causing this error?