Opened 10 years ago

Closed 10 years ago

#2709 closed defect (worksforme)

Crash (more often under high loads)

Reported by: bshankle Owned by: pramsey
Priority: blocker Milestone: PostGIS 2.1.4
Component: postgis Version: 2.1.x
Keywords: Cc: bshankle

Description

Hi, I will apologize in advance if I don't provide enough information at first, but I will try to provide enough to help in creating a repro for those who feel they have time to work on this.

Repro steps:

  1. Add the natural earth file ne_10m_land using shp2pgsql (put it in 4326)
  2. Issue this query from 20 different connections simultaneously:

SELECT ST_AsText(ST_Intersection(ST_MakeValid(ST_SimplifyPreserveTopology(ST_Force_2d(result.geom), 0.00456872)), GeomFromEWKT('SRID=4326; POLYGON((-98.437499999999957367 33.13749663089078723,-97.031249999999971578 33.13749663089078723,-97.031249999999971578 34.307087880329206087,-98.437499999999957367 34.307087880329206087,-98.437499999999957367 33.13749663089078723))'))) as geom FROM (select geom from ne_10m_land) as result WHERE ST_Intersects( GeomFromEWKT('SRID=4326; POLYGON((-98.437499999999957367 33.13749663089078723,-97.031249999999971578 33.13749663089078723,-97.031249999999971578 34.307087880329206087,-98.437499999999957367 34.307087880329206087,-98.437499999999957367 33.13749663089078723))'), result.geom)

OS: CentOS release 6.5 (Final) but I can make it crash on Ubuntu 12.04 and OSX as well….

PostGIS full version: "POSTGIS="2.1.1 r12113" GEOS="3.4.2-CAPI-1.8.2 r3921" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.10.1, released 2013/08/26" LIBXML="2.7.6" TOPOLOGY RASTER"

PostgreSQL log: < 2014-04-09 03:08:39.591 EDT >DETAIL: Failed process was running: SELECT ST_AsText(ST_Intersection(ST_MakeValid(ST_SimplifyPreserveTopology(ST_Force_2d(result.geom), 0.00456872)), GeomFromEWKT('SRID=4326; POLYGON((-98.437499999999957367 33.13749663089078723,-97.031249999999971578 33.13749663089078723,-97.031249999999971578 34.307087880329206087,-98.437499999999957367 34.307087880329206087,-98.437499999999957367 33.13749663089078723))'))) as geom FROM (select geom from ne_10m_land) as result WHERE ST_Intersects( GeomFromEWKT('SRID=4326; POLYGON((-98.437499999999957367 33.13749663089078723,-97.031249999999971578 33.13749663089078723,-97.031249999999971578 34.307087880329206087,-98.437499999999957367 34.307087880329206087,-98.437499999999957367 33.13749663089078723))'), result.geom) < 2014-04-09 03:08:39.591 EDT >LOG: terminating any other active server processes < 2014-04-09 03:08:39.591 EDT >WARNING: terminating connection because of crash of another server process < 2014-04-09 03:08:39.591 EDT >DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

Some background: we have a vector map design tool called MapShop (http://www.ba3.us/index.php?page=pages/tutorials-mapshop&id=0). We never really meant this to be a public tool, but it seems to have a bit of a following now, despite being very difficult to set up all the pieces. It lets you style data directly from PostGIS and show it right in our mapping engine, bypassing the raster phase altogether.

When the map design is complete, we have a tool called metool (soon to be renamed to AltusVector) which generates a bunch of vector tiles for our mapping engine.

I think PostGIS is absolutely amazing and I'd like to help fix this issue.

I'm a proficient C++ programmer and I'm capable of digging into this as deep as you guys need me to. Is there a starter debugging document somewhere I can begin with?

Thanks, Bruce Shankle Founder http://www.ba3.us

Change History (5)

comment:1 by robe, 10 years ago

Cc: bshankle added

Bruce,

Sorry haven't had a chance to look at this. A link to the natural earth file you are using would be helpful. Ideally we prefer smaller tests we can easily fit in our regress suite if you can generate such a thing by isolating an offending geometry.

As far as debugging — check out the following (they are both a bit dated sorry)

http://trac.osgeo.org/postgis/wiki/DevWikiGettingABackTrace

This one is a bit dated but may be of use: http://blog.cleverelephant.ca/2008/08/valgrinding-postgis.html

Thanks, Regina

comment:2 by robe, 10 years ago

BTW can you also test this in 2.1.2 if not too much trouble. I think we fixed a bunch of crashers with that so hopefully no longer an issue.

comment:3 by pramsey, 10 years ago

In general, I think the only way to track this down will be to valgrind the backend while running this query. Because it's a concurrency problem, I tend to think it's a result of a function writing outside its allocated memory, over top of the working memory of another backend, then boom. So getting a stack trace from the dying back-end won't tell you where the problem happened (the writing outside memory). What's needed is to carefully watch what happens when a single backend runs the query: does it write outside memory it should? PostgreSQL 9.4 does include valgrind support (they say) so hopefully can be used to figure out if we have a memory fubar.

comment:4 by robe, 10 years ago

could be the same issue as #2725

comment:5 by pramsey, 10 years ago

Resolution: worksforme
Status: newclosed

Having no way to duplicate this, I'm closing worksforme until that changes.

Note: See TracTickets for help on using tickets.