Opened 9 years ago
Last modified 3 years ago
#708 assigned defect
GEOS memory explosion in intersects
Reported by: | pramsey | Owned by: | strk |
---|---|---|---|
Priority: | blocker | Milestone: | Upstream |
Component: | Default | Version: | 3.4.2 |
Severity: | Unassigned | Keywords: | |
Cc: | royseto |
Description
Described in detail here, with a GEOS test case,
Attachments (5)
Change History (26)
comment:1 by , 9 years ago
Cc: | added |
---|
comment:2 by , 9 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:3 by , 9 years ago
comment:5 by , 9 years ago
by , 9 years ago
Attachment: | inside_the_nest.png added |
---|
comment:6 by , 9 years ago
comment:7 by , 9 years ago
JTS is somewhat ligther on RAM. Only used ~4GB so far, but it's keeping 8 cores at 100% right now, not sure for what (GC, maybe ?). Fun, for an Intersects(A,A) operation (but it's RelateOp under the hood)
comment:8 by , 9 years ago
Indeed it was GC for JTS:
Running tests... Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded at com.vividsolutions.jts.algorithm.RobustLineIntersector.intersectionWithNormalization(RobustLineIntersector.java:279) at com.vividsolutions.jts.algorithm.RobustLineIntersector.intersection(RobustLineIntersector.java:223) at com.vividsolutions.jts.algorithm.RobustLineIntersector.computeIntersect(RobustLineIntersector.java:167) at com.vividsolutions.jts.algorithm.LineIntersector.computeIntersection(LineIntersector.java:248) at com.vividsolutions.jts.geomgraph.index.SegmentIntersector.addIntersections(SegmentIntersector.java:163) at com.vividsolutions.jts.geomgraph.index.MonotoneChainEdge.computeIntersectsForChain(MonotoneChainEdge.java:131) at com.vividsolutions.jts.geomgraph.index.MonotoneChainEdge.computeIntersectsForChain(MonotoneChainEdge.java:112) at com.vividsolutions.jts.geomgraph.index.MonotoneChain.computeIntersections(MonotoneChain.java:53) at com.vividsolutions.jts.geomgraph.index.SimpleMCSweepLineIntersector.processOverlaps(SimpleMCSweepLineIntersector.java:159) at com.vividsolutions.jts.geomgraph.index.SimpleMCSweepLineIntersector.computeIntersections(SimpleMCSweepLineIntersector.java:140) at com.vividsolutions.jts.geomgraph.index.SimpleMCSweepLineIntersector.computeIntersections(SimpleMCSweepLineIntersector.java:75) at com.vividsolutions.jts.geomgraph.GeometryGraph.computeSelfNodes(GeometryGraph.java:354)
comment:9 by , 9 years ago
the monotone chain sweep line intersector catches 88k+ events... SimpleMCSweepLineIntersector::computeIntersection looping over 88224 events
comment:10 by , 9 years ago
Thank you very much for working on this issue. Yes, this linestring looks like a cat toy to me, too, but it is actually a GPS track. Cases like this are not common in our dataset, but out of many millions of examples, we do have some of these.
We are working around this for now by filtering out linestrings with more than 10,000 vertices in our data cleaning. This seems to avoid the PostGIS crash (http://trac.osgeo.org/postgis/ticket/2975). Can you think of a way to implement a filter that better focuses on the problem case of high numbers of self nodes?
comment:11 by , 9 years ago
The total length of the line is 4.2975 m Simplifying the line with a tolerance of 1 mm (1/4000 of total length) brings down the number of points of that line from 52k to 55. Half a millimeter gives you 108. I'm not sure simplification is an acceptable process for the problem being approached.
The real improvement here (beside safety against crash) would be to have a short-circuit for the "intersect" case. Full "relate" computation would still be affected though.
comment:12 by , 9 years ago
Thanks. Yes, adding a short-circuit for the "intersect" case would definitely help in this example.
The linestring in the testcase is in SRID 4326, so it needs to be transformed to an equal-area projection like SRID 2163 to get length in meters. By my calculations it is 412km long. Simplifying to a 5 meter tolerance gives 17362 points; simplifying to a 10 meter tolerance gives 5112 points; simplifying to a 25 meter tolerance gives 444 points. I don't think it is acceptable for us to simplify that much for our use case.
dell4db=# \dt st_intersects_testcase.* List of relations Schema | Name | Type | Owner ------------------------+--------------------+-------+---------- st_intersects_testcase | linestring | table | postgres st_intersects_testcase | linestring_hex_wkb | table | postgres st_intersects_testcase | mpoly | table | postgres st_intersects_testcase | mpoly_hex_wkb | table | postgres (4 rows) dell4db=# \d st_intersects_testcase.linestring Table "st_intersects_testcase.linestring" Column | Type | Modifiers --------+---------------------------+----------- geom | geometry(LineString,4326) | dell4db=# select st_length(geom) from st_intersects_testcase.linestring; st_length ------------------ 4.29752878655587 (1 row) dell4db=# select st_length(st_transform(geom, 2163)) dell4db-# from st_intersects_testcase.linestring; st_length ------------------ 411596.637780254 (1 row) dell4db=# select st_npoints(st_simplify(st_transform(geom, 2163), 5)) dell4db-# from st_intersects_testcase.linestring; st_npoints ------------ 17362 (1 row) dell4db=# select st_npoints(st_simplify(st_transform(geom, 2163), 10)) dell4db-# from st_intersects_testcase.linestring; st_npoints ------------ 5112 (1 row) dell4db=# select st_npoints(st_simplify(st_transform(geom, 2163), 25)) dell4db-# from st_intersects_testcase.linestring; st_npoints ------------ 444 (1 row)
comment:14 by , 9 years ago
Short-circuiting aspect will be possible after porting a recent change in JTS, See #718 and http://sourceforge.net/p/jts-topo-suite/mailman/message/33252299/
comment:19 by , 6 years ago
Milestone: | 3.6.3 → 3.6.4 |
---|
I haven't tested if this is still an issue or not. I'll test later but push for now.
comment:20 by , 4 years ago
Milestone: | 3.6.4 → GEOS Fund Me |
---|
comment:21 by , 3 years ago
Milestone: | GEOS Fund Me → Upstream |
---|
Computing self-nodes of the linestring geometry seems to be the problem