Opened 9 years ago

Last modified 3 years ago

#708 assigned defect

GEOS memory explosion in intersects

Reported by: pramsey Owned by: strk
Priority: blocker Milestone: Upstream
Component: Default Version: 3.4.2
Severity: Unassigned Keywords:
Cc: royseto

Description

Described in detail here, with a GEOS test case,

http://trac.osgeo.org/postgis/ticket/2975

Attachments (5)

line1.png (7.9 KB ) - added by strk 9 years ago.
overview of the line
line2.png (13.2 KB ) - added by strk 9 years ago.
closer look at the line
line3.png (118.4 KB ) - added by strk 9 years ago.
higher zoom (cat toy?)
inside_the_nest.png (251.9 KB ) - added by strk 9 years ago.
intersects.xml.bz2 (155.0 KB ) - added by strk 9 years ago.
XML tester for use with GEOS and JTS

Download all attachments as: .zip

Change History (26)

comment:1 by royseto, 9 years ago

Cc: royseto added

comment:2 by strk, 9 years ago

Owner: changed from geos-devel@… to strk
Status: newassigned

comment:3 by strk, 9 years ago

Computing self-nodes of the linestring geometry seems to be the problem

comment:4 by strk, 9 years ago

IsSimpleOp is equally troubled with that specific linestring

by strk, 9 years ago

Attachment: line1.png added

overview of the line

by strk, 9 years ago

Attachment: line2.png added

closer look at the line

by strk, 9 years ago

Attachment: line3.png added

higher zoom (cat toy?)

comment:5 by strk, 9 years ago

Surely that line is not a common one. Did you have your cat play with it ? higher zoom (cat toy?)

by strk, 9 years ago

Attachment: inside_the_nest.png added

comment:6 by strk, 9 years ago

This is one of those cases in which short-circuits would be useful (answering the "intersects" question could stop at first intersection found...)

by strk, 9 years ago

Attachment: intersects.xml.bz2 added

XML tester for use with GEOS and JTS

comment:7 by strk, 9 years ago

JTS is somewhat ligther on RAM. Only used ~4GB so far, but it's keeping 8 cores at 100% right now, not sure for what (GC, maybe ?). Fun, for an Intersects(A,A) operation (but it's RelateOp under the hood)

comment:8 by strk, 9 years ago

Indeed it was GC for JTS:

Running tests...
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
        at com.vividsolutions.jts.algorithm.RobustLineIntersector.intersectionWithNormalization(RobustLineIntersector.java:279)
        at com.vividsolutions.jts.algorithm.RobustLineIntersector.intersection(RobustLineIntersector.java:223)
        at com.vividsolutions.jts.algorithm.RobustLineIntersector.computeIntersect(RobustLineIntersector.java:167)
        at com.vividsolutions.jts.algorithm.LineIntersector.computeIntersection(LineIntersector.java:248)
        at com.vividsolutions.jts.geomgraph.index.SegmentIntersector.addIntersections(SegmentIntersector.java:163)
        at com.vividsolutions.jts.geomgraph.index.MonotoneChainEdge.computeIntersectsForChain(MonotoneChainEdge.java:131)
        at com.vividsolutions.jts.geomgraph.index.MonotoneChainEdge.computeIntersectsForChain(MonotoneChainEdge.java:112)
        at com.vividsolutions.jts.geomgraph.index.MonotoneChain.computeIntersections(MonotoneChain.java:53)
        at com.vividsolutions.jts.geomgraph.index.SimpleMCSweepLineIntersector.processOverlaps(SimpleMCSweepLineIntersector.java:159)
        at com.vividsolutions.jts.geomgraph.index.SimpleMCSweepLineIntersector.computeIntersections(SimpleMCSweepLineIntersector.java:140)
        at com.vividsolutions.jts.geomgraph.index.SimpleMCSweepLineIntersector.computeIntersections(SimpleMCSweepLineIntersector.java:75)
        at com.vividsolutions.jts.geomgraph.GeometryGraph.computeSelfNodes(GeometryGraph.java:354)

comment:9 by strk, 9 years ago

the monotone chain sweep line intersector catches 88k+ events... SimpleMCSweepLineIntersector::computeIntersection looping over 88224 events

comment:10 by royseto, 9 years ago

Thank you very much for working on this issue. Yes, this linestring looks like a cat toy to me, too, but it is actually a GPS track. Cases like this are not common in our dataset, but out of many millions of examples, we do have some of these.

We are working around this for now by filtering out linestrings with more than 10,000 vertices in our data cleaning. This seems to avoid the PostGIS crash (http://trac.osgeo.org/postgis/ticket/2975). Can you think of a way to implement a filter that better focuses on the problem case of high numbers of self nodes?

comment:11 by strk, 9 years ago

The total length of the line is 4.2975 m Simplifying the line with a tolerance of 1 mm (1/4000 of total length) brings down the number of points of that line from 52k to 55. Half a millimeter gives you 108. I'm not sure simplification is an acceptable process for the problem being approached.

The real improvement here (beside safety against crash) would be to have a short-circuit for the "intersect" case. Full "relate" computation would still be affected though.

comment:12 by royseto, 9 years ago

Thanks. Yes, adding a short-circuit for the "intersect" case would definitely help in this example.

The linestring in the testcase is in SRID 4326, so it needs to be transformed to an equal-area projection like SRID 2163 to get length in meters. By my calculations it is 412km long. Simplifying to a 5 meter tolerance gives 17362 points; simplifying to a 10 meter tolerance gives 5112 points; simplifying to a 25 meter tolerance gives 444 points. I don't think it is acceptable for us to simplify that much for our use case.

dell4db=# \dt st_intersects_testcase.*
                       List of relations
         Schema         |        Name        | Type  |  Owner
------------------------+--------------------+-------+----------
 st_intersects_testcase | linestring         | table | postgres
 st_intersects_testcase | linestring_hex_wkb | table | postgres
 st_intersects_testcase | mpoly              | table | postgres
 st_intersects_testcase | mpoly_hex_wkb      | table | postgres
(4 rows)
 
dell4db=# \d st_intersects_testcase.linestring
   Table "st_intersects_testcase.linestring"
 Column |           Type            | Modifiers
--------+---------------------------+-----------
 geom   | geometry(LineString,4326) |
 
dell4db=# select st_length(geom) from st_intersects_testcase.linestring;
    st_length
------------------
 4.29752878655587
(1 row)
 
dell4db=# select st_length(st_transform(geom, 2163))
dell4db-# from st_intersects_testcase.linestring;
    st_length
------------------
 411596.637780254
(1 row)
 
dell4db=# select st_npoints(st_simplify(st_transform(geom, 2163), 5))
dell4db-# from st_intersects_testcase.linestring;
 st_npoints
------------
      17362
(1 row)
 
dell4db=# select st_npoints(st_simplify(st_transform(geom, 2163), 10))
dell4db-# from st_intersects_testcase.linestring;
 st_npoints
------------
       5112
(1 row)
 
dell4db=# select st_npoints(st_simplify(st_transform(geom, 2163), 25))
dell4db-# from st_intersects_testcase.linestring;
 st_npoints
------------
        444
(1 row)

comment:13 by strk, 9 years ago

The interruptibility aspect of this will be dealt with in ticket #711

comment:14 by strk, 9 years ago

Short-circuiting aspect will be possible after porting a recent change in JTS, See #718 and http://sourceforge.net/p/jts-topo-suite/mailman/message/33252299/

comment:15 by royseto, 9 years ago

Thank you for moving this issue forward.

comment:16 by strk, 7 years ago

Milestone: 3.4.33.6.1

Ticket retargeted after milestone deleted

comment:17 by strk, 7 years ago

Milestone: 3.6.13.6.2

Ticket retargeted after milestone closed

comment:18 by strk, 7 years ago

Milestone: 3.6.23.6.3

Ticket retargeted after milestone closed

comment:19 by robe, 6 years ago

Milestone: 3.6.33.6.4

I haven't tested if this is still an issue or not. I'll test later but push for now.

comment:20 by strk, 4 years ago

Milestone: 3.6.4GEOS Fund Me

comment:21 by pramsey, 3 years ago

Milestone: GEOS Fund MeUpstream
Note: See TracTickets for help on using tickets.