Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#982 closed defect (worksforme)

Postgis ST_ConcaveHull Runtime GEOS Problem?

Reported by: tsw Owned by: geos-devel@…
Priority: major Milestone:
Component: Default Version: 3.7.0
Severity: Unassigned Keywords:
Cc:

Description

A concave hull query with the attached data set under Postgresql 9.4 / PostGIS 2.2 / Geos 3.5.0 runs in under a second.

The same query using the same data on Postgresql 9.6 / PostGIS 2.5.2 / GEOS 3.7.1 is still running after 30 minutes.

I've also attached a perf top screen shot which to me implies this is a GEOS problem.

Attachments (2)

Screenshot from 2019-08-02 11-25-34.png (224.5 KB ) - added by tsw 3 years ago.
screenshot of perf top while query is running
brokenlines.zip (70.6 KB ) - added by tsw 3 years ago.
Data set used to generate error

Download all attachments as: .zip

Change History (24)

by tsw, 3 years ago

screenshot of perf top while query is running

by tsw, 3 years ago

Attachment: brokenlines.zip added

Data set used to generate error

comment:1 by robe, 3 years ago

This might be related to #867

comment:2 by dbaston, 3 years ago

Resolution: invalid
Status: newclosed

I'm more inclined to think this is a PostGIS issue: note the addition of a ST_Union call at https://github.com/postgis/postgis/commit/ef537b71ca46bfab44f481877b98bd59020f6acb

Feel free to reopen if I'm mistaken.

comment:3 by komzpa, 3 years ago

Resolution: invalid
Status: closedreopened

PostGIS ticket: https://trac.osgeo.org/postgis/ticket/4469

Here author says he reverted this change and it made no difference for him: https://www.mail-archive.com/postgis-users@lists.osgeo.org/msg07593.html

I tried commenting out the line in the ST_ConcaveHull function you
suggested but that doesn't make a difference. I suspect there is some
underlying library that has changed.

@dbaston would be cool if there's a way to express your intention without closing tickets as "invalid", as this one is a valid one.

comment:4 by dbaston, 3 years ago

@Komzpa, would be cool if you could help out by running this query on a single version of PostGIS with two different versions of GEOS to confirm the performance difference.

comment:5 by dbaston, 3 years ago

FWIW, if I run ST_ConcaveHull(geom, 0.99) using GEOS master against PostGIS trunk, the query takes forever. If I run it using GEOS master against the ST_ConcaveHull implementation from PostGIS 2.3.10 it completes instantly.

comment:6 by tsw, 3 years ago

I've further tested this today and calls to public.ST_UnaryUnion in ST_Concave hull don't make any difference as they are only applied to polygon geometries, and in this case, lines are the problem.

HOWEVER, the ST_UnaryUnion call made at the end of _st_concavehull added in response to issue 3638 is the source of this problem. If this is commented out the query runs instantly. Not sure if there is another way to address 3638, but this change causes the slowdown I document here.

comment:7 by komzpa, 3 years ago

So, slow part is ST_Union / GEOSUnaryUnion, unioning the attached lines and their hull posted below (almost-hull, but that's PostGIS side issue). Brokenlines may seem to require insane amount of noding, but in the actual call almost all of it can be skipped if some kind of shortcut like "drop the line if it's fully covered by polygon" is implemented.

01030000000100000042000000F5B206C11EB668C1A3DB06B4F3945241D6A3BD30FD8E69C1752C2DB10AE35241C5741F4549836AC18B2952CD51AF5241314B24B75C7E6BC1F2B3CF46A09D5241AE80CE197F826CC1E5ED0FCC82DB52410737024C2A816DC1FAA64EE3ACA9524144E7B718A0AC6DC1847E7CB252F2524148614DCBEEAF6DC12A6C413F20B753416D0E894BC8AC6DC1004BD635398E5541B5EF09D66FB76DC11333D39C3C4F5741EC4FC50C1DB96DC1B95A8CC3E62C59416C03A8C8878D6DC1782FDBAE04385B41EA03332696976DC1DA94C47A224D5D411E83C76FB58E6DC1E1998398DA4D5F4157B3DB9D3AA26DC1D070D3CF58A260414953F11917976DC1D397E6734CB76141AAC276EA79906DC1C01C80BB4DB762411E2B0F39F9A26DC13AA71480AAB6634176542C93C8AA6DC101E95A312AC26441F680C35BBFCC6CC1659DBD4FFE06654106057F131FEF6BC14C35E89109E16441FD69DEC1D7E96AC1C51759BC8BF06441E93D903AFD056AC12608740469E56441A176EE44DDE168C15A5AEAF0CC1A65419658C592C70368C13FC14B62D3E464412FE2810180C166C1C2C51F205A0C6541FE6ACDEBDDAF65C12DFBC60E700E6541754B4E2C7EC964C1EB2698C6C6FE644197EDEFD64BB563C1EAB8CFAB641B6541E54E22C5808A62C19D4EA09C8BEA64418C1AB0FDF86B61C1D7B1D2793B126541AF8D3442203160C1FF9EB62A31FB6441D7F24CE0CF425EC1C6575CA2050665418A4A823671025CC1DB0EB7D6D61B65412A02FA1DF2AF5AC1A0094F9F94196541575AD1585FA859C1C2B7C7BF80FC6441BA448FB464BF57C1919B60618EF46441ADA11AC9B19856C17B221DC0E7A664413712C6B19E7355C132F217CF469464414AF1AC3BE53F55C17958BDA8337C644140733177D43D55C1B455E12AB07C6341B69FE6EFFC7E55C10DE6DC90087D6241B43C89C96E9D55C1446A6575597C6141B2D92BA3E0BB55C17AEEED59AA7B604115195715BB8955C1D2B05EDBF89C5E415D24EBE8D14055C123F095A0AFE95C41D8BDDB89584D55C114AEA4EB16185B415D43D6E2D37755C1920F016981BB5841ED2607869F5255C1DF5EE00522015741B1252C42104955C1C6ABF4692D375541138E4257DB4855C12D505ED8BE4053419CEC2D0428A555C1D126215F88D05241DEF29B3AF91A57C1A48934D951B95241C1862ADB8C4258C1C7C23F5AFDDB5241B34E72F21B8959C1666EE238F4A052416D6F05E4074C5BC1F95119FF039C5241F0FCAC5B079C5DC18700A026DCC65241E11D8AFDA8C95FC18E9F026CC1965241B6E579402F0C61C17C1ADF3145BF5241CAB7CD09683362C167145B3757EB52411B8F5A4D976963C1D462702E9B9D524166AA2F87D67364C1BA913E32BEC0524119F88B6B61B665C18236FAB2B4B952411579F9CB17DA66C1E2ADB1D30B955241F7D62477E6C467C1EDA55BB3FCC45241F5B206C11EB668C1A3DB06B4F3945241

comment:8 by dbaston, 3 years ago

So I am again inclined to close this as a PostGIS issue.

comment:9 by komzpa, 3 years ago

@dbaston PostGIS is calling GEOSUnion in natural way as part of higher level algorithm.

Single GEOSUnaryUnion call is working unexpectedly slow. This manifests as slow PostGIS. That manifestation is problem of PostGIS. The slowness of GEOSUnaryUnion is GEOS problem, will be there whether ST_ConcaveHull is rewritten or not, and this ticket has everything to reproduce it.

If GEOS is not concerned of performance of overlay ops then this ticket can be closed together with #867. I hope this is not the case.

comment:10 by dbaston, 3 years ago

@Komzpa This ticket description claims that there is a performance regression in GEOS. Is there a performance regression, or something unique about these inputs, or are you just saying that GEOS is generally not as fast as you'd like it to be?

comment:11 by dbaston, 3 years ago

In other words, should the ticket be reframed as "please optimize the case of unary union of lines with many intersections" ?

comment:12 by komzpa, 3 years ago

@dbaston this ticket does not claim there's performance regression in GEOS. It claims there's a performance issue in current PostGIS profiling of which ends up in GEOS.

Best way of handling such issues it to link (or block) them to possible fix implementation tickets and recheck whether the issue still stands after each of those. This one can be blocked on overlays-ng implementation if that's ticketed, and on in-GEOS concave hull if that might become a thing. This way original conversation is not lost, and complexity of solution space is captured.

comment:13 by dbaston, 3 years ago

But this has nothing to do with concave hull, in the end? The issue is that you'd like the unary union of many lines to be faster, no? Would that not be better captured in a ticket that leaves PostGIS, concave hull, and all other noise out of it?

comment:14 by tsw, 3 years ago

It may be the concave hull with this data set identified an underlying issue.

As I don't understand the underlying algorithm however I'll point out a few things. If you run the test ST_Concavehull query as it currently stands in PostGIS and GEOS and buffer the lines by 10 meters, it is still never completes. I'm not sure that indicates a GEOS problem or not.

Manipulating the _st_concavehull code to remove ST_UnaryUnion fixes the performance but then not everything is encapsulated by the hull.

I'll leave it to you folks to determine the cause and the fix, but wanted to provide as much information as possible to help diagnose the cause.

Last edited 3 years ago by tsw (previous) (diff)

comment:15 by dbaston, 3 years ago

Thanks for the report, @tsw, and sorry for leaving you caught in the crossfire.

comment:16 by mdavis, 3 years ago

Wading in here... to my mind the use of ST_Union to cover up deficiencies in the Concave Hull algorithm seems very fishy. That should not be needed if the Concave Hull algorithm was correct.

So while this is hitting the GEOS issue with slow UnaryUnion (probably #867), it's not a direct example of that problem.

Since the Concave Hull algorithm is in PostGIS this should be ticketed there as a request for an improved algorithm.

comment:17 by mdavis, 3 years ago

Now that I've looked at the data in this test case, I can see that this dataset is absolutely a worst-case scenario for union. The reason is that there are so many intersections between the lines, it is going to produce on the order of O(n2) node points. This means the noding process has to run in slow O(n2) time. Also, memory usage is going to be very large.

So this is an example of where using Union within the Concave Hull algorithm is not a good idea.

comment:18 by dbaston, 3 years ago

(If I recall, it was something like 2.3 million intersections)

comment:19 by mdavis, 3 years ago

From input containing about 3000 line segments. So yep, O(n2).

comment:20 by mdavis, 3 years ago

Here's the stats from JTS:

A : GEOMETRYCOLLECTION - 1007 elements, 3485 pts

Len: 1.0414378207800608E10

Result : MULTILINESTRING - 1370751 elements, 2742973 pts

Len: 1.041437820780099E10

Time: 72.4 s

Because there are a lot of >= 3 point linestrings, the result component count is more than the square of the input component count. It's good to see the close correlation of the length of the input and output datasets.

Last edited 3 years ago by mdavis (previous) (diff)

comment:21 by dbaston, 3 years ago

Resolution: worksforme
Status: reopenedclosed

I'm not sure what success looks like for this ticket, some comments to which seem to be asking for arbitrary operations to complete in no measurable time, but hopefully the overlay performance improvements in 3.8 will be adequate.

comment:22 by mdavis, 3 years ago

Would success look like a performance which is much closer to the old time value? Maybe not "under a second", but no more than a few seconds (hopefully).

Note: See TracTickets for help on using tickets.