Opened 19 months ago

Last modified 19 months ago

#5401 new defect

ST_Difference silently giving us wrong results

Reported by: Lars Aksel Opsahl Owned by: pramsey
Priority: medium Milestone: PostGIS GEOS
Component: postgis Version:
Keywords: Cc: Lars Aksel Opsahl

Description

In this case, we have a fixed restricted area (blue), where users can select an area to use that are not already used by another user (red area), so we need to remove the area taken. The result the user saw is here https://gitlab.com/nibioopensource/resolve-overlap-and-gap/-/blob/develop/src/test/pictures/difference_test_01_POSTGIS_3_3_3.png (output from ST_Difference, the green rectangles).

The ST_Difference test and data used can be found here https://gitlab.com/nibioopensource/resolve-overlap-and-gap/-/blob/develop/src/test/sql/regress/rog_sf_difference.sql .

If a I use latest source compiled by Sandro a couple weeks ago on a server locally, this test case works, but then another test case fails longer down in the script.

Attachments (3)

acceptable_result.png (187.8 KB ) - added by strk 19 months ago.
Screenshot 2023-06-15 at 12.47.51.png (436.1 KB ) - added by Lars Aksel Opsahl 19 months ago.
before_fix_user_test_02
Screenshot 2023-06-15 at 13.02.31.png (1.1 MB ) - added by Lars Aksel Opsahl 19 months ago.
after_fix_test02

Download all attachments as: .zip

Change History (23)

comment:1 by Lars Aksel Opsahl, 19 months ago

In the script we check the input data are valid before ST_Difference is dobe

CREATE TABLE difference_test_01 AS 
SELECT (ST_Dump((ST_Difference(pt.geo,u.geo)))).geom AS geom FROM 
( SELECT ST_Collect(distinct pt.geo) AS geo FROM all_area pt WHERE ST_IsValid(pt.geo)) AS pt, 
( SELECT ST_Union(geo) AS geo FROM used_area u WHERE ST_IsValid(geo) ) AS u;

comment:2 by Lars Aksel Opsahl, 19 months ago

Summary: ST_Differnce silently giving us wrong resultsST_Difference silently giving us wrong results

comment:3 by strk, 19 months ago

The shown image suggests it's the GEOS heuristic overlay reducing the precision of input way too much. What's the output of SELECT postgis_full_version() ?

comment:4 by strk, 19 months ago

The referenced data is in these files:

The query runs a difference between the UNION of used area polygons and the COLLECTION of all area polygons. This is technically an unsupported overlay operation as it involves collections. The fact it works is accidental and currently under discussion here: https://github.com/libgeos/geos/issues/797

That said, the result on my system is acceptable. I'm running:

POSTGIS="3.4.0dev 3.3.0rc2-960-g8b6d264b9" GEOS="3.12.0dev-CAPI-1.18.0"

by strk, 19 months ago

Attachment: acceptable_result.png added

comment:5 by strk, 19 months ago

A portion of the acceptable result (result is green, the brown part is the only exposed part of the all_areas input otherwise fully covered by the green; the yellow is the other input)

in reply to:  3 comment:6 by Lars Aksel Opsahl, 19 months ago

Replying to strk:

The shown image suggests it's the GEOS heuristic overlay reducing the precision of input way too much. What's the output of SELECT postgis_full_version() ?

Here is an an output from the quary below on he different systems

SELECT 'difference_test_01 sum', COUNT(*), sum(round(ST_Area(r.geom,true))), sum(ST_NPoints(r.geom)) FROM difference_test_01 r;

--Wrong result db09test
--POSTGIS="3.2.0 c3e3cc0" [EXTENSION] PGSQL="120" GEOS="3.10.1-CAPI-1.16.0" SFCGAL="1.3.7" PROJ="8.2.0" GDAL="GDAL 3.4.0, released 2021/11/04" LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3" WAGYU="0.5.0 (Internal)" TOPOLOGY RASTER
--ON Postgresql 12
--result = difference_test_01 sum|9|12374747|72

--Wrong result localhost
--POSTGIS="3.2.3 2f97b6c" [EXTENSION] PGSQL="140" GEOS="3.11.0-CAPI-1.17.0" PROJ="9.0.1" LIBXML="2.9.14" LIBJSON="0.16" LIBPROTOBUF="1.4.1" WAGYU="0.5.0 (Internal)" TOPOLOGY
--ON Postgresql 14
--result = difference_test_01 sum|9|12374747|72

--Wrong result CI
--POSTGIS="3.3.3 2355e8e" [EXTENSION] PGSQL="120" GEOS="3.10.2-CAPI-1.16.0" PROJ="8.2.1" LIBXML="2.9.13" LIBJSON="0.15" LIBPROTOBUF="1.3.3" WAGYU="0.5.0 (Internal)" TOPOLOGY
--ON Postgresql 12, 13 and 14
--result = difference_test_01 sum|8|11848089|67

--But it seems to work on a local server by Sandro with lates code from source (db01tempdb)
--POSTGIS="3.4.0dev 3.3.0rc2-958-g4c776d418" [EXTENSION] PGSQL="120" GEOS="3.9.1-CAPI-1.14.2" SFCGAL="1.3.7" PROJ="7.2.1" LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3" WAGYU="0.5.0 (Internal)" TOPOLOGY
--ON Postgresql 12
--result = difference_test_01 sum|1551|15799749|412840


Last edited 19 months ago by Lars Aksel Opsahl (previous) (diff)

comment:7 by Lars Aksel Opsahl, 19 months ago

This code suggested by Sandro seems to work

CREATE TABLE difference_test_01 AS
SELECT
  ST_Difference(
    a.geo,
    ( SELECT ST_Union(u.geo) FROM used_area u )
  )
AS geom
FROM all_area a;

comment:8 by Lars Aksel Opsahl, 19 months ago

I have noticed that I did not get any emails about changes in this ticket, so I checked and may email was not in this system any more. (Yes it's long time since last email yes, added it again now)

comment:9 by pramsey, 19 months ago

For what it's worth I just ran the original example (not Sandro's SQL) and it output what (at least visually) seem to be perfectly correct results (no crazy precision reduction) against GEOS 3.12. At a minimum, the mixed collection update in 3.12 at least allows processing to conclude.

comment:10 by Lars Aksel Opsahl, 19 months ago

Maybe add this as test case since it also worked with GEOS="3.9.1-CAPI-1.14.2" but not with 3.10, 3.11.

comment:11 by pramsey, 19 months ago

Unfortunately it's a little large to be a test case. Something smaller that exhibits the same behaviour would be nice.

comment:12 by pramsey, 19 months ago

I just pulled up PostGIS under 3.11 and I also didn't see precision reduction artefacts there either, using the

CREATE TABLE difference_test_01 AS 
SELECT (ST_Dump((ST_Difference(pt.geo,u.geo)))).geom AS geom FROM 
( SELECT ST_Collect(distinct pt.geo) AS geo FROM all_area pt WHERE ST_IsValid(pt.geo)) AS pt, 
( SELECT ST_Union(geo) AS geo FROM used_area u WHERE ST_IsValid(geo) ) AS u;

formula to test.

Version 0, edited 19 months ago by pramsey (next)

comment:13 by Lars Aksel Opsahl, 19 months ago

On this CI job failed https://gitlab.com/nibioopensource/resolve-overlap-and-gap/-/pipelines/893035719

formula to test.

difference_test_01 sum|9|12374747|72

Versions

POSTGIS="3.3.3 2355e8e" [EXTENSION] PGSQL="120" GEOS="3.10.2-CAPI-1.16.0" PROJ="8.2.1" LIBXML="2.9.13" LIBJSON="0.15" LIBPROTOBUF="1.3.3" WAGYU="0.5.0 (Internal)" TOPOLOGY

But on my local pipeline we see this

formula to test.

difference_test_01 sum|8|11848089|67

Versions

POSTGIS="3.2.3 2f97b6c" [EXTENSION] PGSQL="140" GEOS="3.11.0-CAPI-1.17.0" PROJ="9.0.1" LIBXML="2.9.14" LIBJSON="0.16" LIBPROTOBUF="1.4.1" WAGYU="0.5.0 (Internal)" TOPOLOGY

Using Sandro code

CREATE TABLE difference_test_01 AS
SELECT
  ST_Difference(
    a.geo,
    ( SELECT ST_Union(u.geo) FROM used_area u )
  )
AS geom
FROM all_area a

formula to test.

difference_test_01 sum |  1719 | 15799744 | 442752

Versions

POSTGIS="3.2.3 2f97b6c" [EXTENSION] PGSQL="140" GEOS="3.11.0-CAPI-1.17.0" PROJ="9.0.1" LIBXML="2.9.14" LIBJSON="0.16" LIBPROTOBUF="1.4.1" WAGYU="0.5.0 (Internal)" TOPOLOGY

Another thing with the Sandro code the time used on my local mac

Time: 1173.537 ms (00:01.174)

But with original code it was much slower (and giving the wrong result the systems I tested on)

Time: 48452.738 ms (00:48.453)

At your test you get 3874 in the count column so something has changed yes, because with the Sandro code count number is 1719 but area is almost the same.

Tests that behave this differently on system that are almost the same is not a good thing, since seems difficult to exactly pin point exact code on where this problem originates from (works on older geos GEOS="3.9.1-CAPI-1.14.2").

With exceptions and maybe a hint on how to try rewrite the SQL in some way and where in your data the problems is, gives us a nice option to fix "bad" data like we have here.

The result from using Postgis Topology is below and thats almost the same as from the Sandro way code.

formula to test.

topo_difference_test_01 sum|1734|15799747|440021

Versions

POSTGIS="3.2.3 2f97b6c" [EXTENSION] PGSQL="140" GEOS="3.11.0-CAPI-1.17.0" PROJ="9.0.1" LIBXML="2.9.14" LIBJSON="0.16" LIBPROTOBUF="1.4.1" WAGYU="0.5.0 (Internal)" TOPOLOGY

by Lars Aksel Opsahl, 19 months ago

before_fix_user_test_02

by Lars Aksel Opsahl, 19 months ago

after_fix_test02

comment:14 by Lars Aksel Opsahl, 19 months ago

This seems like a much common problem then I expected, because testers yesterday was testing different places before the patch was out and they did find more areas like this, where borders has been snapped quite hard.

before_fix_user_test_02

After the fix it follows the property borders correctly

after_fix_test02

The test server used by tester are running on postgresql 12

POSTGIS="3.2.0 c3e3cc0" [EXTENSION] PGSQL="120" GEOS="3.10.1-CAPI-1.16.0" SFCGAL="1.3.7" PROJ="8.2.0" GDAL="GDAL 3.4.0, released 2021/11/04" LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3" WAGYU="0.5.0 (Internal)" TOPOLOGY RASTER
Last edited 19 months ago by Lars Aksel Opsahl (previous) (diff)

comment:15 by strk, 19 months ago

Lars this ticket is best used to tackle the unexpected precision reduction, so the aim here would be to reproduce the issue, not to work around it by using a different SQL statement.

As neither me nor Paul are able to reproduce the issue it isn't easy for us to check if the this bug was already fixed or it depends on some other characteristic on those systems that are still affected.

It would help in trying to reproduce the issue to first of all avoid movable parts, so for example you could save the two operands of the ST_Difference call in a table have some static data to work with. This could be done as such:

CREATE TABLE t5401 AS SELECT 
( SELECT ST_Collect(distinct pt.geo) FROM all_area pt WHERE ST_IsValid(pt.geo)) g1, 
( SELECT ST_Union(geo) AS geo FROM used_area u WHERE ST_IsValid(geo) ) g2;

The check if you can reproduce the bug with something like:

SELECT ST_Difference(g1,g2) FROM t5401;

Then attach a dump of that table in this ticket.

comment:16 by strk, 19 months ago

I've installed GEOS-3.10.1 and enabled debug, here's the interesting log, with input that I can attach later, if needed:

Trying with OverlayNGRobust
OverlayNGRobust: IllegalArgumentException: Argument must be Polygonal or LinearRing
Trying with original input.
Original exception: TopologyException: no outgoing dirEdge found at 10.907027451935953 64.968978126148144
Trying with Common Bits Remover (CBR)
CBR: TopologyException: no outgoing dirEdge found at 0.90702745193595213 0.96897812614814161
Trying with snapping 
Computed snap tolerance: 7.5270739785750608223e-11
Computed common bits: 10 64
...

It's interesting to note that IllegalArgumentException is NOT interrupting the operations, but it's rather considered as a triggering event to move to further attempts, which is probably WRONG. That's the link with https://github.com/libgeos/geos/issues/797 I had in mind, although I do realize that one is about mixed-typed collections really.

comment:17 by strk, 19 months ago

Just to complete the picture, the debugging logs continue to the precision reduction heuristics which end up with the garbage result. Detailed logs are in https://github.com/libgeos/geos/issues/924#issuecomment-1593160531

It turns out this problem is 2-fold:

  1. GEOMETRYCOLLECTION should not be passed to ST_Difference but GEOS fails to report this —→ https://github.com/libgeos/geos/issues/925
  2. GEOS attempts to still support the operation goes too far —→ https://github.com/libgeos/geos/issues/924

comment:18 by strk, 19 months ago

Milestone: PostGIS 3.3.4PostGIS GEOS
Version: 3.3.x

comment:19 by strk, 19 months ago

For the record: GEOS-3.10.4 added support for simple (uniform?) GeometryCollection inputs, which is why builds from 3.10 branch are not capable of reproducing the problem.

in reply to:  17 comment:20 by Lars Aksel Opsahl, 19 months ago

Replying to strk:

Just to complete the picture, the debugging logs continue to the precision reduction heuristics which end up with the garbage result. Detailed logs are in https://github.com/libgeos/geos/issues/924#issuecomment-1593160531

It turns out this problem is 2-fold:

  1. GEOMETRYCOLLECTION should not be passed to ST_Difference but GEOS fails to report this —→ https://github.com/libgeos/geos/issues/925
  2. GEOS attempts to still support the operation goes too far —→ https://github.com/libgeos/geos/issues/924

Thanks, very nice. I then assume then you have what you need, but if you need more we create branch for you on https://gitlab.com/nibioopensource/resolve-overlap-and-gap .

Note: See TracTickets for help on using tickets.