Opened 19 months ago
Last modified 19 months ago
#5401 new defect
ST_Difference silently giving us wrong results
Reported by: | Lars Aksel Opsahl | Owned by: | pramsey |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS GEOS |
Component: | postgis | Version: | |
Keywords: | Cc: | Lars Aksel Opsahl |
Description
In this case, we have a fixed restricted area (blue), where users can select an area to use that are not already used by another user (red area), so we need to remove the area taken. The result the user saw is here https://gitlab.com/nibioopensource/resolve-overlap-and-gap/-/blob/develop/src/test/pictures/difference_test_01_POSTGIS_3_3_3.png (output from ST_Difference, the green rectangles).
The ST_Difference test and data used can be found here https://gitlab.com/nibioopensource/resolve-overlap-and-gap/-/blob/develop/src/test/sql/regress/rog_sf_difference.sql .
If a I use latest source compiled by Sandro a couple weeks ago on a server locally, this test case works, but then another test case fails longer down in the script.
Attachments (3)
Change History (23)
comment:1 by , 19 months ago
comment:2 by , 19 months ago
Summary: | ST_Differnce silently giving us wrong results → ST_Difference silently giving us wrong results |
---|
follow-up: 6 comment:3 by , 19 months ago
The shown image suggests it's the GEOS heuristic overlay reducing the precision of input way too much. What's the output of SELECT postgis_full_version()
?
comment:4 by , 19 months ago
The referenced data is in these files:
- https://gitlab.com/nibioopensource/resolve-overlap-and-gap/-/blob/develop/src/test/sql/regress/data/all_area.sql
- https://gitlab.com/nibioopensource/resolve-overlap-and-gap/-/blob/develop/src/test/sql/regress/data/used_area.sql
The query runs a difference between the UNION of used area polygons and the COLLECTION of all area polygons. This is technically an unsupported overlay operation as it involves collections. The fact it works is accidental and currently under discussion here: https://github.com/libgeos/geos/issues/797
That said, the result on my system is acceptable. I'm running:
POSTGIS="3.4.0dev 3.3.0rc2-960-g8b6d264b9" GEOS="3.12.0dev-CAPI-1.18.0"
by , 19 months ago
Attachment: | acceptable_result.png added |
---|
comment:5 by , 19 months ago
comment:6 by , 19 months ago
Replying to strk:
The shown image suggests it's the GEOS heuristic overlay reducing the precision of input way too much. What's the output of
SELECT postgis_full_version()
?
Here is an an output from the quary below on he different systems
SELECT 'difference_test_01 sum', COUNT(*), sum(round(ST_Area(r.geom,true))), sum(ST_NPoints(r.geom)) FROM difference_test_01 r;
--Wrong result db09test --POSTGIS="3.2.0 c3e3cc0" [EXTENSION] PGSQL="120" GEOS="3.10.1-CAPI-1.16.0" SFCGAL="1.3.7" PROJ="8.2.0" GDAL="GDAL 3.4.0, released 2021/11/04" LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3" WAGYU="0.5.0 (Internal)" TOPOLOGY RASTER --ON Postgresql 12 --result = difference_test_01 sum|9|12374747|72 --Wrong result localhost --POSTGIS="3.2.3 2f97b6c" [EXTENSION] PGSQL="140" GEOS="3.11.0-CAPI-1.17.0" PROJ="9.0.1" LIBXML="2.9.14" LIBJSON="0.16" LIBPROTOBUF="1.4.1" WAGYU="0.5.0 (Internal)" TOPOLOGY --ON Postgresql 14 --result = difference_test_01 sum|9|12374747|72 --Wrong result CI --POSTGIS="3.3.3 2355e8e" [EXTENSION] PGSQL="120" GEOS="3.10.2-CAPI-1.16.0" PROJ="8.2.1" LIBXML="2.9.13" LIBJSON="0.15" LIBPROTOBUF="1.3.3" WAGYU="0.5.0 (Internal)" TOPOLOGY --ON Postgresql 12, 13 and 14 --result = difference_test_01 sum|8|11848089|67 --But it seems to work on a local server by Sandro with lates code from source (db01tempdb) --POSTGIS="3.4.0dev 3.3.0rc2-958-g4c776d418" [EXTENSION] PGSQL="120" GEOS="3.9.1-CAPI-1.14.2" SFCGAL="1.3.7" PROJ="7.2.1" LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3" WAGYU="0.5.0 (Internal)" TOPOLOGY --ON Postgresql 12 --result = difference_test_01 sum|1551|15799749|412840
comment:7 by , 19 months ago
This code suggested by Sandro seems to work
CREATE TABLE difference_test_01 AS SELECT ST_Difference( a.geo, ( SELECT ST_Union(u.geo) FROM used_area u ) ) AS geom FROM all_area a;
comment:8 by , 19 months ago
I have noticed that I did not get any emails about changes in this ticket, so I checked and may email was not in this system any more. (Yes it's long time since last email yes, added it again now)
comment:9 by , 19 months ago
For what it's worth I just ran the original example (not Sandro's SQL) and it output what (at least visually) seem to be perfectly correct results (no crazy precision reduction) against GEOS 3.12. At a minimum, the mixed collection update in 3.12 at least allows processing to conclude.
comment:10 by , 19 months ago
Maybe add this as test case since it also worked with GEOS="3.9.1-CAPI-1.14.2" but not with 3.10, 3.11.
comment:11 by , 19 months ago
Unfortunately it's a little large to be a test case. Something smaller that exhibits the same behaviour would be nice.
comment:12 by , 19 months ago
I just pulled up PostGIS under 3.11 and I also didn't see precision reduction artefacts there either, using the
CREATE TABLE difference_test_01 AS SELECT (ST_Dump((ST_Difference(pt.geo,u.geo)))).geom AS geom FROM ( SELECT ST_Collect(distinct pt.geo) AS geo FROM all_area pt WHERE ST_IsValid(pt.geo)) AS pt, ( SELECT ST_Union(geo) AS geo FROM used_area u WHERE ST_IsValid(geo) ) AS u;
formula to test.
comment:13 by , 19 months ago
On this CI job failed https://gitlab.com/nibioopensource/resolve-overlap-and-gap/-/pipelines/893035719
formula to test.
difference_test_01 sum|9|12374747|72
Versions
POSTGIS="3.3.3 2355e8e" [EXTENSION] PGSQL="120" GEOS="3.10.2-CAPI-1.16.0" PROJ="8.2.1" LIBXML="2.9.13" LIBJSON="0.15" LIBPROTOBUF="1.3.3" WAGYU="0.5.0 (Internal)" TOPOLOGY
But on my local pipeline we see this
formula to test.
difference_test_01 sum|8|11848089|67
Versions
POSTGIS="3.2.3 2f97b6c" [EXTENSION] PGSQL="140" GEOS="3.11.0-CAPI-1.17.0" PROJ="9.0.1" LIBXML="2.9.14" LIBJSON="0.16" LIBPROTOBUF="1.4.1" WAGYU="0.5.0 (Internal)" TOPOLOGY
Using Sandro code
CREATE TABLE difference_test_01 AS SELECT ST_Difference( a.geo, ( SELECT ST_Union(u.geo) FROM used_area u ) ) AS geom FROM all_area a
formula to test.
difference_test_01 sum | 1719 | 15799744 | 442752
Versions
POSTGIS="3.2.3 2f97b6c" [EXTENSION] PGSQL="140" GEOS="3.11.0-CAPI-1.17.0" PROJ="9.0.1" LIBXML="2.9.14" LIBJSON="0.16" LIBPROTOBUF="1.4.1" WAGYU="0.5.0 (Internal)" TOPOLOGY
Another thing with the Sandro code the time used on my local mac
Time: 1173.537 ms (00:01.174)
But with original code it was much slower (and giving the wrong result the systems I tested on)
Time: 48452.738 ms (00:48.453)
At your test you get 3874 in the count column so something has changed yes, because with the Sandro code count number is 1719 but area is almost the same.
Tests that behave this differently on system that are almost the same is not a good thing, since seems difficult to exactly pin point exact code on where this problem originates from (works on older geos GEOS="3.9.1-CAPI-1.14.2").
With exceptions and maybe a hint on how to try rewrite the SQL in some way and where in your data the problems is, gives us a nice option to fix "bad" data like we have here.
The result from using Postgis Topology is below and thats almost the same as from the Sandro way code.
formula to test.
topo_difference_test_01 sum|1734|15799747|440021
Versions
POSTGIS="3.2.3 2f97b6c" [EXTENSION] PGSQL="140" GEOS="3.11.0-CAPI-1.17.0" PROJ="9.0.1" LIBXML="2.9.14" LIBJSON="0.16" LIBPROTOBUF="1.4.1" WAGYU="0.5.0 (Internal)" TOPOLOGY
comment:14 by , 19 months ago
This seems like a much common problem then I expected, because testers yesterday was testing different places before the patch was out and they did find more areas like this, where borders has been snapped quite hard.
After the fix it follows the property borders correctly
The test server used by tester are running on postgresql 12
POSTGIS="3.2.0 c3e3cc0" [EXTENSION] PGSQL="120" GEOS="3.10.1-CAPI-1.16.0" SFCGAL="1.3.7" PROJ="8.2.0" GDAL="GDAL 3.4.0, released 2021/11/04" LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3" WAGYU="0.5.0 (Internal)" TOPOLOGY RASTER
comment:15 by , 19 months ago
Lars this ticket is best used to tackle the unexpected precision reduction, so the aim here would be to reproduce the issue, not to work around it by using a different SQL statement.
As neither me nor Paul are able to reproduce the issue it isn't easy for us to check if the this bug was already fixed or it depends on some other characteristic on those systems that are still affected.
It would help in trying to reproduce the issue to first of all avoid movable parts, so for example you could save the two operands of the ST_Difference call in a table have some static data to work with. This could be done as such:
CREATE TABLE t5401 AS SELECT ( SELECT ST_Collect(distinct pt.geo) FROM all_area pt WHERE ST_IsValid(pt.geo)) g1, ( SELECT ST_Union(geo) AS geo FROM used_area u WHERE ST_IsValid(geo) ) g2;
The check if you can reproduce the bug with something like:
SELECT ST_Difference(g1,g2) FROM t5401;
Then attach a dump of that table in this ticket.
comment:16 by , 19 months ago
I've installed GEOS-3.10.1 and enabled debug, here's the interesting log, with input that I can attach later, if needed:
Trying with OverlayNGRobust OverlayNGRobust: IllegalArgumentException: Argument must be Polygonal or LinearRing Trying with original input. Original exception: TopologyException: no outgoing dirEdge found at 10.907027451935953 64.968978126148144 Trying with Common Bits Remover (CBR) CBR: TopologyException: no outgoing dirEdge found at 0.90702745193595213 0.96897812614814161 Trying with snapping Computed snap tolerance: 7.5270739785750608223e-11 Computed common bits: 10 64 ...
It's interesting to note that IllegalArgumentException is NOT interrupting the operations, but it's rather considered as a triggering event to move to further attempts, which is probably WRONG. That's the link with https://github.com/libgeos/geos/issues/797 I had in mind, although I do realize that one is about mixed-typed collections really.
follow-up: 20 comment:17 by , 19 months ago
Just to complete the picture, the debugging logs continue to the precision reduction heuristics which end up with the garbage result. Detailed logs are in https://github.com/libgeos/geos/issues/924#issuecomment-1593160531
It turns out this problem is 2-fold:
- GEOMETRYCOLLECTION should not be passed to ST_Difference but GEOS fails to report this —→ https://github.com/libgeos/geos/issues/925
- GEOS attempts to still support the operation goes too far —→ https://github.com/libgeos/geos/issues/924
comment:18 by , 19 months ago
Milestone: | PostGIS 3.3.4 → PostGIS GEOS |
---|---|
Version: | 3.3.x |
comment:19 by , 19 months ago
For the record: GEOS-3.10.4 added support for simple (uniform?) GeometryCollection inputs, which is why builds from 3.10 branch are not capable of reproducing the problem.
comment:20 by , 19 months ago
Replying to strk:
Just to complete the picture, the debugging logs continue to the precision reduction heuristics which end up with the garbage result. Detailed logs are in https://github.com/libgeos/geos/issues/924#issuecomment-1593160531
It turns out this problem is 2-fold:
- GEOMETRYCOLLECTION should not be passed to ST_Difference but GEOS fails to report this —→ https://github.com/libgeos/geos/issues/925
- GEOS attempts to still support the operation goes too far —→ https://github.com/libgeos/geos/issues/924
Thanks, very nice. I then assume then you have what you need, but if you need more we create branch for you on https://gitlab.com/nibioopensource/resolve-overlap-and-gap .
In the script we check the input data are valid before ST_Difference is dobe