Opened 6 months ago

Closed 6 months ago

#5737 closed defect (fixed)

ST_SimplifyPreserveTopology non-deterministic behavior?

Reported by: julius6 Owned by: pramsey
Priority: critical Milestone: PostGIS 3.4.3
Component: postgis Version: 3.4.x
Keywords: Cc:

Description

Hi,

I came across what it seems to be a bug in ST_SimplifyPreserveTopology. Any advice/help is much appreciated.

In my workflow it is crucial to have a deterministic behavior (same output from same input data), but ST_SimplifyPreserveTopology outputs different geometries across consecutive runs.

I made up a test case to explain the issue.

test_simplify_fun.sql contains the function test_simplify().

postgis333.log and postgis341.log are the outputs of running the script with psql against postgres 14/postgis 3.3.3 and postgres 15/postgis 3.4.1 respectively.

For more details on the various library versions (geos, …) in the log files you can also find the output of select version(), postgis_full_version() on the two systems.

Quick workflow explanation:

  • poly.the_geom is test geometry, polygon, srid 3857, valid, 4613 points
  • ST_SimplifyPreserveTopology(poly.the_geom, 200) is called in subqueries s1,s2.
  • ST_Simplify(poly.the_geom, 200) would produce an invalid geometry, so this is a case where "preserve topology" comes into play.
  • A few reports are run on the two simplified geometries
    • wkt_difference: difference, as text
    • simp_equals: result of st_equals
    • simp_ordering_equals: result of st_orderingequals
    • eq: result of bare equality
    • wkb_eq: result of equality between wkbs
    • n1simp: how many points in first simplified geometry
    • n2simp: how many points in second simplified geometry
    • n1rem: how many points after removing repeated points from first simplified geometry
    • n2rem: how many points after removing repeated points from second simplified geometry
  • test_simplify() is called 10 times on the same database session

The logs have been generated like this: psql -d <db_uri> -f test_simplify_fun.sql > some.log

As you can see from logs the results are quite odd across multiple runs:

  • the two simplified geometries have different point count (n1simp, n2simp change)
  • the difference between the two simplified geometries changes (wkt_difference changes)
  • simp_equals=t, wkt_difference ≠ 'POLYGON EMPTY'. st_equals is true but the difference is not empty.
  • simp_equals=f, simp_ordering_equals=t. The second should be more restrictive.
  • n1rem > n1simp (or n2rem > n2simp). st_removerepeteadpoints adds points.
  • simp_equals=f, wkb_eq=t. The WKB is the same but geometries are not.

I also made a test by calling an old geos (3.5) from c++ and the results are the same (limited to simplify preserve topology not returning the same geometry across multiple calls).

Could it be a bug in geos? Uninitialized variables? Some geos internal state that doesn't reset?

Any suggestions to solve the indetermination problem?

Thanks in advance Alessandro

Attachments (3)

test_simplify_fun.sql (146.7 KB ) - added by julius6 6 months ago.
postgis333.log (4.1 KB ) - added by julius6 6 months ago.
postgis341.log (5.4 KB ) - added by julius6 6 months ago.

Download all attachments as: .zip

Change History (6)

by julius6, 6 months ago

Attachment: test_simplify_fun.sql added

by julius6, 6 months ago

Attachment: postgis333.log added

by julius6, 6 months ago

Attachment: postgis341.log added

comment:1 by mdavis, 6 months ago

I ran this with

POSTGIS="3.4.0 0874ea3" [EXTENSION] PGSQL="160" GEOS="3.12.0-CAPI-1.18.0" PROJ="9.2.1 NETWORK_ENABLED=OFF URL_ENDPOINT=https://cdn.proj.org USER_WRITABLE_DIRECTORY=/Users/mdavis/Library/Application Support/proj DATABASE_PATH=/Applications/Postgres.app/Contents/Versions/16/share/proj/proj.db" LIBXML="2.11.5" LIBJSON="0.17" LIBPROTOBUF="1.4.1" WAGYU="0.5.0 (Internal)"

and got a clean run (see output below).

There may have been some changes which either fixed a subtle bug, or made the operation more stable. Are you able to upgrade to GEOS 3.12?

 run | valid | npoints |   gtype    | wkt_difference | simp_equals | simp_ordering_equals | eq | wkb_eq | n1simp | n2simp | n1rem | n2rem 
-----+-------+---------+------------+----------------+-------------+----------------------+----+--------+--------+--------+-------+-------
   1 | t     |    4613 | ST_Polygon | POLYGON EMPTY  | t           | t                    | t  | t      |    605 |    605 |   605 |   605
   2 | t     |    4613 | ST_Polygon | POLYGON EMPTY  | t           | t                    | t  | t      |    605 |    605 |   605 |   605
   3 | t     |    4613 | ST_Polygon | POLYGON EMPTY  | t           | t                    | t  | t      |    605 |    605 |   605 |   605
   4 | t     |    4613 | ST_Polygon | POLYGON EMPTY  | t           | t                    | t  | t      |    605 |    605 |   605 |   605
   5 | t     |    4613 | ST_Polygon | POLYGON EMPTY  | t           | t                    | t  | t      |    605 |    605 |   605 |   605
   6 | t     |    4613 | ST_Polygon | POLYGON EMPTY  | t           | t                    | t  | t      |    605 |    605 |   605 |   605
   7 | t     |    4613 | ST_Polygon | POLYGON EMPTY  | t           | t                    | t  | t      |    605 |    605 |   605 |   605
   8 | t     |    4613 | ST_Polygon | POLYGON EMPTY  | t           | t                    | t  | t      |    605 |    605 |   605 |   605
   9 | t     |    4613 | ST_Polygon | POLYGON EMPTY  | t           | t                    | t  | t      |    605 |    605 |   605 |   605
  10 | t     |    4613 | ST_Polygon | POLYGON EMPTY  | t           | t                    | t  | t      |    605 |    605 |   605 |   605

comment:2 by julius6, 6 months ago

Just ran a test on windows with the latest bundle (postgres 15.7, postgis 3.4.2, geos 3.12.1) and I can confirm it works.

I guess the root cause was fixed here:

https://github.com/libgeos/geos/pull/718

comment:3 by mdavis, 6 months ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.