Opened 8 years ago

Closed 7 years ago

Last modified 7 years ago

#3777 closed defect (fixed)

POINT EMPTY, POINT 0 0 anomaly

Reported by: robe Owned by: strk
Priority: medium Milestone: PostGIS 2.3.3
Component: postgis Version: 2.3.x
Keywords: Cc:

Description (last modified by robe)

As Pierre pointed out in:

https://lists.osgeo.org/pipermail/postgis-users/2017-June/042203.html

We've got what appears to be a bug with our handling of POINT EMPTY, POINT 0 0.

Tested on PostGIS 2.3 and 2.4

SELECT ST_AsText(geom) txt, count(*)
FROM (
SELECT 'POINT(0 0)'::geometry geom
UNION ALL
SELECT 'POINT(0 0)'::geometry geom
UNION ALL
SELECT 'POINT(0 0)'::geometry geom
UNION ALL
SELECT 'POINT(0 1)'::geometry geom
UNION ALL
SELECT 'LINESTRING(0 0,0 1)'::geometry geom
UNION ALL
SELECT 'GEOMETRYCOLLECTION EMPTY'::geometry geom
UNION ALL
SELECT 'POINT EMPTY'::geometry geom
) foo
GROUP BY geom;

Yields:

           txt            | count
--------------------------+-------
 POINT(0 0)               |     2
 POINT EMPTY              |     1
 POINT(0 0)               |     1
 LINESTRING(0 0,0 1)      |     1
 GEOMETRYCOLLECTION EMPTY |     1
 POINT(0 1)               |     1

when it should be yielding something along the lines of:

           txt            | count
--------------------------+-------
 POINT(0 0)               |     3
 LINESTRING(0 0,0 1)      |     1
 GEOMETRYCOLLECTION EMPTY |     2
 POINT(0 1)               |     1


Since POINT EMPTY and GEOMETRYCOLLECTION have null boxes they should aggregate together and since points and 2 point lines should have no boxes, they should be taken literally.

I did the same exercise in PostGIS 1.5.5 and got this:

           txt            | count
--------------------------+-------
 GEOMETRYCOLLECTION EMPTY |     4
 LINESTRING(0 0,0 1)      |     1
 GEOMETRYCOLLECTION EMPTY |     1
 POINT(0 1)               |     1
(4 rows)

which I attributed to the fact that at one point of time, since we had no concept of POINT EMPTY we always output the representation as GEOMETRYCOLLECTION EMPTY and I think there was one function that used POINT(0 0) as rep for POINT EMPTY. Though I could be mistaken.

Change History (17)

comment:1 by robe, 8 years ago

Description: modified (diff)

comment:2 by robe, 8 years ago

to remove ST_AsText out of the equation I did this:

SELECT geom, count(*)
FROM (
SELECT 'POINT(0 0)'::geometry geom
UNION ALL
SELECT 'POINT(0 0)'::geometry geom
UNION ALL
SELECT 'POINT(0 0)'::geometry geom
UNION ALL
SELECT 'POINT(0 1)'::geometry geom
UNION ALL
SELECT 'LINESTRING(0 0,0 1)'::geometry geom
UNION ALL
SELECT 'GEOMETRYCOLLECTION EMPTY'::geometry geom
UNION ALL
SELECT 'POINT EMPTY'::geometry geom
) foo
GROUP BY geom;

In PostGIS 2.3.2 and PostgreSQL 10beta1 on x86_64-w64-mingw32, compiled by gcc.exe (x86_64-win32-seh-rev1, Built by MinGW-W64 project) 4.8.3, 64-bit POSTGIS="2.4.0dev r15435" GEOS="3.7.0dev-CAPI-1.11.0 8fe2ce6" PROJ="Rel. 4.9.1, 04 March 2015" GDAL="GDAL 2.2.0, released 2017/04/28" LIBXML="2.7.8" LIBJSON="0.12" LIBPROTOBUF="1.2.1" RASTER

                                        geom                                        | count
------------------------------------------------------------------------------------+-------
 010100000000000000000000000000000000000000                                         |     2
 0101000000000000000000F87F000000000000F87F                                         |     1
 010100000000000000000000000000000000000000                                         |     1
 010200000002000000000000000000000000000000000000000000000000000000000000000000F03F |     1
 010700000000000000                                                                 |     1
 01010000000000000000000000000000000000F03F                                         |     1
(6 rows)

In PostgreSQL 9.0.17, compiled by Visual C++ build 1500, 32-bit POSTGIS="1.5.5" GEOS="3.3.8-CAPI-1.7.8" PROJ="Rel. 4.6.1, 21 August 2008" LIBXML="2.7.8" USE_STATS (procs from 1.5 r7360 need upgrade), this returns:

                                        geom                                        | count
------------------------------------------------------------------------------------+-------
 010700000000000000                                                                 |     4
 010200000002000000000000000000000000000000000000000000000000000000000000000000F03F |     1
 010700000000000000                                                                 |     1
 01010000000000000000000000000000000000F03F                                         |     1
(4 rows)

comment:3 by strk, 8 years ago

GROUP BY uses btree opclass, which uses bounding boxes, any EMPTY has the same bounding box (null).

comment:4 by robe, 8 years ago

strk yes, but that's not the point. The bug is the empties are not collapsing into null and neither are the POINT(0 0) when you do a group by. Why is one POINT(0 0) not the same as the other POINT(0 0)?

           txt            | count
--------------------------+-------
 POINT(0 0)               |     2
 POINT EMPTY              |     1
 POINT(0 0)               |     1
 LINESTRING(0 0,0 1)      |     1
 GEOMETRYCOLLECTION EMPTY |     1
 POINT(0 1)               |     1

comment:5 by robe, 8 years ago

okay strk per your suggestion, here is what UNION does:

SELECT ST_AsText(geom) txt
FROM (
SELECT 'POINT(0 0)'::geometry geom
UNION 
SELECT 'POINT(0 0)'::geometry geom
UNION 
SELECT 'POINT(0 0)'::geometry geom
UNION 
SELECT 'POINT(0 1)'::geometry geom
UNION
SELECT 'LINESTRING(0 0,0 1)'::geometry geom
UNION 
SELECT 'GEOMETRYCOLLECTION EMPTY'::geometry geom
UNION 
SELECT 'POINT EMPTY'::geometry geom
) foo;

In 2.4.0dev/2.3.2 it returns:

           txt
--------------------------
 POINT(0 0)
 POINT EMPTY
 POINT(0 0)
 LINESTRING(0 0,0 1)
 GEOMETRYCOLLECTION EMPTY
 POINT(0 1)
(6 rows)

In my PostGIS 1.5.5 returns

           txt
--------------------------
 GEOMETRYCOLLECTION EMPTY
 LINESTRING(0 0,0 1)
 GEOMETRYCOLLECTION EMPTY
 POINT(0 1)
(4 rows)

comment:6 by robe, 8 years ago

We must have had this problem for sometime. PostGIS 1.5.5 is doing something different as well when change order of items. Check this out:

SELECT ST_AsText(geom) txt
FROM (
SELECT 'GEOMETRYCOLLECTION EMPTY'::geometry geom
UNION 
SELECT 'POINT EMPTY'::geometry geom
UNION
SELECT 'POINT(0 0)'::geometry geom
UNION 
SELECT 'POINT(0 0)'::geometry geom
UNION 
SELECT 'POINT(0 0)'::geometry geom
UNION 
SELECT 'POINT(0 1)'::geometry geom
UNION
SELECT 'LINESTRING(0 0,0 1)'::geometry geom

) foo;

In 2.3.2 and 2.4.0dev

           txt
--------------------------
 POINT(0 0)
 GEOMETRYCOLLECTION EMPTY
 POINT(0 0)
 POINT EMPTY
 POINT(0 0)
 LINESTRING(0 0,0 1)
 POINT(0 1)
(7 rows)

In 1.5.5

         txt
---------------------
 POINT(0 0)
 LINESTRING(0 0,0 1)
 POINT(0 1)
(3 rows)

comment:7 by robe, 7 years ago

Milestone: PostGIS 2.3.3PostGIS 2.4.0

Since this issue has been around sometime and no solution in sight, pushing to 2.4 to clear way for 2.3.3.

comment:8 by strk, 7 years ago

I took a look a the btree code, many functions simply don't check for EMPTY case, so they rely on uninitialized values in analyzed BBOXes. I'm trying a fix (would require a testcase, not sure this one would make it).

comment:9 by strk, 7 years ago

In 15447:

Fix handling of EMPTY geometries in btree operator

Includes regress test

See #3777

comment:10 by strk, 7 years ago

Milestone: PostGIS 2.4.0PostGIS 2.3.3

See how you like it now Regina. I'll backport to 2.3.3

comment:11 by strk, 7 years ago

Resolution: fixed
Status: newclosed

In 15448:

Fix handling of EMPTY geometries in btree operator

Includes regress test

Closes #3777 for 2.3 branch

comment:12 by robe, 7 years ago

Resolution: fixed
Status: closedreopened

I think there is still something wrong here. I'm testing on

PostgreSQL 9.6.1 on x86_64-w64-mingw32, compiled by gcc.exe (x86_64-win32-seh-rev4, Built by MinGW-W64 project) 4.9.2, 64-bit POSTGIS="2.3.3dev r15449" GEOS="3.6.1-CAPI-1.10.1 r4317" PROJ="Rel. 4.9.1, 04 March 2015" GDAL="GDAL 2.2.0, released 2017/04/28" LIBXML="2.7.8" LIBJSON="0.12" RASTER

The first query returns the right answer now:

SELECT ST_AsText(geom) txt, count(*)
FROM (
SELECT 'POINT(0 0)'::geometry geom
UNION ALL
SELECT 'POINT(0 0)'::geometry geom
UNION ALL
SELECT 'POINT(0 0)'::geometry geom
UNION ALL
SELECT 'POINT(0 1)'::geometry geom
UNION ALL
SELECT 'LINESTRING(0 0,0 1)'::geometry geom
UNION ALL
SELECT 'GEOMETRYCOLLECTION EMPTY'::geometry geom
UNION ALL
SELECT 'POINT EMPTY'::geometry geom
) foo
GROUP BY geom;

Yields right answer:

           txt            | count
--------------------------+-------
 POINT(0 0)               |     3
 LINESTRING(0 0,0 1)      |     1
 POINT(0 1)               |     1
 GEOMETRYCOLLECTION EMPTY |     2
(4 rows)

But if I switch the order of geometries, I get a different answer and again POINT(0 0) is not fully aggregated.

SELECT ST_AsText(geom) txt, count(*)
FROM (
SELECT 'POINT(0 0)'::geometry geom
UNION ALL
SELECT 'POINT(0 0)'::geometry geom
UNION ALL
SELECT 'POINT EMPTY'::geometry geom
UNION ALL
SELECT 'POINT(0 0)'::geometry geom
UNION ALL
SELECT 'POINT(0 1)'::geometry geom
UNION ALL
SELECT 'LINESTRING(0 0,0 1)'::geometry geom
UNION ALL
SELECT 'GEOMETRYCOLLECTION EMPTY'::geometry geom
) foo
GROUP BY geom;

Gives:

           txt            | count
--------------------------+-------
 POINT(0 0)               |     2
 POINT EMPTY              |     1
 POINT(0 0)               |     1
 GEOMETRYCOLLECTION EMPTY |     1
 LINESTRING(0 0,0 1)      |     1
 POINT(0 1)               |     1
(6 rows)


Time: 4.227 ms

comment:13 by robe, 7 years ago

Owner: changed from pramsey to strk
Status: reopenednew

comment:14 by strk, 7 years ago

In 15450:

Fix lwgeom_cmp return code

See #3777

comment:15 by strk, 7 years ago

See if you like r15450 better

comment:16 by strk, 7 years ago

Resolution: fixed
Status: newclosed

In 15451:

Fix lwgeom_cmp return code

Closes #3777 once again (for 2.3 branch)

comment:17 by robe, 7 years ago

yap that looks to have fixed it.

thanks

Note: See TracTickets for help on using tickets.