Opened 4 years ago

Closed 3 years ago

Last modified 3 years ago

#4736 closed defect (fixed)

ST_NumGeometries is zero for GeometryCollection with empty parts

Reported by: mwtoews Owned by: strk
Priority: medium Milestone: PostGIS 3.2.0
Component: postgis Version: master
Keywords: Cc:

Description

Take a GeometryCollection with mixed empty/non-empty parts:

db=> select ST_IsEmpty(g), ST_NumGeometries(g), ST_AsText(g), ST_AsText(ST_GeometryN(g, 1)), ST_AsText(ST_GeometryN(g, 2))
db-> from (select 'GeometryCollection(Point empty, point(1 2), LineString empty)'::geometry g) f;
-[ RECORD 1 ]----+------------------------------------------------------------
st_isempty       | f
st_numgeometries | 3
st_astext        | GEOMETRYCOLLECTION(POINT EMPTY,POINT(1 2),LINESTRING EMPTY)
st_astext        | POINT EMPTY
st_astext        | POINT(1 2)

that seems fine. Now with two empty parts:

db=> select ST_IsEmpty(g), ST_NumGeometries(g), ST_AsText(g), ST_AsText(ST_GeometryN(g, 1)), ST_AsText(ST_GeometryN(g, 2))
db-> from (select 'GeometryCollection(Point empty, LineString empty)'::geometry g) f;
-[ RECORD 1 ]----+-------------------------------------------------
st_isempty       | t
st_numgeometries | 0
st_astext        | GEOMETRYCOLLECTION(POINT EMPTY,LINESTRING EMPTY)
st_astext        | POINT EMPTY
st_astext        | LINESTRING EMPTY

why is ST_NumGeometries now suggesting there are zero geometries, but ST_GeometryN is able to retrieve the two empty parts? An expected result would have ST_NumGeometries return 2 instead of 0.


postgis_full_version | POSTGIS="2.4.0 r15853" PGSQL="96" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.11.4, released 2016/01/25" LIBXML="2.9.1" LIBJSON="0.11" RASTER

Change History (12)

comment:1 by strk, 4 years ago

Milestone: PostGIS 3.1.0PostGIS 2.4.9
Owner: changed from pramsey to strk
Status: newassigned

We're hitting a short-circuit: if the geometry is empty, return 0 components. I'd call it a bug.

comment:2 by strk, 4 years ago

The shortcut was introduced with [c35cc257f983151aade662d81689423190c97417/git] with an overall behavioural review of empty geometries handling, referencing ticket #692 and wiki page DevWikiEmptyGeometry which I guess should be UPDATED if we do NOT want empty collections to return 0 from ST_NumGeometries ..

comment:3 by robe, 4 years ago

Milestone: PostGIS 2.4.9PostGIS 3.0.3

comment:4 by robe, 4 years ago

Milestone: PostGIS 3.0.3PostGIS 3.0.4

comment:5 by pramsey, 3 years ago

Hm, the wiki says that ST_GeometryN(empty, 1) should return empty, but that is a bit fiddly, since ST_NumGeometries(empty) returns 0, which maybe it should return 1? Since there's one geometry there, and it's an empty one?

postgis_reg=# select st_numgeometries('POINT EMPTY');
 st_numgeometries 
------------------
                0
(1 row)

postgis_reg=# select st_numgeometries('POINT(1 1)');
 st_numgeometries 
------------------
                1
(1 row)

Also, this is inconsistent with num geometries for empty being zero

postgis_reg=# select st_geometryn('POINT EMPTY', 1);
                st_geometryn                
--------------------------------------------
 0101000000000000000000F87F000000000000F87F
(1 row)

I'm half-tempted to go with num geometries is 1 for simple objects and n-sub-geometries for any collection, even a collection of empties, and make the behaviour consistent. Would require updates to the wiki page, but I think the angels on the pin dance better that way. Thoughts?

comment:6 by pramsey, 3 years ago

Meanwhile, there's some extra behavior lying around in more obscure types. Returns NULL on geometryn for empty input. There's a simplicity to that and it does match with other out-of-range behaviour of geometryn, so if we *do* think that numgeometries(empty) == 0, then geometryn(empty) should be NULL and not empty.

postgis32=# SELECT 'patchN_01', ST_AsEWKT(ST_patchN('POLYHEDRALSURFACE EMPTY'::geometry, 1));
 ?column?  | st_asewkt 
-----------+-----------
 patchN_01 | 

comment:7 by pramsey, 3 years ago

This case suggests that (a) empties count, so numgeometries(point empty) == 1 and numgeometries(collection empty) == number_of_subelements.

postgis_reg=# select st_numgeometries('MULTIPOINT(0 0, EMPTY, 1 1)');
 st_numgeometries 
------------------
                3
(1 row)

postgis_reg=# select st_astext(st_geometryn('MULTIPOINT(0 0, EMPTY, 1 1)', 2));
  st_astext  
-------------
 POINT EMPTY
(1 row)

comment:8 by pramsey, 3 years ago

The overloading of ST_NumGeometries to return non-null on simple geometry means that we end up with an odd situation where behaviours are slightly different with respect to element counts on empty things. I feel like the below is the least disruptive thing that involves the least change.

ST_NumGeometries('POINT EMPTY') => 1
ST_NumGeometries('GEOMETRYCOLLECTION EMPTY') => 0
ST_NumGeometries('GEOMETRYCOLLECTION(POINT EMPTY)') => 1
ST_GeometryN('POINT EMPTY', 1) => POINT EMPTY
ST_GeometryN('GEOMETRYCOLLECTION EMPTY', 1) => NULL
ST_GeometryN('GEOMETRYCOLLECTION(POINT EMPTY)', 1) => POINT EMPTY
}}
So just adjusting the return of ST_NumGeometries to be non-zero for collections of empty sub-elements.
Version 0, edited 3 years ago by pramsey (next)

comment:9 by pramsey, 3 years ago

Although, equally, it's possible that zero elements and null-on-access is better…

ST_NumGeometries('POINT EMPTY') => 0
ST_NumGeometries('GEOMETRYCOLLECTION EMPTY') => 0
ST_NumGeometries('GEOMETRYCOLLECTION(POINT EMPTY)') => 0
ST_GeometryN('POINT EMPTY', 1) => NULL
ST_GeometryN('GEOMETRYCOLLECTION EMPTY', 1) => NULL
ST_GeometryN('GEOMETRYCOLLECTION(POINT EMPTY)', 1) => NULL

Matches out-of-range behaviour and zero-count behaviour in existing code. Makes it impossible to access sub-geometries of empty. I think I like this one more.

comment:10 by pramsey, 3 years ago

Resolution: fixed
Status: assignedclosed

OK, this is changed in main at f885c3354. ST_GeometryN(empty) will return NULL for all values of N, which matches up with ST_NumGeometries(empty) returning 0 for all values of empty, collections and singletons both, since there's no in-range value of N when the count is 0.

I have left the stable branches alone in the interests of preserving existing behaviour.

comment:11 by pramsey, 3 years ago

Milestone: PostGIS 3.0.4PostGIS 3.2.0
Version: 2.4.xmain

in reply to:  5 comment:12 by mdavis, 3 years ago

Replying to pramsey:

I'm half-tempted to go with num geometries is 1 for simple objects and n-sub-geometries for any collection, even a collection of empties, and make the behaviour consistent. Would require updates to the wiki page, but I think the angels on the pin dance better that way. Thoughts?

FWIW, that is the semantics that JTS implements. It allows seamless iteration over atomic and collection geometries, with no special cases. OGC only defines ST_NumGeometries and ST_GeometryN on GeometryCollection and subclasses. IMO it's nicer to have functions operate on any geometry type in a consistent way (especially in a language like SQL, where it's awkward to handle special cases, and NULL has a semantic slightly different to EMPTY). This seemed like the right way to extend them to atomic geometries.

Note: See TracTickets for help on using tickets.