Opened 4 years ago

Last modified 4 years ago

#4749 new enhancement

Improve behaviour of spatial predicates with GeometryCollection inputs

Reported by: mdavis Owned by: pramsey
Priority: medium Milestone: PostGIS Fund Me
Component: postgis Version: 2.5.x -- EOL
Keywords: Cc:

Description

The behaviour of PostGIS spatial predicate operations with geometry collection arguments appears to be ill-defined, inconsistent, and error-prone. 

For instance, this query causes a segfault:

SELECT st_Intersects(
    'LINESTRING (0 0, 1 0)', 
    'GEOMETRYCOLLECTION (MULTIPOLYGON (((922337 -922337, 922337 922337, -922337 922337, -922337 922337, 922337 -922337))))'
);

This query causes an ERROR GEOSContains: TopologyException, which is not helpful to a typical user:

SELECT ST_Contains(
    'GEOMETRYCOLLECTION (POLYGON ((100 300, 300 100, 100 100, 100 300)), POLYGON ((250 250, 250 200, 100 200, 250 250)), LINESTRING (130 180, 220 220))',
    'LINESTRING (130 180, 220 220)'
);

There are other cases which succeed.  There may also be cases which appear to succeed, but return an incorrect answer.

This is due to the behaviour of the underlying GEOS library.  The GEOS spatial predicate code has some fundamental limitations which mean that it cannot handle some cases of this kind (although it could handle more than it does now, and should at least ensure that hard failure cannot occur). (See GEOS tickets 1047 and 1042)

GEOS may be improved, but there may be limits to how many cases it will handle.  So PostGIS could implement some work-arounds

and checks to make the semantics of spatial predicates on GeometryCollections more clear and more robust.

The following cases can be handled with relatively simple code:

  • GCs containing a single non-GC element can be "unwrapped" to convert to just that element
  • GCs containing a set of LineStrings or Points can be converted into a Multi-geometry
  • GCs containing a set of Polygons which form a VALID MultiPolygon can be converted to a MultiPolygon. This requires a potentially-expensive validity check, but at least provides functionality 

The following cases cannot be handled easily. They should report a appropriate, informative error:

  • GCs containing mixed-dimension elements
  • GCs containing Polygons which form an invalid MultiPolygon

Change History (6)

comment:1 by mdavis, 4 years ago

Update 1: It is possible to handle GCs which form invalid MultiPolygons (i.e. have overlapping polygon elements) by unioning the polygons first.

Last edited 4 years ago by mdavis (previous) (diff)

comment:2 by Algunenano, 4 years ago

The following cases cannot be handled easily. They should report a appropriate, informative error:

The main issue I see with this is that chaining calls might bug out for no good reason as current geos predicates return GeometryCollections (I know you tried to change it, but in the end it's still there). For example, with this change ST_Intersects(ST_Intersection(A, B), C) will fail if the result of ST_Intersection(A, B) returns a GC of a polygon and a line.

I'm not saying the old behaviour was correct (since it's obvious it was failing per your examples), but we should have a good base so we stop getting those random TopologyException that everybody dislikes.

in reply to:  2 comment:3 by mdavis, 4 years ago

Replying to Algunenano:

The following cases cannot be handled easily. They should report a appropriate, informative error:

The main issue I see with this is that chaining calls might bug out for no good reason as current geos predicates return GeometryCollections. For example, with this change ST_Intersects(ST_Intersection(A, B), C) will fail if the result of ST_Intersection(A, B) returns a GC of a polygon and a line.

To be clear, it is the overlay functions which can return mixed-type geometry. The current spatial predicates do not handle mixed geometry inputs, so this is not regressing that behaviour.

The goals of this suggestion are:

  • eliminate segfaults
  • provide better error messages to the user
  • (perhaps) extend the predicate domain to handle more inputs (at the price of more processing)

I agree that it would be useful to extend the domain of some of the predicates (in particular, ST_Intersects) to handle mixed geometry inputs. This should be feasible, and hopefully can be done in subsequent development.

we should have a good base so we stop getting those random TopologyException that everybody dislikes.

To be clear, the TopologyException thrown by the example above occurs because the predicate code is exposing too much of the internal logic. That's why I suggest replacing that error message with something useful, such as "Mixed-type collections not allowed" or "Collection is not a valid MultiPolygon".

Don't confuse this with the TopologyExceptions thrown by the overlay code. Those are robustness issues rather than input limitations. We are hoping that OverlayNG eliminates them completely.

comment:4 by pramsey, 4 years ago

I'm not sure I understand why even things like invalid multipolygons cannot be amenable to predicate evaluation?

  • A Intersects B… if any element A intersects any element of B, A intersects B
  • A Contains B… if all elements of B are contained by at least one element of A, and are disjoint from all other elenents of A, then A contains B
  • A Touches B… if all elements of A touch an element of B, and are disjoint from all other elements of B, then A touches B

I mean, Relate() is not possible, but it feels like the boolean predicates, which are made up of some "obvious" interactions and other "ok, I can kind of believe that" intersection (do we really believe "crosses" is well defined between two polygons?) do have plausible rules that can be at least made up for GCs.

Feel free to tear me apart on this.

Last edited 4 years ago by pramsey (previous) (diff)

in reply to:  4 comment:5 by mdavis, 4 years ago

Replying to pramsey:

why even things like invalid multipolygons cannot be amenable to predicate evaluation?

Can we say "GCs containing overlapping polygons"? "invalid" includes far too much craziness…

  • A Intersects B… if any element A intersects any element of B, A intersects B

Agreed

  • A Contains B… if all elements of B are contained by at least one element of A, and are disjoint from all other elenents of A, then A contains B

Some might say that "contains" should test against the entire area covered by the polygons (this is consistent with the notion of point-set topology). In which case this can't be evaluated component-wise - it needs to use the (effective) union of the polygons. But this semantic is certainly a subset of the full test.

  • A Touches B… if all elements of A touch an element of B, and are disjoint from all other elements of B, then A touches B

Sounds reasonable.

I mean, Relate() is not possible, but it feels like the boolean predicates, which are made up of some "obvious" interactions and other "ok, I can kind of believe that" intersection (do we really believe "crosses" is well defined between two polygons?) do have plausible rules that can be at least made up for GCs.

So agreed, can probably make up some semantics which allow extension to all GC inputs. I'm not sure the phone has been ringing off the hook to do this though?

This ticket was really just about cleaning up how the current semantics are implemented.

comment:6 by pramsey, 4 years ago

Milestone: PostGIS 3.1.0PostGIS Fund Me
Note: See TracTickets for help on using tickets.