Geometry Cleaning

Cleaning is fundamentally a difficult problem, because things can be dirty in so many ways. Here's a list of cases that need to be addressed in any geometry cleaning routines.

  • POLYGON rings must not self-touch
    • The classic "bow-tie" polygon would have to be re-written as a "polygon with hole that touches once"
    • A figure-8 polygon would have to be re-written as a MULTIPOLYGON
    • Sometimes the figure-8 has one really, really small side, and it's best to just lose those
    • Similarly sometimes one half of the bow-tie is really small and should just be dropped
  • POLYGON rings should not have zero area
  • POLYGONs should probably not have zero area
  • POLYGON rings must be properly nested and only touch once
    • POLYGONs with rings that touch along a segment should have the inner ring and zero-width corridor removed
  • LINESTRINGs and POLYGON rings should not have duplicate vertices and probably not have vertices within a tolerance of one another
    • For POLYGONs in a coverage, this will break edge-matching
  • LINESTRINGs and POLYGON rings should probably not have "spikes" or "gores"
    • These elements create two very parallel segments in the feature, which lead to topology failures later on
  • MULTIPOLYGONS are not allowed to have parts that touch
    • Fixing this in generality is hard because it requires dissolving which is itself a topologically sensitive operation
  • POLYGON rings must not cross
    • Again, fixing this is hard because the intent behind crossing rings is difficult to discern.

The potential for breaking edge-matching calls for a further cleaning function, that takes in two geometries and snaps the edges of one to another, within a tolerance.

  • ST_SnapToReference(referencegeometry, geometry, tolerance)

A self-join could be used to run this function on all pairwise possibilities in a table to "complete the cleaning" of a coverage.