# Proposal for clustering functions

`geometry[] ST_ClusterIntersecting(geometry geom)`

Aggregate function returning an array of `GeometryCollection`

s representing the connected components of a set of geometries.

- accepts
`[Multi]Point`

,`[Multi]LineString`

,`[Multi]Polygon`

geometries of any type that can be converted into GEOS (I can't think of a situation where`[Multi]Point`

would be useful, but that doesn't mean there isn't one...) - return a geometry array (my current implementation returns a
`GeometryCollection`

, but the recursive semantics of`ST_Dump`

then undo all of the hard work)

Example: if run on a table containing all of the `LineString`

s in the image below, would return an array with two `MultiLineString`

geometries (red and blue)

`geometry[] ST_ClusterWithin(geometry geom, double precision distance)`

Aggregate function returning an array of `GeometryCollection`

s?/`MultiPoint`

s?, where any component is reachable from any other component with jump of no more than the specified distance.

- like
`ST_ClusterIntersecting`

, but uses a distance threshold rather than intersection when determining if two geometries should be included in the same component. Could have an implementation very similar to`ST_ClusterIntersecting`

, or could be restricted to points and maybe have a more efficient implementation. - differs from k-means in that a distance is provided, not a number of clusters

Example: In the picture below, an array of five `MultiPoint`

s would be returned (color-coded). The threshold distance in this case was more than the orange line but less than the pink line.