= Proposal for clustering functions =
== `geometry[] ST_ClusterIntersecting(geometry geom)` ==
Aggregate function returning an array of `GeometryCollection`s representing the connected components of a set of geometries.
- accepts `[Multi]Point`, `[Multi]LineString`, `[Multi]Polygon` geometries of any type that can be converted into GEOS (I can't think of a situation where `[Multi]Point` would be useful, but that doesn't mean there isn't one...)
- return a geometry array (my current implementation returns a `GeometryCollection`, but the recursive semantics of `ST_Dump` then undo all of the hard work)
Example: if run on a table containing all of the `LineString`s in the image below, would return an array with two `MultiLineString` geometries (red and blue)
[[Image(http://i.stack.imgur.com/WNlxX.png)]]
----
== `geometry[] ST_ClusterWithin(geometry geom, double precision distance)` ==
Aggregate function returning an array of `GeometryCollection`s?/`MultiPoint`s?, where any component is reachable from any other component with jump of no more than the specified distance.
- like `ST_ClusterIntersecting`, but uses a distance threshold rather than intersection when determining if two geometries should be included in the same component. Could have an implementation very similar to `ST_ClusterIntersecting`, or could be restricted to points and maybe have a more efficient implementation.
- differs from k-means in that a distance is provided, not a number of clusters
Example: In the picture below, an array of five `MultiPoint`s would be returned (color-coded). The threshold distance in this case was more than the orange line but less than the pink line.
[[Image(http://ibin.co/1oH1ApWCoW8L)]]