Opened 6 years ago

Closed 6 years ago

Last modified 6 years ago

#3965 closed defect (fixed)

KMeans provides less than K clusters

Reported by: komzpa Owned by: pramsey
Priority: high Milestone: PostGIS 2.4.3
Component: postgis Version: master
Keywords: Cc:

Description (last modified by komzpa)

Clustering 25 distinct points into 25 clusters gets 24 clusters:

select count(distinct cid) from 
(WITH
points AS (
    SELECT ST_MakePoint(x,y) geom from generate_series(1,5) x, generate_series(1,5) y
)
SELECT ST_ClusterKMeans(geom, 25) over () AS cid, geom
FROM points) z;

The larger K is, the bigger losses are.

Change History (5)

comment:1 by komzpa, 6 years ago

Description: modified (diff)

comment:3 by komzpa, 6 years ago

Resolution: fixed
Status: newclosed

In 16212:

Fix KMeans initialization issue that lost clusters sometimes.

Closes #3965
Closes https://github.com/postgis/postgis/pull/179

comment:4 by komzpa, 6 years ago

In 16213:

Fix KMeans initialization issue that lost clusters sometimes.

Closes #3965
Closes https://github.com/postgis/postgis/pull/179

comment:5 by komzpa, 6 years ago

In 16214:

Fix KMeans initialization issue that lost clusters sometimes.

Closes #3965
Closes https://github.com/postgis/postgis/pull/179

Note: See TracTickets for help on using tickets.