Opened 10 years ago
Last modified 10 years ago
#3127 closed defect
KNN geog distance doesn't match ST_Distance spheroid — at Version 3
Reported by: | robe | Owned by: | robe |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 2.2.0 |
Component: | documentation | Version: | master |
Keywords: | Cc: |
Description (last modified by )
I'm not sure if this is expected or not, but I was trying to write up regress tests for KNN geog and found inconsistencies between what ↔ returns and what ST_Distance geography returns which changes the ranking. It seems the ↔ is returning the sphere distance instead of spheroid (e.g. I get same answer if I do
ST_Distance(geog1,geog2, false) )
If that is the case, and by design, we just need to make a note of it in the docs, so people don't falsely report its bugged.
e.g.:
CREATE TABLE knn_recheck_geog(gid serial primary key, geog geography); INSERT INTO knn_recheck_geog(gid,geog) SELECT ROW_NUMBER() OVER(ORDER BY x,y) AS gid, ST_Point(x*1.11,y*0.95)::geography As geog FROM generate_series(-100,100, 1) AS x CROSS JOIN generate_series(-90,90,1) As y; SELECT a.gid, b.gid As match, RANK() OVER(PARTITION BY a.gid ORDER BY ST_Distance(a.geog, b.geog) ) As true_rn, b.rn As knn_rn, ST_Distance(a.geog, b.geog) As dist_spheroid, ST_Distance(a.geog, b.geog, false) As dist_sphere, a.geog <-> b.geog As knn_dist FROM knn_recheck_geog As a LEFT JOIN LATERAL ( SELECT gid, geog, RANK() OVER(ORDER BY a.geog <-> g.geog) As rn FROM knn_recheck_geog As g WHERE a.gid <> g.gid ORDER BY 'POINT(-105.45 3.8)'::Geography <-> g.geog LIMIT 5) As b ON true WHERE a.gid = 1000 ORDER BY a.gid, b.rn;
yields:
gid | match | true_rn | knn_rn | dist_spheroid | dist_sphere | knn_dist ------+-------+---------+--------+------------------+------------------+------------------ 1000 | 1001 | 2 | 1 | 105051.447591582 | 105635.3257479 | 105635.3257479 1000 | 999 | 1 | 2 | 105049.134514627 | 105635.3257479 | 105635.3257479 1000 | 819 | 3 | 3 | 123294.778176043 | 123155.172887459 | 123155.172887459 1000 | 1181 | 4 | 4 | 123294.778176044 | 123155.17288746 | 123155.17288746 1000 | 820 | 5 | 5 | 161924.039877904 | 162197.178690157 | 162197.178690157 (5 rows)
Hmm and I should add — I can't get this to use an index (or a simple query with a constant anymore so guess I may need to reopen that other ticket.
Change History (3)
comment:1 by , 10 years ago
Component: | documentation → postgis |
---|---|
Resolution: | → fixed |
Status: | new → closed |
comment:2 by , 10 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Summary: | KNN geog distance doesn't match ST_Distance spheroid but uses sphere → KNN geog distance doesn't match ST_Distance spheroid |
comment:3 by , 10 years ago
Description: | modified (diff) |
---|
Okay how far off are these supposed to be, I'm still getting cases where the distances don't line up with or without an index. Case in point:
CREATE TABLE knn_recheck_geog_small_2 ( gid integer, geog geography ); INSERT INTO knn_recheck_geog_small_2 VALUES (818, '0101000020E6100000A4703D0AD7A35AC0CDCCCCCCCCCC0640'); INSERT INTO knn_recheck_geog_small_2 VALUES (819, '0101000020E6100000A4703D0AD7A35AC06666666666660E40'); INSERT INTO knn_recheck_geog_small_2 VALUES (820, '0101000020E6100000A4703D0AD7A35AC00000000000001340'); INSERT INTO knn_recheck_geog_small_2 VALUES (998, '0101000020E6100000CDCCCCCCCC5C5AC0666666666666FE3F'); INSERT INTO knn_recheck_geog_small_2 VALUES (999, '0101000020E6100000CDCCCCCCCC5C5AC0CDCCCCCCCCCC0640'); INSERT INTO knn_recheck_geog_small_2 VALUES (1000, '0101000020E6100000CDCCCCCCCC5C5AC06666666666660E40'); INSERT INTO knn_recheck_geog_small_2 VALUES (1001, '0101000020E6100000CDCCCCCCCC5C5AC00000000000001340'); INSERT INTO knn_recheck_geog_small_2 VALUES (1180, '0101000020E6100000F6285C8FC2155AC0CDCCCCCCCCCC0640'); INSERT INTO knn_recheck_geog_small_2 VALUES (1181, '0101000020E6100000F6285C8FC2155AC06666666666660E40'); INSERT INTO knn_recheck_geog_small_2 VALUES (1182, '0101000020E6100000F6285C8FC2155AC00000000000001340');
Now without even creating an index, these distances don't match up:
SELECT a.gid, ARRAY(SELECT (gid,a.geog <-> g.geog) FROM knn_recheck_geog_small_2 As g WHERE a.gid <> g.gid ORDER BY a.geog <-> g.geog, g.gid LIMIT 2) As knn , ARRAY(SELECT (gid, ST_Distance(a.geog, g.geog ) ) FROM knn_recheck_geog_small_2 As g WHERE a.gid <> g.gid ORDER BY ST_Distance(a.geog, g.geog), g.gid LIMIT 2) As dist FROM knn_recheck_geog_small_2 As a WHERE a.gid = 1000 ORDER BY a.gid;
—gives
gid | knn | dist ------+--------------------------------------------------+----------------------------- 1000 | {"(1001,105635.3257479)","(999,105635.3257479)"} | {"(999,105049.134514627)",(1001,105051.447591582)"}
That could be expected behavior I guess. I should add, I can't get this or any simple derivative to use the spatial index UNLESS one side is a LITERAL (something like 'POINT(…)'. I think I was able to before (we tried to fix this issue). I may need to go to an old version to verify that assumption. Even enable_seqscan=false doesn't do it.
pramsey fixed this at r13563 not sure if its related or something I just missed before but spatial index is no longer being used. Will post a separate ticket.