#3127 closed defect (fixed)
KNN geog distance doesn't match ST_Distance spheroid
Reported by: | robe | Owned by: | robe |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 2.2.0 |
Component: | documentation | Version: | master |
Keywords: | Cc: |
Description (last modified by )
I'm not sure if this is expected or not, but I was trying to write up regress tests for KNN geog and found inconsistencies between what ↔ returns and what ST_Distance geography returns which changes the ranking. It seems the ↔ is returning the sphere distance instead of spheroid (e.g. I get same answer if I do
ST_Distance(geog1,geog2, false) )
If that is the case, and by design, we just need to make a note of it in the docs, so people don't falsely report its bugged.
e.g.:
CREATE TABLE knn_recheck_geog(gid serial primary key, geog geography); INSERT INTO knn_recheck_geog(gid,geog) SELECT ROW_NUMBER() OVER(ORDER BY x,y) AS gid, ST_Point(x*1.11,y*0.95)::geography As geog FROM generate_series(-100,100, 1) AS x CROSS JOIN generate_series(-90,90,1) As y; SELECT a.gid, b.gid As match, RANK() OVER(PARTITION BY a.gid ORDER BY ST_Distance(a.geog, b.geog) ) As true_rn, b.rn As knn_rn, ST_Distance(a.geog, b.geog) As dist_spheroid, ST_Distance(a.geog, b.geog, false) As dist_sphere, a.geog <-> b.geog As knn_dist FROM knn_recheck_geog As a LEFT JOIN LATERAL ( SELECT gid, geog, RANK() OVER(ORDER BY a.geog <-> g.geog) As rn FROM knn_recheck_geog As g WHERE a.gid <> g.gid ORDER BY 'POINT(-105.45 3.8)'::Geography <-> g.geog LIMIT 5) As b ON true WHERE a.gid = 1000 ORDER BY a.gid, b.rn;
yields:
gid | match | true_rn | knn_rn | dist_spheroid | dist_sphere | knn_dist ------+-------+---------+--------+------------------+------------------+------------------ 1000 | 1001 | 2 | 1 | 105051.447591582 | 105635.3257479 | 105635.3257479 1000 | 999 | 1 | 2 | 105049.134514627 | 105635.3257479 | 105635.3257479 1000 | 819 | 3 | 3 | 123294.778176043 | 123155.172887459 | 123155.172887459 1000 | 1181 | 4 | 4 | 123294.778176044 | 123155.17288746 | 123155.17288746 1000 | 820 | 5 | 5 | 161924.039877904 | 162197.178690157 | 162197.178690157 (5 rows)
Hmm and I should add — I can't get this to use an index (or a simple query with a constant anymore so guess I may need to reopen that other ticket.
Change History (8)
comment:1 by , 10 years ago
Component: | documentation → postgis |
---|---|
Resolution: | → fixed |
Status: | new → closed |
comment:2 by , 10 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Summary: | KNN geog distance doesn't match ST_Distance spheroid but uses sphere → KNN geog distance doesn't match ST_Distance spheroid |
comment:3 by , 10 years ago
Description: | modified (diff) |
---|
Okay how far off are these supposed to be, I'm still getting cases where the distances don't line up with or without an index. Case in point:
CREATE TABLE knn_recheck_geog_small_2 ( gid integer, geog geography ); INSERT INTO knn_recheck_geog_small_2 VALUES (818, '0101000020E6100000A4703D0AD7A35AC0CDCCCCCCCCCC0640'); INSERT INTO knn_recheck_geog_small_2 VALUES (819, '0101000020E6100000A4703D0AD7A35AC06666666666660E40'); INSERT INTO knn_recheck_geog_small_2 VALUES (820, '0101000020E6100000A4703D0AD7A35AC00000000000001340'); INSERT INTO knn_recheck_geog_small_2 VALUES (998, '0101000020E6100000CDCCCCCCCC5C5AC0666666666666FE3F'); INSERT INTO knn_recheck_geog_small_2 VALUES (999, '0101000020E6100000CDCCCCCCCC5C5AC0CDCCCCCCCCCC0640'); INSERT INTO knn_recheck_geog_small_2 VALUES (1000, '0101000020E6100000CDCCCCCCCC5C5AC06666666666660E40'); INSERT INTO knn_recheck_geog_small_2 VALUES (1001, '0101000020E6100000CDCCCCCCCC5C5AC00000000000001340'); INSERT INTO knn_recheck_geog_small_2 VALUES (1180, '0101000020E6100000F6285C8FC2155AC0CDCCCCCCCCCC0640'); INSERT INTO knn_recheck_geog_small_2 VALUES (1181, '0101000020E6100000F6285C8FC2155AC06666666666660E40'); INSERT INTO knn_recheck_geog_small_2 VALUES (1182, '0101000020E6100000F6285C8FC2155AC00000000000001340');
Now without even creating an index, these distances don't match up:
SELECT a.gid, ARRAY(SELECT (gid,a.geog <-> g.geog) FROM knn_recheck_geog_small_2 As g WHERE a.gid <> g.gid ORDER BY a.geog <-> g.geog, g.gid LIMIT 2) As knn , ARRAY(SELECT (gid, ST_Distance(a.geog, g.geog ) ) FROM knn_recheck_geog_small_2 As g WHERE a.gid <> g.gid ORDER BY ST_Distance(a.geog, g.geog), g.gid LIMIT 2) As dist FROM knn_recheck_geog_small_2 As a WHERE a.gid = 1000 ORDER BY a.gid;
—gives
gid | knn | dist ------+--------------------------------------------------+----------------------------- 1000 | {"(1001,105635.3257479)","(999,105635.3257479)"} | {"(999,105049.134514627)",(1001,105051.447591582)"}
That could be expected behavior I guess. I should add, I can't get this or any simple derivative to use the spatial index UNLESS one side is a LITERAL (something like 'POINT(…)'. I think I was able to before (we tried to fix this issue). I may need to go to an old version to verify that assumption. Even enable_seqscan=false doesn't do it.
comment:4 by , 10 years ago
Disregard my index comment — not sure if its an issue of concern or not, but it seems the extra order by clause prevented use of spatial index so getting rid of the addtional order by a.gid seems to allow index to kick in.
SELECT a.gid, ARRAY(SELECT (gid,a.geog <-> g.geog) FROM knn_recheck_geog_small_2 As g WHERE a.gid <> g.gid ORDER BY a.geog <-> g.geog LIMIT 2) As knn , ARRAY(SELECT (gid, ST_Distance(a.geog, g.geog ) ) FROM knn_recheck_geog_small_2 As g WHERE a.gid <> g.gid ORDER BY ST_Distance(a.geog, g.geog) LIMIT 2) As dist FROM knn_recheck_geog_small_2 As a WHERE a.gid = 1000 ORDER BY a.gid;
but answer with index use is still the same with it thinking these two geometries are same distance. Seems like its still using sphere to me at r13589
SELECT a.geog <-> g.geog As knn_dist, ST_Distance(a.geog,g.geog) As dist, ST_Distance(a.geog,g.geog, false) As dist_sphere FROM (SELECT 'POINT(-105.45 3.8)'::geography) As a(geog) , (VALUES ('POINT(-105.45 2.85)'::geography ) , ( 'POINT(-105.45 4.75)'::geography) ) g(geog); knn_dist | dist | dist_sphere ----------------+------------------+---------------- 105635.3257479 | 105049.134514627 | 105635.3257479 105635.3257479 | 105051.447591582 | 105635.3257479
comment:5 by , 10 years ago
Hey wait a minute — looks like in r13588 you switched it back to not use spheroid.
comment:6 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Okay fixed at r13590 and the knn_recheck regress test I had put in is now passing now without crashing.
comment:7 by , 10 years ago
I'm beginning to think this is a sunk cause - should we just have it use sphere? I was all happy until I went to cleanup my tests (to get rid of the additional order by which was preventing index from kicking in) and was able to trigger the index out of order problem again. I'll confirm it's the forcing the spheroid that is causing this before we decide if we need to sacrifice this to make #3131 work.
comment:8 by , 10 years ago
Component: | postgis → documentation |
---|
reverted the fix, and amended the documentation to say its on sphere instead of spheroid at r13593
pramsey fixed this at r13563 not sure if its related or something I just missed before but spatial index is no longer being used. Will post a separate ticket.