Opened 11 years ago
Closed 10 years ago
#2703 closed enhancement (fixed)
KNN gist with recheck for 9.5?
Reported by: | robe | Owned by: | pramsey |
---|---|---|---|
Priority: | high | Milestone: | PostGIS 2.2.0 |
Component: | postgis | Version: | master |
Keywords: | Cc: |
Description
I noticed this feature is on the commitfest 9.4 block for consideration.
https://commitfest.postgresql.org/action/patch_view?id=1367
Would be cool if we can push it along and leverage it in 2.2. As best I understand it, it would mean our KNN would act as a true distance check.
From patch thread
KNN-GiST provides ability to get ordered results from index, but this order is based only on index information. For instance, GiST index contains bounding rectangles for polygons, and we can't get exact distance to polygon from index (similar situation is in PostGIS). In attached patch, GiST distance method can set recheck flag (similar to consistent method). This flag means that distance method returned lower bound of distance and we should recheck it from heap.
Am I missing anything here?
Change History (25)
comment:1 by , 11 years ago
Summary: | KNN gist with recheck for 9.4? → KNN gist with recheck for 9.5? |
---|
comment:2 by , 10 years ago
Still no reviewer for current patch https://commitfest.postgresql.org/4/156/
comment:3 by , 10 years ago
Priority: | medium → high |
---|
Now that knn gist with recheck is officially going to be in 9.5, can push this to high. pramsey push back if you don't think we can get it in for 2.2
comment:5 by , 10 years ago
Update to ↔ committed into trunk at r13530, supports real KNN from PgSQL 9.5+
comment:6 by , 10 years ago
Geography ↔ support with recheck committed at r13538. Regina, some testing would be appreciated. I don't know what to do for <↔> and <<#>>, which are already committed by strk, but are harder to move to re-check since we don't have a generic n-d exact distance calculation available. See me -devel email on the topic.
comment:7 by , 10 years ago
I need to recompile Postgresql 9.5 with latest source to confirm it's not bugged here. My instance is about a week old, but so far not looking good. I pulled my airports database of 14,479 someodd records and this query makes my instance go BOOM.
-- POSTGIS="2.2.0dev r13538" GEOS="3.5.0dev-CAPI-1.9.0 r4034" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.11.1, released 2014/09/24 GDAL_DATA not found" LIBXML="2.7.8" LIBJSON="0.12" RASTER SELECT * FROM airports ORDER BY (SELECT geog FROM airports WHERE airport_id = 'KBOS') <-> geog limit 10;
Logs just show:
LOG: server process (PID 7856) was terminated by exception 0xC0000005 DETAIL: Failed process was running: SELECT * from airports ORDER BY (SELECT geog FROM airports WHERE airport_id = 'KBOS') <-> geog limit 10
comment:8 by , 10 years ago
Hmm nightly snapshot failed too. Perhaps I need to pull right from git.
In meantime can you test this on your dev. This dummy data example crashes for me too:
CREATE TABLE test_geog_knn AS SELECT ROW_NUMBER()OVER(ORDER BY x,y) AS gid, ST_Point(x,y)::geography As geog FROM generate_series(-10,70, 10) As x , generate_series(0,90, 10) As y; SELECT * FROM test_geog_knn ORDER BY ST_GeogFromText('POINT(-71.0518888888889 42.4934722222222)') <-> geog limit 1;
Equivalent example using geometry doesn't crash though
comment:9 by , 10 years ago
Okay I built with latest commit — http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c5dd8ead403f85bd041590d2e3e79b72830472d4
and still crashes for me.
SELECT * FROM test_geog_knn ORDER BY ST_GeogFromText('POINT(-71.0518888888889 42.4934722222222)') <-> geog limit 1;
My back trace looks like this:
Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 10004.0x4b64] 0x00000000708a3263 in geography_distance (fcinfo=0x4b90610) at geography_measurement.c:139 139 tolerance = PG_GETARG_FLOAT8(2); (gdb) bt #0 0x00000000708a3263 in geography_distance (fcinfo=0x4b90610) at geography_measurement.c:139 #1 0x000000000057bf4b in ExecMakeFunctionResultNoSets (fcache=0x4b905a0, econtext=0x4b90340, isNull=0x4b9170a "", isDone=<optimized out>) at execQual.c:2018 #2 0x0000000000581eb0 in ExecTargetList (isDone=0x273ebec, itemIsDone=0x4b91878, isnull=0x4b91708 "", values=0x4b916d0, econtext=0x4b90340, targetlist=0x4b91840) at execQual.c:5363 #3 ExecProject (projInfo=projInfo@entry=0x4b91728, isDone=isDone@entry=0x273ebec) at execQual.c:5578 #4 0x0000000000582300 in ExecScan (node=node@entry=0x4b90228, accessMtd=accessMtd@entry=0x597590 <SeqNext>, recheckMtd=recheckMtd@entry=0x597580 <SeqRecheck>) at execScan.c:207 #5 0x0000000000597603 in ExecSeqScan (node=node@entry=0x4b90228) at nodeSeqscan.c:114 #6 0x000000000057ada8 in ExecProcNode (node=node@entry=0x4b90228) at execProcnode.c:412 #7 0x0000000000598129 in ExecSort (node=node@entry=0x4b8ff90) at nodeSort.c:103 #8 0x000000000057ae18 in ExecProcNode (node=node@entry=0x4b8ff90) at execProcnode.c:488 #9 0x0000000000590a60 in ExecLimit (node=node@entry=0x4b8fc50) at nodeLimit.c:91 #10 0x000000000057acc8 in ExecProcNode (node=node@entry=0x4b8fc50) at execProcnode.c:520 #11 0x0000000000577a6e in ExecutePlan (dest=0x4cd2b50, direction=<optimized out>, numberTuples=0, sendTuples=1 '\001', operation=CMD_SELECT, planstate=0x4b8fc50, estate=0x4b8fb38) at execMain.c:1550 #12 standard_ExecutorRun (queryDesc=0xde81138, direction=<optimized out>, count=0) at execMain.c:337 #13 0x000000000068ba68 in PortalRunSelect (portal=portal@entry=0x4b85ae8, forward=forward@entry=1 '\001', count=0, count@entry=41153104, dest=dest@entry=0x0) at pquery.c:952 #14 0x000000000068d0c6 in PortalRun (portal=0x273eeb0, portal@entry=0x4b85ae8, count=41153104, count@entry=2147483647, isTopLevel=isTopLevel@entry=0 '\000', dest=0x0, dest@entry=0x4cd2b50, altdest=altdest@entry=0x4cd2b50, completionTag=0x273f210 "", completionTag@entry=0x40000000075 <error: Cannot access memory at address 0x40000000075>) at pquery.c:796 #15 0x000000000068a964 in exec_simple_query (query_string=0x0) at postgres.c:1104 #16 PostgresMain (argc=<optimized out>, argv=argv@entry=0x2e7f90, dbname=0x10000f000e000d <error: Cannot access memory at address 0x10000f000e000d>, username=<optimized out>) at postgres.c:4025 #17 0x000000000062a913 in BackendRun (port=0x273f400) at postmaster.c:4162 #18 SubPostmasterMain (argc=argc@entry=3, argv=argv@entry=0x357fa0) at postmaster.c:4649 #19 0x00000000007c6830 in main (argc=3, argv=0x357fa0) at main.c:198 (gdb)
comment:10 by , 10 years ago
Reverse the order of your KNN arguments (put the literal argument second instead of first).
Tom said that the crasher was fixed in the latest tip, so I'm disturbed you're seeing a crash on the very latest version.
comment:11 by , 10 years ago
Yah I was too when I saw that note. I specifically pulled that tar ball associated with the commit and used that and wiped out my old before I reinstalled. You have the latest built on your machine?
comment:12 by , 10 years ago
Still crashes with the literal being second. I sure wish postgresql had a hash embedded in their versions for extra confirmation I'm running the right version.
comment:13 by , 10 years ago
Your last commit r13542 fixed the crashing and made my simple contrived test work even after I put on a spatial index.
However, my real data query:
SELECT * FROM airports ORDER BY geog <-> (SELECT geog FROM airports WHERE airport_id = 'KBOS') limit 10; SELECT * FROM airports ORDER BY ST_GeogFromText('POINT(-71.0518888888889 42.4934722222222)') <-> geog limit 10;
Both give error:
ERROR: index returned tuples in wrong order ********** Error ********** ERROR: index returned tuples in wrong order SQL state: XX000
Explain plan shows index trying to be used:
Limit (cost=8.58..12.64 rows=10 width=279) InitPlan 1 (returns $0) -> Index Scan using pk_airports on airports airports_1 (cost=0.29..8.30 rows=1 width=32) Index Cond: (airport_id = 'KBOS'::bpchar) -> Index Scan using idx_airports_geog on airports (cost=0.28..5877.41 rows=14479 width=279) Order By: (geog <-> $0)
comment:14 by , 10 years ago
Okay seems an issue if I do limit more than 1 and I have an index on table:
CREATE TABLE test_geog_knn AS SELECT ROW_NUMBER()OVER(ORDER BY x,y) AS gid, ST_Point(x,y)::geography As geog FROM generate_series(-10,70, 10) As x , generate_series(0,90, 10) As y; CREATE INDEX idx_test_geog_knn ON test_geog_knn USING gist(geog); SELECT * FROM test_geog_knn ORDER BY (SELECT geog FROM test_geog_knn WHERE gid = 1) <-> geog limit 10;
outputs:
ERROR: index returned tuples in wrong order ********** Error ********** ERROR: index returned tuples in wrong order SQL state: XX000
comment:15 by , 10 years ago
Yes, it reproduces, and it's the commit that fixed your crash r13542 that causes it for me. Prior to that, it *worked* for me.
comment:17 by , 10 years ago
Yap that did it . I'll add some regress checks for this and then close this ticket out.
comment:18 by , 10 years ago
pramsey,
I added recheck tests at r13546 just for geometry2d (still have to add for 3D and geography). The last test currently fails because it replicates the issue nicklas brought up in dev about how with lateral (and an index in place, the answers are wrong).
I'm going to try to replicate the issue with the built in polygon types in postgresql to see if I can make them fail the same way to rule out postgis as the culprit.
comment:19 by , 10 years ago
I'm having trouble getting the built in geometry types to bulk the same way, but then again theirs is harder to test, because they don't have a distance funciton, just the ↔ distance operator (so hard to do a compare). In running the tests though, I realized the problem may not be the lateral, but that the lateral allows for testing multiple records at a time.
For example the one in the regress I put in with the index in place:
SELECT a.gid, b.gid As match, RANK() OVER(PARTITION BY a.gid ORDER BY ST_Distance(a.geom, b.geom) ) As true_rn, b.rn As knn_rn FROM knn_recheck_geom As a LEFT JOIN LATERAL ( SELECT gid, geom, RANK() OVER(ORDER BY a.geom <-> g.geom) As rn FROM knn_recheck_geom As g WHERE a.gid <> g.gid ORDER BY a.geom <-> g.geom LIMIT 5) As b ON true WHERE a.gid IN(50000,50001,70000,61000) ORDER BY a.gid, b.rn; -- returns gid | match | true_rn | knn_rn ------+--------+---------+-------- 50000 | 48969 | 1 | 1 50000 | 51031 | 2 | 2 50000 | 47938 | 3 | 3 50000 | 52062 | 4 | 4 50000 | 46907 | 5 | 5 50001 | 53093 | 5 | 1 50001 | 48970 | 1 | 2 50001 | 51032 | 2 | 3 50001 | 47939 | 3 | 4 50001 | 52063 | 4 | 5 61000 | 62031 | 1 | 1 61000 | 59969 | 1 | 1 61000 | 63062 | 3 | 3 61000 | 58938 | 4 | 4 61000 | 46908 | 5 | 5 70000 | 64093 | 4 | 1 70000 | 57907 | 5 | 2 70000 | 600002 | 1 | 3 70000 | 600001 | 1 | 3 70000 | 600004 | 1 | 3
Note the problem children are 50001, 70000 (the true distance sort and knn sort should match but they don't. Now if I reduce to just testing one problem child:
SELECT a.gid, b.gid As match, RANK() OVER(PARTITION BY a.gid ORDER BY ST_Distance(a.geom, b.geom) ) As true_rn, b.rn As knn_rn -- , ST_Distance(a.geom, b.geom) , ST_GeometryType(a.geom) FROM knn_recheck_geom As a LEFT JOIN LATERAL ( SELECT gid, geom, RANK() OVER(ORDER BY a.geom <-> g.geom) As rn FROM knn_recheck_geom As g WHERE a.gid <> g.gid ORDER BY a.geom <-> g.geom LIMIT 5) As b ON true WHERE a.gid IN(70000) ORDER BY a.gid, b.rn; -- the order agrees -- gid | match | true_rn | knn_rn -------+--------+---------+-------- 70000 | 600001 | 1 | 1 70000 | 600002 | 1 | 1 70000 | 600004 | 1 | 1 70000 | 71031 | 4 | 4 70000 | 68969 | 5 | 5 (5 rows)
If I do both problem children, the first one is right the second one is screwed up.
comment:22 by , 10 years ago
reported upstream: http://www.postgresql.org/message-id/20150525052538.4705.92464@wrigleys.postgresql.org If they fix, then debbie's 9.5 daily run should pass. She is currently failing this recheck test.
comment:23 by , 10 years ago
Confirmed. Fix upstream seems to have fixed issue. Switching debbie to read from git since daily tar is not new enough
comment:24 by , 10 years ago
added geography tests at r13356
comment:25 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
I don't think this feature made it into the 9.4 cut.