#1991 closed defect (fixed)
geocode really slow on PostgreSQL 9.2
Reported by: | robe | Owned by: | robe |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 2.0.2 |
Component: | tiger geocoder | Version: | master |
Keywords: | history | Cc: |
Description
I'm not sure if this is just something I'm missing in settings with the new PostgreSQL or a fundamental issue with the new PostgreSQL planner.
I have 3 parallel windows PostgreSQL instances — 8.4 32-bit running PostGIS 1.5 (on older clunky 32-bit windows 2003 box)
on new Windows 2008 R2 same box: 9.1 64-bit running PostGIS 2.1.0 SVN, 9.2 64-bit running PostGIS 2.1.0 SVN.
On the 9.1 and 8.4, my MA data standard geocodes even with supplementary augmented parcel check completes in about 30-200ms.
On 9.2 instance with same dataset. It takes 8-30 seconds. Yap.
I compared the plans and it seems 9.2 is not using a bitmap index scan for this particular query I have. I have narrowed it to this construct.
-- 10 seconds WITH a AS ( SELECT * FROM (SELECT f.*, ad.side, ad.zip, ad.fromhn, ad.tohn, RANK() OVER(ORDER BY diff_zip(ad.zip,'02124') + CASE WHEN lower(f.name) = lower('Evans') THEN 0 ELSE levenshtein_ignore_case(f.name, lower('Evans') ) END + levenshtein_ignore_case(f.fullname, lower('Evans' || ' ' || COALESCE('St','')) ) + CASE WHEN (greatest_hn(ad.fromhn,ad.tohn) % 2)::integer = ('22' % 2)::integer THEN 0 ELSE 1 END + CASE WHEN '22'::integer BETWEEN least_hn(ad.fromhn,ad.tohn) AND greatest_hn(ad.fromhn, ad.tohn) THEN 0 ELSE 4 END + CASE WHEN lower('St') = lower(f.suftypabrv) OR lower('St') = lower(f.pretypabrv) THEN 0 ELSE 1 END + rate_attributes(NULL, f.predirabrv, 'Evans', f.name , 'St', suftypabrv , NULL, sufdirabrv, prequalabr) ) As rank FROM featnames As f INNER JOIN addr As ad ON (f.tlid = ad.tlid) WHERE '25' = f.statefp AND '25' = ad.statefp AND lower(f.name) = lower('Evans') AND ( ad.zip = ANY('{02123,02124,02125}'::varchar[]) ) ) AS foo ORDER BY rank LIMIT 3 ) SELECT a.tlid, a.fullname, a.name, a.predirabrv, a.suftypabrv, a.sufdirabrv, a.prequalabr, a.pretypabrv, b.the_geom, tfidr, tfidl, a.side , a.fromhn, a.tohn, a.zip, p.name as place FROM a INNER JOIN edges As b ON (a.statefp = b.statefp AND a.tlid = b.tlid ) INNER JOIN faces AS f ON ('25' = f.statefp AND ( (b.tfidl = f.tfid AND a.side = 'L') OR (b.tfidr = f.tfid AND a.side = 'R' ) )) INNER JOIN place p ON ('25' = p.statefp AND f.placefp = p.placefp ) WHERE a.statefp = '25' AND b.statefp = '25' ;
Which can be worked around by switching to this:
-- 92 ms -- WITH a AS ( SELECT * FROM (SELECT f.*, ad.side, ad.zip, ad.fromhn, ad.tohn, RANK() OVER(ORDER BY diff_zip(ad.zip,'02124') + CASE WHEN lower(f.name) = lower('Evans') THEN 0 ELSE levenshtein_ignore_case(f.name, lower('Evans') ) END + levenshtein_ignore_case(f.fullname, lower('Evans' || ' ' || COALESCE('St','')) ) + CASE WHEN (greatest_hn(ad.fromhn,ad.tohn) % 2)::integer = ('22' % 2)::integer THEN 0 ELSE 1 END + CASE WHEN '22'::integer BETWEEN least_hn(ad.fromhn,ad.tohn) AND greatest_hn(ad.fromhn, ad.tohn) THEN 0 ELSE 4 END + CASE WHEN lower('St') = lower(f.suftypabrv) OR lower('St') = lower(f.pretypabrv) THEN 0 ELSE 1 END + rate_attributes(NULL, f.predirabrv, 'Evans', f.name , 'St', suftypabrv , NULL, sufdirabrv, prequalabr) ) As rank FROM featnames As f INNER JOIN addr As ad ON (f.tlid = ad.tlid) WHERE '25' = f.statefp AND '25' = ad.statefp AND lower(f.name) = lower('Evans') AND ( ad.zip = ANY('{02123,02124,02125}'::varchar[]) ) ) AS foo ORDER BY rank LIMIT 3 ) SELECT a.tlid, a.fullname, a.name, a.predirabrv, a.suftypabrv, a.sufdirabrv, a.prequalabr, a.pretypabrv, b.the_geom, tfidr, tfidl, a.side , a.fromhn, a.tohn, a.zip, p.name as place FROM a INNER JOIN edges As b ON (a.statefp = b.statefp AND a.tlid = b.tlid ) INNER JOIN faces AS f ON ('25' = f.statefp AND ( (b.tfidl = f.tfid ) OR (b.tfidr = f.tfid ) )) INNER JOIN place p ON ('25' = p.statefp AND f.placefp = p.placefp ) WHERE a.statefp = '25' AND b.statefp = '25' AND ( (b.tfidl = f.tfid AND a.side = 'L') OR (b.tfidr = f.tfid AND a.side = 'R' ) );
I'll encode this revision if I conclude it's not just something screwy with my setup and also report it up the chain if I conclude it is a PostgreSQL 9.2 issue, which I suspect it is.
Note: there is nothing spatial in the above query, so I think PostGIS is off the hook.
Change History (3)
comment:1 by , 12 years ago
comment:2 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Okay rather than change my code, I'm just going to set join_collapse_limit='2' for both the geocode_address and geocode_intersection functions. As this yields a dramatic improvement on my PostgreSQL 9.2 instance, and doesn't seem to reduce speed of my 8.4 instance.
comment:3 by , 12 years ago
Keywords: | history added |
---|
Okay it seems in 9.2, reducing the join collapse limit makes it behave as fast as the old versions
e.g.:
Make both queries work at the faster speed and use the bitmapor strategy again.So not sure this is a real issue or just something to be cognizant of.
The default is 8, but if I increase above 4 it gets bad.