Ticket #1669 (closed defect: wontfix)
Missing street type w/ Internal component makes geocode() so SLOOOW
| Reported by: | mikepease | Owned by: | robe |
|---|---|---|---|
| Priority: | medium | Milestone: | PostGIS 2.1.0 |
| Component: | tiger geocoder | Version: | trunk |
| Keywords: | Cc: | woodbri |
Description
It appears there are certain types of address strings that make the geocode() function run 100 - 1000X slower than "normal addresses".
Consider these variations on
51 Nicollet Ave FL 4, Minneapolis, MN 55402
select (addy).*, rating from geocode('651 Nicollet Ave FL 4, Minneapolis, MN 55402') -- Fast[[BR]]
select (addy).*, rating from geocode('651 Nicollet FL 4, Minneapolis, MN 55402') -->SLOW! 60+sec
select (addy).*, rating from geocode('651 Nicollet, FL 4, Minneapolis, MN 55402') -->SLOW! 60+sec
select (addy).*, rating from geocode('651 Nicollet, Minneapolis, MN 55402') --Fast[[BR]]
From what I can surmise, it seems that if you have an address that DOESN'T specify the street type but DOES specify an internal component, then the big slow-down happens.
When I run a list of several thousand addresses of medium quality, I run into dozens or more of these slow addresses and it ends up taking a majority of the time to run through these addresses.
Say I have a list of 10,000 medium-quality addresses. Maybe 9,950 of them will run fine, but just 50 of them go slow. This tiny minority ends up taking the vast majority of the batch time.
9,950 x 0.1 sec = ~15 min. 50 x 90 sec = 75 min. Total ~90 min.
Given the effect this has on running through a list, I think it's important to find a fix for this.
