Opened 13 years ago
Closed 12 years ago
#1669 closed defect (wontfix)
Missing street type w/ Internal component makes geocode() so SLOOOW
Reported by: | mikepease | Owned by: | robe |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 2.1.0 |
Component: | tiger geocoder | Version: | master |
Keywords: | Cc: | woodbri |
Description
It appears there are certain types of address strings that make the geocode() function run 100 - 1000X slower than "normal addresses".
Consider these variations on
51 Nicollet Ave FL 4, Minneapolis, MN 55402
select (addy).*, rating from geocode('651 Nicollet Ave FL 4, Minneapolis, MN 55402') — Fast[[BR]]
select (addy).*, rating from geocode('651 Nicollet FL 4, Minneapolis, MN 55402') —>SLOW! 60+sec
select (addy).*, rating from geocode('651 Nicollet, FL 4, Minneapolis, MN 55402') —>SLOW! 60+sec
select (addy).*, rating from geocode('651 Nicollet, Minneapolis, MN 55402') —Fast[[BR]]
From what I can surmise, it seems that if you have an address that DOESN'T specify the street type but DOES specify an internal component, then the big slow-down happens.
When I run a list of several thousand addresses of medium quality, I run into dozens or more of these slow addresses and it ends up taking a majority of the time to run through these addresses.
Say I have a list of 10,000 medium-quality addresses. Maybe 9,950 of them will run fine, but just 50 of them go slow. This tiny minority ends up taking the vast majority of the batch time.
9,950 x 0.1 sec = ~15 min. 50 x 90 sec = 75 min. Total ~90 min.
Given the effect this has on running through a list, I think it's important to find a fix for this.
Change History (6)
comment:1 by , 13 years ago
Component: | postgis → tiger geocoder |
---|---|
Owner: | changed from | to
comment:2 by , 12 years ago
Milestone: | PostGIS 2.0.1 → PostGIS 2.1.0 |
---|
comment:3 by , 12 years ago
Version: | 1.5.X → trunk |
---|
comment:4 by , 12 years ago
Cc: | added |
---|
comment:5 by , 12 years ago
Regina,
Using the PAGC tools that I wrapped into postgresql, I can handle all of these. http://tinyurl.com/bxpnnvc
comment:6 by , 12 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
I can probably put a timeout on these. I'll play with that. I recall playing with that but unfortunately I think it might rollback a whole batch process which isn't ideal.