Opened 12 years ago

Closed 11 years ago

#1669 closed defect (wontfix)

Missing street type w/ Internal component makes geocode() so SLOOOW

Reported by: mikepease Owned by: robe
Priority: medium Milestone: PostGIS 2.1.0
Component: tiger geocoder Version: master
Keywords: Cc: woodbri

Description

It appears there are certain types of address strings that make the geocode() function run 100 - 1000X slower than "normal addresses".

Consider these variations on
51 Nicollet Ave FL 4, Minneapolis, MN 55402

select (addy).*, rating from geocode('651 Nicollet Ave FL 4, Minneapolis, MN 55402') — Fast[[BR]]

select (addy).*, rating from geocode('651 Nicollet FL 4, Minneapolis, MN 55402') —>SLOW! 60+sec

select (addy).*, rating from geocode('651 Nicollet, FL 4, Minneapolis, MN 55402') —>SLOW! 60+sec

select (addy).*, rating from geocode('651 Nicollet, Minneapolis, MN 55402') —Fast[[BR]]

From what I can surmise, it seems that if you have an address that DOESN'T specify the street type but DOES specify an internal component, then the big slow-down happens.

When I run a list of several thousand addresses of medium quality, I run into dozens or more of these slow addresses and it ends up taking a majority of the time to run through these addresses.

Say I have a list of 10,000 medium-quality addresses. Maybe 9,950 of them will run fine, but just 50 of them go slow. This tiny minority ends up taking the vast majority of the batch time.

9,950 x 0.1 sec = ~15 min. 50 x 90 sec = 75 min. Total ~90 min.

Given the effect this has on running through a list, I think it's important to find a fix for this.

Change History (6)

comment:1 by robe, 12 years ago

Component: postgistiger geocoder
Owner: changed from pramsey to robe

comment:2 by robe, 12 years ago

Milestone: PostGIS 2.0.1PostGIS 2.1.0

comment:3 by robe, 12 years ago

Version: 1.5.Xtrunk

I can probably put a timeout on these. I'll play with that. I recall playing with that but unfortunately I think it might rollback a whole batch process which isn't ideal.

comment:4 by woodbri, 12 years ago

Cc: woodbri added

comment:5 by woodbri, 11 years ago

Regina,

Using the PAGC tools that I wrapped into postgresql, I can handle all of these. http://tinyurl.com/bxpnnvc

comment:6 by robe, 11 years ago

Resolution: wontfix
Status: newclosed
Note: See TracTickets for help on using tickets.