Ticket #1461 (new defect)

Opened 16 months ago

Last modified 2 weeks ago

Tiger Geocoder doesn't anticipate irregular spacing in road name

Reported by: arencambre Owned by: robe
Priority: medium Milestone: PostGIS 2.2.0
Component: pagc_address_parser Version: trunk
Keywords: Cc: aren@…, woodbri

Description

The tx_edges table uses I- 635 instead of the more consistent I-635.

This looks weird but works:

SELECT ST_AsEWKT(geomout) FROM geocode_intersection('N. Belt Line', 'I- 635', 'TX', 'Coppell') ORDER BY rating ASC LIMIT 1;

This looks correct but doesn't work:

SELECT ST_AsEWKT(geomout) FROM geocode_intersection('N. Belt Line', 'I-635', 'TX', 'Coppell') ORDER BY rating ASC LIMIT 1;

Seems like the geocoder needs to work around these kind of errors in the Tiger data.

Change History

Changed 16 months ago by robe

  • milestone changed from PostGIS 2.0.0 to PostGIS 2.1.0

yah -- may not be that trivial as spacing is used to designate separation of elements so putting logic like this in is libel to break something else without some extensive testing. I'll push to 2.1.0 but may get to it before then.

Changed 15 months ago by robe

  • milestone changed from PostGIS 2.1.0 to PostGIS 2.0.1

Changed 12 months ago by robe

  • milestone changed from PostGIS 2.0.1 to PostGIS 2.1.0

Changed 5 months ago by woodbri

  • cc woodbri added

PAGC tools handle this correctly.

Changed 7 weeks ago by robe

  • component changed from tiger geocoder to pagc_address_parser

Changed 7 weeks ago by woodbri

I should correct my last comment, PAGC still parses based on token so names like "SUN VALLEY" will parse as two tokens and "SUNVALLEY" will parse as one token. In my geocoder, I handle this under the fuzzy search by joining all the name tokens, then picking the best match to the input via scoring the results.

Changed 7 weeks ago by woodbri

Also, I have found at least one very pathological case where the name is "MAINSTREET" and if it is entered as "MAIN STREET" then it is impossible to match, because "STREET" is classified as SUFFIX_TYPE token and "MAIN" is too short to match the fuzy key of "MAINSTREET".

There are ways to find this, but they tend to make everything else slower and return too many unwanted results. -- Good Times!

Changed 2 weeks ago by robe

  • milestone changed from PostGIS 2.1.0 to PostGIS 2.2.0
Note: See TracTickets for help on using tickets.