Ticket #1600 (assigned defect)

Opened 15 months ago

Last modified 7 weeks ago

normalize_address() confused by whitespace

Reported by: mikepease Owned by: robe
Priority: medium Milestone: PostGIS 2.1.0
Component: pagc_address_parser Version: 1.5.X
Keywords: Cc: woodbri

Description

If there is a tab or newline at the beginning or end of an address string, normalize_address() incorrectly parses the address components.

select * from normalize_address('212 n 3rd ave, Minneapolis, mn 55401 ')

select * from normalize_address(' 212 n 3rd ave, Minneapolis, mn 55401')

Can your function start off by cleaning up whitespace on the input string? When I don't make sure to send input clean of whitespace, I get poor results.

Something like... trim(regexp_replace(' 212 n 3rd ave, Minneapolis, mn 55401', '\r|\n', ' ', 'g'))

e.g. select normalize_address(trim(raw_address))

Change History

Changed 15 months ago by robe

  • status changed from new to assigned

Changed 15 months ago by robe

  • milestone changed from PostGIS 2.0.0 to PostGIS 2.1.0

Changed 5 months ago by woodbri

  • cc woodbri added

Changed 7 weeks ago by robe

  • component changed from tiger geocoder to pagc_address_parser

Changed 7 weeks ago by robe

This works okay with pagc_normalize

Except instead of 3rd, it returns: 3

Debating if that is okay or not. I'll have to see how geocoder handles it. I don't think it will care.

e.g.

select (a).address, a.predirabbrev, a.streetname from pagc_normalize_address(' 212 n 3rd ave, Minneapolis, mn 55401') As a ;

Yields:

 address | predirabbrev | streetname
---------+--------------+------------
     212 | N            | 3
Note: See TracTickets for help on using tickets.