Opened 13 years ago
Last modified 7 years ago
#1599 assigned defect
normalize_address() confused by country
Reported by: | mikepease | Owned by: | robe |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS Fund Me |
Component: | pagc_address_parser | Version: | master |
Keywords: | Cc: |
Description
If you include a country at the end of an address, normalize_address() incorrectly parses the state and zip.
212 n 3rd ave, Minneapolis, mn 55401, USA (This is the format of the "formatted address" returned by Google's XML webservice)
select * from normalize_address('212 n 3rd ave, Minneapolis, mn 55401, USA')
Result:
stateabbrev
212;"N";"3rd";"Ave";"";"";"mn 55401";"";"";t
Change History (12)
comment:1 by , 13 years ago
Milestone: | PostGIS 2.0.0 → PostGIS 2.1.0 |
---|---|
Status: | new → assigned |
comment:2 by , 12 years ago
Cc: | added |
---|
comment:3 by , 12 years ago
Cc: | removed |
---|
comment:4 by , 12 years ago
Component: | tiger geocoder → pagc_address_parser |
---|
comment:5 by , 12 years ago
Version: | 1.5.X → trunk |
---|
Steve,
This one crashes my backend, (mingw32 installed on windows 7 32 VC++), though all the others I have tried work fine:
select * from pagc_normalize_address('212 n 3rd ave, Minneapolis, mn 55401, USA')
Gets to start load rules and then hits the bucket. Works fine it I take off the , USA.
I tried the original set to rule out my changes to the tables.
-- crashes -- SELECT (SELECT standardize_address( 'select seq, word::text, stdword::text, token from tiger.pagc_gaz union all select seq, word::text, stdword::text, token from tiger.pagc_lex ' , 'select seq, word::text, stdword::text, token from tiger.pagc_gaz order by id' , 'select * from tiger.pagc_rules order by id' , 'select 0::int4 as id, ' || quote_literal(address1) || '::text As micro, ' || quote_literal(city || ', ' || state || ' ' || zip) || '::text As macro') As pagc_addr FROM (SELECT * FROM parse_address('212 n 3rd ave, Minneapolis, mn 55401, USA') ) As a );
-- also crashes -- SELECT (SELECT standardize_address( 'select seq, word::text, stdword::text, token from gaz union all select seq, word::text, stdword::text, token from lex ' , 'select seq, word::text, stdword::text, token from gaz order by id' , 'select * from rules order by id' , 'select 0::int4 as id, ' || quote_literal(address1) || '::text As micro, ' || quote_literal(city || ', ' || state || ' ' || zip) || '::text As macro') As pagc_addr FROM (SELECT * FROM parse_address('212 n 3rd ave, Minneapolis, mn 55401, USA') ) As a );
The log:
NOTICE: Start standardize_address NOTICE: start load_lex NOTICE: Time to read 3764 lexicon records: 10.0 ms. NOTICE: start load_lex NOTICE: Time to read 834 lexicon records: 0.0 ms. NOTICE: start load_rules
comment:6 by , 12 years ago
Crash went away after r11247 , but not parsing right first parse seems wrong. Is this the issue you were talking about with country?
-- using hstore here for easier output -- SELECT (each(hstore(a))).* FROM parse_address('212 n 3rd ave, Minneapolis, mn 55401, USA') As a yields: key | value ----------+-------------------------------------- num | 212 zip | city | USA state | street | n 3rd ave, Minneapolis, mn 55401 country | US street2 | zipplus | address1 | 212 n 3rd ave, Minneapolis, mn 55401
This may have to do with how I'm concatenating.
comment:7 by , 12 years ago
I'll have to look into this more, but I'm out of town the rest of this week, with limited access.
comment:8 by , 12 years ago
Milestone: | PostGIS 2.1.0 → PostGIS 2.2.0 |
---|
comment:9 by , 9 years ago
Milestone: | PostGIS 2.2.0 → PostGIS 2.3.0 |
---|
comment:10 by , 9 years ago
Summary: | normalize_address() confused by coutry → normalize_address() confused by country |
---|
comment:11 by , 9 years ago
Milestone: | PostGIS 2.3.0 → PostGIS Future |
---|
PAGC tools handle this, but the single line parsers has a trivial bug to fix related to this.