Opened 10 years ago

Closed 9 years ago

Last modified 7 years ago

#2979 closed defect (wontfix)

address standardizer doesn't correctly parse street suffix if no ,

Reported by: robe Owned by: robe
Priority: medium Milestone: PostGIS Fund Me
Component: pagc_address_parser Version: master
Keywords: Cc:

Description

Sorry for this barrage of issues. Leo had a batch of addresses he needed to standardize and was comparing performance between normalize_address and standardize_address and came up with these issues.

Here is one where normalize_address parses right but standardize_address (and pagc_normalized_address by extension) don't

SELECT 'std' As parser, house_num, name, suftype, state, postcode
FROM standardize_address('lex' -- --
, 'gaz' -- --
, 'rules' -- --
, '25 PINE ST PROVIDENCE RI 99999'  
) As std 
UNION ALL
SELECT 'norm' As parser, address::text, streetname, streettypeabbrev, stateabbrev, zip
FROM normalize_address('25 PINE ST PROVIDENCE RI 99999' ) ;

Output is:

 parser | house_num | name | suftype |    state     | postcode
--------+-----------+------+---------+--------------+----------
 std    | 25        | PINE |         | RHODE ISLAND | ST 99999
 norm   | 25        | PINE | St      | RI           | 99999

Note how with standardize_address, the street type gets caught in postcode.

However both behave right if I put in a,

SELECT 'std' As parser, house_num, name, suftype, state, postcode
FROM standardize_address('lex' -- --
, 'gaz' -- --
, 'rules' -- --
, '25 PINE ST, PROVIDENCE, RI 99999'  
) As std 
UNION ALL
SELECT 'norm' As parser, address::text, streetname, streettypeabbrev, stateabbrev, zip
FROM normalize_address('25 PINE ST, PROVIDENCE, RI 99999' ) ;
 parser | house_num | name | suftype |    state     | postcode
--------+-----------+------+---------+--------------+----------
 std    | 25        | PINE | STREET  | RHODE ISLAND | 99999
 norm   | 25        | PINE | St      | RI           | 99999

Change History (5)

comment:1 by robe, 9 years ago

Owner: changed from woobri to robe

On closer inspection, seems like this is the fault of parse_address

which for this:

SELECT * FROM parse_address('25 PINE ST PROVIDENCE RI 99999');

gives:

 num | street | street2 | address1 |     city      | state |  zip  | zipplus | country
-----+--------+---------+----------+---------------+-------+-------+---------+---------
 25  | PINE   |         | 25 PINE  | ST PROVIDENCE | RI    | 99999 |         | US
(1 row)

but for this:

SELECT * FROM parse_address('25 PINE ST, PROVIDENCE, RI 99999');

gives the right answer:

 num | street  | street2 |  address1  |    city    | state |  zip  | zipplus | country
-----+---------+---------+------------+------------+-------+-------+---------+---------
 25  | PINE ST |         | 25 PINE ST | PROVIDENCE | RI    | 99999 |         | US

comment:2 by robe, 9 years ago

Milestone: PostGIS 2.2.0PostGIS 2.3.0

comment:3 by robe, 9 years ago

Resolution: wontfix
Status: newclosed

On closer inspection this seems specific to PINE. Presuamble because there is a regex match for PINE in the parseaddress-stcities.ht

If I change to say:

SELECT * FROM parse_address('25 OAK ST PROVIDENCE RI 99999');

 num | street | street2 | address1  |    city    | state |  zip  | zipplus | country
-----+--------+---------+-----------+------------+-------+-------+---------+---------
 25  | OAK ST |         | 25 OAK ST | PROVIDENCE | RI    | 99999 |         | US

That said - trying to fix this would probably cause more damage than good.

comment:4 by robe, 9 years ago

Milestone: PostGIS 2.3.0PostGIS Future

comment:5 by robe, 7 years ago

Milestone: PostGIS FuturePostGIS Fund Me

Milestone renamed

Note: See TracTickets for help on using tickets.