Opened 7 years ago

Closed 7 years ago

#3968 closed defect (worksforme)

postgis geocode bug - invalid input syntax for type numeric

Reported by: hammeryosi Owned by: robe
Priority: medium Milestone: PostGIS 2.4.3
Component: tiger geocoder Version: 2.4.x
Keywords: Cc:

Description

I am using postgis with the tiger extension to geocode addresses in our database at Dataline. I have successfully used it for tens of millions of addresses before running into this issue with a specific address in North Carolina. The geocoder function fails when I specify max_results=1.

Here’s what happens without specifying it:

select geocode(normalize_address('6922 SURREY RD, FAYETTEVILLE, NC 28306'))
'("(6709,,Surrey,Rd,,,Fayetteville,NC,28306,t,,)",0101000020AD10000027ACEC8778C053C09480511ED27F4140,5)'
'("(198,,Sarah,Ln,,,Rockfish,NC,28306,t,,)",0101000020AD1000007EBC9E12A8C353C07DC16057347D4140,71)'

And with max_results=1

select geocode(normalize_address('6922 SURREY RD, FAYETTEVILLE, NC 28306'), 1)
ERROR:  invalid input syntax for type numeric: " "
CONTEXT:  PL/pgSQL function geocode_address(norm_addy,integer,geometry) line 204 at assignment
PL/pgSQL function geocode(norm_addy,integer,geometry) line 14 at FOR over SELECT rows
SQL state: 22P02

This does not happen for other addresses, e.g.,

select geocode(normalize_address('6922 SIMS SCHOOL RD, SIMS, NC 27880'), 1)
'("(6922,,""Sims School"",Rd,,,Sims,NC,27880,t,,)",0101000020AD100000D0EB95D6808353C0364C7D2CFEDF4140,0)'

My system info:

'PostgreSQL 10.1, compiled by Visual C++ build 1800, 64-bit'
'POSTGIS="2.4.1 r16012" PGSQL="100" GEOS="3.6.2-CAPI-1.10.2 4d2925d" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 2.2.2, released 2017/09/15" LIBXML="2.7.8" LIBJSON="0.12" LIBPROTOBUF="1.2.1" RASTER'

Any help would be appreciated. Thanks! Yosi Hammer, Data scientist at Dataline (http:
www.datalinedata.com)

Change History (4)

comment:1 by robe, 7 years ago

I'll take a look at this hopefully in next 2 days. I thought this issue I had already fixed, but might be another instance of it. The issue is data specific probably some strange stuff in one of the street numbers in addr table it happens to hit during filtering.

comment:2 by robe, 7 years ago

Hammer,

I loaded NC tiger 2017 data and tested, and not seeing an issue.

SELECT geocode(normalize_address('6922 SURREY RD, FAYETTEVILLE, NC 28306'), 1);

outputs:

                                                geocode

--------------------------------------------------------------------------------
------------------------
 ("(6922,,Surrey,Rd,,,Fayetteville,NC,28306,t,,)",0101000020AD100000B4AED5924DC1
53C0DFB3212DF17E4140,0)
(1 row)
SELECT geocode(normalize_address('6922 SURREY RD, FAYETTEVILLE, NC 28306'));

outputs

                                                geocode

--------------------------------------------------------------------------------
------------------------
 ("(6709,,Surrey,Rd,,,Fayetteville,NC,28306,t,,)",0101000020AD10000027ACEC8778C0
53C09480511ED27F4140,5)
 ("(198,,Sarah,Ln,,,Rockfish,NC,28306,t,,)",0101000020AD1000007EBC9E12A8C353C07D
C16057347D4140,71)
(2 rows)

I'm running:

PostgreSQL 10.1, compiled by Visual C++ build 1800, 64-bit

POSTGIS="2.4.2 r16113" PGSQL="100" GEOS="3.6.2-CAPI-1.10.2 4d2925d" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 2.2.2, released 2017/09/15" LIBXML="2.7.8" LIBJSON="0.12" LIBPROTOBUF="1.2.1" RASTER

Very similar to yours, so I'm guessing you might be running an unpatched tiger geocoder. To check which version you are running, check what this outputs:

SELECT name, installed_version 
FROM pg_available_extensions WHERE name = 'postgis_tiger_geocoder';
          name          | installed_version
------------------------+-------------------
 postgis_tiger_geocoder | 2.4.1
(1 row)

I thought maybe it was something I fixed in 2.4.2, but as you can see I neglected to even upgrade my tiger to 2.4.2, so in theory I should be running the same version as you.

If you see something less than 2.4.1, then run this:

ALTER EXTENSION postgis_tiger_geocoder UPDATE;

and that should bring you up to date. Hopefully will fix your issue.

comment:3 by hammeryosi, 7 years ago

Hi, Thanks for your answer and sorry it took me a while to reply. Haven't got back to this until now.

Since posting this I set up another database with all states included. (Before I used different databases for different states to improve performance working on millions of addresses) The new database does not have this issue. The only difference between the setup process was using a different version of wget for downloading the census data. Could this have been something to do with a bad line in one of the census tables? I don't see how that would cause such a strange bug appearing only when specifying max_results=1.

In any case, since this bug is not reproducible even on my machine. I think we can close it.

Thanks again!

Yosi

comment:4 by robe, 7 years ago

Resolution: worksforme
Status: newclosed

hammeryosi,

No shouldn't have anything to do with wget. Perhaps your two different systems had two different versions of postgis_tiger_geocoder. Like I said, this bug looked familiar and one I had fixed.

Note: See TracTickets for help on using tickets.