Opened 16 years ago
Last modified 16 years ago
#121 closed task (fixed)
update tiger_geocoder to run in PostgreSQL 8.3 (patch attached)
Reported by: | cdwinslow | Owned by: | robe |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 1.3.6 |
Component: | postgis | Version: | 1.3.X |
Keywords: | Cc: |
Description
I was playing with the TIGER geocoder this weekend and had to make a few changes for it to work in PostgreSQL 8.3. If I remember correctly they are all type casts or variable declaration changes to avoid comparing text to numeric values.
Attachments (2)
Change History (12)
comment:1 by , 16 years ago
comment:2 by , 16 years ago
Looked at this again today long enough to make a patch with only one cast per comparison. As part of the process, lpad() is no longer needed outside the pretty-printer function.
It seems the problem is that the lookup tables built/used by the geocoder use numeric fields for (zip,state,county…) codes while the analogous fields in the TIGER data are text.
comment:4 by , 16 years ago
This version looks much better. I am still confused by the presence of some of the casts, as text ≠ varchar should just work fine, and some of the ::integer additions shouldn't be required if the variable is already defined in PLPGSQL as type integer. Regina will be able to comment further though.
ATB,
Mark.
comment:5 by , 16 years ago
CD — yes this does look better. Regarding Mark's comment, the only redundancy I see is the below. That change you made seems unnecessary.
geocode_address_zip.sql — line 15 don't think is necessary since your tempInt is already an integer
ideally I would have rather changed the lookup tables to match the datatype of the census tables, but maybe we can consider that for a second pass. I think this is better than what we have already so maybe we can accept and revisit later.
Am I correct in saying this uses the old proprietary tiger format and doesn't use the new census 2007/2008 shapefile structure, or is the documentation just out of date? Looking at the table structurs loooks like the old format. In which case this needs a lot of rework anyway.
comment:6 by , 16 years ago
in reply to comment 5: agree, line 15 is redundant
agree, it would be nicer to fix up the tables, but I've just been looking at this in my spare time. Also…
yes, the geocoder currently works against just the TIGER data from 2006, so redoing the tables without updating them to use more current data formats seems a bit of a time-waste.
comment:7 by , 16 years ago
Okay so I'll go ahead and commit your changes minus line 15 and commit to 1.3 and 1.4. Mark do you have any objections? We can then work on the 1.5 to get it to use the new 2008. I'm willing to put in a bit more time into that effort.
comment:8 by , 16 years ago
No, that works for me as long as it still works on older PostgreSQL versions.
ATB,
Mark.
comment:9 by , 16 years ago
Applied to trunk and 1.3 minus line 15. I couldn't really test since I don't have all the tables, but can't see how any of these changes would break 8.2 and below. And quick checks on 8.2 with jury rigged examples seem to work fine.
by , 16 years ago
Attachment: | tiger_geocoder-pg-8.3.patch added |
---|
Thanks for the patch.
On review, there are several things in this patch which don't make sense with respect to the definition of the geocoding tables. My main confusion is as to why the geocoder feels the need to cast integer fields to padded text for comparision with another integer field to begin with? These parts of the query will never be able to use an index to start with…
I think that we need to go over the geocoder in conjunction with this patch and re-write several of the joins to remove some of the text padding. I suspect when this is done then the errors under 8.3 will just go away.
Setting to 1.3 as people have mentioned this several times before, and hence this should get picked up into 1.4 too.
ATB,
Mark.