Opened 5 years ago
Closed 4 years ago
#4655 closed defect (wontfix)
TIGER 'addrfeat' table definition vs. current 2019 shapefiles
Reported by: | jlrobins | Owned by: | robe |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 3.0.3 |
Component: | tiger geocoder | Version: | 3.0.x |
Keywords: | Cc: |
Description
I'm bootstrapping a PostGIS 3.0.1, PostgreSQL 12.2, TIGER 2019 database from scratch against all possible layers for the southeast US. I've met an ounce of resistance when trying to load the 'addrfeat' layer for FL, which happened to be my first state. The following section of the results of
SELECT loader_generate_script(ARRAY['FL','NC', 'SC', 'GA'],'sh');`
…
${PSQL} -c "CREATE TABLE tiger_data.FL_addrfeat(CONSTRAINT pk_FL_addrfeat PRIMARY KEY (gid)) INHERITS(tiger.addrfeat);ALTER TABLE tiger_data.FL_addrfeat ALTER COLUMN statefp SET DEFAULT '12';" for z in *addrfeat*.dbf; do ${SHP2PGSQL} -D -S -D -s 4269 -g the_geom -W "latin1" $z tiger_staging.FL_addrfeat | ${PSQL} ${PSQL} -c "SELECT loader_load_staged_data(lower('FL_addrfeat'), lower('FL_addrfeat'));" done
ends up croaking within the loader_load_staged_data()
call due to a mismatch in columns from the shapefile vs. columns in base table addrfeat
:
ALTER TABLE Shapefile type: Arc Postgis type: LINESTRING[2] SET SET BEGIN CREATE TABLE ALTER TABLE addgeometrycolumn ---------------------------------------------------------------------- tiger_staging.fl_addrfeat.the_geom SRID:4269 TYPE:LINESTRING DIMS:2 (1 row) COPY 32505 COMMIT ANALYZE NOTICE: INSERT INTO tiger_data.fl_addrfeat(aridl,aridr,edge_mtfcc,fullname,lfromhn,lfromtyp,linearid,ltohn,ltotyp,offsetl,offsetr,parityl,parityr,plus4l,plus4r,rfromhn,rfromtyp,rtohn,rtotyp,the_geom,tlid,zipl,zipr) SELECT aridl,aridr,edge_mtfcc,fullname,lfromhn,lfromtyp,linearid,ltohn,ltotyp,offsetl,offsetr,parityl,parityr,plus4l,plus4r,rfromhn,rfromtyp,road_mtfcc,rtohn,rtotyp,tfidl,tfidr,the_geom,tlid,zipl,zipr FROM tiger_staging.fl_addrfeat; ERROR: INSERT has more expressions than target columns LINE 1: ...mtyp,road_mtfcc,rtohn,rtotyp,tfidl,tfidr,the_geom,tlid,zipl,... ^ QUERY: INSERT INTO tiger_data.fl_addrfeat(aridl,aridr,edge_mtfcc,fullname,lfromhn,lfromtyp,linearid,ltohn,ltotyp,offsetl,offsetr,parityl,parityr,plus4l,plus4r,rfromhn,rfromtyp,rtohn,rtotyp,the_geom,tlid,zipl,zipr) SELECT aridl,aridr,edge_mtfcc,fullname,lfromhn,lfromtyp,linearid,ltohn,ltotyp,offsetl,offsetr,parityl,parityr,plus4l,plus4r,rfromhn,rfromtyp,road_mtfcc,rtohn,rtotyp,tfidl,tfidr,the_geom,tlid,zipl,zipr FROM tiger_staging.fl_addrfeat; CONTEXT: PL/pgSQL function loader_load_staged_data(text,text,text[]) line 26 at EXECUTE SQL function "loader_load_staged_data" statement 1
shp2pgsql from the shapefile within https://www2.census.gov/geo/tiger/TIGER2019/ADDRFEAT/tl_2019_13001_addrfeat.zip
has 3 columns not present in base table addrfeat
:
CREATE TABLE "tl_2019_12001_addrfeat" ( ... "tfidl" int8, "tfidr" int8, ... "road_mtfcc" varchar(5), ...
When I adjust tiger_geocoder.sql.in's definition of addrfeat
with how I see tfidl
, tfidr
, and similar concept edge_mtfcc
modeled:
CREATE TABLE addrfeat ( ... tfidl numeric(10), tfidr numeric(10), ... road_mtfcc character varying(5), ... );
Then initial population proceeds past this point.
I've not yet explored what an 'mftcc' is, but there are at least some differences in road_ vs. edge_ values in FL:
tiger=# select (edge_mtfcc = road_mtfcc) as same_mtfcc_value, count(*) from tiger_data.fl_addrfeat group by 1 order by 2 desc; same_mtfcc_value | count ------------------+--------- t | 1268547 f | 2615 (2 rows) tiger=# select gid, edge_mtfcc, road_mtfcc from tiger_data.fl_addrfeat where edge_mtfcc != road_mtfcc order by 1, 2, 3 limit 5; gid | edge_mtfcc | road_mtfcc -------+------------+------------ 1180 | P0001 | S1400 16533 | P0001 | S1200 16534 | P0001 | S1200 36457 | P0001 | S1200 36500 | P0001 | S1400 (5 rows)
Change History (6)
comment:1 by , 5 years ago
comment:2 by , 5 years ago
ijlrobins thanks for the report. Just out of curiosity what do you use addrfeat for?
The geocoder doesn't use it at all and I think I have it turned off by default. I had originally thought about using it but after some experiments decided joining with addr and features as far as geocoding and disk space usage was the best way to go. At a glance looked redundant.
follow-up: 4 comment:3 by , 5 years ago
Component: | postgis → tiger geocoder |
---|---|
Owner: | changed from | to
comment:4 by , 5 years ago
Replying to robe:
Absolutely nothing at this point. Having manually imported ~2012-era census boundaries manually in my prior shop, and realizing just what a chore it was, I was excited to start out with someone else's repeatable efforts and then pick through the results to figure out what layers I'd like to keep. So I started from the perspective of 'import every possible layer; sort 'em out later.'
So at this point take it as advisement that some written code isn't working as hoped, regardless of its possible utility.
Thanks!
comment:5 by , 4 years ago
Milestone: | → PostGIS 3.0.3 |
---|
comment:6 by , 4 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
Sorry, that last informational query should have projected the tlid of some interesting rows, not my own gid serial values: