Opened 21 years ago
Closed 9 years ago
#372 closed defect (fixed)
OGR/OGDI: All (?) string column field lengths shortened by 1 character
Reported by: | Markus Neteler | Owned by: | warmerdam |
---|---|---|---|
Priority: | high | Milestone: | |
Component: | OGR_SF | Version: | unspecified |
Severity: | normal | Keywords: | |
Cc: | krbaker@…, Mateusz Łoskot |
Description (last modified by )
Frank,
after digging around with the extraction of political boundaries from VMAP0, I discovered, that the column length are not correct.
id: Integer (10.0) f_code: String (5.0) nam: String (64.0) na2: String (2.0) na3: String (1.0) tile_id: Integer (5.0) fac_id: Integer (10.0)
At least the column 'na2' is one character too short.
This works however (extraction of Botswana):
ogrinfo -ro -where "na2 = ' BC'" gltp:/vrf/grass0/warmerdam/v0soa/vmaplv0/soamafr 'polbnda@bnd(*)_area' INFO: Open of `gltp:/vrf/grass0/warmerdam/v0soa/vmaplv0/soamafr' using driver `OGDI' successful. Layer name: polbnda@bnd(*)_area Geometry: Polygon Feature Count: 10 Layer SRS WKT: GEOGCS["NAD83", [...]
while this fails:
ogrinfo -ro -where "na2 = ' B'" gltp:/vrf/grass0/warmerdam/v0soa/vmaplv0/soamafr 'polbnda@bnd(*)_area' INFO: Open of `gltp:/vrf/grass0/warmerdam/v0soa/vmaplv0/soamafr' using driver `OGDI' successful. Layer name: polbnda@bnd(*)_area Geometry: Polygon Feature Count: 0 Layer SRS WKT: [...]
It should be a 3 chars string just to be able to store UNK = No entry present
When exporting to SHAPE, and then using ARCExplorer, one character in the 'na2' column is lost.
To fetch the USA from NOAMER, I have to run
ogrinfo -ro -where "na2 = ' US'" gltp:/vrf/grass0/warmerdam/v0noa/vmaplv0/noamer 'polbnda@bnd(*)_area'
The bug may also be related to the fact, that the columns f_code, nam, na2, na3 start with a white space. Maybe the reading of the string is shifted by one character.
Hope it is easy to fix.
Thanks,
Markus
Change History (11)
comment:2 by , 21 years ago
I was asked to append to this problem with my issues. First my comments on what is below. The fields look the right length... I think "unk" is the representation of some sort of null... The space at the beginning of the string seems like the real problem. Though I don't know what UNK should be in the translation... (I could be out to lunch on this) IIRC the VPF file defines the field sizes in some header... its not like it goes parsing through... if we really want to keep the unk you might have to check if its less than three and set it to three... just a thought.... So I first tried this: > > ogr2ogr -f "PostgreSQL" PG:dbname=geoDB \ > > gltp:/vrf/mapdata/vmaplv0/v0eur/vmaplv0/eurnasia \ > > -nln political_bondries 'polbnda@bnd(*)_area' and it failed on > > ERROR 1: INSERT command for new feature failed. > > ERROR: value too long for type character(5) > > > > ERROR 1: Terminating translation prematurely after failed > > translation of layer polbnda@bnd(*)_area 1st try with debugging on yeilded: OGR_PG: PQexec(SELECT AddGeometryColumn('gazgeoDB','first_order','wkb_geometry',104306,'POLYGON',3)) OGR_PG: PQexec(INSERT INTO "first_order" (wkb_geometry , "id", "f_code", "nam", "na2", "na3", "tile_id", "fac_id") VALUES (GeometryFromText('POLYGON ((-179.12876892 71.60120392,-179.19979858 71.60615540,-179.32174683 71.60820770,-179.32791138 71.60778046,-179.34704590 71.60645294,-179.35140991 71.60566711,-179.36265564 71.60363007,-179.39033508 71.59860992,-179.40625000 71.59832764,-179.38592529 71.60366821,-179.35977173 71.60800934,-179.29605103 71.61209106,-179.13288879 71.60758209,-179.12300110 71.60483551,-179.12876892 71.60120392))'::TEXT,104306) , 1, ' FA001', ' OSTROV NAKHODKA#CHUKOTSKIY AVTONOMNYY OKRUG', ' RS', ' A', 1, 32)) ERROR 1: INSERT command for new feature failed. ERROR: value too long for type character(5) this was the table that created: Column | Type | Modifiers --------------+---------------+------------------------------------------------------------- ogc_fid | integer | not null default nextval('"first_order_ogc_fid_seq"'::text) wkb_geometry | geometry | id | numeric(10,0) | f_code | character(5) | nam | character(64) | na2 | character(2) | na3 | character(1) | tile_id | numeric(5,0) | fac_id | numeric(10,0) | then I defined my own table and changed the length of f_code to 6 and it worked failed on the na2 field which has the same problem (the leading whitespace) I then noticed that all of the text fields have this problem... This is the only solution I have come up with... Load into a custom built table (with fields that are 1 bigger than they should be) and then munge out all of the leading spaces. but thats a pain. Even more pain... I almost got it working. I dumped the vpf to shapefile with ogr2ogr and then loaded that into a postgres DB. This worked, but I just noticed that it hacked off one char in the na2 and na3 columns in the process of going to shapefile. I think this points to the OGDI interface or OGDI's VPF driver. I'll try and poke at it but I am not up on the code at all. Frank asked that I update my code to the latest and greatest as he thought he had fixed the problem. I got both CVS trees for OGDI and GDAL and I still have the same problem. My gdal tree is at least 7/16/03 (last changelog) and my OGDI tree is at least 1/29/03 (last changelog) the OGDI is from the source forge cvs tree and GDAL is from remote sensing.
comment:3 by , 21 years ago
Keith, the UNK will be simply "unknown". In fact the leading space character is the problem. The UNK was just an example. The main problem is that the strings are cut, so that e.g. the countries in the map polbnda@bnd(*)_area such as 'US' are cut to ' U' ^-- shifting space char inserted by OGR/OGDI driver which makes queries impossible. All strings should be shifted one position to left, then the problem will be fixed. Probably in a loop the index is 1 and not 0 or so (guessing). Comparison 'ogrinfo'/'ogdi_info': ogdi_info -u gltp:/vrf/grass0/warmerdam/v0noa/vmaplv0/noamer -l 'polbnda@bnd(*)_area' -f area -id 123 [...] Object ID:123 Object Attributes:124 { FA001 } { NORTHWEST TERRITORIES } { CA } { N } 2 12 0: (0.000000 , 0.000000) -> CA = Canada Now we do the same with ogr2ogr: ogrinfo -ro gltp:/vrf/grass0/warmerdam/v0noa/vmaplv0/noamer 'polbnda@bnd(*)_area' | less -> now search for 'NORTHWEST TERRITORIES' Do you get 'CA' for na2? I am using yesterday's CVS of OGR and OGDI from End of May. Best regards Markus
comment:4 by , 21 years ago
Guys, I just want to confirm that I have reproduced the problem of the extra space with the NOAMER dataset this morning, and will dig into it today. Sorry for all the delays.
comment:5 by , 21 years ago
I have patched gdal/ogr/ogrsf_frmts/ogdi/ogrogdilayer.cpp to trim all leading white space. VRF objects where being returned from OGDI with attribute lists like this: Object Attributes:1 { FA001 } { OSTROV NAKHODKA#CHUKOTSKIY AVTONOMNYY OKRUG } { RS } { A } 1 32 The ecs_FindElement() function used to parse the attribute lists was preserving the spaces around the attribute values even though, as far as I know, they are not part of the proper attribute valuein VRF. While I suspect the VRF driver should not have been returning the extra spaces in the first place, I am hesitant to fiddle with that code without being more sure. In any event, the implemented change resolves the problems seen. The change is in CVS.
comment:6 by , 21 years ago
Fixed the north american vpf data set... but Europe still has issues. Here is what I am seeing. Not 100% sure why this is happening... maybe some of the char's in the text string aren't really counted as a true char in vpf? krbaker@mosor:~$ CPL_DEBUG=ON /usr/bin/ogr2ogr -f PostgreSQL -nln vpfeur PG:dbname=gazgeoDB gltp:/vrf/afs/metacarta.com/map/public/vmaplv0/vpf/v0eur/vmaplv0/eurnasia 'polbnda@bnd(*)_area' OGR: ~OGRDDataSource(0x804e060) OGR: ~OGRDDataSource(0x804e060) OGR: ~OGRDDataSource(0x804e060) OGR: ~OGRDDataSource(0x804e060) OGR: ~OGRDDataSource(0x804e060) OGR: ~OGRDDataSource(0x804e060) OGR: ~OGRDDataSource(0x804e060) OGR: ~OGRDDataSource(0x804e060) OGR: ~OGRDDataSource(0x804e060) OGR: ~OGRDDataSource(0x804e060) OGR: OGROpen(gltp:/vrf/afs/metacarta.com/map/public/vmaplv0/vpf/v0eur/vmaplv0/eu rnasia) succeeded (0x804e060). OGR_PG: DBName="gazgeoDB" OGR_PG: PQexec(CREATE TABLE "vpfeur" ( OGC_FID SERIAL )) OGR_PG_NOTICE: NOTICE: CREATE TABLE will create implicit sequence 'vpfeur_ogc_f id_seq' for SERIAL column 'vpfeur.ogc_fid' OGR_PG_NOTICE: NOTICE: CREATE TABLE / UNIQUE will create implicit index 'vpfeur _ogc_fid_key' for table 'vpfeur' OGR_PG: PQexec(SELECT AddGeometryColumn('gazgeoDB','vpfeur','wkb_geometry',10430 6,'POLYGON',3)) OGR_PG: PQexec(INSERT INTO "vpfeur" (wkb_geometry , "id", "f_code", "nam", "na2" , "na3", "tile_id", "fac_id") VALUES (GeometryFromText('POLYGON ((83.61547089 74 .05305481,83.61058044 74.05364227,83.60916901 74.05919647,83.60236359 74.0624694 8,83.58677673 74.06372070,83.57983398 74.06427765,83.56016541 74.06913757,83.562 11090 74.07141876,83.57908630 74.07250214,83.59864044 74.06858063,83.60969543 74 .07038879,83.61058044 74.07875061,83.59558105 74.08491516,83.57338715 74.0905532 8,83.55777740 74.08980560,83.54805756 74.08666992,83.55371857 74.07788849,83.549 61395 74.07519531,83.53080750 74.07252502,83.49994659 74.07653046,83.47408295 74 .07736206,83.42522430 74.07375336,83.40255737 74.06983185,83.38527679 74.0649185 2,83.38138580 74.06114197,83.36988831 74.04994202,83.37863922 74.04900360,83.418 55621 74.05449677,83.43777466 74.06005859,83.45211029 74.05452728,83.44916534 74 .04907990,83.44255829 74.04714203,83.39152527 74.04241943,83.36538696 74.0427246 1,83.32577515 74.04766846,83.30449677 74.04738617,83.28072357 74.04955292,83.188 46893 74.06461334,83.17028046 74.06758118,83.14008331 74.07491302,83.09447479 74 .08122253,83.05180359 74.08396912,83.02561188 74.08258057,82.99508667 74.0780029 3,82.94633484 74.06383514,82.89636230 74.06047058,82.88488770 74.06314087,82.904 33502 74.06994629,82.93386078 74.06900024,82.96069336 74.07711029,82.98872375 74 .08555603,83.06880188 74.10530853,83.19869232 74.12564087,83.24594116 74.1300811 8,83.34505463 74.13055420,83.35436249 74.13008118,83.43444824 74.12616730,83.493 63708 74.12083435,83.56138611 74.10717010,83.63122559 74.07963562,83.63752747 74 .07350159,83.63697052 74.06397247,83.62777710 74.05449677,83.61547089 74.0530548 1))'::TEXT,104306) , 2160, 'FA001', 'OSTROV VOSTOCHNYY KAMENNYY#TAYMYRSKIY(DOLGA NO-NENETSKIY) AVT.OKRUG', 'RS', 'A', 4, 128)) ERROR 1: INSERT command for new feature failed. ERROR: value too long for type character(64) OGR_PG: SoftRollback() with no transaction active. ERROR 1: Terminating translation prematurely after failed translation of layer polbnda@bnd(*)_area
comment:7 by , 21 years ago
Hmm. Indeed "OSTROV VOSTOCHNYY KAMENNYY#TAYMYRSKIY(DOLGANO-NENETSKIY) AVT.OKRUG" is more than 64 characters, but I don't know why. I think the best fix for this might be to have the PostGIS driver truncate and warn while constructing the insert command rather than leaving it to Postgres to fail. I *suspect* this is some sort of issue with multi-byte characters, but none of the above string looks like a multi-byte escape sequence or anything like that. Does this approach sound OK to you?
comment:9 by , 17 years ago
Description: | modified (diff) |
---|
comment:10 by , 16 years ago
Cc: | added |
---|
comment:11 by , 16 years ago
Reporter: | changed from | to
---|
comment:12 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
With a pretty old GDAL 1.7.0b2, FWTools 2.4.7, released 2010/01/19 I get correct result if I query na2 column as 'BC'. No need to write it as ' BC' (which finds zero results). It looks like the core issue of this ticket is fixed long time ago.
ogrinfo -where "na2 = 'BC'" gltp:/vrf/C:/temp/v0soa_5/vmaplv0/soamafr "polbnda@bnd(*)_area"
Layer name: polbnda@bnd(*)_area Geometry: Polygon Feature Count: 10 Extent: (19.999535, -26.907246) - (29.360781, -17.780813) Layer SRS WKT: GEOGCS["NAD83", DATUM["North_American_Datum_1983", SPHEROID["GRS 1980",6378137,298.257222101, AUTHORITY["EPSG","7019"]], TOWGS84[0,0,0,0,0,0,0], AUTHORITY["EPSG","6269"]], PRIMEM["Greenwich",0, AUTHORITY["EPSG","8901"]], UNIT["degree",0.0174532925199433, AUTHORITY["EPSG","9108"]], AUTHORITY["EPSG","4269"]] id: Integer (10.0) f_code: String (5.0) nam: String (64.0) na2: String (2.0) na3: String (1.0) tile_id: Integer (5.0) fac_id: Integer (10.0) OGRFeature(polbnda@bnd(*)_area):2159 id (Integer) = 2160 f_code (String) = FA001 nam (String) = CHOBE na2 (String) = BC na3 (String) = F tile_id (Integer) = 31 fac_id (Integer) = 16