Opened 12 years ago

Closed 11 years ago

#1484 closed defect (duplicate)

NHD (ESRI) geodatabase reading problems

Reported by: rkgeorge@… Owned by: Mateusz Łoskot
Priority: normal Milestone:
Component: OGR_SF Version: 1.4.0
Severity: normal Keywords: PGeo "ESRI Geodatabase"
Cc: Jeff McKenna, springmeyer, MarkHirschi

Description (last modified by warmerdam)

The USGS NHD data is in ESRI geodatabase mdb format. Example cmd line:

ogr2ogr -f PostgreSQL PG:"user=user dbname=NHD host=localhost password=pass port=5432" NHDH0104.mdb

The above cmd line creates a set of PostGIS tables, but some of the tables' wkb_geometry fields are empty - for example NHDARea, HYDRO_NET_Junctions. The NHD shape fields appear to have type 19 for polygons and type 9 for points which are unknown types, to me. The shape object signatures appear to match PolygonM, 25, and PointM, 21, but since the types do not match measure features the Mmin, Mmax, Marray may mean something different.

thanks

Attachments (2)

CirTest.zip (118.3 KB) - added by hobu 12 years ago.
Two small .mdb's with three features each that do not return all geometries
arcmap-export.tar.gz (8.6 KB) - added by Mateusz Łoskot 12 years ago.
The cirtest23 data exported to shapefile and XML file using ESRI ArcMap? (thanks to Albert for these files)

Download all attachments as: .zip

Change History (24)

comment:1 Changed 12 years ago by Mateusz Łoskot

Randy,

I can not find the NHD data set.
I tried to find some samples on the NHD Data website:
http://nhd.usgs.gov/data.html
but it seems the FTP is down:
ftp://nhdftp.usgs.gov/SubRegions/

Could you provide me with the NHDH0104.mdb file, so I can try to reproduce your problems and fix it?

Thanks

comment:2 Changed 12 years ago by Mateusz Łoskot

Frank,
I'm re-assigning this bug to myself, as it's on my TODO list.

comment:3 Changed 12 years ago by Mateusz Łoskot

Description: modified (diff)
Status: newassigned

Frank, I have a few considerations about NHD database support in the OGR that I'd like to discuss before I start fixing it.

1) Incomplete geometry types coverage

In OGRPGeoTableLayer::Initialize(), switch block decoding SHPT_* values to OGRwkbGeometryType does not include surface types like Polygon or MultiPolygon?. There are only Point, LineString? and MultiPoint? types decoded.

Is this intentional and correct or something is not finished here?

2) OGR geometry type is not set

The type decoded in point 1) is not set for PGeo layer. The poFeatureDefn->SetGeomType?() call in ogr/ogrsf_frmts/pgeo/ogrpgeotablelayer.cpp:188 is commented and has no effect. It results in setting wkbUnknown type for all layers in PGeo database.

There is a comment that seems to explain it:

So for now we just always return wkbUnknown.

but I'm not sure if it should apply to all geometry types or it is supposed to apply only to linear types.

Anyway, setting geometry to wkbUnknown, for PGeo->PG translation, results in setting PostGIS geometry type to GEOMETRY.

3) Reading Shape of unknown type

AFAI debugged and analysed the problem correctly, the main reason of empty geometries transfer to PostGIS is that PGeo Shape features are incorrectly recognized in OGR, and these geometries are not transfered to PostGIS.

For example - layer NDHArea

  • PGeo features include geometry
  • layer NHDArea reports Shape type = 19 which value does not exist in Shapefile types (shapefil.h:152).
  • OGRPGeoLayer::createFromShapeBin() returns NULL geometry

What is the geometry of ShapeType? = 19 ? Does list of Shape types defined in Shapelib cover all types from ESRI Personal Geodatabase, or are there extensions to Shapefile types possible in PGeo?

4) Commented blocks of PGeo driver

Why the code reading MultiPoint? geometries in createFromShapeBin() function is commented? Also, there is code reading surface geometries, but as I'm writing in point 1) above, these types like Polygon and MultiPolygon? are not set anywhere.

I think these issues above should be reviewed and solved in order to fix geometry transfer problem.

comment:4 Changed 12 years ago by warmerdam

Cc: warmerdam added
Description: modified (diff)
Keywords: geodatabase added
Milestone: 1.4.1
Priority: highesthigh
Summary: ogr2ogr translation of NHD geodatabase to PostGIS is incompleteNHD (ESRI) geodatabase reading problems

There seem to be potentially serious issues with the geodatabase driver. I have moved this to 1.4.1 milestone, and I will review the points that Mateusz has raised.

comment:5 in reply to:  3 Changed 12 years ago by warmerdam

Milestone: 1.4.11.5.0
Priority: highnormal

Replying to mloskot:

1) Incomplete geometry types coverage

In OGRPGeoTableLayer::Initialize(), switch block decoding SHPT_* values to OGRwkbGeometryType does not include surface types like Polygon or MultiPolygon?. There are only Point, LineString? and MultiPoint? types decoded.

Is this intentional and correct or something is not finished here?

Mateusz,

This is intentional. SHPT_ values do not distinguish between polygons and multipolygons, so we have no way to know which to set, or even whether we will receive a mix of them. In fact, it is likely that multipoint or multilinestrings could occur in the other types in which case we likely ought to not be setting explicit geometry types for them either.

2) OGR geometry type is not set

The type decoded in point 1) is not set for PGeo layer. The poFeatureDefn->SetGeomType?() call in ogr/ogrsf_frmts/pgeo/ogrpgeotablelayer.cpp:188 is commented and has no effect. It results in setting wkbUnknown type for all layers in PGeo database.

There is a comment that seems to explain it:

So for now we just always return wkbUnknown.

but I'm not sure if it should apply to all geometry types or it is supposed to apply only to linear types.

Anyway, setting geometry to wkbUnknown, for PGeo->PG translation, results in setting PostGIS geometry type to GEOMETRY.

Ah, this was commented for exactly the reason mentioned above. Even for point and line layers you can't be sure that multi versions will not occur. So we are left in the position of not knowing the layer type without scanning all features.

So, this code is operating properly, though it seems we really ought to remove some of the cruft from the Initialize() method (in trunk only).

3) Reading Shape of unknown type

AFAI debugged and analysed the problem correctly, the main reason of empty geometries transfer to PostGIS is that PGeo Shape features are incorrectly recognized in OGR, and these geometries are not transfered to PostGIS.

For example - layer NDHArea

  • PGeo features include geometry
  • layer NHDArea reports Shape type = 19 which value does not exist in Shapefile types (shapefil.h:152).
  • OGRPGeoLayer::createFromShapeBin() returns NULL geometry

What is the geometry of ShapeType? = 19 ? Does list of Shape types defined in Shapelib cover all types from ESRI Personal Geodatabase, or are there extensions to Shapefile types possible in PGeo?

I don't know what this type is, but it would appear to be the crux of the problem. This geometry binary chunk should be isolated for examination. We should also find out what ArcGIS thinks this geometry is.

4) Commented blocks of PGeo driver

Why the code reading MultiPoint? geometries in createFromShapeBin() function is commented? Also, there is code reading surface geometries, but as I'm writing in point 1) above, these types like Polygon and MultiPolygon? are not set anywhere.

I think these issues above should be reviewed and solved in order to fix geometry transfer problem.

I never had an example of the multipoint type, and the code was never ported from shapelib for use of pgeo. It is desirable for us to get an example of this, and complete this code.

I have retargetted this report to 1.5.0. I believe the improvements will take substantial work, and may change too much code to belong in the stable branch. Depending on how it goes we might port it back for 1.4.2.

comment:6 Changed 12 years ago by Mateusz Łoskot

Keywords: PostgreSQL PGeo "ESRI Geodatabase" added; geodatabase removed

Changed 12 years ago by hobu

Attachment: CirTest.zip added

Two small .mdb's with three features each that do not return all geometries

comment:7 Changed 12 years ago by hobu

re: the mdb's I've attached. Here's what ogrinfo's output looks like when these layers are in ArcSDE:

hobu@ubuntu:~/gdal$ ogrinfo SDE:localhost,5151,sde,sde,sde SDE.CIRTEST23
ERROR 4: SDE Driver doesn't support update.
Had to open data source read-only.
INFO: Open of `SDE:localhost,5151,sde,sde,sde'
     using driver `SDE' successful.

Layer name: SDE.CIRTEST23
Geometry: Unknown (any)
Feature Count: 3
Extent: (413989.682700, 4818278.332800) - (414088.684200, 4818642.400500)
Layer SRS WKT:
PROJCS["NAD_1983_UTM_Zone_12N",
   GEOGCS["GCS_North_American_1983",
       DATUM["North_American_Datum_1983",
           SPHEROID["GRS_1980",6378137.0,298.257222101]],
       PRIMEM["Greenwich",0.0],
       UNIT["Degree",0.0174532925199433]],
   PROJECTION["Transverse_Mercator"],
   PARAMETER["False_Easting",500000.0],
   PARAMETER["False_Northing",0.0],
   PARAMETER["Central_Meridian",-111.0],
   PARAMETER["Scale_Factor",0.9996],
   PARAMETER["Latitude_Of_Origin",0.0],
   UNIT["Meter",1.0]]
OBJECTID: Integer (10.0)
CIRCUITID: WideString (10.0)
SUBTYPE: Integer (4.0)
PHASE: WideString (10.0)
OPKV: Real (38.8)
WIRESIZE: WideString (15.0)
FEEDER: WideString (5.0)
YEARINSTALL: WideString (10.0)
GIS_DATE: String (0.0)
COMMENTS: WideString (255.0)
ENABLED: Integer (4.0)
CONDUITTYPE: WideString (25.0)
CONDUITSIZE: WideString (25.0)
OGRFeature(SDE.CIRTEST23):1
 OBJECTID (Integer) = 1
 CIRCUITID (WideString) = (null)
 SUBTYPE (Integer) = 0
 PHASE (WideString) = (null)
 OPKV (Real) =                            12.50000000
 WIRESIZE (WideString) = (null)
 FEEDER (WideString) = (null)
 YEARINSTALL (WideString) = (null)
 GIS_DATE (String) = 00:00:00 07/06/2005
 COMMENTS (WideString) = (null)
 ENABLED (Integer) = 1
 CONDUITTYPE (WideString) = (null)
 CONDUITSIZE (WideString) = (null)
 LINESTRING (414080.956599999975879 4818351.567099999636412,414080.741299999994226 4818345.819799999706447,414079.563100000028498 4818301.640700000338256,414023.187800000014249 4818302.431099999696016,414022.79859999998007 4818295.745299999602139,414021.574499999987893 4818287.231599999591708,414020.571100000001024 4818282.66440000012517,414020.399600000004284 4818281.14159999974072,414020.301399999996647 4818278.332799999974668)

OGRFeature(SDE.CIRTEST23):2
 OBJECTID (Integer) = 2
 CIRCUITID (WideString) = (null)
 SUBTYPE (Integer) = 0
 PHASE (WideString) = (null)
 OPKV (Real) =                            12.50000000
 WIRESIZE (WideString) = (null)
 FEEDER (WideString) = (null)
 YEARINSTALL (WideString) = (null)
 GIS_DATE (String) = 00:00:00 07/06/2005
 COMMENTS (WideString) = (null)
 ENABLED (Integer) = 1
 CONDUITTYPE (WideString) = (null)
 CONDUITSIZE (WideString) = (null)
 LINESTRING (414088.684200000017881 4818642.400500000454485,414084.396499999973457 4818482.867200000211596,414083.178000000014435 4818437.184700000099838,414081.959499999997206 4818391.502299999818206,414080.956599999975879 4818351.567099999636412)

OGRFeature(SDE.CIRTEST23):3
 OBJECTID (Integer) = 3
 CIRCUITID (WideString) = (null)
 SUBTYPE (Integer) = 0
 PHASE (WideString) = (null)
 OPKV (Real) =                            12.50000000
 WIRESIZE (WideString) = (null)
 FEEDER (WideString) = (null)
 YEARINSTALL (WideString) = (null)
 GIS_DATE (String) = 00:00:00 07/01/2005
 COMMENTS (WideString) = (null)
 ENABLED (Integer) = 1
 CONDUITTYPE (WideString) = (null)
 CONDUITSIZE (WideString) = (null)
 LINESTRING (414080.956599999975879 4818351.567099999636412,414078.951000000000931 4818350.008200000040233,414078.609400000015739 4818345.876799999736249,414077.471100000024308 4818303.193300000391901,413989.682700000004843 4818304.457100000232458)

*********************************************************************


hobu@ubuntu:~/gdal$ ogrinfo SDE:localhost,5151,sde,sde,sde SDE.CIRTEST38
ERROR 4: SDE Driver doesn't support update.
Had to open data source read-only.
INFO: Open of `SDE:localhost,5151,sde,sde,sde'
     using driver `SDE' successful.

Layer name: SDE.CIRTEST38
Geometry: Unknown (any)
Feature Count: 3
Extent: (413285.716300, 4817350.113700) - (413399.000700, 4817546.133400)
Layer SRS WKT:
PROJCS["NAD_1983_UTM_Zone_12N",
   GEOGCS["GCS_North_American_1983",
       DATUM["North_American_Datum_1983",
           SPHEROID["GRS_1980",6378137.0,298.257222101]],
       PRIMEM["Greenwich",0.0],
       UNIT["Degree",0.0174532925199433]],
   PROJECTION["Transverse_Mercator"],
   PARAMETER["False_Easting",500000.0],
   PARAMETER["False_Northing",0.0],
   PARAMETER["Central_Meridian",-111.0],
   PARAMETER["Scale_Factor",0.9996],
   PARAMETER["Latitude_Of_Origin",0.0],
   UNIT["Meter",1.0]]
OBJECTID: Integer (10.0)
CIRCUITID: WideString (10.0)
SUBTYPE: Integer (4.0)
PHASE: WideString (10.0)
OPKV: Real (38.8)
WIRESIZE: WideString (15.0)
FEEDER: WideString (5.0)
YEARINSTALL: WideString (10.0)
GIS_DATE: String (0.0)
COMMENTS: WideString (255.0)
ENABLED: Integer (4.0)
CONDUITTYPE: WideString (25.0)
CONDUITSIZE: WideString (25.0)
OGRFeature(SDE.CIRTEST38):1
 OBJECTID (Integer) = 1
 CIRCUITID (WideString) = (null)
 SUBTYPE (Integer) = 0
 PHASE (WideString) = (null)
 OPKV (Real) =                            12.50000000
 WIRESIZE (WideString) = (null)
 FEEDER (WideString) = (null)
 YEARINSTALL (WideString) = (null)
 GIS_DATE (String) = 00:00:00 06/30/2005
 COMMENTS (WideString) = (null)
 ENABLED (Integer) = 1
 CONDUITTYPE (WideString) = (null)
 CONDUITSIZE (WideString) = (null)
 LINESTRING (413373.621700000017881 4817393.042899999767542,413373.349199999996927 4817396.403500000014901,413371.517300000006799 4817396.264999999664724,413369.401300000026822 4817396.450100000016391,413367.349600000015926 4817396.999900000169873,413365.424500000022817 4817397.897599999792874,413363.684599999978673 4817399.115899999625981,413362.182600000000093 4817400.61780000012368,413360.964299999992363 4817402.357800000347197,413360.722599999979138 4817402.79700000025332,413360.571500000020023 4817403.071600000374019,413359.810100000002421 4817404.158999999985099,413358.871400000003632 4817405.097699999809265,413357.783899999980349 4817405.859199999831617,413356.580699999991339 4817406.420199999585748,413355.298400000028778 4817406.763799999840558,413353.975900000019465 4817406.879499999806285,413353.309300000022631 4817406.850300000049174,413351.14069999998901 4817406.659799999557436,413346.353100000007544 4817420.869300000369549,413346.835599999991246 4817470.018400000408292,413344.673999999999069 4817473.106599999591708)

OGRFeature(SDE.CIRTEST38):2
 OBJECTID (Integer) = 2
 CIRCUITID (WideString) = (null)
 SUBTYPE (Integer) = 0
 PHASE (WideString) = (null)
 OPKV (Real) =                            12.50000000
 WIRESIZE (WideString) = (null)
 FEEDER (WideString) = (null)
 YEARINSTALL (WideString) = (null)
 GIS_DATE (String) = 00:00:00 06/30/2005
 COMMENTS (WideString) = (null)
 ENABLED (Integer) = 1
 CONDUITTYPE (WideString) = (null)
 CONDUITSIZE (WideString) = (null)
 LINESTRING (413344.673999999999069 4817473.106599999591708,413345.116900000022724 4817504.437699999660254,413337.863899999996647 4817508.734600000083447,413322.043400000024121 4817526.840699999593198,413312.560200000007171 4817545.736399999819696,413285.716299999970943 4817546.133399999700487)

OGRFeature(SDE.CIRTEST38):3
 OBJECTID (Integer) = 3
 CIRCUITID (WideString) = (null)
 SUBTYPE (Integer) = 0
 PHASE (WideString) = (null)
 OPKV (Real) =                            12.50000000
 WIRESIZE (WideString) = (null)
 FEEDER (WideString) = (null)
 YEARINSTALL (WideString) = (null)
 GIS_DATE (String) = 00:00:00 06/30/2005
 COMMENTS (WideString) = (null)
 ENABLED (Integer) = 1
 CONDUITTYPE (WideString) = (null)
 CONDUITSIZE (WideString) = (null)
 LINESTRING (413399.000699999975041 4817350.113699999637902,413374.038300000014715 4817355.196899999864399,413373.621700000017881 4817393.042899999767542)



comment:8 Changed 12 years ago by Mateusz Łoskot

Priority: normalhigh

comment:9 Changed 12 years ago by Mateusz Łoskot

I added helpful debug message that reports problems with translation of binary shape to OGR geometry (r12718). When testing Hobu's dataset, I found that geometry type of some features like FID=1 from cirtest23 is reported as nShapeType = 50. There is no such number withing range of shape types. Max number is SHPT_MULTIPATCH = 31.

I'm not sure but for me there is a bug in the PGeo data.

comment:10 Changed 12 years ago by Mateusz Łoskot

Following Frank's suggestion in his reply to my big comment above, we would need to know how does the ArcGIS report geometry type of this feature (FID=1, table cirtest23).

I added extra CPLDebug messages (r12718 and r12719) and here is what they say for the cirtest23:

ogr2ogr -f PostgreSQL PG:dbname=bugs2 test.mdb
...
OGR_PG: Layer 'cirtest23' geometry type: GEOMETRY:Unknown (any), Dim=2
...
PGeo: Shape type read from PGeo data is nSHPType = 50
PGeo: Translation shape binary to OGR geometry failed (FID=1)
PGeo: Shape type read from PGeo data is nSHPType = 3
PGeo: Shape type read from PGeo data is nSHPType = 3

Note, the last two messages confirm correct shape type detected for FID=2 and FID=3. The shape type for FID=1 is incorrect.

I've pointed that the same problem is reported for the NHDH0104.mdb database but here is output with new CPLDebug messages added:

ogr2ogr -f PostgreSQL PG:dbname=bugs2 NHDH0104.mdb NHDArea
...
PGeo: Shape type read from PGeo data is nSHPType = 19
PGeo: Translation shape binary to OGR geometry failed (FID=1)
PGeo: Shape type read from PGeo data is nSHPType = 19
PGeo: Translation shape binary to OGR geometry failed (FID=2)
...
PGeo: Shape type read from PGeo data is nSHPType = 19
PGeo: Translation shape binary to OGR geometry failed (FID=126)
PGeo: 126 features read on layer 'NHDArea'.

Again, we don't know what ArcGIS says about the geometry type reported here as nSHPType=19.

comment:11 Changed 12 years ago by warmerdam

BTW, my guess is that geometry type 19 and 50 are new specialized geometries. Perhaps CAD related for instance. Deducing how to parse them will depend on someone with ArcGIS (or perhaps FME) giving details of that the feature geometry of these features *should* be.

Changed 12 years ago by Mateusz Łoskot

Attachment: arcmap-export.tar.gz added

The cirtest23 data exported to shapefile and XML file using ESRI ArcMap? (thanks to Albert for these files)

comment:12 Changed 12 years ago by Mateusz Łoskot

Frank,

It looks we've realized this problem is not a bug but missing support for some features of PGeo format. Are we going to implement it (as a feature enhancement) for 1.5.0 or defer this task?

comment:13 Changed 12 years ago by warmerdam

Mateusz,

If we can implement the missing geometry types in a managable amount of time we should. If it is impractical then defer.

comment:14 Changed 12 years ago by hobu

/me is hopeful this can make it into 1.5

comment:15 Changed 12 years ago by Mateusz Łoskot

OK, I will do my best to implement it.

comment:16 Changed 12 years ago by warmerdam

Keywords: PostgreSQL removed
Priority: highnormal

not a show stopper.

comment:17 Changed 12 years ago by hobu

Milestone: 1.5.01.6.0

Pushing forward to 1.6...

comment:18 Changed 11 years ago by Jeff McKenna

Cc: Jeff McKenna added

comment:19 Changed 11 years ago by springmeyer

Cc: springmeyer added

Thanks to Randy for posting this issue and the effort going into it. I'm just starting to work with NHD data, so this is helpful info. It seems that for my purposes the NHD data provided in shapefile format may suffice for now. In case anyone else would benefit from accessing shapefiles of NHD until this defect is fixed it is possible to extract them for one or multiple subbasins through the USGS mapping interface here: http://nhdgeo.usgs.gov/viewer.htm. (look in the lower left for the extraction tools and instructions).

comment:20 Changed 11 years ago by MarkHirschi

Cc: MarkHirschi added

comment:21 Changed 11 years ago by warmerdam

Cc: warmerdam removed
Milestone: 1.6.01.5.4

A careful review of a additional sample database (TARP_Calumet.mdb) has lead me to believe that the type 50 geometries look a lot like line strings. A test of this theory has apparently been successful, and I have incorporated the logic into trunk (r15662) and 1.5 branch (r15663).

I am going to do a bit of testing with the NHD dataset ...

comment:22 Changed 11 years ago by warmerdam

Milestone: 1.5.4
Resolution: duplicate
Status: assignedclosed

I see that type 50 geometries are more complicated and include circular arcs. I have moved this whole topic off to ticket #2643.

The core of the NHD problem appears to be type 9/19/29 geometries. I'm moving this problem off to ticket #1918.

I am closing this ticket now since it has grown unwieldy and unfocused. See the other two tickets for followup.

Note: See TracTickets for help on using tickets.