Opened 16 years ago

Closed 16 years ago

#2124 closed defect (invalid)

Handling of Shapefile Z and M data in shapelib is incorrect

Reported by: odegaard Owned by: warmerdam
Priority: normal Milestone:
Component: OGR_SF Version: 1.5.0 betas/RCs
Severity: normal Keywords: shapelib shape
Cc: Mateusz Łoskot

Description (last modified by warmerdam)

I discovered a bug in shapelib and how it handles Z and M data. What I did was converting a Polygon shapefile to PolygonZ using:

 OGR2OGR.exe -f "ESRI Shapefile" countriesZ.shp countries.shp -lco SHPT=PolygonZ

At the shapefile page, there is this warning: "Shapefiles with measure values are not supported". The thing is that you cannot create a shapefile with Z values without also having measure values. Its either XY, XYM or XYZM (see the spec). I confirmed this with files created using ArcCatalog, and sure enough that's how it is (In ArcCatalog you can choose to only have Z values in the file, but it really creates a Z+M file).

The problem is that the shapetype header is set to 15 (=PolygonZ*) so a shapefile reader will think there are both Z and M values in the file. Shapereaders that ignores Z and M doesn't have a problem, but if they are read, they are either incorrect or will risk getting a EOF exception.

*The same thing of course goes for points and polylines.

Technically what OGR really creates with the above command is PolygonM, but with an incorrect shapetype value. So the quickest fix would be to change the shapetype to 25 (generally add 10) and change the warning to "Shapefiles with Z values are not supported" (although adding support for both Z and M would of course be the best, and still fairly straighforward).

Change History (7)

comment:1 by warmerdam, 16 years ago

Component: UtilitiesOGR_SF
Description: modified (diff)
Keywords: shape added
Status: newassigned

comment:2 by warmerdam, 16 years ago

Morten,

I have reviewed OGR briefly, and Shapelib more carefully and it seems that Shapelib is creating "M" object correctly with XYZM.

"Z" objects are created with XYZM if an M value is passed into SHPCreateObject(). They are created as XYZ if no measure is provided. This is done under the assumption that it is legal to write shapefiles with the "M" tuple omitted from the object and that applications will detect this properly.

Whether or not this is "in spec" or not is not entirely clear to me. The shapefile specification does not seem to address this possibility (at least on brief review). But the code was written this way based on experience with some ESRI created files, and in consultation with Craig Bruce (at Cubewerx) who seemed convinced this was appropropriate behavior.

Are you running into specific applications that can't read these XYZ files? It seems that Shapelib is reading it properly, and I haven't had reports of problems reading such files with ArcGIS.

Applications using Shapelib can avoid creating these XYZ files just by providing an "M" array in SHPCreateObject().

So I'm not sure how to analyse whether the XYZ type files are appropriate or not. If you are running into applications which crash on these files, then I'd be willing to at least change OGR to create Point/Arc/Polygon25D files as XYZM instead of XYZ (by passing a zeroed measure array into SHPCreateObject()).

Awaiting feedback...

comment:3 by odegaard, 16 years ago

Just because some applications have built-in error-checking doesn't make it right. I think the spec is very clear on this, given that nowhere it describes XYZ data, and shapetype 15 is XYZM polygons.

You are correct that ArcMap will read these files. The most common reason that it goes well is that most apps only bother with X and Y data, sometimes Z, and rarely M. So if you never read the M value, you will never see the problem.

The problem will be with any library that doesn't perform extra error checking but just adhere to the spec and tries to read M values (in which your assumptions that applications detects this are wrong and to be honest a poor assumtion that apps should check for non-standard files and guess the format). Passing in zeroes for M is fine. That's for instance what Arc* does, if you select to make a Z-only file (which as I said earlier not really is Z-only).

As an enhancement it would be nice also to be able to create the XYM format.

I agree that it's kinda weird that you can make XY, XYM and XYZM, but not XYZ (given that Z is probably more common than M). That's why I re-read the spec and analyzed the files ArcMap creates to confirm that it wasn't an error in the spec.

comment:4 by warmerdam, 16 years ago

http://bugzilla.maptools.org/show_bug.cgi?id=1249 relates to this issue.

I have re-reviewed the shapefile specification at:

http://shapelib.maptools.org/dl/shapefile.pdf

Note that the Mmin, Mmax, and Marray fields in the MultiPointZ (Table 13, page 16) are marked with an asterisks which is indicated below to indicate "optional". The same notation is made for PolyLineZ, and PolygonZ records and interestingly MultiPointM, PolyLineM and PolygonZ. Oddly this is not done for PointZ.

So, I disagree with your position that the M optionality is not in the specification. I believe it is. Possibly a case could be made that PointZ should always include the M value.

comment:5 by Mateusz Łoskot, 16 years ago

Cc: Mateusz Łoskot added

comment:6 by odegaard, 16 years ago

Wow, I completely overlooked the asterisks. I bow in the dust. Sorry I wasted your time. (funny that ArcMap writes them out then)

comment:7 by warmerdam, 16 years ago

Resolution: invalid
Status: assignedclosed
Note: See TracTickets for help on using tickets.