Opened 16 years ago

Closed 16 years ago

#1933 closed enhancement (wontfix)

GeoTIFF GTIFMemBufFromWkt data is platform dependent but should always be MSB ordered

Reported by: castalia Owned by: warmerdam
Priority: high Milestone:
Component: default Version: 1.4.2
Severity: major Keywords: GeoTIFF "data order"
Cc:

Description (last modified by warmerdam)

GeoTIFF binary data should be MSB (high-endian) ordered. However, the GTIFMemBufFromWkt function (found in frmts/gtiff/gt_wkt_srs.cpp) which generates the GeoTIFF data - using TIFFSetField, GTIFSetFromOGISDefn and GTIFWriteKeys functions - produces LSB ordered binary data on LSB (little-endian) platforms. Since the contents of the data buffer returned by GTIFMemBufFromWkt are opaque to the user of the function, it, or its dependencies, should detect the platform data order and ensure that MSB ordered binary data is always returned. Note the the function user can not reliably reorder the data since what is returned is a mix of text and binary data.

The sample data (non-printable byte values printed in hex format) generated by GTIFMemBufFromWkt, below, from the same source demonstrates the problem.

MSB platform (Darwin.PowerPC):

GeoTIFF data, 499 bytes -
MM 0x00 * 0x00 0x00 0x00 0x0a 0x00 0x00 0x00 0x0f 0x01 0x00 0x00 0x03
0x00 0x00 0x00 0x01 0x00 0x01 0x00 0x00 0x01 0x01 0x00 0x03 0x00 0x00
0x00 0x01 0x00 0x01 0x00 0x00 0x01 0x02 0x00 0x03 0x00 0x00 0x00 0x01
0x00 0x08 0x00 0x00 0x01 0x03 0x00 0x03 0x00 0x00 0x00 0x01 0x00 0x01
0x00 0x00 0x01 0x06 0x00 0x03 0x00 0x00 0x00 0x01 0x00 0x01 0x00 0x00
0x01 0x11 0x00 0x04 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x08 0x01 0x15
0x00 0x03 0x00 0x00 0x00 0x01 0x00 0x01 0x00 0x00 0x01 0x16 0x00 0x03
0x00 0x00 0x00 0x01 0x00 0x01 0x00 0x00 0x01 0x17 0x00 0x04 0x00 0x00
0x00 0x01 0x00 0x00 0x00 0x01 0x01 0x1c 0x00 0x03 0x00 0x00 0x00 0x01
0x00 0x01 0x00 0x00 0x83 0x0e 0x00 0x0c 0x00 0x00 0x00 0x03 0x00 0x00
0x00 0xc4 0x84 0x82 0x00 0x0c 0x00 0x00 0x00 0x06 0x00 0x00 0x00 0xdc
0x87 0xaf 0x00 0x03 0x00 0x00 0x00 L 0x00 0x00 0x01 0x0c 0x87 0xb0 0x00
0x0c 0x00 0x00 0x00 0x06 0x00 0x00 0x01 0xa4 0x87 0xb1 0x00 0x02 0x00
0x00 0x00 0x1f 0x00 0x00 0x01 0xd4 0x00 0x00 0x00 0x00 ? 0xd0 0x00 0x00
0x00 0x00 0x00 0x00 ? 0xd0 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0xc0 0x85 'M 0xf3 0xb6 E 0xa2 0xc1 0x14 0x89 &' 0xcc : 0x00 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x01 0x00 0x01 0x00 0x00 0x00
0x12 0x04 0x00 0x00 0x00 0x00 0x01 0x00 0x01 0x04 0x01 0x00 0x00 0x00
0x01 0x00 0x01 0x04 0x02 0x87 0xb1 0x00 0x15 0x00 0x00 0x08 0x00 0x00
0x00 0x00 0x01 0x7f 0xff 0x08 0x01 0x87 0xb1 0x00 0x09 0x00 0x15 0x08
0x02 0x00 0x00 0x00 0x01 0x7f 0xff 0x08 0x06 0x00 0x00 0x00 0x01 # 0x8e
0x08 0x08 0x00 0x00 0x00 0x01 0x7f 0xff 0x08 0x09 0x87 0xb0 0x00 0x01
0x00 0x04 0x08 0x0a 0x87 0xb0 0x00 0x01 0x00 0x05 0x0c 0x00 0x00 0x00
0x00 0x01 0x7f 0xff 0x0c 0x02 0x00 0x00 0x00 0x01 0x7f 0xff 0x0c 0x03
0x00 0x00 0x00 0x01 0x00 0x11 0x0c 0x04 0x00 0x00 0x00 0x01 #) 0x0c
0x0a 0x87 0xb0 0x00 0x01 0x00 0x02 0x0c 0x0b 0x87 0xb0 0x00 0x01 0x00
0x03 0x0c 0x10 0x87 0xb0 0x00 0x01 0x00 0x01 0x0c 0x11 0x87 0xb0 0x00
0x01 0x00 0x00 0xc0 0x14 0x00 0x00 0x00 0x00 0x00 0x00 0xc0 R 0xdb C
0x95 0x81 0x06 % 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 AI 0xe8 0xe2 h 0x10 bNAI 0xe8 0xe2 h 0x10
bNEquirectangular MARS|GCS_MARS| 0x00

LSB platform (Linux.X86_64):

GeoTIFF data, 499 bytes -
MM* 0x00 0x0a 0x00 0x00 0x00 0x00  0x0f 0x00 0x00 0x01 0x03 0x00 0x01
0x00 0x00 0x00 0x00 0x00 0x01 0x00 0x01 0x01 0x03 0x00 0x01 0x00 0x00
0x00 0x00 0x00 0x01 0x00 0x02 0x01 0x03 0x00 0x01 0x00 0x00 0x00 0x00
0x00 0x08 0x00 0x03 0x01 0x03 0x00 0x01 0x00 0x00 0x00 0x00 0x00 0x01
0x00 0x06 0x01 0x03 0x00 0x01 0x00 0x00 0x00 0x00 0x00 0x01 0x00 0x11
0x01 0x04 0x00 0x01 0x00 0x00 0x00 0x08 0x00 0x00 0x00 0x15 0x01 0x03
0x00 0x01 0x00 0x00 0x00 0x00 0x00 0x01 0x00 0x16 0x01 0x03 0x00 0x01
0x00 0x00 0x00 0x00 0x00 0x01 0x00 0x17 0x01 0x04 0x00 0x01 0x00 0x00
0x00 0x01 0x00 0x00 0x00 0x1c 0x01 0x03 0x00 0x01 0x00 0x00 0x00 0x00
0x00 0x01 0x00 0x0e 0x83 0x0c 0x00 0x03 0x00 0x00 0x00 0xc4 0x00 0x00
0x00 0x82 0x84 0x0c 0x00 0x06 0x00 0x00 0x00 0xdc 0x00 0x00 0x00 0xaf
0x87 0x03 0x00 L 0x00 0x00 0x00 0x0c 0x01 0x00 0x00 0xb0 0x87 0x0c 0x00
0x06 0x00 0x00 0x00 0xa4 0x01 0x00 0x00 0xb1 0x87 0x02 0x00 0x1f 0x00
0x00 0x00 0xd4 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0xd0 ? 0x00 0x00 0x00 0x00 0x00 0x00 0xd0 ? 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0xa2 E 0xb6 0xf3 M' 0x85 0xc0 0x00 : 0xcc '& 0x89 0x14 0xc1 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x01 0x00 0x01 0x00 0x00 0x00 0x12
0x00 0x00 0x04 0x00 0x00 0x01 0x00 0x01 0x00 0x01 0x04 0x00 0x00 0x01
0x00 0x01 0x00 0x02 0x04 0xb1 0x87 0x15 0x00 0x00 0x00 0x00 0x08 0x00
0x00 0x01 0x00 0xff 0x7f 0x01 0x08 0xb1 0x87 0x09 0x00 0x15 0x00 0x02
0x08 0x00 0x00 0x01 0x00 0xff 0x7f 0x06 0x08 0x00 0x00 0x01 0x00 0x8e #
0x08 0x08 0x00 0x00 0x01 0x00 0xff 0x7f 0x09 0x08 0xb0 0x87 0x01 0x00
0x04 0x00 0x0a 0x08 0xb0 0x87 0x01 0x00 0x05 0x00 0x00 0x0c 0x00 0x00
0x01 0x00 0xff 0x7f 0x02 0x0c 0x00 0x00 0x01 0x00 0xff 0x7f 0x03 0x0c
0x00 0x00 0x01 0x00 0x11 0x00 0x04 0x0c 0x00 0x00 0x01 0x00 )# 0x0a
0x0c 0xb0 0x87 0x01 0x00 0x02 0x00 0x0b 0x0c 0xb0 0x87 0x01 0x00 0x03
0x00 0x10 0x0c 0xb0 0x87 0x01 0x00 0x01 0x00 0x11 0x0c 0xb0 0x87 0x01
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x14 0xc0 % 0x06 0x81 0x95
C 0xdb R 0xc0 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 Nb 0x10 h 0xe2 0xe8 IANb 0x10 h 0xe2 0xe8
IAEquirectangular MARS|GCS_MARS| 0x00

Bradford Castalia
Senior Systems Analyst
Planetary Image Research Laboratory
University of Arizona

Change History (9)

comment:1 by warmerdam, 16 years ago

Description: modified (diff)
Priority: highesthigh
Severity: blockermajor
Status: newassigned

comment:2 by warmerdam, 16 years ago

Description: modified (diff)

Bradford,

I'm not at all clear on why you write "GeoTIFF binary data should be MSB (high-endian) ordered."

On the face of it, I see no problem with LSB GeoTIFFs being generated on LSB platforms.

comment:3 by castalia, 16 years ago

Caveat: The test for correctness of the GeoTIFF data is being done using the IDL/ENVI application. This application correctly reports and uses the GeoTIFF data when it is from a file generated on an MSB platform, but does not recognize the same, but re-ordered, GeoTIFF data when it is from a file generated on an LSB platform.

Further testing of the GTIFMemBufFromWkt results by seeing if GTIFWktFromMemBuf can interpret it shows that this succeeds on both MSB and LSB platforms.

The workaround for this gotcha would be to coerce the GeoTIFF data to always be MSB ordered. Is there as way to instruct the GDAL functions to do this?

Bradford Castalia

comment:4 by castalia, 16 years ago

The source of the architecture-specific inconsistencies in the GeoTIFF data has been traced down to cross-platform GDAL configuration interaction during the build process. Here at PIRL we use a heterogenous environment with four primary systems architectures and a highly integrated shared filesystem organization. In particular, software include header files are presumed to be sharable across platforms since they are not expected to contain architecture-specific content (I've only encountered one other case in about twenty years managing software development at PIRL where this problem occurred).

By running numerous test builds during which configure settings were adjusted I was able to build and install GDAL such that the subsequent build of the application using GDAL produced correct GeoTIFF data on each platform. I remain concerned, however, that there are installed GDAL include files with architecture-specific settings that lie in wait like land mines for the unwary developer who would like to use GDAL in an application. It also remains confusing to me why the loop-back test of the GTIFMemBufFromWkt results by GTIFWktFromMemBuf always succeeded though the data written to the file was undigestable by all applications that were used to test the file content. I'm presuming this is a result of the cross-platform inconsistencies propagating to the data parsing algorithms such that the data generation problems were effectively canceled by the corresponding problem in the data interpreter.

A useful outcome of this experience was a change that I implemented in the GTIFMemBufFromWkt function to force the GeoTIFF data to always be MSB order (the choice of MSB over LSB is because all of our other instrument data is MSB). The ability to generate data with a consistent data order, regardless of the architecture of the system hosting the data generation software, is very important to data providers. The GTIFMemBufFromWkt function should have modality control via an additional argument for this capability rather than the forced MSB coding I used.

Bradford Castalia

comment:5 by warmerdam, 16 years ago

Resolution: worksforme
Status: assignedclosed

Bradford,

There are indeed many platform specific definitions in the gdal_config.h include file (generated by configure), not just the endian ones. So GDAL include files should not be considered platform independent.

The libtiff default is to generate TIFFs in the native platform byte order, and I don't see a compelling need to offer an override to that. As noted, your local patch should handle your special requirement.

I'm closing this ticket, but please reopen if there is a specific bug still needing to be fixed.

comment:6 by castalia, 16 years ago

Resolution: worksforme
Status: closedreopened
Type: defectenhancement

The cross-platform dependencies are due to installed include files; gdal_config.h is not an installed config file (it is only used during the GDAL build process). The problem arise under these circumstances: GDAL is built on MSB platform A with include files installed in /pub/include where all open source include files are installed. Then GDAL is built on LSB platform B using a compiler -I/pub/include to pick up include files needed for GDAL dependencies. The latter build will get GDAL architecture specific configuration informantion for platform A, not the local platform B. The case that I find most obvious (there may be others) is the cpl_port.h file, which is included by the key gdal.h file. This contains platform data order configuration information such as the section starting at line 304:

/*---------------------------------------------------------------------
 *                         CPL_LSB and CPL_MSB
 * Only one of these 2 macros should be defined and specifies the byte 
 * ordering for the current platform.  
 * This should be defined in the Makefile, but if it is not then
 * the default is CPL_LSB (Intel ordering, LSB first).
 *--------------------------------------------------------------------*/
#if defined(WORDS_BIGENDIAN) && !defined(CPL_MSB) && !defined(CPL_LSB)
#  define CPL_MSB
#endif

#if ! ( defined(CPL_LSB) || defined(CPL_MSB) )
#define CPL_LSB
#endif

#if defined(CPL_LSB)
#  define CPL_IS_LSB 1
#else
#  define CPL_IS_LSB 0
#endif

Building a test stub that includes cpl_port.h and reports the value of CPL_IS_LSB shows that it has the value 1 on all of our platforms regardless of native data order. Is the implication of this section that the requirement of specifying the platform data order has been left to the user to handle in their Makefile and if they don't (because they don't know about this) they get LSB by default? And how is this reconciled with the libgdal operations that may have been compiled with a different CPL_IS_LSB value? My experience with this issue shows that it silently results in the GeoTIFF data being scrambled.

The compelling need to control the GeoTIFF data order is for data producers for whom a specified data order is a critical requirement.

Thnx,

Bradford Castalia
Sr. Systems Analyst
Planetary Image Research Laboratory
HiRISE Operations Center
University of Arizona
Tucson, Arizona

comment:7 by warmerdam, 16 years ago

Bradford,

I mispoke myself earlier referring to gdal_config.h. The file is actually cpl_config.h, and this *is* installed with the other include files even though it is not platform independent. Yes, I realize this is not typical for configure based packages that produce a config.h file.

cpl_port.h gets the byte order from cpl_config.h. It is not intended that applications linking against GDAL need to define this anywhere.

Note that it doesn't usually matter if you get include files with the wrong endianness when building an application against GDAL since it does not affect interfaces which is all that applications generally use. It can however make compilation difficult as cpl_config.h is used to select include files to pull in.

So, in fact, unless application code is doing some funky stuff based on the cpl definitions for byte order, I don't see why it would matter if you get the right or wrong endianness of GDAL include files.

comment:8 by castalia, 16 years ago

Frank,

It's certainly quite a challenge to manage a software build system with as many build dependencies as GDAL, especially when these dependencies include platform-specific conditions. Nevertheless, that's clearly the source of the problem that was causing the GeoTIFF data to be unrecognizable by applications that do recognize valid GeoTIFF data. As proof of this hypothesis I tested GeoTIFF data generated on an LSB system using the identical application code (employing a call to GTIFMemBufFromWkt to get the data) and source file under two conditions: GDAL built and installed on the LSB system using a configuration that included the shared open-source software include files directory where an immediately preceeding GDAL build and install on an MSB platform had placed its header files, and GDAL built and installed on the LSB system without including the shared include directory containing the files from the MSB install in the configuration. In the former case the resulting file was not recognized by ENVI, but in the latter case the resulting file was recognized by ENVI. So clearly the problem was not an inability of ENVI to recognized LSB GeoTTIFF data vs MSB data; the problem was due to the conditions of the GDAL build. I ran several other matrix-of-conditions tests to confirm that the source of the problem was cross-platform configuration confusion during the GDAL build.

Bottom line: It does in fact matter that the correct platform-specific information is in the installed GDAL include files. At the very least this affects subsequent GDAL builds in heterogenous environments, as I encountered here at PIRL, and can easily be the source of hidden gotchas.

My recommondation, from having wrestled with this issue myself, is to limit platform-specific configuration include files to private files that are used only at package build time and are not installed in the public area. If user applications employing GDAL do not need the platform-specific information of cpl_port.h/cpl_config.h to successfully compile, then this information should not be in these files. As you say, the publically installed include files should only define the public software interfaces, not the private workings of their implementations.

Bradford Castalia
Sr. Systems Analyst
Planetary Image Research Laboratory
HiRISE Operations Center
University of Arizona
Tucson, Arizona

comment:9 by warmerdam, 16 years ago

Resolution: wontfix
Status: reopenedclosed

Understood, thanks.

Note: See TracTickets for help on using tickets.