Opened 12 years ago
Closed 12 years ago
#1971 closed defect (fixed)
64bit LFS support for G_ftell() and PRI_OFF_T gis.h
Reported by: | hamish | Owned by: | |
---|---|---|---|
Priority: | critical | Milestone: | 6.4.3 |
Component: | LibGIS | Version: | svn-develbranch6 |
Keywords: | r.in.bin, LFS, G_ftell(), PRI_OFF_T | Cc: | |
CPU: | x86-64 | Platform: | Linux |
Description
Hi,
as reported on the users' ML, r.in.bin is failing for input files bigger than 2GB. thread: "ERROR: Bytes do not match file size with r.in.bin (but file size is correct!!)"
It works in trunk, but not 6.x. Just after 6.4.1 was released r.in.bin got upgraded to off_t, but PRI_OFF_T in GRASS 6 set by gis.h remains as "ld" because
LFS_CFLAGS = -D_FILE_OFFSET_BITS=64
is missing from Grass.make (it is there in G7). That may just be the overflow in the error message though, I think the real problem is G_ftell() is always returning int, even when ftello() should be returning off_t.
fwiw in G6 only the flags for wxWidgets get D_FILE_OFFSET'd. In G7 PRI_OFF_T is "lld" and r.in.bin works on a >2GB test file.
here's a little Matlab/Octave code to make one:
%%%% make a >2GB binary data file fd1 = fopen('lfs_test.bin', 'w'); rows=19450 cols=29404 for i = 1:rows row_content = [1:cols] * sqrt(i); fwrite(fd1, row_content, 'real*4'); end fclose(fd1) %%%%
and import command:
$ ls -l lfs_test.bin -rw-r--r-- 1 hamish hamish 2287631200 May 10 12:43 lfs_test.bin GRASS> r.in.bin -f input=lfs_test.bin output=outputmap bytes=4 \ n=51:05:20.4N s=41:21:50.4N w=5:08:31.2W e=9:33:36E \ r=19450 c=29404 anull=-9999.0
the error in GRASS 6 is:
WARNING: File Size -2007336096 ... Total Bytes 2287631200 ERROR: Bytes do not match file size
This is a 64bit system, and GRASS6> g.version -b | tr ' ' '\n'
has the --enable-64bit option, but I'm not really sure why that's needed, as autoconf already knows the platform. (for cross-compiling from 32bit?)
thanks, Hamish
Change History (14)
follow-up: 2 comment:1 by , 12 years ago
follow-up: 3 comment:2 by , 12 years ago
Replying to hamish:
AFAICT r.in.bin is the only thing in GRASS 6 which uses G_ftell(), and nothing in the grass6-addons repo uses it. It was there in the 6.4.2 release though, so someone might be using it in a personal addon program. Could it harm backwards compatibility to change the function def'n, or would it be harmless?
The worst that can happen is that off_t gets truncated to int when the result is assigned (which doesn't apply to r.in.bin, which assigns the result of G_ftell() to an off_t), rather than within G_ftell().
The truncation to int was done in r45468, but I don't know why. Possibly to avoid issues with off_t having different sizes in different files (didn't LFS have to be enabled on a module-by-module basis in 6.x?).
The irony is that having G_fseek() return int is even worse than using plain fseek(), which returns a long. This imposes a 32-bit limit even on "real" 64-bit platforms (i.e. not Windows, where even the 64-bit versions have a 32-bit long).
At the very least, G_fseek() should return a long, if not an off_t.
Seeing that it is only r.in.bin, a less invasive fix to get the 6.4.3 release out the door might be to add the
#ifdef HAVE_FSEEKO
test into r.in.bin directly and avoid using G_ftell() altogether.
7.0 has 31 uses of G_ftell() and 48 uses of G_fseek() (although a bunch of those are in r.viewshed and r.terraflow, which all come from the same AMI_STREAM template). The fact that 6.x only has one use indicate that everything else is using plain ftell(), which means that it will fail on files larger than 2GiB.
follow-up: 5 comment:3 by , 12 years ago
Replying to glynn:
(didn't LFS have to be enabled on a module-by-module basis in 6.x?).
not any more:
r40620 | glynn | 2010-01-23 03:19:24 +1300 (Sat, 23 Jan 2010) | 4 lines Enable LFS globally Define, use HAVE_FSEEKO Makefile cleanup
7.0 has 31 uses of G_ftell() and 48 uses of G_fseek() [...]. The fact that 6.x only has one use indicate that everything else is using plain ftell(), which means that it will fail on files larger than 2GiB.
outside of libgis, the module priorities would seem to me to be:
- r.in.bin
- r.in.ascii
- r.in.poly
- r.in.xyz (although there it's just used for a G_percent() estimate)
r3.in.ascii
- r.out.mat
- v.in.dxf
Hamish
follow-up: 6 comment:4 by , 12 years ago
fwiw my glibc docs show vanilla fseek() returning int, it's ftell() that returns long:
int fseek(FILE *stream, long offset, int whence); long ftell(FILE *stream);
comment:5 by , 12 years ago
Replying to hamish:
Replying to glynn:
(didn't LFS have to be enabled on a module-by-module basis in 6.x?).
not any more:
r40620 | glynn | 2010-01-23 03:19:24 +1300 (Sat, 23 Jan 2010) | 4 lines
That's trunk.
7.0 has 31 uses of G_ftell() and 48 uses of G_fseek() [...]. The fact that 6.x only has one use indicate that everything else is using plain ftell(), which means that it will fail on files larger than 2GiB.
outside of libgis, the module priorities would seem to me to be:
FWIW, the modules and libraries which use G_fseek() and G_ftell() in 7.x are:
general/g.rename imagery/i.find include/iostream lib/dspf lib/gis lib/iostream lib/raster lib/raster3d lib/rst/interp_float lib/sites lib/vector/Vlib lib/vector/diglib ps/ps.map raster/r.coin raster/r.cut.mat raster/r.in.arc raster/r.in.ascii raster/r.in.bin raster/r.in.mat raster/r.in.poly raster/r.in.xyz raster3d/r3.in.bin vector/v.in.dxf vector/v.support vector/v.vol.rst
follow-up: 7 comment:6 by , 12 years ago
Replying to hamish:
fwiw my glibc docs show vanilla fseek() returning int, it's ftell() that returns long:
Right. ftell() returns an offset, fseek() accepts one (and returns a status). The G_* versions should use at least a long for offsets.
comment:7 by , 12 years ago
Glynn wrote:
That's trunk.
ah, the history was imported as the file was recently copied from trunk. ok, so 6.x is still LFS-on-demand. Added to devbr6 r.in.bin Makefile in r56207. formal audit of all modules which might need it can come later.
Replying to glynn:
The G_* versions should use at least a long for offsets.
if I undo r45468 and add
#include <sys/types.h> /* for off_t */
to include/gisdefs.h, then devbr6 builds ok and r.in.bin + the >2GB file imports ok for me on 64bit linux.
Glynn:
The truncation to int was done in r45468, but I don't know why. Possibly to avoid issues with off_t having different sizes in different files (didn't LFS have to be enabled on a module-by-module basis in 6.x?).
Martin, do you remember why the change to int was needed? I can't find anything in the archives about it.
thanks, Hamish
comment:8 by , 12 years ago
in devbr6 I've committed r56422 which changes it back from int to off_t for testing.
Hamish
follow-up: 10 comment:9 by , 12 years ago
Hi,
it would help testing if the nightly wingrass builds had the --enable-largefile ./configure switch enabled in the 6.5 build.
thanks, Hamish
follow-up: 11 comment:10 by , 12 years ago
Replying to hamish:
it would help testing if the nightly wingrass builds had the --enable-largefile ./configure switch enabled in the 6.5 build.
changing there?:
http://trac.osgeo.org/grass/browser/grass/branches/develbranch_6/mswindows/osgeo4w/package.sh#L116
follow-up: 12 comment:11 by , 12 years ago
follow-up: 13 comment:12 by , 12 years ago
Replying to hamish:
Replying to hellik:
changing there?: [...]/develbranch_6/mswindows/osgeo4w/package.sh#L116
seems like it, thanks; change made in r56545. (see also grass-addons/tools/wingrass-packager/grass_compile.sh which calls that)
Note that --enable-largefile has no effect with wingrass 6.x. Only trunk has LFS on windows. See also #1131 and the LFS-related changes to include/config.h.in [0].
[0] https://trac.osgeo.org/grass/browser/grass/trunk/include/config.h.in#L282
comment:13 by , 12 years ago
Replying to mmetz:
Note that --enable-largefile has no effect with wingrass 6.x.
I guess the thing to do then is to delay worrying about it for wingrass, and just focus on get it working at least for 32bit Linux/Mac. I'll try to get to that tomorrow after a bit more testing, if all goes well I'll backport r56422 to relbr64 and then we should be ok to move the milestone to 6.4.4. (LFS audit for other modules as part of that)
Hamish
comment:14 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
AFAICT r.in.bin is the only thing in GRASS 6 which uses G_ftell(), and nothing in the grass6-addons repo uses it. It was there in the 6.4.2 release though, so someone might be using it in a personal addon program. Could it harm backwards compatibility to change the function def'n, or would it be harmless?
Seeing that it is only r.in.bin, a less invasive fix to get the 6.4.3 release out the door might be to add the
#ifdef HAVE_FSEEKO
test into r.in.bin directly and avoid using G_ftell() altogether.Hamish