Opened 14 years ago

Closed 11 years ago

#1131 closed defect (fixed)

Global LFS for wingrass

Reported by: mmetz Owned by: grass-dev@…
Priority: critical Milestone: 7.0.0
Component: LibGIS Version: svn-trunk
Keywords: LFS, wingrass Cc:
CPU: All Platform: MSWindows 7

Description

Large File Support (LFS) is a must for any modern GIS package. In GRASS 7, it is globally enabled, but not for wingrass. This is not a blocker considering that Linux and Mac are the OS's of choice for power users, but for the numerous wingrass users it should be available.

Change History (22)

comment:1 by mmetz, 14 years ago

Following up previous discussions, I have removed struct stat where possible. The changes needed for LFS with wingrass are in config.h.in but still inactive.

Glynn, could you please check if I messed up something somewhere (all the struct stat related changes and config.h.in)?

Thanks,

Markus M

in reply to:  1 ; comment:2 by glynn, 14 years ago

Replying to mmetz:

Following up previous discussions, I have removed struct stat where possible. The changes needed for LFS with wingrass are in config.h.in but still inactive.

Glynn, could you please check if I messed up something somewhere (all the struct stat related changes and config.h.in)?

The config.h.in changes seem okay. Although, I would omit the aliases for seek() and tell(), as we don't use those (does seek() even exist? I can't find any evidence for it).

I don't see any "struct stat" changes. Currently, 7.0 has references to "struct stat" in:

display/d.font/main.c
general/g.access/get_perms.c
general/g.mkfontcap/freetype_fonts.c
include/gisdefs.h
include/iostream/ami_stream.h
lib/db/dbmi_base/isdir.c
lib/gis/copy_dir.c
lib/gis/mapset_msc.c
lib/gis/mapset_nme.c
lib/gis/paths.c
lib/gis/remove.c
lib/gis/user_config.c
lib/init/clean_temp.c
lib/vector/Vlib/open.c
lib/vector/dglib/examples/parse.c
lib/vector/diglib/file.c
raster/r.li/r.li.daemon/daemon.c
vector/v.mapcalc/map.c

in reply to:  2 ; comment:3 by mmetz, 14 years ago

Replying to glynn:

Replying to mmetz:

Following up previous discussions, I have removed struct stat where possible. The changes needed for LFS with wingrass are in config.h.in but still inactive.

Glynn, could you please check if I messed up something somewhere (all the struct stat related changes and config.h.in)?

The config.h.in changes seem okay.

They were not. STRUCT_STAT needs the be always defined, fixed.

Although, I would omit the aliases for seek() and tell()

Done.

I don't see any "struct stat" changes. Currently, 7.0 has references to "struct stat" in:

> display/d.font/main.c
> general/g.access/get_perms.c
> general/g.mkfontcap/freetype_fonts.c
> include/gisdefs.h
> include/iostream/ami_stream.h
> lib/db/dbmi_base/isdir.c
> lib/gis/copy_dir.c
> lib/gis/mapset_msc.c
> lib/gis/mapset_nme.c
> lib/gis/paths.c
> lib/gis/remove.c
> lib/gis/user_config.c
> lib/init/clean_temp.c
> lib/vector/Vlib/open.c
> lib/vector/dglib/examples/parse.c
> lib/vector/diglib/file.c
> raster/r.li/r.li.daemon/daemon.c
> vector/v.mapcalc/map.c

That was the first step, to get some feedback without actually changing anything. In theory, LFS for wingrass is now working as of r43115 in trunk. In practice, global LFS in trunk is working as before on Linux.

Some remarks:

Everything related to fonts: the stat'ed file might be opened by another program?

general/g.mkfontcap/freetype_fonts.c
display/d.font/main.c

Shouldn't the check here also check if path is a dir?

lib/gis/mapset_nme.c

Needs some cleaning up, unused functions (coor file is never loaded to memory).

lib/vector/diglib/file.c

There is a temporary check for LFS hidden in v.info, visible with DEBUG=1.

Markus M

in reply to:  3 ; comment:4 by hellik, 14 years ago

Replying to mmetz:

[...] In theory, LFS for wingrass is now working as of r43115 in trunk.

Is this now a working for WinVista-32/Win7-32 and/or WinVista-64/Win7-64? Or only for 64-bit versions of the Win operating systems?

And is there maybe a testcase for testing this?

So can I test now --enable-largefile on my WinVista32?

thanks for the work! best regards Helmut

in reply to:  4 ; comment:5 by mmetz, 14 years ago

Replying to hellik:

Replying to mmetz:

[...] In theory, LFS for wingrass is now working as of r43115 in trunk.

Is this now a working for WinVista-32/Win7-32 and/or WinVista-64/Win7-64? Or only for 64-bit versions of the Win operating systems?

It's supposed to be working on all MS systems, irrespective of whether it's 32bit or 64 bit.

And is there maybe a testcase for testing this?

No standard test case. But try e.g. r.terraflow or r.watershed -m with a DEM with more than 100 million cells.

So can I test now --enable-largefile on my WinVista32?

Now (r43115) enabled by default in mswindows/osgeo4w/package.sh

thanks for the work!

Please test!

Markus M

in reply to:  3 comment:6 by glynn, 14 years ago

Replying to mmetz:

Some remarks:

Everything related to fonts: the stat'ed file might be opened by another program?

general/g.mkfontcap/freetype_fonts.c
display/d.font/main.c

This only tests for existence and type. Ideally, we should have e.g. G_file_exists(), G_file_is_file() and G_file_is_directory() and use those (the last two need a flag to indicate whether to use stat() or lstat()). That would probably eliminate many of the stat() calls. Also, G_file_size(). It's better to keep as much of this in the library as possible.

Shouldn't the check here also check if path is a dir?

lib/gis/mapset_nme.c

It should check that WIND is a file, rather than checking that it isn't a directory.

in reply to:  5 ; comment:7 by hellik, 14 years ago

Replying to mmetz:

Please test!

Markus M

a quick compiling test with r43116

[...]
checking for special C compiler options needed for large files... no
checking for _FILE_OFFSET_BITS value needed for large files... no
checking for _LARGE_FILES value needed for large files... no
checking for _LARGEFILE_SOURCE value needed for large files... no
checking for _LARGEFILE_SOURCE value needed for large files... no
checking for fseeko... no
checking if system supports Large Files at all... yes
[...]
GRASS is now configured for:  i686-pc-mingw32

  Source directory:            /c/osgeo4w/usr/src/grass_trunk
  Build directory:             /c/osgeo4w/usr/src/grass_trunk
  Installation directory:      ${prefix}/grass-7.0.svn
  Startup script in directory: ${exec_prefix}/bin
  C compiler:                  gcc -g -O2 
  C++ compiler:                c++ -g -O2
  Building shared libraries:   yes
  64bit support:               
  OpenGL platform:             Windows

  MacOSX application:         no
  MacOSX architectures:       
  MacOSX SDK:                 

  NVIZ:                       yes

  BLAS support:               no
  C++ support:                yes
  Cairo support:              no
  DWG support:                no
  FFMPEG support:             no
  FFTW support:               yes
  FreeType support:           yes
  GDAL support:               yes
  GEOS support:               no
  JPEG support:               yes
  LAPACK support:             no
  Large File support (LFS):   yes
  MySQL support:              no
  NLS support:                yes
  ODBC support:               yes
  OGR support:                yes
  OpenGL support:             yes
  PNG support:                yes
  PostgreSQL support:         yes
  Python support:             no
  Readline support:           no
  SQLite support:             yes
  Tcl/Tk support:             yes
  wxWidgets support:          no
  TIFF support:               yes
  X11 support:                no
  Regex support:              yes
  POSIX thread support:       no
-------------------------
Started compilation: Sat Aug 14 19:14:40 GMT 2010
--
Errors in:
/c/osgeo4w/usr/src/grass_trunk/lib/vector/diglib
/c/osgeo4w/usr/src/grass_trunk/lib/python/ctypes
syringia@NADA /c/osgeo4w/usr/src/grass_trunk/lib/vector/diglib
$ make
if [ "" != "" -a -f "".html ] ; then make html ; fi
==============TEST=============
make test
make[1]: Entering directory `/c/osgeo4w/usr/src/grass_trunk/lib/vector/diglib'
diff OBJ.i686-pc-mingw32/test.tmp test64.ok
Files OBJ.i686-pc-mingw32/test.tmp and test64.ok differ
make[1]: *** [test] Error 2
make[1]: Leaving directory `/c/osgeo4w/usr/src/grass_trunk/lib/vector/diglib'
make: *** [default] Error 2

Helmut

in reply to:  7 ; comment:8 by glynn, 14 years ago

Replying to hellik:

Please test!

a quick compiling test with r43116

syringia@NADA /c/osgeo4w/usr/src/grass_trunk/lib/vector/diglib
$ make

Files OBJ.i686-pc-mingw32/test.tmp and test64.ok differ

Ah.

  1. The config.h.in changes are conditionalised upon
    #ifdef USE_LARGEFILES
    

which isn't defined. USE_LARGEFILES is a make variable (AC_SUBST), not a preprocessor macro (AC_DEFINE).

  1. The config.h.in changes are also conditionalised upon
    #if defined(__MINGW32__) && defined(_FILE_OFFSET_BITS) && _FILE_OFFSET_BITS == 64
    

The configure checks will determine that _FILE_OFFSET_BITS don't have any effect (upon the test cases) on Windows, and so won't define it.

in reply to:  8 ; comment:9 by mmetz, 14 years ago

Replying to glynn:

a quick compiling test with r43116

> syringia@NADA /c/osgeo4w/usr/src/grass_trunk/lib/vector/diglib
> $ make
> 
> Files OBJ.i686-pc-mingw32/test.tmp and test64.ok differ

The diglib test is passed in r43119

  1. The config.h.in changes are conditionalised upon
> #ifdef USE_LARGEFILES

Hmph. We discussed that in May, I forgot, sorry! Fixed in r43119.

  1. The config.h.in changes are also conditionalised upon
    #if defined(__MINGW32__) && defined(_FILE_OFFSET_BITS) && _FILE_OFFSET_BITS == 64
    

The configure checks will determine that _FILE_OFFSET_BITS don't have any effect (upon the test cases) on Windows, and so won't define it.

On Windows, the configure checks do define _FILE_OFFSET_BITS=64, at least something defines _FILE_OFFSET_BITS=64 on Windows, that's why the diglib test did not pass previously.

I had to add some more conditionalised defines to config.h.in because we redefine some internal types and functions. I could not get grass7 to compile when using old names, so I defined _NO_OLD_NAMES_, but pulled those in that grass needs. This way I did not have to modify the source code for compilation on Linux and Mac, everything is conditionalised for mingw32 in config.h.in.

Needs more testing, I can compile grass7, errors are ctypes, but I can't get it running for various reasons, a bit arbitrary. Grass7 can't start the shell using /C/OSGeo4w/apps/msys/bin/sh.exe although that file exists??? Sometimes grass7 can't find the .gislock file in the mapset I want to use. Mysterious. Configuration and compilation with cairo worked fine BTW.

Markus M

in reply to:  9 comment:10 by hellik, 14 years ago

Replying to mmetz:

Needs more testing, I can compile grass7, errors are ctypes, but I can't get it running for various reasons, a bit arbitrary. Grass7 can't start the shell using /C/OSGeo4w/apps/msys/bin/sh.exe although that file exists??? Sometimes grass7 can't find the .gislock file in the mapset I want to use. Mysterious. Configuration and compilation with cairo worked fine BTW.

Markus M

I've never managed running grass7 in C/OSGeo4w/apps/grass/bin or in c/osgeo4w/usr/src/grass_trunk/bin.i686-pc-mingw32 after changing the startup script from a shell to a python script.

so I build my own WinGrass7-installer and install wingrass7 in c/Program files/Grass-7.0.svn and from there WinGrass7 is starting.

Helmut

in reply to:  9 ; comment:11 by glynn, 14 years ago

Replying to mmetz:

On Windows, the configure checks do define _FILE_OFFSET_BITS=64, at least something defines _FILE_OFFSET_BITS=64 on Windows, that's why the diglib test did not pass previously.

The reason the diglib tests failed was that configure set USE_LARGEFILES to 1 in Platform.make, which causes lib/vector/diglib/Makefile to use the 64-bit test:

ifneq ($(USE_LARGEFILES),)
	TESTFILE = test64.ok
else
	TESTFILE = test32.ok
endif

Makefile substitutions and config.h macro definitions are completely separate.

in reply to:  11 comment:12 by mmetz, 14 years ago

Replying to glynn:

Makefile substitutions and config.h macro definitions are completely separate.

But should be in sync in this case, i.e. _FILE_OFFSET_BITS must be set according to USE_LARGEFILES.

There are issues with wingrass, with raster processing. Using nc_spm_08,

g.region -p rast=elev_state_500m@PERMANENT res=100
r.resample input=elev_state_500m@PERMANENT output=elev_state_100m method=bicubic

produces different results on Windows than on Linux. On Windows, the results for grass64 and grass7 with LFS are identical, but unfortunately different from Linux, where again results from grass64 and grass7 are identical. A difference map has (rounded) min = -7.73e-12 max = 5.91e-1, not much but not zero.

One of the ctypes errors, the one about float('inf'), is caused by missing python support for inf and nan as valid string literals on some platforms. This seems to be independent of the python version. Python doesn't really know inf and nan, e.g. 0.0/0.0 produces nan in C but an error in python. A workaround I found is replacing float('inf') with e.g. float(1e1000) which should become DBL_MAX. Mathematically this is not correct, but in this case (lib/python/ctypes/ctypesgencore/expressions.py) it might do the job?

The good news is (apart from the passed diglib test) that r.watershed in disk swap mode produced a >2GB temp file and completed successfully (only possible with LFS).

Markus M

comment:13 by mmetz, 14 years ago

Hmm. This avoiding old names with

#define	_NO_OLDNAMES

is causing more troubles than expected, too many old names are required. Trunk compiles on Windows without LFS-related errors, but there are now too many warnings about implicit declarations.

#define lseek lseek64

does not work with old names in use, compiler error because <grass/config.h> needs to be included before all others.

Alternatively, we could not redefine existing functions/types, but define functions/types for grass, e.g.

#define g_off_t off64_t
#define g_fseeko fseeko64
#define g_ftello ftello64
#define g_lseek lseek64
/* use _stati64 compatible with MSVCRT < 6.1 */
#define g_stat _stati64
#define g_fstat _fstati64

but then these new names must always be defined, also on other platforms, and the codebase needs to be modified, each occurrence of lseek would need to be replaced with the new name, same for all other new definitions (there is a workaround for off_t, that could maybe stay). This is too much modification for my taste. I would like to postpone LFS support for wingrass in trunk a bit.

Markus M

in reply to:  13 ; comment:14 by glynn, 14 years ago

Replying to mmetz:

#define lseek lseek64

does not work with old names in use, compiler error because <grass/config.h> needs to be included before all others.

Can you provide details?

Alternatively, we could not redefine existing functions/types, but define functions/types for grass, e.g.

#define g_off_t off64_t
#define g_fseeko fseeko64
#define g_ftello ftello64
#define g_lseek lseek64
/* use _stati64 compatible with MSVCRT < 6.1 */
#define g_stat _stati64
#define g_fstat _fstati64

If you take this route, these should be functions rather than macros (IMHO, g_off_t is a non-starter).

in reply to:  14 comment:15 by mmetz, 14 years ago

Replying to glynn:

Replying to mmetz:

#define lseek lseek64

does not work with old names in use, compiler error because <grass/config.h> needs to be included before all others.

Can you provide details?

go to lib/gis, make

In file included from c:/OSGeo4W/include/fcntl.h:20,
                 from copy_dir.c:24:
c:/OSGeo4W/include/io.h:299: error: conflicting types for 'lseek64'
c:/OSGeo4W/include/io.h:164: error: previous definition of 'lseek64' was here
c:/OSGeo4W/include/io.h:299: error: conflicting types for 'lseek64'
c:/OSGeo4W/include/io.h:164: error: previous definition of 'lseek64' was here

same for stat:

In file included from copy_dir.c:28:
c:/OSGeo4W/include/sys/stat.h:122: error: redefinition of `struct _stati64'
make: *** [OBJ.i686-pc-mingw32/copy_dir.o] Error 1

This error disappears if

#define stat _stati64

is changed and stat is given a unique name (I used _g_stat).

typedef off64_t _off_t;
typedef off64_t off_t;

(_off_t is also needed) gives

c:/OSGeo4W/usr/src/grass-7.0.svn/dist.i686-pc-mingw32/include/grass/config.h:298: error: syntax error before "_off_t"
c:/OSGeo4W/usr/src/grass-7.0.svn/dist.i686-pc-mingw32/include/grass/config.h:298: warning: type defaults to `int' in declaration of `_off_t'

because <sys/types.h> is not available, but if it would be available, there would be conflicts with off_t.

#define _OFF_T_
#define _off_t off64_t 
#define off_t off64_t 

works. A defined _OFF_T_ tells <sys/types.h> to not typedef off_t

Markus M

in reply to:  14 ; comment:16 by mmetz, 14 years ago

Replying to glynn:

Alternatively, we could not redefine existing functions/types, but define functions/types for grass, e.g.

#define g_off_t off64_t
#define g_fseeko fseeko64
#define g_ftello ftello64
#define g_lseek lseek64
/* use _stati64 compatible with MSVCRT < 6.1 */
#define g_stat _stati64
#define g_fstat _fstati64

If you take this route, these should be functions rather than macros (IMHO, g_off_t is a non-starter).

I think I understand. How about in config.h.in

/* use own off_t definition */
#define _OFF_T_
#define _off_t off64_t
#define off_t off64_t
/* fseeko and ftello are safe because not defined by MINGW */
#define HAVE_FSEEKO
#define fseeko fseeko64
#define ftello ftello64
/* lseek is not safe, defined in io.h */
#define HAVE_LSEEK64
/* use _stati64 compatible with MSVCRT < 6.1 */
/* stat and fstat are not safe, defined in stat.h */
#define HAVE__STATI64
#define HAVE__FSTATI64
#define _STRUCT_STAT_
typedef struct _stati64 STRUCT_STAT;

in lib/gis/seek.c new function G_lseek()

off_t G_lseek(int fd, off_t offset, int whence)
{
#ifdef HAVE_LSEEK64
    return lseek64(fd, offset, whence);
#else
    return lseek(fd, offset, whence);
#endif     
}

in lib/gis/paths.c

int G_stat(const char *file_name, STRUCT_STAT *buf)
{
#ifdef HAVE__STATI64
    return _stati64(file_name, buf);
#else
    return stat(file_name, buf);
#endif
}

int G_lstat(const char *file_name, STRUCT_STAT *buf)
{
#ifdef __MINGW32__
    return G_stat(file_name, buf);
#else
    return lstat(file_name, buf);
#endif
}

?

Then update other libs and modules accordingly...

Markus M

in reply to:  16 comment:17 by glynn, 14 years ago

Replying to mmetz:

I think I understand. How about in config.h.in

in lib/gis/seek.c new function G_lseek()

We don't need to go this far. We should be able to count on using off_t, lseek() etc without having to wrap them. The reasoning behind G_fseek() is different, as we don't want to completely break GRASS on systems which lack fseeko().

I suggest first checking whether it's sufficient to include <io.h> and <stdio.h> before defining the macros. Most files will end up including one or both of those anyhow, as <grass/gis.h> includes <stdio.h>, and <unistd.h> includes <io.h>.

comment:18 by mmetz, 11 years ago

[Extending LFS support to other platforms]

Apparently there is a limited number of switches needed to get LFS. 64 bit off_t seems to be activated with either -n32 (IRIX), -D_FILE_OFFSET_BITS=64 (most *NIX systems), or -D_LARGE_FILES (AIX-style hosts). Additionally, -D_LARGEFILE_SOURCE might be needed to expose the 64 bit versions of fseeko etc.

The current LFS test in configure tests for all these switches, but the result is forgotten, thus the crude hack in Grass.make for global LFS.

I have modified the configure tests such that any switches needed for LFS are stored as LFS_CFLAGS. Tests show that on Linux 64 bit, nothing is needed. On Solaris 32 bit, -D_FILE_OFFSET_BITS=64 is needed and -D_LARGEFILE_SOURCE recommended. On FreeBSD 64 bit, nothing is needed.

in reply to:  18 ; comment:19 by glynn, 11 years ago

Replying to mmetz:

I have modified the configure tests such that any switches needed for LFS are stored as LFS_CFLAGS. Tests show that on Linux 64 bit, nothing is needed.

That's probably correct; fseeko and ftello are guarded by:

#if defined __USE_LARGEFILE || defined __USE_XOPEN2K

__USE_LARGEFILE is implied by _XOPEN_SOURCE >= 500, while __USE_XOPEN2K is implied by _POSIX_C_SOURCE >= 200112. If you don't have one of those two, you'll probably have other issues besides fseeko/ftello (e.g. M_PI, SA_RESTART).

Aside: if you're testing for compatibility, start by adding -ansi or -std=c89 to CFLAGS, then add the minimum set of feature macros. The default is -std=gnu89, which enables practically all of the extensions (POSIX, X/Open, SUS, SVID, BSD, GNU).

About the only feature which isn't enabled by default is _BSD_SOURCE, which not only enables BSD extensions but favours them over the SysV versions (e.g. both BSD and SysV define getpgrp() and setpgrp(), but the BSD versions take arguments while the SysV versions don't).

in reply to:  19 ; comment:20 by mmetz, 11 years ago

Replying to glynn:

Replying to mmetz:

I have modified the configure tests such that any switches needed for LFS are stored as LFS_CFLAGS. Tests show that on Linux 64 bit, nothing is needed.

That's probably correct; fseeko and ftello are guarded by:

#if defined __USE_LARGEFILE || defined __USE_XOPEN2K

__USE_LARGEFILE is implied by _XOPEN_SOURCE >= 500, while __USE_XOPEN2K is implied by _POSIX_C_SOURCE >= 200112. If you don't have one of those two, you'll probably have other issues besides fseeko/ftello (e.g. M_PI, SA_RESTART).

From the autoconf comments, same in our aclocal.m4: "We used to try defining _XOPEN_SOURCE=500 too, to work around a bug in glibc 2.1.3, but that breaks too many other things. If you want fseeko and ftello with glibc, upgrade to a fixed glibc."

Aside: if you're testing for compatibility, start by adding -ansi or -std=c89 to CFLAGS, then add the minimum set of feature macros. The default is -std=gnu89, which enables practically all of the extensions (POSIX, X/Open, SUS, SVID, BSD, GNU).

About the only feature which isn't enabled by default is _BSD_SOURCE, which not only enables BSD extensions but favours them over the SysV versions (e.g. both BSD and SysV define getpgrp() and setpgrp(), but the BSD versions take arguments while the SysV versions don't).

So far everything is working fine for me on the systems available to me (Redhat-based Linux 32 and 64 bit, Solaris 10, 11 32 bit, *BSD 32 and 64 bit), therefore I would not add new features unless they are required. For example, if you define _BSD_SOURCE, "you must give the option ‘-lbsd-compat’ to the compiler or linker when linking the program, to tell it to find functions in this special compatibility library before looking for them in the normal C library", says the glibc documentation.

AIX is a problem, also with regard to 64 bit AIX, it insists on using a 32 bit off_t and corresponding functions. Setting -D_LARGE_FILES is apparently not enough, even though the official documentation says it should suffice. I need to analyse the errors in more detail before I can give useful information on the particulars for LFS on AIX.

in reply to:  20 comment:21 by glynn, 11 years ago

Replying to mmetz:

therefore I would not add new features unless they are required.

Note that I wasn't suggesting changes to the configure checks per se. Rather, that if you want to test the checks, configure should be run with -ansi in CFLAGS so that tests don't pass simply by virtue of extensions having being enabled by default.

comment:22 by mmetz, 11 years ago

Resolution: fixed
Status: newclosed

Closing as fixed.

Note: See TracTickets for help on using tickets.