Opened 11 years ago

Closed 11 years ago

Last modified 11 years ago

#2036 closed defect (fixed)

Failed watershed analysis on Grass

Reported by: mehmeto Owned by: grass-dev@…
Priority: normal Milestone: 6.4.4
Component: Raster Version: 6.4.2
Keywords: LFS, r.watershed Cc:
CPU: x86-64 Platform: MSWindows 7

Description

This is my first day with Grass GIS (version 6.4.2 for Windows). As I tried this tutorial the watershed analysis failed.

I carefully put in the parameters as specified on the tutorial. The error message is:

(Thu Jul 18 14:10:00 2013)                                                      
r.watershed elevation=elevation@PERMANENT basin=elev.basins stream=elev.streams threshold=10000
SECTION 1a (of 5): Initiating Memory.
SECTION 1b (of 5): Determining Offmap Flow.
SECTION 2: A * Search.
SECTION 3: Accumulating Surface Flow with SFD.
ERROR: G_calloc: unable to allocate 4 * 60175190 bytes at main.c:91
Subprocess failed with exit code 1
category information for [elev.basins] in [user1] missing or invalid
category information for [elev.streams] in [user1] missing or invalid

I will be glad to receive some help with this as my aim is to do hydrological analysis with Grass. Please consider in your reply that I am a novice with this program and I am not a technical person.

Change History (21)

in reply to:  description ; comment:1 by hellik, 11 years ago

Replying to mehmeto:

This is my first day with Grass GIS (version 6.4.2 for Windows).

would it possible to test it with the latest 6.4.3RC4

http://grass.osgeo.org/grass64/binary/mswindows/native/ or http://trac.osgeo.org/osgeo4w/wiki/pkg-grass

(Thu Jul 18 14:10:00 2013)                                                      
r.watershed elevation=elevation@PERMANENT basin=elev.basins stream=elev.streams threshold=10000

does a smaller threshold work?

SECTION 1a (of 5): Initiating Memory. SECTION 1b (of 5): Determining Offmap Flow. SECTION 2: A * Search. SECTION 3: Accumulating Surface Flow with SFD. ERROR: G_calloc: unable to allocate 4 * 60175190 bytes at main.c:91

what does g.region -p says

in reply to:  1 comment:2 by mehmeto, 11 years ago

Replying to hellik:

Thanks for your reply, hellik.

Replying to mehmeto:

This is my first day with Grass GIS (version 6.4.2 for Windows).

would it possible to test it with the latest 6.4.3RC4

Done. Got the same error message.

http://grass.osgeo.org/grass64/binary/mswindows/native/ or http://trac.osgeo.org/osgeo4w/wiki/pkg-grass

(Thu Jul 18 14:10:00 2013)                                                      
r.watershed elevation=elevation@PERMANENT basin=elev.basins stream=elev.streams threshold=10000

does a smaller threshold work?

A threshold as small as 100 yielded the same message. Grass Wiki tutorial suggested 10,000 as a threshold, anyway.

SECTION 1a (of 5): Initiating Memory. SECTION 1b (of 5): Determining Offmap Flow. SECTION 2: A * Search. SECTION 3: Accumulating Surface Flow with SFD. ERROR: G_calloc: unable to allocate 4 * 60175190 bytes at main.c:91

what does g.region -p says

Sorry but being a beginner with this program I do not understand what you mean by this, could you please be more specific?

-Mehmeto

comment:3 by mehmeto, 11 years ago

Further to my last message, I realized that there was a command console and guessing typing g.region -p there (guessed you meant that :)) yielded the following message:

(Thu Jul 18 17:48:17 2013)                                                      
g.region -p                                                                     
projection: 99 (Lambert Conformal Conic)
zone:       0
datum:      nad83
ellipsoid:  a=6378137 es=0.006694380022900787
north:      258500
south:      185000
west:       596670
east:       678330
nsres:      10
ewres:      10
rows:       7350
cols:       8166
cells:      60020100
(Thu Jul 18 17:48:17 2013) Command finished (0 sec)    

Is that relevant?

-Mehmeto

in reply to:  description ; comment:4 by neteler, 11 years ago

Keywords: LFS added; grass removed

Replying to mehmeto:

This is my first day with Grass GIS (version 6.4.2 for Windows).

...

ERROR: G_calloc: unable to allocate 4 * 60175190 bytes at main.c:91 Subprocess failed with exit code 1

If you calculate 4 * 60175190 bytes = 240700760 which is > 231. It seems that you have hit the 2GB barrier on 32bit which means that either Large File Support (LFS) is not enabled in your copy of winGRASS or that r.watershed lacks LFS support on GRASS 6 which I don't believe.

(maybe related to ticket #1903)

in reply to:  4 comment:5 by mehmeto, 11 years ago

Replying to neteler:

If you calculate 4 * 60175190 bytes = 240700760 which is > 231. It seems that you have hit the 2GB barrier on 32bit which means that either Large File Support (LFS) is not enabled in your copy of winGRASS or that r.watershed lacks LFS support on GRASS 6 which I don't believe. (maybe related to ticket #1903)

With all due respect I can not understand why this may be the problem. I am working with a standard demo file (North Carolina) that comes with the installation package, followed instructions on a tutorial for beginners located in the Grass Wiki, the layer I work on looks like to be a rather small one as suggested by the tutorial, I downloaded the latest version of Grass today (v. 6.4.3RC4), it seems they enable LFS on all new installations as default and I work with a 64-bit Windows machine. Would it not be strange to suggest such a problematic task in a beginners' tutorial?

I looked for a way to enable LFS if it was disabled but Large File Support this wiki article sounds rather technical and I failed to find a procedure elsewhere. Also tried to do the same on a different computer (again Windows 7) and got the same error message.

-Mehmeto

comment:6 by mehmeto, 11 years ago

Resolution: fixed
Status: newclosed

Tried to do the same analysis with Spearfish data that also is in the installation package and it worked. It should be a memory problem indeed. (although Spearfish map looks larger) Thanks for your replies.

comment:7 by hamish, 11 years ago

this isn't really that large of a region, you usually hit 2 GB files at ~ 45000x45000 rows x cols. this is much smaller,

rows:       7350
cols:       8166

ERROR: G_calloc: unable to allocate 4 * 60175190 bytes at main.c:91

(devs: should we strip the wingrass binaries in the stable release builds? or keep the line number code there for better debugging?)

If you calculate 4 * 60175190 bytes = 240700760 which is > 231.

missed a 0, actually it's 9 times smaller than 231. It only wants to allocate 241mb RAM, so probably not a large-file problem. How much do memory does your computer have?

for what it's worth, some LFS notes: In GRASS 6.4 it is enabled module by module for those that need it. We know the list of modules which might need it, and intend to fully audit that for 6.4.4. WinGrass is built using the OSGeo4W stack and the MSys software, both of which are currently 32bit with 64bit versions in current development. Once we have those 64 bit build platforms in place a 64 bit version of WinGrass will be available as it is on MacOS and Linux already.

Hamish

in reply to:  6 comment:8 by hamish, 11 years ago

Replying to mehmeto:

Tried to do the same analysis with Spearfish data that also is in the installation package and it worked. It should be a memory problem indeed. (although Spearfish map looks larger)

Hi,

I think there was a missing step at the start where you need to right click on the elevation map name in the layer manager and choose "Set computational region from selected map(s)".

I've done a number of updates to that tutorial, but I still need to merge them into the GRASS wiki,

https://trac.osgeo.org/osgeo/changeset?new=10348%40livedvd%2Fgisvm%2Ftrunk%2Fdoc%2Fen%2Fquickstart%2Fgrass_quickstart.rst&old=9076%40livedvd%2Fgisvm%2Ftrunk%2Fdoc%2Fen%2Fquickstart%2Fgrass_quickstart.rst

Note I had to go full-Spearfish for the live disc tutorial (http://live.osgeo.org) since we didn't have enough room for both the NC grass location and the semi-cloned NC shapefile/geotiff shared dataset. In practice my idea to make the tutorial generic for both sample datasets didn't work well anyway. Also the watershed threshold size given for Spearfish should be different for NC. Still todo for the live disc is a script to re-create the GRASS NC location by importing those source files, but I think first to remake the geotiffs for Helena with the newer r.out.gdal magic for correct data types and better metadata. We are also working on a "big data" directory for people with the live disc booting from e.g. 8gb USB sticks, with an extra directory in the top FAT32 partition with extra data, so could store nc_08.tgz there. (rc.local will decide at boot time if to symlink to that data (if it is present) or place an index.html in that dir with links to download from online)

Hamish

in reply to:  7 ; comment:9 by glynn, 11 years ago

Replying to hamish:

ERROR: G_calloc: unable to allocate 4 * 60175190 bytes at main.c:91

(devs: should we strip the wingrass binaries in the stable release builds? or keep the line number code there for better debugging?)

Stripping just removes debug symbols; it won't affect the error message, which relies upon the G_calloc() etc macros passing __FILE__ and __LINE__ to the underlying function.

If you calculate 4 * 60175190 bytes = 240700760 which is > 231.

missed a 0, actually it's 9 times smaller than 231. It only wants to allocate 241mb RAM

That's the allocation that fails. But main() allocates 2 such arrays, and before that it calls init_vars(), which allocates 2 or 3 CELL arrays and between 1 and 4 DCELL arrays. So the actual requirements are N*241 MB where N is between 6 and 13. The upper bound comes out at over 3 GB.

And that's only the allocations which use size_array().

in reply to:  9 ; comment:10 by hamish, 11 years ago

Replying to glynn:

Replying to hamish:

(devs: should we strip the wingrass binaries in the stable release builds? or keep the line number code there for better debugging?)

Stripping just removes debug symbols; it won't affect the error message, which relies upon the G_calloc() etc macros passing __FILE__ and __LINE__ to the underlying function.

ok, I mostly meant if it was desired to make the binaries a bit smaller or not since it looks like mswindows/osgeo4w/package.sh is not stripping them.

Hamish

in reply to:  6 ; comment:11 by mmetz, 11 years ago

Replying to mehmeto:

Tried to do the same analysis with Spearfish data that also is in the installation package and it worked. It should be a memory problem indeed. (although Spearfish map looks larger) Thanks for your replies.

You need to run r.watershed in disk swap mode with the -m flag. Also be aware that even for smaller regions, the size of the temporary files can be quite large, and LFS for Windows is only available in GRASS 7.

in reply to:  10 comment:12 by glynn, 11 years ago

Replying to hamish:

ok, I mostly meant if it was desired to make the binaries a bit smaller or not since it looks like mswindows/osgeo4w/package.sh is not stripping them.

Distributed binaries ought to be stripped (or compiled without -g in the first place), regardless of platform. If you don't specify CFLAGS when running configure, it defaults to "-g -O2".

"objdump -h <filename>" will confirm whether the binaries contain debug information (if there are sections whose names start with ".debug_", the file isn't stripped).

in reply to:  7 comment:13 by mehmeto, 11 years ago

Replying to hamish:

this isn't really that large of a region, you usually hit 2 GB files at ~ 45000x45000 rows x cols. this is much smaller,

If you calculate 4 * 60175190 bytes = 240700760 which is > 231.

missed a 0, actually it's 9 times smaller than 231. It only wants to allocate 241mb RAM, so probably not a large-file problem. How much do memory does your computer have?

It has 8GB.

Also thank you for updating the tutorial.

Now I can do watershed analysis at least on a demo map but my real data will be larger. I will need to work on an area of about 5,000 square km. and the map scale will be about 1:25,000 or even lower. I will also need to transform contours to DEM before start using r.watershed. Since I will work on a PC, memory problems seem inevitable.

I know one suggested method to overcome memory limitations is dividing a large map to smaller pieces but in a wathershed analysis that will mean false or missing results. Or, maybe I will need to divide the map into overlapping sections to make sure no basins are missed and then manually stitch the sections, which sounds to me a very laborous procedure now. Do you have any ideas how to do that, or any forums where I can ask?

in reply to:  11 ; comment:14 by mehmeto, 11 years ago

Replying to mmetz:

You need to run r.watershed in disk swap mode with the -m flag. Also be aware that even for smaller regions, the size of the temporary files can be quite large, and LFS for Windows is only available in GRASS 7.

Thank you. So, I undertand that current stable version of Grass for Windows is not a convenient program to do such an analysis on a large map. I should either wait for Grass 7 release, or find a Linux / Mac machine or use another GIS?

in reply to:  7 comment:15 by mmetz, 11 years ago

Replying to hamish:

It only wants to allocate 241mb RAM, so probably not a large-file problem. How much do memory does your computer have?

If your machine has, say 8 GB, and 8 GB are already allocated, trying to allocate further 241 MB will fail with an out of memory error.

for what it's worth, some LFS notes:

LFS for GRASS on Windows is only available in trunk, and there it is enabled by default. By default, Windows, also a 64 bit Windows, does not have LFS when using the standard API, you need to use the LFS API explicitly.

in reply to:  14 ; comment:16 by mmetz, 11 years ago

Replying to mehmeto:

Replying to mmetz:

You need to run r.watershed in disk swap mode with the -m flag. Also be aware that even for smaller regions, the size of the temporary files can be quite large, and LFS for Windows is only available in GRASS 7.

Thank you. So, I undertand that current stable version of Grass for Windows is not a convenient program to do such an analysis on a large map. I should either wait for Grass 7 release, or find a Linux / Mac machine or use another GIS?

For Windows, you can use the GRASS 7 installer available at

http://wingrass.fsv.cvut.cz/grass70/

Most GRASS 7 modules are at least as robust as in GRASS 6, and r.watershed in GRASS 7 is suitable for production work. In GRASS 6.4.2, r.watershed is IMHO not suitable for production work.

in reply to:  16 ; comment:17 by hamish, 11 years ago

Replying to mmetz:

Most GRASS 7 modules are at least as robust as in GRASS 6, and r.watershed in GRASS 7 is suitable for production work.

Although I'd generally caution anyone using any new/development code for production work that there are no guarantees, and the usual "bleeding edge" situation applies. The choice is between old, well tested, and trusted versus new & improved (faster, better, stronger) but lightly tested with a short track record. The choice is yours..

In GRASS 6.4.2, r.watershed is IMHO not suitable for production work.

How about if you use the "-f" MDF flag there? Or do you mean more than that? (I take it you mean a methodological improvement, not bug fixes)

By default, Windows, also a 64 bit Windows, does not have LFS when using the standard API, you need to use the LFS API explicitly.

But AFAIU it can be explicitly enabled for libgis, so the programmer only needs to worry about it if the module uses ftell() and fseek() type operations. (and many of the modules already do)

Is there a reason that --enable-largefile is not included in relbr_6_4/mswindows/osgeo4w/package.sh's ./configure?

mehmeto wrote:

So, I undertand that current stable version of Grass for Windows is not a convenient program to do such an analysis on a large map.

use the r.watershed "-m" flag with GRASS 6, then it might be ok. (just take a bit longer)

But getting back to the original bug report, if I understand correctly this happened with the North Carolina sample dataset's 'elevation' raster map due to a missing step in the quick-start tutorial: right click on the map name to set the computation region bounds and resolution to match the map before running a processing module. That map's region settings is much smaller than the 7000 rows,cols reported, so it should run ok without any memory issues in 32bit WinGrass.

regards, Hamish

in reply to:  17 comment:18 by hamish, 11 years ago

Replying to hamish:

Is there a reason that --enable-largefile is not included in relbr_6_4/mswindows/osgeo4w/package.sh's ./configure?

(fwiw it is recently enabled in 6.5 for testing)

Hamish

in reply to:  17 comment:19 by mmetz, 11 years ago

Replying to hamish:

Replying to mmetz:

In GRASS 6.4.2, r.watershed is IMHO not suitable for production work.

How about if you use the "-f" MDF flag there? Or do you mean more than that? (I take it you mean a methodological improvement, not bug fixes)

I mean bug fixes. Some bugs have been fixed only in 6.4.3 and later. Also, r.watershed in G7 uses less memory and is faster.

By default, Windows, also a 64 bit Windows, does not have LFS when using the standard API, you need to use the LFS API explicitly.

But AFAIU it can be explicitly enabled for libgis, so the programmer only needs to worry about it if the module uses ftell() and fseek() type operations. (and many of the modules already do)

It's not so easy for Windows. You need to use the LFS API explicitely, i.e. off64_t, fseeko64(), ftello64, lseek64(), _stati64, etc.

Is there a reason that --enable-largefile is not included in relbr_6_4/mswindows/osgeo4w/package.sh's ./configure?

Yes, because --enable-largefile has no effect for wingrass 6.x. LFS on Windows is only available for G7, not for G6, independent of any configuration options.

mehmeto wrote:

So, I undertand that current stable version of Grass for Windows is not a convenient program to do such an analysis on a large map.

use the r.watershed "-m" flag with GRASS 6, then it might be ok. (just take a bit longer)

In this case, it is not a matter of processing time, but if the module finishes at al. For trunk, the -m flag could become the default, in order to avoid tickets like this.

comment:20 by hamish, 11 years ago

Hi,

I tried in 6.4.3svn on Windows7 with Spearfish's elevation.10m with g.region at 2m, with r.watershed run using the '-m' flag and 4 output maps. It seemed to work fine that way. (region size ~ 7000x9500)

mmetz:

It's not so easy for Windows. You need to use the LFS API explicitely, i.e. off64_t, fseeko64(), ftello64, lseek64(), _stati64, etc.

so more #ifdefs are needed in G_ftell() and G_fseek(), then more modules in g6 need to use those functions? (right now only r.in.bin does) If so it doesn't seem so hard. As a future maintenance goal, full 64bit file support on Windows for 6.4.x seems to me a rather important thing to work towards. (From both the grass code, msys, and osgeo4w ends)

For trunk, the -m flag could become the default, in order to avoid tickets like this.

mmph, I'm not a fan of that, rather just document the memory issues in the man page and have it go fast in the typical cases. (how much RAM will the average computer have when g7 is middle aged?)

Hamish

in reply to:  20 comment:21 by mmetz, 11 years ago

Replying to hamish:

mmetz:

It's not so easy for Windows. You need to use the LFS API explicitely, i.e. off64_t, fseeko64(), ftello64, lseek64(), _stati64, etc.

so more #ifdefs are needed in G_ftell() and G_fseek(), then more modules in g6 need to use those functions? (right now only r.in.bin does) If so it doesn't seem so hard.

In g6, G_ftell() and G_fseek() do not have LFS on Windows because they use fseeko and ftello, not fseeko64 and ftello64. Besides, it does not make sense to enable LFS on module level if the libraries do not have LFS (which they do not have on wingrass 6). This is why LFS is globally enabled/disabled in g7 by configure which in turn creates config.h and Platform.make. Libraries and modules do no longer need any additional switches.

As a future maintenance goal, full 64bit file support on Windows for 6.4.x seems to me a rather important thing to work towards. (From both the grass code, msys, and osgeo4w ends)

Considering that ticket #1131 (Global LFS for wingrass) was opened 3 years ago and closed 10 days ago, which triggered you to ask

"it depends: is LFS working in MS Windows? i.e. has it been tested with the latest trunk code and passed? We shouldn't close tickets on assumptions."

I suggest to 1) release the GRASS GIS 7 tech-preview asap, 2) leave wingrass LFS for g7. LFS is not so easy for Windows.

Interesting that for a change you ask for low-level modification of g 6.4.x which I regard as too invasive.

For trunk, the -m flag could become the default, in order to avoid tickets like this.

mmph, I'm not a fan of that, rather just document the memory issues in the man page and have it go fast in the typical cases.

I prefer modules to finish successfully, even if it takes some time, instead of first freezing the machine by going into swap, then failing with an out-of-memory error.

(how much RAM will the average computer have when g7 is middle aged?)

Replacing "g7" with "currently cutting edge software under development", this is an easy question. The answer has been the same for the last decades and will be the same for the foreseeable future: not enough, even for your average HPC. This is the reason for the design of the GRASS raster library since its inception and the reason for the existence of the segment and rowio libraries.

I assume that the data available will continue to prevent processing all in RAM. For example, the SRTM data have been collected in February 2000, and as of today only few hardware + software combos exist that can process the whole SRTM dataset.

I guess the reason why an all-in-RAM r.watershed version exists at all is that the all-in-RAM version was really slow and the disk-swap version was even slower. This has been changed, the current disk-swap version is magnitudes faster than the pre g-6.4 all-in-RAM version.

Anyway, as long as wingrass is compiled as a 32 bit application, it can only access the amount of RAM available to a 32 bit application with correspondingly sized pointers. Another reason to use the low-memory module versions by default and release the g7 tech preview asap.

Note: See TracTickets for help on using tickets.