Opened 15 years ago

Closed 14 years ago

#788 closed defect (fixed)

r.out.gdal and nodata

Reported by: frankie Owned by: grass-dev@…
Priority: normal Milestone: 6.4.1
Component: Raster Version: 6.4.0 RCs
Keywords: r.out.gdal, nodata Cc:
CPU: Unspecified Platform: Linux

Description

The new version of r.out.gdal shows a regression in exporting to float type. In 6.2 its default for nodata value was NaN, now it is an explicit _valid_ floating point (min float?). This is not acceptable for instance in ENVI output format, where the correct value is NaN, to avoid trashing lookups and statistics in external programs.

Attachments (1)

r.out.gdal.diff (46.4 KB ) - added by neteler 14 years ago.
nodata logic update patch from GRASS 6.5 (by Markus Metz)

Download all attachments as: .zip

Change History (14)

comment:1 by martinl, 15 years ago

Component: defaultRaster
Keywords: r.in.gdal nodata added

comment:2 by hamish, 14 years ago

This is in part a duplicate of bug #73, or at least a continuation of it. Please read through that for background. As that bug is long and things have evolved from then I don't mind starting afresh.

Frankie:

The new version of r.out.gdal shows a regression in exporting to float type. In 6.2 its default for nodata value was NaN, now it is an explicit _valid_ floating point (min float?). This is not acceptable for instance in ENVI output format, where the correct value is NaN, to avoid trashing lookups and statistics in external programs.

Using 6.5svn with elevation.10m DCELL map from the Spearfish dataset the default nodata value is nan (type defaults to Float64), and also I can set nodata=nan on the command line and it works. fwiw nodata=inf does not work.

in 6.4 it defaults to -1e+37 but setting nodata=nan explicitly on the command line works as expected. (I'm using gdalinfo on the output geotiff to confirm).

so this one is already "fixed", but waiting-for-backport from the 6.5 branch.

as we should try to not change default behaviour during a stable branch maybe we should aim to get this change into the 6.4 branch before 6.4.0 ... ?

one thing before that happens:


I notice another problem. (latest 6.5svn) If I export a palleted raster (with cats 0-7 + NULL) as type=Byte

G65> r.out.gdal in=map out=map_LL_WGS84.tif type=Byte creat="COMPRESS=DEFLATE,INTERLEAVE=BAND"

Exporting to GDAL data type: Byte
 100%
Input raster map contains cells with NULL-value (no-data). The value 0 was
used to represent no-data values in the input map. You can specify a nodata
value with the nodata option.
WARNING: The default nodata value is present in rasterband <map> and would
         lead to data loss. Please specify a different nodata value with
         the nodata parameter.
ERROR: Raster export aborted.

that's all fine, but it has left an incomplete map in the filesystem. The data is there (or at least some of it) but the georef & metadata is not. Just before that "ERROR: Raster export aborted." it should unlink() the output file. It doesn't know until after it starts to write if the default nodata value will also be present in the map, so aborting and deleting the file has to come after the fact. I assume any createopt="TFW=YES" files etc get written after, so we don't have to worry about deleting them too. (??)

Hamish

comment:3 by mmetz, 14 years ago

Keywords: r.out.gdal added; r.in.gdal removed

Replying to hamish:

[...]

in 6.4 it defaults to -1e+37 but setting nodata=nan explicitly on the command line works as expected. (I'm using gdalinfo on the output geotiff to confirm).

so this one is already "fixed", but waiting-for-backport from the 6.5 branch.

the r.out.gdal version in 6.5 is not the proper fix, r.out.gdal in 7.0 is currently the safest version for raster GDAL export, waiting for backport to 6.5, then to 6.4 if it passed all testing.


I notice another problem. (latest 6.5svn) If I export a palleted raster (with cats 0-7 + NULL) as type=Byte

> G65> r.out.gdal in=map out=map_LL_WGS84.tif type=Byte creat="COMPRESS=DEFLATE,INTERLEAVE=BAND"
> 
> Exporting to GDAL data type: Byte
>  100%
> Input raster map contains cells with NULL-value (no-data). The value 0 was
> used to represent no-data values in the input map. You can specify a nodata
> value with the nodata option.
> WARNING: The default nodata value is present in rasterband <map> and would
>          lead to data loss. Please specify a different nodata value with
>          the nodata parameter.
> ERROR: Raster export aborted.

that's all fine, but it has left an incomplete map in the filesystem. The data is there (or at least some of it) but the georef & metadata is not. Just before that "ERROR: Raster export aborted." it should unlink() the output file.

The output map is not always in a single file, may also be in a database (Oracle Spatial, Rasterlite in SQLite). How to handle e.g. Erdas Imagine .img with external spill files?

It doesn't know until after it starts to write if the default nodata value will also be present in the map, so aborting and deleting the file has to come after the fact. I assume any createopt="TFW=YES" files etc get written after, so we don't have to worry about deleting them too. (??)

As I see it, there are three possibilities if the nodata value is present in the rasterband to be exported:

  1. Current behaviour, abort with error message, leaves a broken raster.
  2. Continue with warning, leaves corrupted raster.
  3. Introduce a first pass to check if the nodata value is present in the rasterband to be exported, abort with error if necessary before actual export. No output is written, but it will double the time needed for export.

Anyway, the above example would successfully export in 7.0, in this case 255 would be selected as default nodata value because 0 is present in the rasterband.

What parts of the 7.0 version of r.out.gdal qualify as bugfixes and what parts qualify as changed default behaviour compared to 6.4?

BTW, the INTERLEAVE option has no effect if only one raster band is exported.

Markus M

in reply to:  3 ; comment:4 by hamish, 14 years ago

Replying to mmetz:

the r.out.gdal version in 6.5 is not the proper fix,

ok. for my needs yesterday it did the trick though, but I had to run it twice to learn + get there.

r.out.gdal in 7.0 is currently the safest version for raster GDAL export, waiting for backport to 6.5, then to 6.4 if it passed all testing.

could you quickly highlight the differences between them now? I've sort of lost track.

it has left an incomplete map in the filesystem.

...

The output map is not always in a single file, may also be in a database (Oracle Spatial, Rasterlite in SQLite). How to handle e.g. Erdas Imagine .img with external spill files?

don't know. perhaps we can't catch all cases. perhaps we are helped by the fact that some of the extra files get written along with the metadata after the job is done. We would need GDAL-dev's advice on this one I think.

As I see it, there are three possibilities if the nodata value is present in the rasterband to be exported:

  1. Current behaviour, abort with error message, leaves a broken raster.

1.5. Current behaviour, abort with error message, but try to delete what we know can. A possible way to do this is to write the output to $MAPSET/.tmp/ and then move it into the destination dir once successfully complete. (??)

  1. Continue with warning, leaves corrupted raster.

no thanks. unknowingly trusting corrupted data is the worst kind of bug. if it's in a script you are likely to miss the warning message in all the noise.

  1. Introduce a first pass to check if the nodata value is

present in the rasterband to be exported, abort with error if necessary before actual export. No output is written, but it will double the time needed for export.

actually it'll be a bit faster than double. After the initial read from disk most of it will be cached in memory and not need to be read again, and also the scanning step is probably less i/o and cpu intensive than the step that writes to disk. Personally I'm willing to trade a little time to guarantee pristine results, but that may not fit everyone's use case. Those really concerned with the time could use the -force flag to bypass those checks...

Anyway, the above example would successfully export in 7.0, in this case 255 would be selected as default nodata value because 0 is present in the rasterband.

ok.

What parts of the 7.0 version of r.out.gdal qualify as bugfixes and what parts qualify as changed default behaviour compared to 6.4?

again, I've sort of forgotten how far things got and am not sure what's on that list.

BTW, the INTERLEAVE option has no effect if only one raster band is exported.

yeah, I noticed that after posting. I had cut&pasted it out of my template collection (aka the man pages) and it went along for the ride.

Hamish

in reply to:  4 ; comment:5 by mmetz, 14 years ago

Replying to hamish:

Replying to mmetz:

the r.out.gdal version in 6.5 is not the proper fix,

ok. for my needs yesterday it did the trick though, but I had to run it twice to learn + get there.

Not good, it shouldn't be that complicated. The module should export a correct raster map with default settings only, also if GDAL format is given but nothing else. It gets a bit more complicated with nodata value and/or data type set.

r.out.gdal in 7.0 is currently the safest version for raster GDAL export, waiting for backport to 6.5, then to 6.4 if it passed all testing.

could you quickly highlight the differences between them now? I've sort of lost track.

This is what g7 r.out.gdal does:

Before actual export:

  1. Range check: abort with error if the selected GDAL data type can not hold any of the values in the raster, e.g. using GDAL data type Byte with a CELL raster with all values > 255 and/or < 0: Raster export would result in complete data loss, aborting.
  1. Precision test: precision loss occurs if DCELL is exported as some integer or FLOAT32 or if Float32 is exported as some integer or if CELL is exported as Float32 and the min max values are outside the range of what can be represented in Float32 (see [!1]). Override with force flag.
  1. Check if any given nodata value is within the range of the selected GDAL data type. Report what would happen and abort raster export if outside range (mismatch between metadata nodata value and the value replacing NULLs). Effects can be drastic, try g7 r.out.gdal with type=Byte and nodata=-1 and nodata=-9999. Can't override.
  1. Get default nodata value if none specified: for any floating point this is NaN, for integer types, it figures out a default type. Based on discussion in [!1].

During actual export:

  1. Real nodata value check: abort if the selected nodata value is present in the part of the raster map to be exported. Override with force flag.
  1. Real range check: abort if any values in the part of the raster map to be exported are outside the selected GDAL data type -> data loss, near unpredictable results, can't override.

The checks during the actual export could be moved to a separate pass over all raster bands to be exported, I think this is the safest solution, then r.out.gdal does not have to bother with cleaning up and can leave all format-specific error handling to GDAL. Looking at the ever increasing number of GDAL-supported formats, the introduction of format specific code to r.out.gdal would open a box of Pandora, and IMHO the purpose of r.out.gdal is that all that format-specific stuff can be left to the GDAL lib.

What is missing is a region check. At the very least, r.out.gdal should make sure that at least some parts of the raster maps to be exported fall within the current region, if not, abort with error: raster map outside current region, nothing to export. I wouldn't make it too complicated, so no resolution checks and also export if parts of the current region are not covered by the raster map to be exported.

[!1] http://lists.osgeo.org/pipermail/grass-dev/2009-April/043464.html and following

Markus M

in reply to:  5 ; comment:6 by hamish, 14 years ago

Ok sounds good. Lets go ahead and backport what's in the gr7 version to gr6.5. No point in widely testing dead-end code in 6.5...

Replying to mmetz:

The checks during the actual export could be moved to a separate pass over all raster bands to be exported, I think this is the safest solution, then r.out.gdal does not have to bother with cleaning up and can leave all format-specific error handling to GDAL.

sounds ok to me/I could live with that. anyone else have a POV?

What is missing is a region check. At the very least, r.out.gdal should make sure that at least some parts of the raster maps to be exported fall within the current region, if not, abort with error: raster map outside current region, nothing to export.

hmmm. I would not do that, it is straying away from checking for errors into checking for intention. I would say that's a good idea but change it to a G_warning() or G_message() instead of G_fatal_error() and just write a map full of NULLs regardless. No other GRASS raster module refuses to go on if you have the region (ie WINDow) set beyond its bounds.

e.g. I have a script which tiles a raster into a series of small geotiffs. say I had a region with a full raster map in the top left and another in the bottom right with overlapping the corners in the middle and empty space in the opposite corners. I'd want those all-NULL tiles instead of a 404 error.

|-----------------|
|                 |
|  +-----+        |
|  |     |        |
|  |   +-+-----+  |
|  +---|-+     |  |
|      |       |  |
|      +-------+  |
|-----------------|

Hamish

in reply to:  6 comment:7 by mmetz, 14 years ago

Replying to hamish:

Ok sounds good. Lets go ahead and backport what's in the gr7 version to gr6.5. No point in widely testing dead-end code in 6.5...

OK.

Replying to mmetz:

The checks during the actual export could be moved to a separate pass over all raster bands to be exported, I think this is the safest solution, then r.out.gdal does not have to bother with cleaning up and can leave all format-specific error handling to GDAL.

sounds ok to me/I could live with that. anyone else have a POV?

Let's implement it and get it tested. This is a trade-off between speed and leaving behind a mess, now going for the no-mess alternative.

What is missing is a region check. At the very least, r.out.gdal should make sure that at least some parts of the raster maps to be exported fall within the current region, if not, abort with error: raster map outside current region, nothing to export.

hmmm. I would not do that, it is straying away from checking for errors into checking for intention. I would say that's a good idea but change it to a G_warning() or G_message() instead of G_fatal_error() and just write a map full of NULLs regardless. No other GRASS raster module refuses to go on if you have the region (ie WINDow) set beyond its bounds.

OK. Most if not all raster modules respect the current region, at least its bounds, so that's nothing new. Change of mind, I would think about such a test again if users request it, but leave it out for now.

Markus M

comment:8 by mmetz, 14 years ago

Checks are now performed before actual export, in case of aborted export, nothing was exported and no files need to be removed. Done in trunk r40119 and devbr6 r40120.

I noticed that the relbr64 manual is quite different, a lot of IMO clarifying information has been added to the updated manuals in devbr6 and trunk. Backport the manual to relbr64?

comment:9 by neteler, 14 years ago

Milestone: 6.4.06.4.1

by neteler, 14 years ago

Attachment: r.out.gdal.diff added

nodata logic update patch from GRASS 6.5 (by Markus Metz)

comment:10 by neteler, 14 years ago

Attached the 6.4.svn backport of the r.out.gdal fixes from 6.5. This fixes the nodata problem when exporting (tested with UInt16). The present logic in 6.4.svn is unusable.

comment:11 by hamish, 14 years ago

has this been backported to 6.4.0 already or do 6.4 and 6.5 still differ?

in reply to:  11 comment:12 by mmetz, 14 years ago

Replying to hamish:

has this been backported to 6.4.0 already or do 6.4 and 6.5 still differ?

Backported some months ago (05/05/10) in r42117, all branches including 7.0 behave identical.

Markus M

comment:13 by hamish, 14 years ago

Resolution: fixed
Status: newclosed

thanks; in that case closing the bug.

Note: See TracTickets for help on using tickets.