Opened 16 years ago
Closed 16 years ago
#2107 closed defect (fixed)
ASCII grid disregards nodatavalue when choosing data type
Reported by: | asgerpetersen | Owned by: | Mateusz Łoskot |
---|---|---|---|
Priority: | normal | Milestone: | 1.5.1 |
Component: | GDAL_Raster | Version: | svn-trunk |
Severity: | normal | Keywords: | AAIGRID type statistics |
Cc: | warmerdam |
Description
When choosing the data type for an ascii grid the driver doesn't care about the no data value which can lead to very wrong results.
Short example:
The grid
ncols 2 nrows 2 xllcorner 500000.00 yllcorner 6100000.00 cellsize 1 nodata_value -999999 39 13 -999999 22
returns the following statistics from gdalinfo -stats:
Band 1 Block=2x1 Type=Int16, ColorInterp=Undefined Minimum=-32145.000, Maximum=39.000, Mean=-12263.000, StdDev=13410.683 NoData Value=-999999 Metadata: STATISTICS_MINIMUM=-32145 STATISTICS_MAXIMUM=39 STATISTICS_MEAN=-12263
I think it should be possible to use a nodatavalue which is well outside the domain of the data values. So, in the above case the driver should choose GDT_Float32 from looking at the nodatavalue alone. This would have the nice side effect, that it makes the complete file scan unnecessary in cases like this.
Change History (8)
comment:1 by , 16 years ago
Cc: | added |
---|---|
Component: | default → GDAL_Raster |
Keywords: | AAIGRID added |
Owner: | changed from | to
comment:2 by , 16 years ago
Keywords: | type statistics added |
---|---|
Status: | new → assigned |
Version: | → svn-trunk |
If I understand this report correctly, there are actually two issues:
- Use nodata value to determine data type of a grid
according to following algorithm:
if nodata is float-point number { set grid data type to GDT_Float32 } else // nodata is integral number { for pixels in sample chunk) { if float-point numbers found set data type to GDT_Float32 else set data type to GDT_Integer16 (default type) } }
- When calculating statistics, nodata value should be also compared, so no data is not out of range of min/max value:
min = MIN( calculated_min, nodata_value ) max = MAX( calculated_max, nodata_value )
The 1st issue can be solved directly in AAIGrid driver. The 2nd issue seems to be solvable in GDALComputeRasterMinMax function only, so the fix would apply to all GDAL drivers. I think , this is generally correct that nodata value is in range of all values:
min <= nodata < max or min < nodata <= max
Could you confirm if I've caught the problem correctly?
comment:3 by , 16 years ago
Mateusz,
I believe point 2 is not right. nodata values should not be included in min/max statistics if possible. I think you can just focus on point 1.
comment:4 by , 16 years ago
Frank,
I'm an idiot. I've no idea why I wanted to use nodata in statistics computation. This is completely stupid idea that nodata value is in range of all values. Sorry for that.
comment:7 by , 16 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Actually that is not what I meant. My problem was overflow. It doesn't matter if the nodatavalue is float if there are no nodatavalues in the data or if the nodatavalues in data are represented as ints.
I'm nut sure it is a good idea to make data type float just because the nodatavalue is float. Some drivers (I think GDAL included) always write the nodata as a float regardless of the "real" data type.
What actually meant was something like:
if nodata < -32,768 or nodata > 32,767 { set grid data type to GDT_Float32 } else { for pixels in sample chunk) { if float-point numbers found set data type to GDT_Float32 else set data type to GDT_Integer16 (default type) } }
I'm sorry, I din't come back to this before. You two are just too fast for me :-)
comment:8 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
I agree that the nodata value should be considered in the data range for ascii files.