Opened 10 years ago

Closed 5 years ago

#5291 closed defect (wontfix)

netcdf driver does not detect .grd files in netcdf-4 format / netcdf driver does not support netcdf-4 files with default chunking and bottom-up format / hdf5 driver crashes when H5Sget_simple_extent_ndims() returns negative value

Reported by: jluis Owned by: warmerdam
Priority: normal Milestone: closed_because_of_github_migration
Component: GDAL_Raster Version: svn-trunk
Severity: normal Keywords: netcdf
Cc: etourigny, Even Rouault

Description

This works fine

gdal_translate zz_non_deflated.nc lixo.tiff

Input file size is 104, 53
0...10...20...30...40...50...60...70...80...90...100 - done.

Now, the same file saved with a deflation level of 3 (with Mirone), complains

gdal_translate zz_deflated.nc lixo.tiff
Input file size is 104, 53
0ERROR 1: nBlockYSize = 53, only 1 supported when reading bottom-up dataset
ERROR 1: zz_deflated.nc, band 1: IReadBlock failed at X offset 0, Y offset 0
ERROR 1: GetBlockRef failed at X block offset 0, Y block offset 0

However, with a larger file (also defalted) I get a crash.

(Note: I can convert these same files nicely with GMT)

C:\j\bat\Cesar>gdal_translate zz_deflation_3.grd lixo.tiff
HDF5-DIAG: Error detected in HDF5 (1.8.11) thread 0:
  #000: ..\..\src\H5Ddeprec.c line 231 in H5Dopen1(): not found
    major: Dataset
    minor: Object not found
  #001: ..\..\src\H5Gloc.c line 430 in H5G_loc_find(): can't find object
    major: Symbol table
    minor: Object not found
  #002: ..\..\src\H5Gtraverse.c line 861 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #003: ..\..\src\H5Gtraverse.c line 641 in H5G_traverse_real(): traversal operator failed
    major: Symbol table
    minor: Callback failed
  #004: ..\..\src\H5Gloc.c line 385 in H5G_loc_find_cb(): object 'zz' doesn't exist
    major: Symbol table
    minor: Object not found
HDF5-DIAG: Error detected in HDF5 (1.8.11) thread 0:
  #000: ..\..\src\H5D.c line 391 in H5Dclose(): not a dataset
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.8.11) thread 0:
  #000: ..\..\src\H5Ddeprec.c line 231 in H5Dopen1(): not found
    major: Dataset
    minor: Object not found
  #001: ..\..\src\H5Gloc.c line 430 in H5G_loc_find(): can't find object
    major: Symbol table
    minor: Object not found
  #002: ..\..\src\H5Gtraverse.c line 861 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #003: ..\..\src\H5Gtraverse.c line 641 in H5G_traverse_real(): traversal operator failed
    major: Symbol table
    minor: Callback failed
  #004: ..\..\src\H5Gloc.c line 385 in H5G_loc_find_cb(): object 'zz' doesn't exist
    major: Symbol table
    minor: Object not found
HDF5-DIAG: Error detected in HDF5 (1.8.11) thread 0:
  #000: ..\..\src\H5D.c line 437 in H5Dget_space(): not a dataset
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.8.11) thread 0:
  #000: ..\..\src\H5S.c line 794 in H5Sget_simple_extent_ndims(): not a dataspace
    major: Invalid arguments to routine
    minor: Inappropriate type
ERROR 1: CPLMalloc(-8): Silly size requested.

Attachments (4)

zz_deflated.nc (28.3 KB ) - added by jluis 10 years ago.
small file (deflated)
zz_non_deflated.nc (23.5 KB ) - added by jluis 10 years ago.
small file (non deflated)
zz_deflation_3.grd (828.7 KB ) - added by jluis 10 years ago.
larger file (deflated)
ticket5291.patch (3.9 KB ) - added by demaria 9 years ago.
Patch including fix, unit test and a small test file

Download all attachments as: .zip

Change History (33)

by jluis, 10 years ago

Attachment: zz_deflated.nc added

small file (deflated)

by jluis, 10 years ago

Attachment: zz_non_deflated.nc added

small file (non deflated)

by jluis, 10 years ago

Attachment: zz_deflation_3.grd added

larger file (deflated)

comment:1 by Even Rouault, 10 years ago

Cc: etourigny added
Component: defaultGDAL_Raster
Keywords: netcdf added

comment:2 by etourigny, 10 years ago

Cc: Even Rouault added

I'm afraid I can't replicate your problem with the deflated file(zz_deflated.nc), using gdal 1.10.0 from ubuntugis-unstable.

$ gdal_translate zz_deflated.nc lixo2.tiff
Input file size is 104, 53
0...10...20...30...40...50...60...70...80...90...100 - done.

Regarding the zz_deflation_3.grd, it looks more like a driver detection issue, the hdf5 driver picks it up before the netcdf driver and fails, as the file is a netcdf-4 (with hdf5 storage). This is not great - perhaps the driver should be fixed for .grd files, but this is a tough issue, because the file is also an hdf5 file.

Real fix would be to try to make sure that netcdf driver is checked before the hdf5 driver. Even can you see how we can force the netcdf driver to be tried before the hdf5 driver?

$ gdal_translate NETCDF:zz_deflation_3.grd lixo.tiff
Input file size is 512, 512
0...10...20...30...40...50...60...70...80...90...100 - done.

$ gdal_translate zz_deflation_3.grd lixo.tiff
HDF5-DIAG: Error detected in HDF5 (1.8.9) thread 0:
  #000: H5Ddeprec.c line 231 in H5Dopen1(): not found
    major: Dataset
    minor: Object not found
  #001: H5Gloc.c line 430 in H5G_loc_find(): can't find object
    major: Symbol table
    minor: Object not found
  #002: H5Gtraverse.c line 861 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #003: H5Gtraverse.c line 641 in H5G_traverse_real(): traversal operator failed
    major: Symbol table
    minor: Callback failed
  #004: H5Gloc.c line 385 in H5G_loc_find_cb(): object 'zz' doesn't exist
    major: Symbol table
    minor: Object not found
HDF5-DIAG: Error detected in HDF5 (1.8.9) thread 0:
  #000: H5D.c line 391 in H5Dclose(): not a dataset
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.8.9) thread 0:
  #000: H5Ddeprec.c line 231 in H5Dopen1(): not found
    major: Dataset
    minor: Object not found
  #001: H5Gloc.c line 430 in H5G_loc_find(): can't find object
    major: Symbol table
    minor: Object not found
  #002: H5Gtraverse.c line 861 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #003: H5Gtraverse.c line 641 in H5G_traverse_real(): traversal operator failed
    major: Symbol table
    minor: Callback failed
  #004: H5Gloc.c line 385 in H5G_loc_find_cb(): object 'zz' doesn't exist
    major: Symbol table
    minor: Object not found
HDF5-DIAG: Error detected in HDF5 (1.8.9) thread 0:
  #000: H5D.c line 437 in H5Dget_space(): not a dataset
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.8.9) thread 0:
  #000: H5S.c line 755 in H5Sget_simple_extent_ndims(): not a dataspace
    major: Invalid arguments to routine
    minor: Inappropriate type
ERROR 1: CPLMalloc(-8): Silly size requested.

Segmentation fault

comment:3 by Even Rouault, 10 years ago

Etienne, the driver order is defined in frmts/gdalallregister.cpp, and currently the netCDF driver is tested before the HDF5 driver, so this is not the issue

With my netcdf 3.6, neither "gdalinfo zz_deflation_3.grd" nor "gdalinfo NETCDF:zz_deflation_3.grd" manage to open the file. In your case, the code that autodetects netcdf when the NETCDF: prefix is not specified must fail, and thus the HDF5 driver is then triggered. So the autodetection in netCDFDataset::IdentifyFormat() shall likely be enhanced to accept the ".grd" extension in netCDF-4/HDF5 case ?

For the segfault in the HDF5 driver, I've committed the following fix : trunk r26601, branches/1.10 r26602 : "HDF5: avoid segmentation fault when H5Sget_simple_extent_ndims() returns negative value (#5291)"

in reply to:  3 comment:4 by etourigny, 10 years ago

Replying to rouault:

Etienne, the driver order is defined in frmts/gdalallregister.cpp, and currently the netCDF driver is tested before the HDF5 driver, so this is not the issue

ok, thanks

With my netcdf 3.6, neither "gdalinfo zz_deflation_3.grd" nor "gdalinfo NETCDF:zz_deflation_3.grd" manage to open the file.

nor should it, because it is a netcdf-4 file supported by netcdf 4.x.y, not 3.x.y

In your case, the code that autodetects netcdf when the NETCDF: prefix is not specified must fail, and thus the HDF5 driver is then triggered. So the autodetection in netCDFDataset::IdentifyFormat() shall likely be enhanced to accept the ".grd" extension in netCDF-4/HDF5 case ?

Yes it could be enhanced but is not the ideal solution. I believe it only checks when the file has .nc/.nc2/.nc3/.nc4 extension. This is fragile, as it should ideally allow to test all files. But the problem with that is that hdf5 files (not netcdf-4) would be picked up. Perhaps solution is to NOT check .h5/.hdf5 files, instead of checking only .nc* files?

For the segfault in the HDF5 driver, I've committed the following fix : trunk r26601, branches/1.10 r26602 : "HDF5: avoid segmentation fault when H5Sget_simple_extent_ndims() returns negative value (#5291)"

cool thanks.

comment:5 by etourigny, 10 years ago

Summary: Deflated netCDF may crashnetcdf driver does not detect .grd files in netcdf-4 format and hdf5 driver crashes when H5Sget_simple_extent_ndims() returns negative value

comment:6 by etourigny, 10 years ago

fixed .grd (and .nc3) file detection in trunk (r26605) and 1.10 (r26606).

However, I think it would be better in trunk (but not in 1.10) to reverse the logic, instead of trying to open .nc/.nc2/.nc3/.nc4/.cdf/.grd files , NOT try and open .hdf/.h5/.he5 files, so they are read by the hdf5 driver. However, this might be disruptive so I won't go ahead without input from others.

comment:7 by jluis, 10 years ago

Guys, thanks for your work and sorry to put you on this because I'm stubborn and keep use .grd for nc files, but I like to distinguish pure simple grids from the whole host other possibilities that can go inside a netcdf file.

However, sorry to say that I still get the same error (now with the larger file that used to crash). Could this be related to the relatively recent issue with the hdf1.8.11? But this not hdf related, isn't it?

gdal_translate zz_deflation_3.grd lixo.tiff
Input file size is 512, 512
0ERROR 1: nBlockYSize = 512, only 1 supported when reading bottom-up dataset
ERROR 1: zz_deflation_3.grd, band 1: IReadBlock failed at X offset 0, Y offset 0
ERROR 1: GetBlockRef failed at X block offset 0, Y block offset 0

comment:8 by etourigny, 10 years ago

Joaquim - the remaining error you experience is due to limitations in the netcdf driver, sorry. Workaround is to create or copy the file without chunking, or chunking that is supported by the driver (e.g. 512x1 or heightx1) - or have it written in top-down order. How you do this depends on the software used to create the deflated file in the first place. I'm afraid I cannot fix this in the netcdf driver, it is too involved and I don't have the time.

I was able to do this with recent cdo (1.6.1)

$ cdo -v sinfon zz_deflation_3.grd
   File format: netCDF4 classic ZIP
    -1 : Institut Source   Ttype    Levels Num  Gridsize Num Dtype : Parameter name : Extra
     1 : unknown  unknown  constant      1   1    262144   1  F32z : z              : chunks=512x512 
   Grid coordinates :
     1 : lonlat       > size      : dim = 262144  nx = 512  ny = 512
                        longitude : first = -12  last = -2  inc = 0.0195694716  degrees_east
                        latitude  : first = 35  last = 45  inc = 0.0195694716  degrees_north
   Vertical coordinates :
     1 : surface                  : 0 
cdo sinfon: Processed 1 variable ( 0.00s )
tourigny@supernova: /data/research/work/gdal/issues/5291 $ cdo -f nc4c -z zip -k lines copy zz_deflation_3.grd zz_deflation_3.nc
cdo copy: Processed 262144 values from 1 variable over 1 timestep ( 0.06s )

$ cdo -v sinfon zz_deflation_3.nc
   File format: netCDF4 classic ZIP
    -1 : Institut Source   Ttype    Levels Num  Gridsize Num Dtype : Parameter name : Extra
     1 : unknown  unknown  constant      1   1    262144   1  F32z : z              : chunks=512x1 
   Grid coordinates :
     1 : lonlat       > size      : dim = 262144  nx = 512  ny = 512
                        longitude : first = -12  last = -2  inc = 0.0195694716  degrees_east
                        latitude  : first = 35  last = 45  inc = 0.0195694716  degrees_north
   Vertical coordinates :
     1 : surface                  : 0 
cdo sinfon: Processed 1 variable ( 0.01s )

$ gdal_translate zz_deflation_3.nc lixo.tiff 
Input file size is 512, 512
0...10...20...30...40...50...60...70...80...90...100 - done.

Now can you work with the other files without problem?

comment:9 by jluis, 10 years ago

Etienne, thanks for the effort. This is not an important issue to me as I can read the nc files with GMT. I just want to report this so it gets known. Not sure what you mean by "other files". So far I had problems only with deflated files.

in reply to:  9 ; comment:10 by etourigny, 10 years ago

Replying to jluis:

Etienne, thanks for the effort. This is not an important issue to me as I can read the nc files with GMT. I just want to report this so it gets known. Not sure what you mean by "other files". So far I had problems only with deflated files.

I meant the other files in this ticket, actually just zz_deflated.nc (you reported the same problem with that file, but I can't reproduce it).

in reply to:  10 comment:11 by jluis, 10 years ago

I meant the other files in this ticket, actually just zz_deflated.nc (you reported the same problem with that file, but I can't reproduce it).

Oh that. Yes, I still get the same error with the small zz_deflated.nc file.

comment:12 by etourigny, 10 years ago

Summary: netcdf driver does not detect .grd files in netcdf-4 format and hdf5 driver crashes when H5Sget_simple_extent_ndims() returns negative valuenetcdf driver does not detect .grd files in netcdf-4 format / netcdf driver does not support netcdf-4 files with default chunking and bottom-up format / hdf5 driver crashes when H5Sget_simple_extent_ndims() returns negative value

you are right, same problem with that file! Updated summary to reflect this.

comment:13 by jluis, 9 years ago

Ah, setting

GDAL_NETCDF_BOTTOMUP=NO

as mentioned by Julien in

http://osgeo-org.1560.x6.nabble.com/Problems-with-GDAL-1-10-1-converting-from-NetCDF-to-Geotiff-td5180197.html

works-around the problem. Now

gdal_translate zz_deflation_3.grd lixo.tiff

works fine

comment:14 by Jukka Rahkonen, 9 years ago

What do you suggest to do with this ticket? Can it be closed? Is there something that should be improved in the documentation before closing? Or are there some issues left which could perhaps be separated and transferred to a new ticket?

comment:15 by jluis, 9 years ago

No, definitely not closing as is. Having to set an ENV variable to make it work is just a workaround, not an acceptable final solution. The poster (Julien) showed a patch in its second mail in thread. Maybe implement it?

comment:16 by jluis, 9 years ago

In fact the setting of GDAL_NETCDF_BOTTOMUP=NO does not actually solves this issue. True that the file is read and converted but it comes out upside down.

comment:17 by Jukka Rahkonen, 9 years ago

I wonder if there is some more general issue with the bottom-up images.

http://trac.osgeo.org/gdal/ticket/4977 (about bottom-up GeoTIFF)

http://gis.stackexchange.com/questions/133054/gdal-translate-creates-images-that-are-mirrored (about bottom-up XYZ).

in reply to:  17 comment:18 by Even Rouault, 9 years ago

Replying to jratike80:

I wonder if there is some more general issue with the bottom-up images.

http://trac.osgeo.org/gdal/ticket/4977 (about bottom-up GeoTIFF)

http://gis.stackexchange.com/questions/133054/gdal-translate-creates-images-that-are-mirrored (about bottom-up XYZ).

No, those are all driver specific

comment:19 by demaria, 9 years ago

Hi, I've just a patch to fix the issue, tested on the 3 files. Julien

comment:20 by Even Rouault, 9 years ago

A small (a few kB at most) dataset illustrating the need for the patch + associated test case in autotest/gdrivers/netcdf.py would be helpfull.

comment:21 by demaria, 9 years ago

Ok I will do that.

by demaria, 9 years ago

Attachment: ticket5291.patch added

Patch including fix, unit test and a small test file

comment:22 by demaria, 9 years ago

Done, patch updated. I've included the small test file in the patch, let me know if it's problematic. The test file is a copy of zz_deflated.nc with all data set to fillvalue to reduce size.

comment:23 by jluis, 9 years ago

Fine thanks. It worked for me as well.

comment:24 by Even Rouault, 9 years ago

Milestone: 2.0
Resolution: fixed
Status: newclosed

Applied in trunk r28458 (please remind to mention the ticket number with ' (#XXXX)' at the end of your commit message)

comment:25 by jluis, 9 years ago

Resolution: fixed
Status: closedreopened

I have to reopen this because although r28458 makes it work I now realize that it has become extremely slow. For example, it takes 45 seconds to do this (accessing the file via GMT)

grdinfo algarve50.grd=gd

running gdalinfo on that file is fast but gdal_translate is also very slow.

in reply to:  25 comment:26 by demaria, 9 years ago

Replying to jluis:

I have to reopen this because although r28458 makes it work I now realize that it has become extremely slow. For example, it takes 45 seconds to do this (accessing the file via GMT)

grdinfo algarve50.grd=gd

running gdalinfo on that file is fast but gdal_translate is also very slow.

Joaquim,

I have reproduced and analyzed your performance issue, and it seems this is a problem with the default NetCDF4 cache size (which use HDF5 cache) which is too small. This default cache is defined when compiling the NetCDF4 library, so the performance issue is correlated with your local version of the NetCDF library.

Your file size is 3701x1341 with chunks size 1234x447 using deflate internal compression, but my fix force GDAL blocks size to one scanline (3701x1) to be able to do the bottom-up correction.

So when reading the file GDAL reads by single scanline but NetCDF internally re-read for each scanline the corresponding chunks and decompress them, which is a big overhead. This should not happen if the NetCDF cache is big enough to handle one row of chunks, but on my own configuration (GNU/Linux Ubuntu 64 bits 14.04, with the latest NetCDF/HDF5 libraries hand compiled) the default cache is only 4 Mo and a row of chunks is more than 6 Mo. When I set the cache to 10 Mo there is no more performance issue.

I don't understand why my default NetCDF cache is 4 Mo because NetCDF documentation says that it is 64 Mo: https://www.unidata.ucar.edu/software/netcdf/docs/netcdf/Chunk-Cache.html and it seems you have the same problem. Note that the only way to set/get the NetCDF cache is to call specific functions in the source code: https://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c/nc_005fset_005fchunk_005fcache.html#nc_005fset_005fchunk_005fcache

For the moment I'm not sure what is the best way to fix this issue because we must handle all cases. For cases without bottom-up correction, GDAL block size is aligned on NetCDF chunks so we should never have performance issues because the GDAL cache will do the work, so maybe we could set the NetCDF cache size only in bottom-up correction case. And in this case, it is not easy to set a good cache size because it depends on how you read files; ideally it should be set the same way you set the GDAL cache...

Last edited 9 years ago by demaria (previous) (diff)

comment:27 by jluis, 9 years ago

Confirmed, using the nc_get_chunk_cache() function I also get that the cache size is 4 Mb. So I rebuild netCDF with

-DCHUNK_CACHE_SIZE=67108864 -DDEFAULT_CHUNK_SIZE=67108864

Note that only the later (DEFAULT_CHUNK_SIZE) effectively changes the cache size but I didn't want to risk and changed both. With this change, the above "grdinfo algarve50.grd=gd " runs fast again.

So, as we discussed in private emails, we may be in presence of a netCDF bug.

comment:28 by Even Rouault, 9 years ago

Milestone: 2.0

Removing obsolete milestone

comment:29 by Even Rouault, 5 years ago

Milestone: closed_because_of_github_migration
Resolution: wontfix
Status: reopenedclosed

This ticket has been automatically closed because Trac is no longer used for GDAL bug tracking, since the project has migrated to GitHub. If you believe this ticket is still valid, you may file it to https://github.com/OSGeo/gdal/issues if it is not already reported there.

Note: See TracTickets for help on using tickets.