Opened 7 years ago

Closed 6 years ago

#3890 closed defect (fixed)

netCDF file reading error

Reported by: angelospanagiotakis Owned by: etourigny
Priority: normal Milestone:
Component: GDAL_Raster Version: svn-trunk
Severity: normal Keywords: netcdf
Cc: warmerdam

Description

i am trying to load some netCDF files (CF compatible) from this site : http://data.smhi.se/met/scenariodata/echam5-r3/rca3/a1b/monthly/50km/ Gdal does not seem to recognise these files. They are in CF format gdal_info returns error opening the files. For some reason the netcdf driver is not recognising the organization of these particular netcdf files. any ideas why ?

Attachments (1)

netcdf.py (16.1 KB) - added by etourigny 6 years ago.

Download all attachments as: .zip

Change History (27)

comment:1 Changed 7 years ago by Kyle Shannon

I do not know why. I will look into this. At first glance from

kyle@Lucky13:~/gdal-tickets/3980$ ncdump -h ENSEMBLES_SMHIRCA30_A1B_ECHAM5-r3_MM_50km_1961-2100_prhmax.nc

I don't see anything out of the ordinary. I will take a look into this soon and post what I find.

kss

comment:2 Changed 7 years ago by warmerdam

Cc: warmerdam added; Kyle Shannon removed
Keywords: netcdf added
Owner: changed from warmerdam to Kyle Shannon

Thanks Kyle!

comment:3 Changed 7 years ago by Kyle Shannon

Status: newassigned

This one didn't take long. The GDALOpenInfo for the files on the site contain:

(gdb) p poOpenInfo->nHeaderBytes 
$1 = 1024
(gdb) p poOpenInfo->pabyHeader 
$2 = (GByte *) 0x61a7a0 "CDF\002"

which fails the test:

if( !EQUALN(poOpenInfo->pszFilename,"NETCDF:",7)
        && ( poOpenInfo->nHeaderBytes < 5 
             || !EQUALN((const char *) (poOpenInfo->pabyHeader),"CDF\001",5)))
        return NULL;

because of CDF\001 versus CDF\002. I imagine this is a versioning issue for netcdf. I will look into the documentation and see what the difference is and if it is a valid header. If it is valid and doesn't need to be handled specifically some how, I will fix the issue and commit the change. If it is bigger than that, I will work on it. In the mean time, if you use the:

NETCDF:Filename.nc:variable 

syntax, you can access the data through gdalinfo. Also, could you report the version of gdal you are using?

comment:4 Changed 7 years ago by angelospanagiotakis

gdalinfo --version

GDAL 1.7.0b2, FWTools 2.4.7, released 2010/01/19

gdalinfo "ENSEMBLES_SMHIRCA30_A1B_ECHAM5-r3_MM_50km_1961-2100_lit.nc"

ERROR 4: `ENSEMBLES_SMHIRCA30_A1B_ECHAM5-r3_MM_50km_1961-2100_lit.nc' not r ecognised as a supported file format.

gdalinfo failed - unable to open 'ENSEMBLES_SMHIRCA30_A1B_ECHAM5-r3_MM_50km _1961-2100_lit.nc'.

comment:5 Changed 7 years ago by Kyle Shannon

I don't have 1.7.0 on my machine as that release was retracted. Could you try:

gdalinfo NETCDF:ENSEMBLES_SMHIRCA30_A1B_ECHAM5-r3_MM_50km_1961-2100_lit.nc:lit

I will build 1.7.0 in the mean time. I am not sure why FWTools is using 1.7.0, I inquired on the dev list.

As for your issue, GDAL doesn't support large files in the netcdf format. The flag: "CDF\002" represents a 64-bit offset file. There is no catch for this in the driver right now (it can be circumvented by using the NETCDF:file:variable syntax) I will continue to look into it.

kss

comment:6 Changed 7 years ago by angelospanagiotakis

i get the same error message

gdalinfo NETCDF:ENSEMBLES_SMHIRCA30_A1B_ECHAM5-r3_MM_50km_1961-2100_lit.nc:lit

ERROR 4: `NETCDF:ENSEMBLES_SMHIRCA30_A1B_ECHAM5-r3_MM_50km_1961-2100_lit.nc:lit' does not exist in the file system, and is not recognised as a supported dataset name.

gdalinfo failed - unable to open 'NETCDF:ENSEMBLES_SMHIRCA30_A1B_ECHAM5-r3_MM_50km_1961-2100_lit.nc:lit'.

comment:7 Changed 7 years ago by Kyle Shannon

Version: unspecified1.7.0

I may have the answer, it just came to me. I am guessing your OS is win32, and the netcdf.dll is 32 bit. I don't have a machine with win32 setup for debugging, so I may be wrong. I am sure that lib doesn't support the 64 bit offset. It will get past the gdal check, but not the nc_open check. I will setup a 32 bit linux virtual machine and test it. That would explain why I can get to the data by using the NETCDF: tag. I have a netcdf lib built for 64 bit. If I am correct, I need to beef up the check in the driver for this behavior. Thanks for being patient.

comment:8 Changed 7 years ago by angelospanagiotakis

Thanks a lot for your efforts. You are correct my OS is win32 and the netcdf is 32 bit.

comment:9 Changed 7 years ago by angelospanagiotakis

Can anyone convert these files to 32 bit ?

comment:10 Changed 6 years ago by etiennesky

I would recommend the great CDO tools available at https://code.zmaw.de/projects/cdo .

The following command will do it cdo -f nc copy in.nc out.nc

comment:11 Changed 6 years ago by etiennesky

Version: 1.7.0svn-trunk

Please see proposed fix in #2379. If the ncdump utility can read the file, then the GDAL NetCDF driver should also with this fix.

comment:12 Changed 6 years ago by etourigny

Owner: changed from Kyle Shannon to etourigny
Status: assignednew

Fixed in trunk.

r23081 adds support for nc2 and nc4 in netcdfdataset.cpp r23080 adds autotests and 3 test files to see if nc2 and nc4 are supported, and also makes sure that hdf5 files (that are not netcdf) are opened by the hdf5 driver.

If someone can test this on windows, I will close the ticket. My autotests show that the issue is resolved in linux 64 bits.

BTW, netcdf2 (64 bits) is supposed to be supported on 320bit platforms, according to the netcdf FAQ.

comment:13 Changed 6 years ago by etourigny

Status: newassigned

comment:14 Changed 6 years ago by Even Rouault

Etienne,

I've looked at r23081 and your implementation of Identify() (as the method being afected to the pfnIdentify field of the driver) is a bit unusual, although probably still conformant with http://trac.osgeo.org/gdal/wiki/rfc11_fastidentify because your NCDF_FILETYPE_NONE constant evaluates to 0 (FALSE) (but that's fragile). All the implementations I'm aware of in other drivers return explicitely TRUE or FALSE, but not an enumeration. I think it would be nice to stick to that convention to keep things homogeneous. And safer if someone does "if ( TRUE == Identify() ) {}". Pedantic minds would say "eh, the solution would be to use bool instead of int as the return type of Identify()" ;-)

Your current Identify() implementation could be renamed IdentifiedProductType?(), kept internal to netCDFDataset, and you'd implement a standard Identify() that returns TRUE if EQUALN(poOpenInfo->pszFilename,"NETCDF:",7) or if IdentifiedProductType?() != NCDF_FILETYPE_NONE . Not sure I've described the exact implementation, but you've got the idea. You'd probably need migrate into Identify() the logic you added in Open() that tests the extension in the HDF5/netCDF4 case.

On a more personal point of view (that's could be argued), I believe that there's an additional implicit assumpution for Identify(), that it should return FALSE, if the driver cannot open the file with Open() (in other words Identify() should be at least as "optimistic" as Open() ) . At least, that's what most drivers do. But RFC11 leaves the door open to implementations where Identify() could fail and Open() could success. If that's the case, I think either Identify() should be improved, or just left unimplemented. Anyway, the whole point of this paragraph is to justify to add testing of EQUALN(poOpenInfo->pszFilename,"NETCDF:",7) in Identify().

(Ah, FYI, I'll be cut off from the Internet from 09/08 to 09/22)

comment:15 Changed 6 years ago by etourigny

I've implemented your suggestion in r23084.

New private function IdentifyFileType?() returns the filetype, and Identify() uses that function. Added a new type NCDF_FILETYPE_HDF5, which corresponds to a HDF5 file that should be read by the HDF5 driver (if netcdf is not version 4 or if extension is not .nc or .nc4). As there is no formal and easy way of differentiating between standard hdf5 and netCDF-4, this check is sufficient IMHO. If user really wants to open any HDF5 file as a netcdf-4 file, rename it to .nc or .nc4, or use the NETCDF: syntax.

Possible bug: the Identify() function doesn't get called outside of netcdfdataset (when I use gdalinfo or python Gdal.Open()) even though it has been registered for the driver. This is contrary to what I would expect from wiki:rfc11_fastidentify . Is this a bug or I am misunderstanding the utility of Identify() function outside of Open() (i.e. a fast way to find the appropriate driver without actually calling Open())? Or perhaps it just hasn't been implemented fully yet?

comment:16 Changed 6 years ago by Even Rouault

Identify() is used by GDALIdentifyDriver(), not by GDALOpen().

comment:17 Changed 6 years ago by Even Rouault

Etienne,

I get the following errors on netcdf.py. According to http://docs.python.org/library/subprocess.html , subprocess.check_output() is only available since python 2.7 . And my default python is 2.6

  TEST: netcdf_15 ... fail (blowup)
Traceback (most recent call last):
  File "../pymod/gdaltest_python2.py", line 37, in run_func
    result = func()
  File "netcdf.py", line 395, in netcdf_15
    proc = subprocess.check_output( [ 'nc-config', '--has-nc2' ] )
AttributeError: 'module' object has no attribute 'check_output'
  TEST: netcdf_16 ... fail (blowup)
Traceback (most recent call last):
  File "../pymod/gdaltest_python2.py", line 37, in run_func
    result = func()
  File "netcdf.py", line 426, in netcdf_16
    proc = subprocess.check_output( [ 'nc-config', '--has-nc4' ] )
AttributeError: 'module' object has no attribute 'check_output'
  TEST: netcdf_17 ... skip

I've introduced some "abstraction" of subprocess in pymod subdirectory (gdaltest_python2.py and gdaltest_python3.py) with implementations for python 2.X and others with python 3.X. Perhaps you could use them, in particular gdaltest.runexternal().

Other note so that you can have a better idea of my current environment : I have the following packages installed, but they do not ship with nc-config

ii  libnetcdf-dev                        1:3.6.3-1                                       Development kit for NetCDF
ii  libnetcdf4                           1:3.6.3-1                                       An interface for scientific data access to l
ii  netcdf-bin                           1:3.6.3-1                                       Programs for reading and writing NetCDF file

This is Ubuntu 10.04 with ubuntugis-unstable PPA.

comment:18 Changed 6 years ago by etourigny

added support for Identify() in the autotest in trunk (r23086) added support for IdentifyFileType?() for NETCDF: file syntax in trunk (r23087)

just read your last comment now...will look more into the test and submit a solution for the process call and also netcdf-3 and nc-config

I am using Ubuntu 11.04 and latest libs for everything related to netcdf/hdf, will setup something similar to your setup.

Sorry for so many commits...

comment:19 Changed 6 years ago by Even Rouault

If my configuration is too painful to support, you coud just catch the exception and return 'skip'.

comment:20 Changed 6 years ago by etourigny

Even, please test the attached netcdf.py script.

I use gdaltest.runexternal() and gdaltest.runexternal_out_and_err() instead of subprocess.check_output.

I check for the netcdf version using ncdump which is always available. If library is version 3, assume that netcdf-4 files are not supported.

When I try it with python-2.6, I get the following message, but not with python-2.7:

TEST: netcdf_15 ... ../pymod/gdaltest_python2.py:152: DeprecationWarning?: os.popen3 is deprecated. Use the subprocess module. (ret_stdin, ret_stdout, ret_stderr) = os.popen3(cmd)

success

TEST: netcdf_16 ... success

Changed 6 years ago by etourigny

Attachment: netcdf.py added

comment:21 Changed 6 years ago by Even Rouault

The attached file works for me (netcdf_15 succeeds, and netcdf_16 and netcdf_17 are properly skipped) with all python versions >= 2.4 (including 3.X). So you can apply.

You can safely ignore the os.popen3 deprecation warning. The python folks decided in 2.6 to add deprecation warnings for API that would not be avalaible anymore in 3.X, and as they realize that python 2.X would remain used for quite a long time, they decided to make turn them off by default in 2.7.

comment:22 Changed 6 years ago by etourigny

ok Even, merci beaucoup et bon voyage!

I will try to remember to test against different python versions when I implement new features in the autotests.

However, I cannot test against 3.1 and 3.2:

tourigny@supernova: /home/src/gdal-svn/gdal-autotest/gdrivers $ /usr/bin/python3.1 ./netcdf.py
Traceback (most recent call last):
  File "./netcdf.py", line 33, in <module>
    import gdal
  File "/home/softdev/lib/python2.7/site-packages/GDAL-1.9.0-py2.7-linux-x86_64.egg/gdal.py", line 2, in <module>
    from osgeo.gdal import deprecation_warn
  File "/home/softdev/lib/python2.7/site-packages/GDAL-1.9.0-py2.7-linux-x86_64.egg/osgeo/__init__.py", line 21, in <module>
    _gdal = swig_import_helper()
  File "/home/softdev/lib/python2.7/site-packages/GDAL-1.9.0-py2.7-linux-x86_64.egg/osgeo/__init__.py", line 17, in swig_import_helper
    _mod = imp.load_module('_gdal', fp, pathname, description)
ImportError: /home/softdev/lib/python2.7/site-packages/GDAL-1.9.0-py2.7-linux-x86_64.egg/osgeo/_gdal.so: undefined symbol: _Py_ZeroStruct

I understand that I have to re-build gdal for python 3.x. If there is an easy way to do that, please point me to it, because I can't get it working.

comment:23 Changed 6 years ago by Even Rouault

It is just as simple as doing :

cd swig/python
python3.2 setup.py build
python3.2 setup.py install (or if you don't have the rights to install into the python3.2 site-packages directory, you can define PYTHONPATH=/path/to/gdal/swig/python/build/lib.linux-x86_64-3.2  )

comment:24 Changed 6 years ago by etourigny

Thanks for the suggestion. I had always built the python bindings within the main gdal build.

However that didn't work for me (trying to install to /usr/local... even though I set PYTHONPATH).

The following worked fine though, using virtualenv:

cd swig/python
python3.2 /home/softdev/bin/virtualenv.py /home/softdev/
python3.2 setup.py build
python3.2 setup.py install

comment:25 Changed 6 years ago by etourigny

Fixed netcdf-3 and Python-2.6 compatibility in trunk (r23088).

comment:26 Changed 6 years ago by etourigny

Resolution: fixed
Status: assignedclosed

Closing this ticket. Should someone find problems with netcdf2 (64-bit) files, please re-open.

Note: See TracTickets for help on using tickets.