Opened 7 years ago

Closed 6 years ago

#6997 closed defect (fixed)

Reading NetCDF/HDF5 files directly from S3

Reported by: lachlan Owned by: warmerdam
Priority: normal Milestone:
Component: GDAL_Raster Version: 2.2.1
Severity: normal Keywords: VSIS3 AWS S3
Cc:

Description

I've been looking into the GDAL AWS S3 support, the intention is to use a statically compiled version of GDAL in a Lambda function to process NetCDF/HDF5 files. Unfortunately I've run into some issues.

Testing with gdalinfo

$ gdalinfo /vsis3/lambda-geospatial-test-data/road-network/RoadNetwork_tile_x9_y23_2.nc
HDF5-DIAG: Error detected in HDF5 (1.8.19) thread 0:
  #000: H5F.c line 602 in H5Fopen(): unable to open file
    major: File accessibilty
    minor: Unable to open file
  #001: H5Fint.c line 990 in H5F_open(): unable to open file: time = Thu Aug 10 07:12:22 2017
, name = '/vsis3/lambda-geospatial-test-data/road-network/RoadNetwork_tile_x9_y23_2.nc', tent_flags = 0
    major: File accessibilty
    minor: Unable to open file
  #002: H5FD.c line 991 in H5FD_open(): open failed
    major: Virtual File Layer
    minor: Unable to initialize object
  #003: H5FDsec2.c line 337 in H5FD_sec2_open(): unable to open file: name = '/vsis3/lambda-geospatial-test-data/road-network/RoadNetwork_tile_x9_y23_2.nc', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0
    major: File accessibilty
    minor: Unable to open file
ERROR 4: `/vsis3/lambda-geospatial-test-data/road-network/RoadNetwork_tile_x9_y23_2.nc' not recognized as a supported file format.
gdalinfo failed - unable to open '/vsis3/lambda-geospatial-test-data/road-network/RoadNetwork_tile_x9_y23_2.nc'.

However if I download the file locally it works.

$ aws s3 cp s3://lambda-geospatial-test-data/road-network/RoadNetwork_tile_x9_y23.nc .
$ gdalinfo RoadNetwork_tile_x9_y23.nc
Driver: netCDF/Network Common Data Format
Files: RoadNetwork_tile_x9_y23.nc
Size is 2000, 2000
Coordinate System is:
PROJCS["GDA94 / NSW Lambert",
. . .

Also, if I translate the file to a GeoTiff and upload to S3 it works.

$ gdal_translate RoadNetwork_tile_x9_y23.nc RoadNetwork_tile_x9_y23.tif
$ aws s3 cp RoadNetwork_tile_x9_y23.tif s3://lambda-geospatial-test-data/road-network/RoadNetwork_tile_x9_y23.tif
$ gdalinfo /vsis3/lambda-geospatial-test-data/road-network/RoadNetwork_tile_x9_y23.tif
Driver: GTiff/GeoTIFF
Files: /vsis3/lambda-geospatial-test-data/road-network/RoadNetwork_tile_x9_y23.tif
Size is 2000, 2000
Coordinate System is:
PROJCS["GDA94 / NSW Lambert",
. . .

I'm trying to understand if this is a bug, an issue related to the way we've compiled GDAL, or simply that GDAL's S3 support is currently limited to reading GeoTiff files. Thanks

Attachments (1)

RoadNetwork_tile_x7_y3.nc (723.3 KB ) - added by lachlan 7 years ago.
Sample NetCDF/HDF5 file

Download all attachments as: .zip

Change History (6)

by lachlan, 7 years ago

Attachment: RoadNetwork_tile_x7_y3.nc added

Sample NetCDF/HDF5 file

comment:1 by Even Rouault, 7 years ago

This is a limitation of the GDAL HDF5/netCDF drivers, and more exactly a limitation of libhdf5 / libnetcdf itself. Last time I looked (and I guess this is still valid), those libraries do directly the regular file I/O operations (open, seek, read, etc..), and offer no way to plug alternate I/O routines, which are necessary to make virtual file systems like /vsis3/ work.

That said a few years ago I created the https://trac.osgeo.org/gdal/browser/trunk/gdal/port/vsipreload.cpp shared library that can be LD_PRELOAD'ed and offer a workaround. I didn't try it recently with netcdf and it is rather rough at the edges, but you can try follow the indications at the top of the file.

comment:2 by lachlan, 7 years ago

Thanks Even. I did have a look through the HDF5 driver and it seemed that this was the case. Will have a look through your workaround when I get back in the office, there's also some simpler approaches such as using tiff files or copying the nc file locally for processing.

I'm happy for this ticket to be closed.

comment:5 by Even Rouault, 6 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.