Update NetCDF driver for greater CF-1 convention compatibility (orientation, datums, map projections)
In August 2011 Etienne Tourigny initiated a discussion on the GDAL mailing list proposing and calling for interest in improvements to GDAL's NetCDF driver (Subject: discussion on improvements to the NetCDF driver and CF-1 convention), and collected ideas at wiki:NetCDF_Improvements.
This page summarises proposals on several of the key changes proposed on that Wiki page that are being actively developed, with the intention of upgrading the NetCDF driver code in the GDAL trunk. The changes affect orientation of data, handling of datum parameters, writing coordinate variables, handling of different NetCDF file formats, and are described specifically below. A common thread amongst the changes is to improve interoperability with the NetCDF Climate & Forecasting (CF-1) conventions, which are the widely utilised format for recording projection information in raster datasets.
The changes are listed here for comment:- including noting where applicable when they may affect backwards-compatibility with existing data exported by GDAL into NetCDF format.
The common rationale underlying these changes is:
- The NetCDF view of rasters created by GDAL should conform to CF-1 conventions :- and work for key operations with common
NetCDF tools such as creating a WMS or WCS display from a NetCDF raster (eg displaying a WMS using ncWMS, or a WCS using Thredds Data Server).
- Given (1), the driver should use the CF-1 conventions for storing needed metadata where at all possible, but where
necessary add extra metadata so the file can be readable back into GDAL with no loss of information or translation error. A specific example of this is named Datum and EPSG codes, which the current CF convention (CF-1.5) do not include a standard for recording. Therefore, GDAL should by default supplement CF-1 with this information.
- If the user doesn't wish to supplement the files with GDAL metadata, they should specifically disable this using an appropriate '-co' option to the gdal command used.
As specified below in the sub-sections.
The autotests have been updated (autotest/gdrivers/netcdf_cf.py) to check CF-1 compliance of exported datasets, and also that round-trip conversion from another format (eg GeoTiff) to NetCDF and back is successful.
The GDAL driver summary page (http://www.gdal.org/frmt_netcdf.html) will be updated with description of new behaviour and options.
Development and Testing
- The gdal-dev mailing list should serve as the central point for initial discussions
- Ticket #4294 should serve for technical aspects and patch and test file repository.
- Code and history of changes is in a git repository at https://github.com/etiennesky/gdal-netcdf/ (netcdf-dev branch)
- Anyone that wishes to contribute can post patches here or given write access to the repository.
- Since the modifications above involved a lot of related changes to significant parts of the driver, we managed the actual development process using a Github repository for integration and testing.
- The autotest suite had been modified (in svn trunk) to test the improvements - including using an online cf-checker to test conformance when available.
- We also manually tested that NetCDF files created by the updated driver in each supported CF-1.5 projection could be opened via NetCDF-Java, in addition to the cf-checker used in the new autotests. This has allowed verification of the majority of projections, with a few remaining issues - see link below.
- We will ask the users of the driver to please test the new version with their important files and notify any regressions.
- The code will be committed to svn trunk, and if possible (dependent on feedback) will be back-ported to the 1.8 series (should a 1.8.2 release occur after implementation and sufficient testing).
- The commit will happen as soon as we get approval from members of the gdal-dev list or RFC vote (if needed)
- wiki:NetCDF_ProjectionTestingStatus: page with information on tested capabilities to export from/to CF-1.5 projections.
- wiki:NetCDF_Improvements: community planning page for consolidating NetCDF driver upgrade suggestions that preceded the specific work described here.
1) Changes to metadata items including update to CF-1.5
As is common with many programs which manipulate netcdf data, a new GDAL metadata item (with version information) is added. Also, the netcdf history attribute will be updated, and the Conventions attribute will be set to CF-1.5 to reflect support of the latest CF conventions and projections (See http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.5/).
:Conventions = "CF-1.5" ; :GDAL = "GDAL 1.9dev, released 2011/01/18" ; :history = "Mon Oct 10 21:12:19 2011: GDAL NCDFCreateCopy( trmm4.nc, ... )\n",
Has back-compat impact?: No
2) Change to a default south-up orientation for NetCDF data
Related fix: #4284
New ConfigOptions: GDAL_NETCDF_BOTTOMUP=YES/NO (default YES) for default export and import
New -co options: WRITE_BOTTOMUP=YES/NO (default YES) for default export
Has back-compat impact?: Only for GDAL versions prior to 1.6 (r18152). Since then, a check is made to ensure the driver recognizes bottom-up grids (using the CF tags) and flips the y-axis accordingly. An exception is when there is not geo-referencing data, in which case it may be in the wrong direction. The export options can be used to force top-down if needed.
When GDAL exports to NetCDF, translate the data so that it is in "south up" orientation, i.e. invert the data from the GDAL default. Import and export assume bottom-up, but import and export options can change the default. Import of files created by earlier versions of the driver assume top-down for backwards compatibility. When there is information on the orientation from CF Y axis values, default values are overridden.
CF-1's default orientation is south-up, and all known current CF-1 -compliant applications expect this.
3) Changes to improve SRS Interoperability between GDAL and NetCDF CF-1 for Map Projected data
See GDAL ticket: #2893
While GDAL uses the OGC Well Known Text (WKT) format internally to store datasets' Spatial Reference Systems (SRS), the CF-1 conventions use a custom system based on a special 'grid_mapping' NetCDF variable with a set of projection attributes, and also Coordinate variables for both the projected coordinates and latitude and longitude. See http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.5/ch05s06.html CF1.5 ch5.6 for an explanation and examples, and http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.5/apf.html CF1.5 App F for a list of supported projections.
Unfortunately, the CF-1 conventions do not currently allow saving of all data needed for a fully georeferenced SRS, such as Named datums and EPSG codes. For background on this, see CF convention trac tickets https://cf-pcmdi.llnl.gov/trac/ticket/9 #9, https://cf-pcmdi.llnl.gov/trac/ticket/18 #18, https://cf-pcmdi.llnl.gov/trac/ticket/27 #27 and more recent https://cf-pcmdi.llnl.gov/trac/ticket/69 #69 - and recent mailing list threads like http://mailman.cgd.ucar.edu/pipermail/cf-metadata/2011/012937.html here.
The proposed changes in this section update the NetCDF driver when exporting to NetCDF to:
- Ensure the projection parameters are saved correctly according to their CF-1 definitions
- save extra information so datasets are CF-1 compliant and can be viewed as gridded data in CF-1 compliant NetCDF applications built on NetCDF Java (X/Y projected coordinate arrays, and optionally full Lat/Lon arrays);
- save as much of the SRS information in CF-1 compliant format as possible (including ellipsoid parameters);
- continue exporting extra GDAL information by default that allows full re-import. However add to the driver a creation option so the user can choose not to export them.
The following subsections describe these changes specifically.
3.1) Ensure projection parameters are saved and read from correctly for all projections specified in CF-1.5
A thorough analysis of the CF-1.5 map projections supported (see again http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.5/apf.html CF-1.5 App F) showed that the mapping of OGC WKT projection parameter names to CF-1.5 equivalents needed to be done per-projection. Whereas, the existing code tried to use a single global mapping of WKT attribute names to CF-1 for use in exporting.
To overcome this issue we have defined per-projection mappings of OGC WKT parameter names to CF-1.5 equivalents using simple data structures, and re-written the NetCDF driver export code (in the CreateCopy() function) to reference these mappings when saving a map-projected raster. This approach is similar to that used for other drivers such as HFA and when exporting to ESRI PS strings.
The updated code's projection capabilities was tested for all projections listed as part of CF-1.5 by translating input files in another format (eg GeoTIFF) into NetCDF, and then ensuring the resulting files were CF-compliant and could be opened as gridded data by the NetCDF Java API.
We also tested these files could have their projections correctly re-imported back into GDAL :- and in the process updated the NetCDF driver's projection import code (in SetProjection()) as well.
Note: while we've tested the updated driver has much better coverage of CF-1.5 projection parameter handling than previously, there are still certain CF-1.5 projection unusual cases that are proving challenging to confirm an OGC WKT equivalent. We have documented this list at wiki:NetCDF_ProjectionTestingStatus, and welcome advice/testing on the remaining challenges here.
3.2) When exporting to NetCDF map projections via GDAL, save projection coordinate variables. Optionally, also save lat and lon mapping variables.
Related fix: #2893
New -co options:
- WRITE_LONLAT=YES/NO/IF_NEEDED (default: YES for geographic, IF_NEEDED for projected)
- TYPE_LONLAT=float/double (default: double for geographic, float for projected)
Has back-compat impact?: No
When exporting rasters to NetCDF, save Coordinate Variables for the projected SRS, as required by the CF-1 convention, normally saved with names "x" and "y" (these are the names for every CF-1 projection except Rotated Pole - see CF-1 App F linked above for details).
We have also added a '-co' driver option listed above, that allows specification to also write 2D 'lat' and 'lon' arrays, which are specified as part of the CF-1 convention for map projections - but are not actually required for the Unidata-maintained NetCDF Java API and applications built upon them (eg ncWMS, Thredds Data Server).
The WRITE_LONLAT=IF_NEEDED option is useful in the case of exporting a projection which is not supported by the CF-1.5 standard (e.g. sinusoidal), so any CF-1.5 compliant software can locate the points on a lon/lat map.
The CF-1 conventions specify the X and Y projected coordinate variables as necessary to define projected grids, and tests with common NetCDF CF-1 reading software indeed show that rasters with projected grids will not be correctly interpreted without them (the NetCDF-Java API determines this, since it's used by ncWMS, ToolsUI and Thredds Data Server).
While the CF-1 convention also formally requires 2D "lat" and "lon" Variables containing the latitude and longitude at each data point, these are redundant in the case of software that can read the encoded projection and perform conversion to Geographic calculation on the fly. They also add very significantly to file-size of the saved raster since they are 2D arrays and saved as double/float. Thus the NetCDF Java API chose not to require their presence and can correctly project NetCDF files that contain just the X and Y coordinate variables (see http://firstname.lastname@example.org/msg00759.html this mailing list post by NetCDF Java explaining this decision).
Thus we've chosen to make it optional to write these Variables - but provide the 'WRITE_LONLAT' co option for users who wish to make sure their data is fully self-describing and CF-1 compliant, without relying on on-the-fly reprojection.
3.3) Continue saving GDAL custom attribute tags in addition to CRS definition, but add option not to save
New -co options: -co WRITE_GDAL_TAGS=YES/NO (default: YES)
Has back-compat impact?: No
In addition to following the CF-1 conventions to save projection metadata, the current NetCDF driver writes metadata in several custom attributes:
- A "spatial_ref" tag containing full OGC WKT string, including datum and EPSG codes if known.
- A geotransform array saved as a string (GeoTransform),
- and individual corner coordinates (NN, SS, WE and WW).
The new implementation does not save the GeoTransform array, with the exception of when the GeoTransform cannot be retrieved from the netcdf file (i.e. only for a geographic CRS, in the specific case that lon/lat values are not written ).
This has the effect that in some cases, the computed pixel width and height (GT(1) and GT(5)) differs by a very small amount from the original values (e.g. for 30m pixels, < 10-11 - about a picometer) in the GDAL raster file, due to various floating-point operations and half-pixel shifts. We feel that this difference is marginal, however should a strong argument be presented, we could save the GeoTransform as before and compare with the values stored in the CF dimensions variable for conflict.
The corner coordinates are never saved as they are redundant. When importing a CF file, if GeoTransform cannot be computed from dimension variables, GeoTransform and corner coordinates are consulted if available. We propose adding a new WRITE_GDAL_TAGS driver creation option to allow the user to choose not to write these tags. However it would be set to 'YES' by default.
When importing files in NetCDF format, the driver checks the GDAL-written WKT (including Datum), matches the projection information stored in the CF-compliant attributes. If there is a conflict, the CF-compliant CRS will be considered as authoritative.
We assume by default that if users export to a NetCDF file from GDAL, then they may wish to import into GDAL again and retain full information. As stated above, the CF-1 convention doesn't allow saving all the projection information that GDAL can manipulate via a WKT, so we recommend keeping current behaviour by default to allow lossless re-import into GDAL. The 'extra' attributes do not interfere with the ability of CF-1 compliant tools such as NetCDF-Java to open and update rasters.
The extra option to not save GDAL tags has been added in recognition of a use-case to export to a NetCDF file with no intention to access again via GDAL and use purely for use by CF-1 applications, in which case the user may wish to have no additional metadata present.
The GeoTranform array is not written to file unless there is no means to recover that information (i.e. in the case when lon/lat values are not written to file for a geographic CRS).
3.4) Add saving of reference Ellipsoid parameters in CF-1 compliant format used by dataset
New -co options: No, though related to the new proposed WRITE_GDAL_TAGS option below.
Has back-compat impact?: No
When exporting to NetCDF, the new driver will utilise the CF-1 attributes that allow specification of the reference ellipsoid used by the dataset's CRS (e.g. 'semi_major_axis', 'inverse_flattening', 'longitude_of_prime_meridian').
However, on importing data the driver will continue to use the full GDAL WKT specification in the 'spatial_ref' attribute where present (as described in 3.2). This is because the CF-1.5 conventions still don't include provision for saving full Datum information, such as the name and authority of the datum, which is needed for precise transformations. The CF-1 ellipsoid attributes will only be used as a fallback when the full WKT isn't present.
An open question: if the CF metadata includes datum information (but no named datum), should we try to match the spheroid to well known GeogCS (such as WGS84)? This can cause datum errors in some cases (one spheroid can be used in many named datums), as discussed in this thread http://lists.osgeo.org/pipermail/gdal-dev/2011-October/030374.html.
CF-1 convention's only support of datums currently is to specify a reference ellipsoid using parameters such as 'semi_major_axis' and 'inverse_flattening'. While these attributes alone do not fully specify the datum (which requires an EPSG code so control points can be referenced), they at least allow fairly accurate map projection of the created NetCDF file using tools such as ncWMS.
In the longer-term it would be beneficial to work with the CF-1 community to allow saving of full datum information in a format readable by CF-1, and we've made preliminary enquiries here and would welcome cooperation with the CF-1 community, but updating the CF-1 standard is outside the scope of this current GDAL NetCDF work.
4) Add support for different netcdf file types and compression
Related fix: N/A
New -co options:
- COMPRESS=NONE/DEFLATE/PACKED (default: NONE)
- ZLEVEL=[1-9] (default: 1) note: higher ZLEVEL values do not decrease size significantly
Has back-compat impact?: NC2 and NC4 filetypes, as well as DEFLATE and PACKED compression may not be supported by older netcdf and GDAL versions. NC4 filetype and DEFLATE compression require netcdf-4 and HDF5 and DEFLATE requires zlib. PACKED compression has to be added to the driver for import also. However, these are optional therefore default behaviour causes no backwards compatibility issues.