wiki:NetCDF_exportUpdateChanges

Version 42 (modified by etourigny, 13 years ago) ( diff )

--

Update NetCDF driver for greater CF-1 convention compatibility (orientation, datums, map projections)

Status: Development

Summary

In August 2011 Etienne Tourigny initiated a discussion on the GDAL mailing list proposing and calling for interest in improvements to GDAL's NetCDF driver (Subject: discussion on improvements to the NetCDF driver and CF-1 convention), and collected ideas at wiki:NetCDF_Improvements.

This page summarises proposals on several of the key changes proposed on that Wiki page that are being actively developed, with the intention of upgrading the NetCDF driver code in the GDAL trunk. The changes affect orientation of data, handling of datum parameters, writing coordinate variables, handling of different NetCDF file formats, and are described specifically below. A common thread amongst the changes is to improve interoperability with the NetCDF Climate & Forecasting (CF-1) conventions, which are the widely utilised format for recording projection information in raster datasets.

The changes are listed here for comment:- including noting where applicable when they may affect backwards-compatibility with existing data exported by GDAL into NetCDF format.

Overall Rationale

The common rationale underlying these changes is:

  1. The NetCDF view of rasters created by GDAL should conform to CF-1 conventions :- and work for key operations with common

NetCDF tools such as creating a WMS or WCS display from a NetCDF raster (eg displaying a WMS using ncWMS, or a WCS using Thredds Data Server).

  1. Given (1), the driver should use the CF-1 conventions for storing needed metadata where at all possible, but where

necessary add extra metadata so the file can be readable back into GDAL with no loss of information or translation error. A specific example of this is named Datum and EPSG codes, which the current CF convention (CF-1.5) do not include a standard for recording. Therefore, GDAL should by default supplement CF-1 with this information.

  • If the user doesn't wish to supplement the files with GDAL metadata, they should specifically disable this using an appropriate '-co' option to the gdal command used.

Planned Changes

1) Changes to metadata items including update to CF-1.5

As is common with many programs which manipulate netcdf data, a new GDAL metadata item (with version information) is added. Also, the netcdf history attribute will be updated, and the Conventions attribute will be set to CF-1.5 to reflect support of the latest CF conventions and projections (See http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.5/).

		:Conventions = "CF-1.5" ;
		:GDAL = "GDAL 1.9dev, released 2011/01/18" ;
		:history = "Mon Oct 10 21:12:19 2011: GDAL NCDFCreateCopy( trmm4.nc, ... )\n",

Has back-compat impact?: No

2) Change to a default south-up orientation for NetCDF data

Related fix: #4284

New ConfigOptions: GDAL_NETCDF_BOTTOMUP=YES/NO (default YES) for default export and import

New -co options: WRITE_BOTTOMUP=YES/NO (default YES) for default export

Has back-compat impact?: Only for GDAL versions prior to 1.6 (r18152). Since then, a check is made to ensure the driver recognizes bottom-up grids (using the CF tags) and flips the y-axis accordingly. An exception is when there is not geo-referencing data, in which case it may be in the wrong direction. The export options can be used to force top-down if needed.

Change:

When GDAL exports to NetCDF, translate the data so that it is in "south up" orientation, i.e. invert the data from the GDAL default. Import and export assume bottom-up, but import and export options can change the default. Import of files created by earlier versions of the driver assume top-down for backwards compatibility. When there is information on the orientation from CF Y axis values, default values are overridden.

Rationale:

CF-1's default orientation is south-up, and all known current CF-1 -compliant applications expect this.

3) Changes to improve SRS Interoperability between GDAL and NetCDF CF-1 for Map Projected data

See GDAL ticket: #2893

While GDAL uses the OGC Well Known Text (WKT) format internally to store datasets' Spatial Reference Systems (SRS), the CF-1 conventions use a custom system based on a special 'grid_mapping' NetCDF variable with a set of projection attributes, and also Coordinate variables for both the projected coordinates and latitude and longitude. See http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.5/ch05s06.html CF1.5 ch5.6 for an explanation and examples, and http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.5/apf.html CF1.5 App F for a list of supported projections.

Unfortunately, the CF-1 conventions do not currently allow saving of all data needed for a fully georeferenced SRS, such as Named datums and EPSG codes. For background on this, see CF convention trac tickets https://cf-pcmdi.llnl.gov/trac/ticket/9 #9, https://cf-pcmdi.llnl.gov/trac/ticket/18 #18, https://cf-pcmdi.llnl.gov/trac/ticket/27 #27 and more recent https://cf-pcmdi.llnl.gov/trac/ticket/69 #69.

The proposed changes in this section update the NetCDF driver when exporting to NetCDF to:

  • save extra information so datasets are CF-1 compliant and can be viewed as gridded data in CF-1 compliant NetCDF applications built on NetCDF Java (X/Y projected coordinate arrays, and optionally full Lat/Lon arrays);
  • save as much of the SRS information in CF-1 compliant format as possible (including ellipsoid parameters);
  • continue exporting extra GDAL information by default that allows full re-import. However add to the driver a creation option so the user can choose not to export them.

The following subsections describe these changes specifically.

3.1) Proj. SRS Handling: When exporting to NetCDF map projections via GDAL, save projection coordinate variables. Optionally, also save lat and lon mapping variables.

Related fix: #2893
New -co options:

  • WRITE_LONLAT=YES/NO/IF_NEEDED (default: YES for geographic, IF_NEEDED for projected)
  • TYPE_LONLAT=float/double (default: double for geographic, float for projected)

Has back-compat impact?: No

Change:

When exporting rasters to NetCDF, save Coordinate Variables for the projected SRS, as required by the CF-1 convention, normally saved with names "x" and "y" (these are the names for every CF-1 projection except Rotated Pole - see CF-1 App F linked above for details).

We have also added a '-co' driver option listed above, that allows specification to also write 2D 'lat' and 'lon' arrays, which are specified as part of the CF-1 convention for map projections - but are not actually required for the Unidata-maintained NetCDF Java API and applications built upon them (eg ncWMS, Thredds Data Server).

The WRITE_LONLAT=IF_NEEDED option is useful in the case of exporting a projection which is not supported by the CF-1.5 standard (e.g. sinusoidal), so any CF-1.5 compliant software can locate the points on a lon/lat map.

Rationale:

The CF-1 conventions specify the X and Y projected coordinate variables as necessary to define projected grids, and tests with common NetCDF CF-1 reading software indeed show that rasters with projected grids will not be correctly interpreted without them (the NetCDF-Java API determines this, since it's used by ncWMS, ToolsUI and Thredds Data Server).

While the CF-1 convention also formally requires 2D "lat" and "lon" Variables containing the latitude and longitude at each data point, these are redundant in the case of software that can read the encoded projection and perform conversion to Geographic calculation on the fly. They also add very significantly to file-size of the saved raster since they are 2D arrays and saved as double/float. Thus the NetCDF Java API chose not to require their presence and can correctly project NetCDF files that contain just the X and Y coordinate variables (see http://www.mail-archive.com/cf-metadata@cgd.ucar.edu/msg00759.html this mailing list post by NetCDF Java explaining this decision).

Thus we've chosen to make it optional to write these Variables - but provide the 'WRITE_LONLAT' co option for users who wish to make sure their data is fully self-describing and CF-1 compliant, without relying on on-the-fly reprojection.

3.2) Proj. SRS Handling: Continue saving GDAL custom attribute tags in addition to CRS definition, but add option not to save

New -co options: -co WRITE_GDAL_TAGS=YES/NO (default: YES)
Has back-compat impact?: No

Change:

In addition to following the CF-1 conventions to save projection metadata, the current NetCDF driver writes metadata in several custom attributes:

  • A "spatial_ref" tag containing full OGC WKT string, including datum and EPSG codes if known.
  • A geotransform array saved as a string (GeoTransform),
  • and individual corner coordinates (NN, SS, WE and WW).

The new implementation saves the GeoTransform array only for a geographic CRS, in the specific case that lon/lat values are not written to the file. The corner coordinates are never saved as they are redundant.

We propose adding a new WRITE_GDAL_TAGS driver creation option to allow the user to choose not to write these tags. However it would be set to 'YES' by default.

When importing files in NetCDF format, the driver checks the GDAL-written WKT (including Datum) matches the projection information stored in the CF-compliant attributes. If there is a conflict, the CF-compliant CRS will be considered as authoritative.

Rationale:

We assume by default that if users export to a NetCDF file from GDAL, then they may wish to import into GDAL again and retain full information. As stated above, the CF-1 convention doesn't allow saving all the projection information that GDAL can manipulate via a WKT, so we recommend keeping current behaviour by default to allow lossless re-import into GDAL. The 'extra' attributes do not interfere with the ability of CF-1 compliant tools such as NetCDF-Java to open and update rasters.

The extra option to not save GDAL tags has been added in recognition of a use-case to export to a NetCDF file with no intention to access again via GDAL and use purely for use by CF-1 applications, in which case the user may wish to have no additional metadata present.

The GeoTranform array is not written to file unless there is no means to recover that information (i.e. in the case when lon/lat values are not written to file for a geographic CRS).

3.3) Proj. SRS Handling: Add saving of reference Ellipsoid parameters in CF-1 compliant format used by dataset

New -co options: No, though related to the new proposed WRITE_GDAL_TAGS option below.

Has back-compat impact?: No

Change:

When exporting to NetCDF, the new driver will utilise the CF-1 attributes that allow specification of the reference ellipsoid used by the dataset's CRS (e.g. 'semi_major_axis', 'inverse_flattening', 'longitude_of_prime_meridian').

However, on importing data the driver will continue to use the full GDAL WKT specification in the 'spatial_ref' attribute where present (as described in 3.2). This is because the CF-1.5 conventions still don't include provision for saving full Datum information, such as the name and authority of the datum, which is needed for precise transformations. The CF-1 ellipsoid attributes will only be used as a fallback when the full WKT isn't present.

An open question: if the CF metadata includes datum information (but no named datum), should we try to match the spheroid to well known GeogCS (such as WGS84)? This can cause datum errors in some cases (one spheroid can be used in many named datums), as discussed in this thread http://lists.osgeo.org/pipermail/gdal-dev/2011-October/030374.html.

Rationale:

CF-1 convention's only support of datums currently is to specify a reference ellipsoid using parameters such as 'semi_major_axis' and 'inverse_flattening'. While these attributes alone do not fully specify the datum (which requires an EPSG code so control points can be referenced), they at least allow fairly accurate map projection of the created NetCDF file using tools such as ncWMS.

In the longer-term it would be beneficial to work with the CF-1 community to allow saving of full datum information in a format readable by CF-1, and we've made preliminary enquiries here and would welcome cooperation with the CF-1 community, but updating the CF-1 standard is outside the scope of this current GDAL NetCDF work.

4) Add support for different netcdf file types and compression

Related fix: N/A
New -co options:

  • FILETYPE=NC/NC2/NC4
  • COMPRESS=NONE/DEFLATE/PACKED (default: NONE)
  • ZLEVEL=[1-9] (default: 1) note: higher ZLEVEL values do not decrease size significantly

Has back-compat impact?: NC2 and NC4 filetypes, as well as DEFLATE and PACKED compression may not be supported by older netcdf and GDAL versions. NC4 filetype and DEFLATE compression require netcdf-4 and HDF5 and DEFLATE requires zlib. PACKED compression has to be added to the driver for import also. However, these are optional therefore default behaviour causes no backwards compatibility issues.

Change: TBD

Compatibility Issues

As specified above in the sub-sections.

Test Suite

The autotests have been updated (autotest/gdrivers/netcdf_cf.py) to check CF-1 compliance of exported datasets, and also that round-trip conversion from another format (eg GeoTiff) to NetCDF and back is successful.

Documentation

The GDAL driver summary page (http://www.gdal.org/frmt_netcdf.html) would be updated with description of new behaviour and options.

Release

  • The code will be committed to the trunk, and if possible will be back-ported to the 1.8 series (should a 1.8.2 release occur after implementation and sufficient testing).
    • Since the modifications above involved a lot of related changes to significant parts of the driver, we managed the actual development process using a Github repository for integration and testing.
    • The address of the Github repo, if you wish to view the code or do testing, is at https://github.com/etiennesky/gdal-netcdf/
  • We will ask to the users of the driver to please test the new version with their important files and notify any regressions.

Addenda

  • We also intend to improve interface for writing to NetCDF 4 - (but will handle this separately, may not need an RFC)

Attachments (2)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.