Opened 5 years ago

Last modified 3 years ago

#4513 reopened enhancement

netcdf driver does not support irregular grids / GDAL GEOLOCATION arrays

Reported by: etourigny Owned by: etourigny
Priority: normal Milestone:
Component: default Version: svn-trunk
Severity: normal Keywords: netcdf, geolocation
Cc: bronaugh, Kyle Shannon

Description

The CF standard allows for irregular 2D grids with "Two-dimensional coordinate variables" specified in the "coordinates" attribute (see CF sections 5.2 and 5.6).

CF projections currently unsupported by GDAL (#4285) can be supported via this mechanism, and be represented in GDAL via GEOLOCATION arrays. Current code does not support reading and writing GEOLOCATION arrays in netcdf files.

Attaching examples netcdf files in projected SRS with lat/lon values in "coordinates" attribute.

Attachments (1)

orog_RCM3.nc (276.3 KB) - added by etourigny 5 years ago.

Download all attachments as: .zip

Change History (25)

Changed 5 years ago by etourigny

Attachment: orog_RCM3.nc added

comment:1 Changed 5 years ago by etourigny

Owner: changed from warmerdam to etourigny
Status: newassigned

In trunk (r23971) : support 2D GEOLOCATION arrays when a projected variable has coordinates attribute and supporting lon/at arrays .

This supports the coordinates "lon lat" attribute as described in CF section 5.6, only when longitude and latitude are present (as per cf-4.1 and 4.2) and the variable has a supporting grid_mapping (projected SRS) definition. If the variables are present, they are used as GEOLOCATION arrays (see example below).

The WRITE_LONLAT handling has changed slightly: default is NO, except if has geolocation ; with IF_NEEDED : write if has geoloc or is not CF projection. The latter is to ensure that if GDAL does not support the projection by mapping the CF attributes to OGC WKT, then the GEOLOCATION arrays are written to the CF file (and possibly read from input dataset).

If GEOLOCATION values are present in an input CF file, these will be used in export, instead of calculating with transform as previously.

Example (attached):

$ ncdump -h orog_RCM3.nc
netcdf orog_RCM3 {
dimensions:
	yc = 104 ;
	xc = 134 ;
variables:
	char Transverse_Mercator ;
		Transverse_Mercator:grid_mapping_name = "transverse_mercator" ;
		Transverse_Mercator:longitude_of_central_meridian = -97. ;
		Transverse_Mercator:latitude_of_projection_origin = 47.5 ;
		Transverse_Mercator:scale_factor_at_central_meridian = 1. ;
		Transverse_Mercator:false_easting = 3925000. ;
		Transverse_Mercator:false_northing = 3175000. ;
	double yc(yc) ;
		yc:long_name = "y-coordinate in Cartesian system" ;
		yc:standard_name = "projection_y_coordinate" ;
		yc:units = "m" ;
		yc:axis = "Y" ;
	double xc(xc) ;
		xc:long_name = "x-coordinate in Cartesian system" ;
		xc:standard_name = "projection_x_coordinate" ;
		xc:units = "m" ;
		xc:axis = "X" ;
	double lon(yc, xc) ;
		lon:units = "degrees_east" ;
		lon:long_name = "longitude" ;
		lon:standard_name = "longitude" ;
		lon:axis = "X" ;
	double lat(yc, xc) ;
		lat:units = "degrees_north" ;
		lat:long_name = "latitude" ;
		lat:standard_name = "latitude" ;
		lat:axis = "Y" ;
	float orog(yc, xc) ;
		orog:grid_mapping = "Transverse_Mercator" ;
		orog:standard_name = "surface_altitude" ;
		orog:long_name = "Surface Altitude" ;
		orog:units = "m" ;
		orog:coordinates = "lon lat" ;

// global attributes:
		:Conventions = "CF-1.0" ;
}

$ gdalinfo NETCDF:"orog_RCM3.nc":orog
Driver: netCDF/Network Common Data Format
Files: orog_RCM3.nc
Size is 134, 104
Coordinate System is:
PROJCS["unnamed",
    GEOGCS["WGS 84",
        DATUM["WGS_1984",
            SPHEROID["WGS 84",6378137,298.257223563,
                AUTHORITY["EPSG","7030"]],
            TOWGS84[0,0,0,0,0,0,0],
            AUTHORITY["EPSG","6326"]],
        PRIMEM["Greenwich",0,
            AUTHORITY["EPSG","8901"]],
        UNIT["degree",0.0174532925199433,
            AUTHORITY["EPSG","9108"]],
        AUTHORITY["EPSG","4326"]],
    PROJECTION["Transverse_Mercator"],
    PARAMETER["latitude_of_origin",47.5],
    PARAMETER["central_meridian",-97],
    PARAMETER["scale_factor",1],
    PARAMETER["false_easting",3925000],
    PARAMETER["false_northing",3175000]]
Origin = (575000.000000000000000,5775000.000000000000000)
Pixel Size = (50000.000000000000000,-50000.000000000000000)
Metadata:
  NC_GLOBAL#Conventions=CF-1.0
[...]
  yc#units=m
Geolocation:
  LINE_OFFSET=0
  LINE_STEP=1
  PIXEL_OFFSET=0
  PIXEL_STEP=1
  SRS=GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],TOWGS84[0,0,0,0,0,0,0],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9108"]],AUTHORITY["EPSG","4326"]]
  X_BAND=1
  X_DATASET=NETCDF:"orog_RCM3.nc":lon
  Y_BAND=1
  Y_DATASET=NETCDF:"orog_RCM3.nc":lat
Corner Coordinates:
Upper Left  (  575000.000, 5775000.000) ( 33d52' 4.42"W, 73d 9'59.02"N)
Lower Left  (  575000.000,  575000.000) (128d 1'21.79"W, 20d55'33.01"N)
Upper Right ( 7275000.000, 5775000.000) (160d 7'55.58"W, 73d 9'59.02"N)
Lower Right ( 7275000.000,  575000.000) ( 65d58'38.21"W, 20d55'33.01"N)
Center      ( 3925000.000, 3175000.000) ( 97d 0' 0.00"W, 47d30' 0.00"N)
Band 1 Block=134x1 Type=Float32, ColorInterp=Undefined
[...]

$ gdalwarp -geoloc -of netcdf NETCDF:"orog_RCM3.nc":orog  tmp2.nc
Creating output file that is 501P x 121L.
Processing input file NETCDF:orog_RCM3.nc:orog.
Using internal nodata values (eg. 1e+20) for image NETCDF:orog_RCM3.nc:orog.
0...10...20...30...40...50...60...70...80...90...100 - done.

Resulting lon/lat file is "similar" to original (TM re-projected to WGS84 lon/lat) although obviously not identical.

Due to bug in gdalwarp the extents are incorrect (resulting raster goes too far east), but the non-missing data pixels are correct.

Further testing with other datasets is needed.

comment:2 Changed 5 years ago by bronaugh

Cc: bronaugh added

comment:3 Changed 4 years ago by Kyle Shannon

Cc: Kyle Shannon added

comment:4 Changed 4 years ago by etourigny

see bug #5118 - any warp command fails because the metadata is stored in the GEOLOCATION domain, fix is to use a new domain GEOLOCATION2

added autotest in r26107 and a fix in r26109

comment:5 Changed 4 years ago by etourigny

fixed in 1.10 in r26111

comment:6 Changed 3 years ago by neo

Are geolocation arrays now fully supported for netcdf? Can this issue be closed?

comment:7 Changed 3 years ago by etourigny

Resolution: fixed
Status: assignedclosed

It works but is not very well tested. But it doesn't write geolocation arrays to a resulting netcdf file, only supports netcdf-cf "geolocation" support. But I guess I can close it.

comment:8 Changed 3 years ago by neo

Ok, I guess on http://www.gdal.org/frmt_netcdf.html it should mention the term "two-dimensional coordinate variables" and its relation to gdal's geolocation arrays and the fact that it will read but not write them to netcdf files (what happens instead? an error message?). Being able to read geolocation arrays and then getting rid of them using gdalwarp is a very nice use case I think.

I'm new to this field and it was quite hard for me to connect all these pieces, mainly because most documentation pages/tickets gave the impression that it's still unsupported, e.g. also https://trac.osgeo.org/gdal/wiki/rfc4_geolocate There it says netcdf "No timetable for update".

comment:9 Changed 3 years ago by etourigny

updated the docs in r27422 , if you have a better wording please add it here. I have also updated the rfc4 wiki page.

comment:10 Changed 3 years ago by neo

Docs sound good, I would add that currently gdal only supports such coordinate arrays if they have an attached projection (through the grid_mapping attribute). It says at line 3342 of netcdfdataset.cpp "2D GEOLOCATION arrays only supported for projected SRS". This is very important, as in fact in my case I have an undefined projection and *only* irregular 2D lat,lon arrays. So essentially I can't use gdal for that (btw could this be supported in the future somehow?).

In the rfc4 page: "which are represented as geolocation arrays when they are irregular." Is that really the case? Are the netcdf coordinate arrays ignored if they are regular and not irregular? Also, does irregular include any non-regular grid types like rectilinear?

comment:11 in reply to:  10 Changed 3 years ago by etourigny

Resolution: fixed
Status: closedreopened

Replying to neo:

Docs sound good, I would add that currently gdal only supports such coordinate arrays if they have an attached projection (through the grid_mapping attribute). It says at line 3342 of netcdfdataset.cpp "2D GEOLOCATION arrays only supported for projected SRS". This is very important, as in fact in my case I have an undefined projection and *only* irregular 2D lat,lon arrays. So essentially I can't use gdal for that (btw could this be supported in the future somehow?).

I'm not sure why I added that constraint, perhaps because actual examples have grid_mapping attributes (projected SRS).

I guess this should work, if you remove that check by commenting out the "else if ( ! bIsProjected )" block. Are you able to compile gdal? Best using latest release 1.11.

If you can attach that dataset and expected output it would be nice. I am reopening this issue.

In the rfc4 page: "which are represented as geolocation arrays when they are irregular." Is that really the case? Are the netcdf coordinate arrays ignored if they are regular and not irregular? Also, does irregular include any non-regular grid types like rectilinear?

They are considered when gdalwarp -geoloc is used, if not the basic SRS info is used.

In the case of 1 irregular dimension (gaussian latitude), then there is a 1D GEOLOCATION array (special Y_VALUES metadata). However, if both are irregular, geolocation arrays are used only if there is a coordinates variables with x and y variables for lon/lat. So if the information is encoded in a way compatible with CF conventions (section 5.2) then they are supported.

If you have specific examples (esp. about rectilinear grids) I could give a better answer.

I guess there could be an enhancement to use geolocation for irregular grids (with no coordinates variable), but that would require creating a means to encode X_DATASET and X_BAND to be able to read the x/y values.

comment:12 Changed 3 years ago by etourigny

oh, I updated the rfc4 information

comment:13 Changed 3 years ago by neo

I can compile it, but I have some other problems I think I need to solve first.

This is what ncdump -h gives:

netcdf ISS030-E-102170 {
dimensions:
    time = 1 ;
    y = 2832 ;
    x = 4256 ;
    vertex = 4 ;
    channel = 3 ;
variables:
    double time(time) ;
        time:units = "seconds since 1970-01-01 00:00:00" ;
        time:calendar = "gregorian" ;
        time:standard_name = "time" ;
        time:long_name =  ;
        time:comment =  ;
    double lat(time, y, x) ;
        lat:units = "degrees_north" ;
        lat:valid_min = -90. ;
        lat:valid_max = 90. ;
        lat:standard_name = "latitude" ;
        lat:long_name = "Latitude" ;
        lat:comment = "Geodetic latitude" ;
        lat:bounds = "lat_bounds" ;
    double lon(time, y, x) ;
        lon:units = "degrees_east" ;
        lon:valid_min = -180. ;
        lon:valid_max = 180. ;
        lon:standard_name = "longitude" ;
        lon:long_name = "Longitude" ;
        lon:comment = "Geodetic longitude" ;
        lon:bounds = "lon_bounds" ;
    double lat_bounds(time, y, x, vertex) ;
    double lon_bounds(time, y, x, vertex) ;
    short img(time, y, x, channel) ;
        img:_FillValue = -32768s ;
        img:units = "unitless" ;
        img:valid_min = 0L ;
        img:valid_max = 255L ;
        img:long_name =  ;
        img:comment = "channel order: RGB" ;
        img:coordinates = "lat lon" ;
        img:grid_mapping = "crs" ;
    byte crs ;
        crs:grid_mapping_name = "latitude_longitude" ;
        crs:semi_major_axis = 6378137. ;
        crs:inverse_flattening = 298.257223563 ;
        crs:comment = "Geographic Coordinate System, WGS 84" ;

// global attributes:
        :Conventions = "CF-1.6" ;
}

When I run gdalinfo NETCDF:ISS030-E-102170.nc:img on it, I get:

$ gdalinfo NETCDF:ISS030-E-102170.nc:img
Warning 1: dimension #3 (channel) is not a Longitude/X dimension.
Warning 1: dimension #2 (x) is not a Latitude/Y dimension.
Warning 1: dimension #1 (y) is not a Time  dimension.
Driver: netCDF/Network Common Data Format
Files: ISS030-E-102170.nc
Size is 3, 4256
Coordinate System is `'
Metadata:
  crs#comment=Geographic Coordinate System, WGS 84
  crs#grid_mapping_name=latitude_longitude
  crs#inverse_flattening=298.257223563
  crs#semi_major_axis=6378137
  img#_FillValue=-32768
  img#comment=channel order: RGB
  img#coordinates=lat lon
  img#grid_mapping=crs
  img#long_name=
  img#units=unitless
  NC_GLOBAL#Conventions=CF-1.6
  NETCDF_DIM_EXTRA={time,y}
  NETCDF_DIM_time_DEF={1,6}
  NETCDF_DIM_time_VALUES=1327483628
  NETCDF_DIM_y_DEF={2832,6}
  NETCDF_DIM_y_VALUES=1327483628
  time#calendar=gregorian
  time#comment=
  time#long_name=
  time#standard_name=time
  time#units=seconds since 1970-01-01 00:00:00
Geolocation:
  LINE_OFFSET=0
  LINE_STEP=1
  PIXEL_OFFSET=0
  PIXEL_STEP=1
  SRS=GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],TOWGS84[0,0,0,0,0,0,0],AUTHORITY["E
PSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9108"]],AUTHORITY["EPSG","43
26"]]
  X_BAND=1
  X_DATASET=NETCDF:"ISS030-E-102170.nc":lon
  Y_BAND=1
  Y_DATASET=NETCDF:"ISS030-E-102170.nc":lat
Corner Coordinates:
Upper Left  (    0.0,    0.0)
Lower Left  (    0.0, 4256.0)
Upper Right (    3.0,    0.0)
Lower Right (    3.0, 4256.0)
Center      (    1.5, 2128.0)
Band 1 Block=1x2089 Type=Int16, ColorInterp=Undefined
  NoData Value=-32768
  Metadata:
    _FillValue=-32768
    comment=channel order: RGB
    coordinates=lat lon
    grid_mapping=crs
    long_name=
    NETCDF_DIM_time=1327483628
    NETCDF_DIM_Y=1327483628
    NETCDF_VARNAME=img
    units=unitless
Band 2 Block=1x2089 Type=Int16, ColorInterp=Undefined
  NoData Value=-32768
  Metadata:
    _FillValue=-32768
    comment=channel order: RGB
    coordinates=lat lon
    grid_mapping=crs
    long_name=
    NETCDF_DIM_time=1327483628
    NETCDF_DIM_Y=1327483628
    NETCDF_VARNAME=img
    units=unitless

...and going on for every image line. Note that this is with unchanged 1.11 code.

Did I model my data wrongly somehow? Those three warnings at the beginning seem to tell me that. What do I have to do so that GDAL recognizes 3 bands and also recognizes the time dimension right?

comment:14 Changed 3 years ago by etourigny

Hi

first thing I see is that the order of the dimensions is wrong. The netcdf driver assumes that the "recommended" CF dimensions order is used, so in this case it should be (channel, time, y, x)

from the cf conventions document, section 2.4

If any or all of the dimensions of a variable have the interpretations of "date or time" ( T ), "height or depth" ( Z ), "latitude" ( Y ), or "longitude" ( X ) then we recommend, but do not require (see Section 1.4, “Relationship to the COARDS Conventions” ), those dimensions to appear in the relative order T , then Z , then Y , then X in the CDL definition corresponding to the file. All other dimensions should, whenever possible, be placed to the left of the spatiotemporal dimensions. 

So as shown in the output, it first looks for a lon dimension, then lat and time, in reverse order of the defined dimensions (as recommended in the CF conventions). It looks for the time dimension just so it can save the "unrolled" band in case you do a gdal_translate.

Admittedly this is restrictive, but it would be too hard to guess all possible permutations. If you have a patch for this I would be glad to review it.

Also, your lat and lon dimensions should not have a time coordinate, I would remove it to make sure it doesn't have any adverse effects.

comment:15 Changed 3 years ago by neo

Ah thanks. I fixed the dim order. Now it looks right, but it still says: " Warning 1: dimension #1 (time) is not a Time dimension. Warning 1: dimension #0 (channel) is not a Time dimension." The first sentence is a little funny, does that mean it can't recognize my time dimension as being Time?

comment:16 Changed 3 years ago by etourigny

Those warnings are informative only.

I just saw that there is a small bug in the code, it is complaining that it is not a valid time coordinate, but in reality it not a valid vertical coordinate.

If you change the order like this (time,channel,y,x) it should just complain about the channel dim not being a vertical coordinate, and if you add a channel variable with axis="Z" it will not complain. It should be fixed to allow (channel,time,y,x) but again, it's only a warning.

I'll add a fix for this.

Let me know if it needs removing the bIsProjected test to support this file (aside from the warnings).

comment:17 Changed 3 years ago by neo

I tried to use gdalwarp (unchanged 1.11 code):

gdalwarp -t_srs '+proj=utm +zone=11 +datum=WGS84' NETCDF:ISS030-E-102170.nc:img test.tif
Warning 1: dimension #1 (time) is not a Time  dimension.
Warning 1: dimension #0 (channel) is not a Time  dimension.
ERROR 1: nBlockYSize = 591, only 1 supported when reading bottom-up dataset
ERROR 1: NETCDF:"ISS030-E-102170.nc":lon, band 1: IReadBlock failed at X offset 0, Y offset 0
ERROR 1: GetBlockRef failed at X block offset 0, Y block offset 0

The gdalinfo output of the first band is:

Size is 4256, 2832
...
Band 1 Block=2089x1390 Type=Int16, ColorInterp=Undefined
  NoData Value=-32768
  Metadata:
    _FillValue=-32768
    comment=channel order: RGB
    coordinates=lat lon
    grid_mapping=crs
    long_name=
    NETCDF_DIM_Channel=1
    NETCDF_DIM_time=1327483628
    NETCDF_VARNAME=img
    units=unitless

I used the defaults of the netCDF4 library (Python wrapper) to produce the file. Should I adapt something in the block size to make it compatible to gdal? It seems something isn't supported there, or maybe a bug?

comment:18 Changed 3 years ago by etourigny

yeah that's a limitation in the driver, as gdal reads scanlines from top-down and netcdf files are usually bottom-up. I fixed the case when block size is a single line (nBlockYSize=1). Workaround is to change the chunk size (4256 x 1) or do not write as a compressed netcdf-4 file.

If you can post a link to an example file I can try it out.

comment:19 Changed 3 years ago by neo

Ok I changed the chunk size but now it's complaining about "ERROR 1: Too many points (441 out of 441) failed to transform".

The data is not public yet, therefore I sent you an email (on your google .dev address) with a download link. I hope that's ok.

comment:20 Changed 3 years ago by etourigny

the error is because there is no geotransform detected. The code does not support geolocation array with grid_mapping of type "latitude_longitude" without real lon/lat variable, which is probably why I had restricted the geoloation stuff to projected crs (although that test fails in this case).

You would have to trick gdalwarp to know the actual extents, or change the x/y definitions (by adding variables with the lon/lat values) so that a GT is detected. Or use a projection that is supported by the driver.

Attaching a file that works using gdalwarp -geoloc. You can see that x and y are actual dimensions with an associated variable.

comment:21 Changed 3 years ago by etourigny

actually see the already attached orog_RCM3.nc

comment:22 Changed 3 years ago by neo

Hmm, I'm confused. If it only works when there is a known projection, what advantage does the geolocation support for netcdf bring then? I mean, with a known projection the lat/lon coordinates can be re-calculated anyway, right? On the RFC4 page it says that "[it] is common in AVHRR, Envisat, HDF and netCDF data products to distribute geolocation for raw or projected data". So does that mean, at the moment, gdal can only handle "projected" data and not "raw" data?

comment:23 Changed 3 years ago by etourigny

Yes it's a limitation of the driver and also a limitation of gdalwarp, it cannot automatically guess the extents of this input dataset from geolocation only. That is why the attached file works, by computing the extents from the included projection information.

to illustrate this, the following is output when warping is done in debug mode: WARP: Recompute out extent with CHECK_WITH_INVERT_PROJ=TRUE

If you need to apply geolocation information you need to set the extents of the file in gdalwarp arguments.

So how would you suggest computing the extents and resolution from the 2D longitude/latitude variables? I guess that the warping must be fixed for this case, by computing the extents from the geolocation information.

comment:24 Changed 3 years ago by neo

Ah now I understand the actual problem, thanks for clarifying.

I assume extent == bounding box (latSouth,latNorth,lonWest,lonEast)?

It's true that it's not trivial, but I think this problem can be solved. Basically I do similar things in my own code but that's restricted to a certain problem domain and doesn't cover all cases.

In general I do the following to obtain a bounding box from such arrays:

  1. Determine data-outline of array (basically form the convex hull in array-index space and then map to the actual lat,lon coordinates). This is for the case that there are no-data values, otherwise it would just be the top/bottom row and left/right column. It also assumes that latitude and longitude arrays have no-data at the same places.
  1. Calculate latMin, latMax, lonMin, lonMax of the outline using standard min, max functions.
  1. Check if the outline contains (or crosses) one of the poles. I used the method detailed here: http://www.element84.com/follow-up-to-determining-if-a-spherical-polygon-contains-a-pole.html
  1. If the outline contains a pole, then lonWest=-180, lonEast=180. If latMax < 0, it's the south pole, so latSouth=-90 and latNorth=latMax. Else, it's the north pole, so latNorth=90 and latSouth=latMin. This assumes that the latitude extent is not greater than 90 deg if one of the poles is within the extent.
  1. If the outline doesn't contain a pole, then the main difficulty is the discontinuity. In my case I assume that the extent of a dataset is smaller than 180 degrees of longitude. If lonMax-lonMin<180 then the extent doesn't cover the discontinuity and lonWest=lonMin and lonEast=lonMax. Otherwise (>=180), lonWest=min(longitudes of outline > 0) and lonEast=max(longitudes of outline <= 0).

This doesn't cover the following cases:

  • Both poles are contained. But then the bounding box would be fully degenerate and contain everything, right?
  • An extent covering more than 180 deg longitude.
  • An extent covering more than 90 deg latitude if a pole is covered.
  • The latitude,longitude grids are not "continuous". E.g. if random non-adjacent patches of the earth are just concatenated together in one array. Then the outline is not an outline in the above sense and cannot be used for pole testing. I think such data is very unlikely to exist, but who knows.
  • Using the convex hull as outline may lead to a smaller than expected extent if the projection is somehow strange, as the outline doesn't contain all real outline points. A better way would be to take the exact outline (without simplifying lines or anything) which then also could be concave. In OpenCV you would use findContours(lats, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_NONE). But I wouldn't go that far and use the simplest way possible for the beginning.

Does that make sense?

Note: See TracTickets for help on using tickets.