Opened 17 years ago

Closed 5 years ago

#1900 closed defect (wontfix)

Get hdf band, inverse data set dimensions

Reported by: cermak Owned by: dron
Priority: normal Milestone: closed_because_of_github_migration
Component: GDAL_Raster Version: unspecified
Severity: normal Keywords: hdf4
Cc: ilucena, dnadeau, dron

Description (last modified by warmerdam)

Hi!

In the MODIS cloud product hdf files (MOD03, MYD03) some data sets are stored with the band dimension last instead of first.

gdalinfo myfile.hdf (or gdal.Dataset.GetSubDatasets) yields:
[snip]
SUBDATASET_52_NAME=HDF4_EOS:EOS_SWATH:'myfile.hdf':mod06:Cloud_Mask_1km
SUBDATASET_52_DESC=[2040x1354x2] Cloud_Mask_1km mod06 (8-bit integer)
[snip]

So obviously there are two bands of size 2040x1354 in the subdataset. However, in order for gdal to recognize them as two bands the dimensions would have to be [2x2040x1354].

Accordingly, when I do gdalinfo HDF4_EOS:EOS_SWATH:'myfile.hdf':mod06:Cloud_Mask_1km I get information on 2040 different bands, each of size 1354x2

It would be useful to have a feature in gdal that allows explicit selection of the dimension that holds the band count or even let gdal decide on this.

A sample file is at: ftp://ladsweb.nascom.nasa.gov/allData/5/MYD06_L2/2006/220/MYD06_L2.A2006220.1315.005.2006222153137.hdf

Thanks and all best, Jan

PS: hdfview can read these files.

Attachments (1)

input.c (18.2 KB ) - added by ilucena 17 years ago.
Souce code from Modis Swath Reprojection Tool

Download all attachments as: .zip

Change History (14)

comment:1 by warmerdam, 17 years ago

Cc: ilucena dnadeau dron added
Component: defaultGDAL_Raster
Description: modified (diff)
Keywords: hdf4 added
Status: newassigned

Added a few people knowledgable about HDF4 to cc list.

comment:2 by ilucena, 17 years ago

I downloaded the Modis Swath Reprojection tool with source code and I found on input.c the function FindInputDim() with those comments:

/*Figure out which are the line/sample dimensions and which are the

extra dimensions. The line and sample dimensions are expected to be greater than MIN_LS_DIM_SIZE. If too many dimensions are "line/sample" dimensions, then it is an error. If not enough dimensions are available for "line/sample" dimensions, then it is also an error.*/

I guess that this is the only way to solve that problem since a call to SDgetdimid() and SDdiminfo() cannot guaranty the exact information about who is who in the dimension array.

comment:3 by dron, 17 years ago

Owner: changed from warmerdam to dron
Status: assignednew

I have no idea what MIN_LS_DIM_SIZE should be. If we have a hyperspectral HDF (and I have such samples) it is possible that number of bands will be greater than image width. We can take a liberty to assume that number of bands can't be greater than 256, but it is still risky.

Jan,

Talking about referenced sample I have noticed that its dimensions [2040x1354x2] called as "Cell_Along_Swath_1km,Cell_Across_Swath_1km,Cloud_Mask_1km_Num_Bytes"

Note that the third dimension called "Cloud_Mask_1km_Num_Bytes". Is it possible that this has nothing to do with the number of bands? Maybe we should interpret the whole thing in the different way? The product specification could help here.

Best regards, Andrey

comment:4 by dron, 17 years ago

Status: newassigned

by ilucena, 17 years ago

Attachment: input.c added

Souce code from Modis Swath Reprojection Tool

comment:5 by ilucena, 17 years ago

Andrey,

Sorry I forgot to attach the input.c source code from MRTSwath.

I faced that problem before and by that time I contact several people at NASA/NCSA to sort it out. The conclusion was that not all HDF4 products are well documented.

The proof of that is that their own (NASA/NSCA) software MRT/MRTSwath tries to guess witch dimension is witch. That is way I mentioned their source code.

The solution that I implemented on Idrisi was something like that: I select the dimension with closed length (assuming that columns always come after rows):

[2x430x500] = 2 bands of 430x500 [430x500x2] = 2 bands of 430x500 [430x4x500 = 4 bands of 430x500 /*crazy but possible*/ [430x500x600] = 600 bands of 430x500 /*almost got it wrong*/

That is not a perfect solution but we can use it in case there is not sufficient information on the file itself or there is not enough knowledge about the product specification.

I hope that would help.

Best regards,

Ivan

comment:6 by ilucena, 17 years ago

Corrections:

  • That is way I mentioned their source code.

+ That is why I mentioned their source code.

  • closed length (assuming that columns always come after rows):

+ closest length (assuming that columns always come after rows):

Reformatting:

[2x430x500] = 2 bands of 430x500

[430x500x2] = 2 bands of 430x500

[430x4x500 = 4 bands of 430x500 /*crazy but possible*/

[430x500x600] = 600 bands of 430x500 /*almost got it wrong*

comment:7 by dron, 17 years ago

GDAL is a general raster processing tool, and I hate to add more intelligence to our already overintelligent driver. I am thinking about open options like XDIM, YDIM, BANDDIM to point the exact dimensions numbers. Probably it will be the best and most common solution.

Best regards,

Andrey

comment:8 by ilucena, 17 years ago

Andrey, As a GDAL user, what I need is to run MapServer (gdalindex) and/or ArcGIS ImageServer to go through thousands of HDF4 files and make footprints of all datasets/bands into a shapefile. And there is no way to add a user-option on that process. As a GDAL programmer, what I can do is to write the guessing-dimension stuff on the driver and send it to you as a patch so you can evaluate and commit if you want. Best regards, Ivan

comment:9 by dron, 17 years ago

Ivan,

Actually it is not a problem to assign MIN_LS_DIM_SIZE to some number (in your case it is 250). It is a problem to generalize the whole process. We should develop a way to deduce the dimension map of the arbitrary dataset. There are datasets that have (width < num_bands), so I still do not understand how we can guess image dimensions based on dimension array only.

Folks, I want to get the one day time out to think out this thing again. Max band number constant is the first solution, the second one is introducing the global variables XDIM, YDIM and BANDDIM. I do not think we will be able to implement the general case here. We will need a control from the user side, because HDF is a user oriented format.

Best reghrds, Andrey

in reply to:  3 comment:10 by cermak, 17 years ago

Replying to dron:

Talking about referenced sample I have noticed that its dimensions [2040x1354x2] called as "Cell_Along_Swath_1km,Cell_Across_Swath_1km,Cloud_Mask_1km_Num_Bytes"

Note that the third dimension called "Cloud_Mask_1km_Num_Bytes". Is it possible that this has nothing to do with the number of bands? Maybe we should interpret the whole thing in the different way? The product specification could help here.

Hi Andrey,

That depends on your definition of a band. These are not 'bands' in the sense of different spectral channels in a radiometer. They are however bands in that they contain distinct subsets of information relating to the 'Cloud_Mask_1km' product. (Each band contains 1 byte of bit-coded information, thence 'Num_Bytes')

Product format info is on http://modis-atmos.gsfc.nasa.gov/MOD06_L2/format.html

Best, Jan

comment:11 by Jukka Rahkonen, 9 years ago

Perhaps something like what is suggested in this NetCDF patch #2540 would suit here as well: let user a possibility to select the bands.

comment:12 by ilucena, 9 years ago

Agreed. But it could be a little more complex than that. Let's say a dataset dimension array is [7,6001,3001,12]. We can pass ":3" to identify bands as the 4th dimension but which one is row and which one is column? And what to do with more than 3 dimensions?

comment:13 by Even Rouault, 5 years ago

Milestone: closed_because_of_github_migration
Resolution: wontfix
Status: assignedclosed

This ticket has been automatically closed because Trac is no longer used for GDAL bug tracking, since the project has migrated to GitHub. If you believe this ticket is still valid, you may file it to https://github.com/OSGeo/gdal/issues if it is not already reported there.

Note: See TracTickets for help on using tickets.