Opened 17 years ago
Closed 5 years ago
#1900 closed defect (wontfix)
Get hdf band, inverse data set dimensions
Reported by: | cermak | Owned by: | dron |
---|---|---|---|
Priority: | normal | Milestone: | closed_because_of_github_migration |
Component: | GDAL_Raster | Version: | unspecified |
Severity: | normal | Keywords: | hdf4 |
Cc: | ilucena, dnadeau, dron |
Description (last modified by )
Hi!
In the MODIS cloud product hdf files (MOD03, MYD03) some data sets are stored with the band dimension last instead of first.
gdalinfo myfile.hdf (or gdal.Dataset.GetSubDatasets) yields: [snip] SUBDATASET_52_NAME=HDF4_EOS:EOS_SWATH:'myfile.hdf':mod06:Cloud_Mask_1km SUBDATASET_52_DESC=[2040x1354x2] Cloud_Mask_1km mod06 (8-bit integer) [snip]
So obviously there are two bands of size 2040x1354 in the subdataset. However, in order for gdal to recognize them as two bands the dimensions would have to be [2x2040x1354].
Accordingly, when I do gdalinfo HDF4_EOS:EOS_SWATH:'myfile.hdf':mod06:Cloud_Mask_1km I get information on 2040 different bands, each of size 1354x2
It would be useful to have a feature in gdal that allows explicit selection of the dimension that holds the band count or even let gdal decide on this.
A sample file is at: ftp://ladsweb.nascom.nasa.gov/allData/5/MYD06_L2/2006/220/MYD06_L2.A2006220.1315.005.2006222153137.hdf
Thanks and all best, Jan
PS: hdfview can read these files.
Attachments (1)
Change History (14)
comment:1 by , 17 years ago
Cc: | added |
---|---|
Component: | default → GDAL_Raster |
Description: | modified (diff) |
Keywords: | hdf4 added |
Status: | new → assigned |
comment:2 by , 17 years ago
I downloaded the Modis Swath Reprojection tool with source code and I found on input.c the function FindInputDim() with those comments:
/*Figure out which are the line/sample dimensions and which are the
extra dimensions. The line and sample dimensions are expected to be greater than MIN_LS_DIM_SIZE. If too many dimensions are "line/sample" dimensions, then it is an error. If not enough dimensions are available for "line/sample" dimensions, then it is also an error.*/
I guess that this is the only way to solve that problem since a call to SDgetdimid() and SDdiminfo() cannot guaranty the exact information about who is who in the dimension array.
follow-up: 10 comment:3 by , 17 years ago
Owner: | changed from | to
---|---|
Status: | assigned → new |
I have no idea what MIN_LS_DIM_SIZE should be. If we have a hyperspectral HDF (and I have such samples) it is possible that number of bands will be greater than image width. We can take a liberty to assume that number of bands can't be greater than 256, but it is still risky.
Jan,
Talking about referenced sample I have noticed that its dimensions [2040x1354x2] called as "Cell_Along_Swath_1km,Cell_Across_Swath_1km,Cloud_Mask_1km_Num_Bytes"
Note that the third dimension called "Cloud_Mask_1km_Num_Bytes". Is it possible that this has nothing to do with the number of bands? Maybe we should interpret the whole thing in the different way? The product specification could help here.
Best regards, Andrey
comment:4 by , 17 years ago
Status: | new → assigned |
---|
comment:5 by , 17 years ago
Andrey,
Sorry I forgot to attach the input.c source code from MRTSwath.
I faced that problem before and by that time I contact several people at NASA/NCSA to sort it out. The conclusion was that not all HDF4 products are well documented.
The proof of that is that their own (NASA/NSCA) software MRT/MRTSwath tries to guess witch dimension is witch. That is way I mentioned their source code.
The solution that I implemented on Idrisi was something like that: I select the dimension with closed length (assuming that columns always come after rows):
[2x430x500] = 2 bands of 430x500 [430x500x2] = 2 bands of 430x500 [430x4x500 = 4 bands of 430x500 /*crazy but possible*/ [430x500x600] = 600 bands of 430x500 /*almost got it wrong*/
That is not a perfect solution but we can use it in case there is not sufficient information on the file itself or there is not enough knowledge about the product specification.
I hope that would help.
Best regards,
Ivan
comment:6 by , 17 years ago
Corrections:
- That is way I mentioned their source code.
+ That is why I mentioned their source code.
- closed length (assuming that columns always come after rows):
+ closest length (assuming that columns always come after rows):
Reformatting:
[2x430x500] = 2 bands of 430x500
[430x500x2] = 2 bands of 430x500
[430x4x500 = 4 bands of 430x500 /*crazy but possible*/
[430x500x600] = 600 bands of 430x500 /*almost got it wrong*
comment:7 by , 17 years ago
GDAL is a general raster processing tool, and I hate to add more intelligence to our already overintelligent driver. I am thinking about open options like XDIM, YDIM, BANDDIM to point the exact dimensions numbers. Probably it will be the best and most common solution.
Best regards,
Andrey
comment:8 by , 17 years ago
Andrey, As a GDAL user, what I need is to run MapServer (gdalindex) and/or ArcGIS ImageServer to go through thousands of HDF4 files and make footprints of all datasets/bands into a shapefile. And there is no way to add a user-option on that process. As a GDAL programmer, what I can do is to write the guessing-dimension stuff on the driver and send it to you as a patch so you can evaluate and commit if you want. Best regards, Ivan
comment:9 by , 17 years ago
Ivan,
Actually it is not a problem to assign MIN_LS_DIM_SIZE to some number (in your case it is 250). It is a problem to generalize the whole process. We should develop a way to deduce the dimension map of the arbitrary dataset. There are datasets that have (width < num_bands), so I still do not understand how we can guess image dimensions based on dimension array only.
Folks, I want to get the one day time out to think out this thing again. Max band number constant is the first solution, the second one is introducing the global variables XDIM, YDIM and BANDDIM. I do not think we will be able to implement the general case here. We will need a control from the user side, because HDF is a user oriented format.
Best reghrds, Andrey
comment:10 by , 17 years ago
Replying to dron:
Talking about referenced sample I have noticed that its dimensions [2040x1354x2] called as "Cell_Along_Swath_1km,Cell_Across_Swath_1km,Cloud_Mask_1km_Num_Bytes"
Note that the third dimension called "Cloud_Mask_1km_Num_Bytes". Is it possible that this has nothing to do with the number of bands? Maybe we should interpret the whole thing in the different way? The product specification could help here.
Hi Andrey,
That depends on your definition of a band. These are not 'bands' in the sense of different spectral channels in a radiometer. They are however bands in that they contain distinct subsets of information relating to the 'Cloud_Mask_1km' product. (Each band contains 1 byte of bit-coded information, thence 'Num_Bytes')
Product format info is on http://modis-atmos.gsfc.nasa.gov/MOD06_L2/format.html
Best, Jan
comment:11 by , 9 years ago
Perhaps something like what is suggested in this NetCDF patch #2540 would suit here as well: let user a possibility to select the bands.
comment:12 by , 9 years ago
Agreed. But it could be a little more complex than that. Let's say a dataset dimension array is [7,6001,3001,12]. We can pass ":3" to identify bands as the 4th dimension but which one is row and which one is column? And what to do with more than 3 dimensions?
comment:13 by , 5 years ago
Milestone: | → closed_because_of_github_migration |
---|---|
Resolution: | → wontfix |
Status: | assigned → closed |
This ticket has been automatically closed because Trac is no longer used for GDAL bug tracking, since the project has migrated to GitHub. If you believe this ticket is still valid, you may file it to https://github.com/OSGeo/gdal/issues if it is not already reported there.
Added a few people knowledgable about HDF4 to cc list.