Opened 6 years ago

Last modified 2 years ago

#4227 new defect

[""PATCH""] HDF5 : cannot read/convert some multi-band multi-image dataset

Reported by: bugbuster Owned by: warmerdam
Priority: normal Milestone:
Component: GDAL_Raster Version: svn-trunk
Severity: normal Keywords: hdf5
Cc: antonio

Description

I cannot convert a multi-image multi-band HDF5 dataset. First, I get the dataset structure :

asterix > gdalinfo CSKS1_SCS_B_PP_07_CO_RA_SF_20081212110938_20081212110944.h5
...
Subdatasets:
  SUBDATASET_1_NAME=HDF5:"CSKS1_SCS_B_PP_07_CO_RA_SF_20081212110938_20081212110944.h5"://S01/QLK
  SUBDATASET_1_DESC=[707x165] //S01/QLK (8-bit unsigned character)
  SUBDATASET_2_NAME=HDF5:"CSKS1_SCS_B_PP_07_CO_RA_SF_20081212110938_20081212110944.h5"://S01/SBI
  SUBDATASET_2_DESC=[17676x4139x2] //S01/SBI (16-bit integer)
  SUBDATASET_3_NAME=HDF5:"CSKS1_SCS_B_PP_07_CO_RA_SF_20081212110938_20081212110944.h5"://S02/SBI
  SUBDATASET_3_DESC=[17676x4139x2] //S02/SBI (16-bit integer)
Corner Coordinates:
Upper Left  (    0.0,    0.0)
Lower Left  (    0.0,  512.0)
Upper Right (  512.0,    0.0)
Lower Right (  512.0,  512.0)
Center      (  256.0,  256.0)
asterix >

As we can see, the dataset contains :

o S01/QLK : a 707x165 quicklook image

o S01/SBI : a 17676x4139 2-band 16-bit image

o S02/SBI : idem

There is a bug when I try to get S01/SBI (or S02/SBI) information :

asterix > gdalinfo HDF5:"CSKS1_SCS_B_PP_07_CO_RA_SF_20081212110938_20081212110944.h5"://S01/SBI
Driver: HDF5Image/HDF5 Dataset
Files: CSKS1_SCS_B_PP_07_CO_RA_SF_20081212110938_20081212110944.h5
Size is 2, 4139
....
Corner Coordinates:
Upper Left  (    0.0,    0.0)
Lower Left  (    0.0, 4139.0)
Upper Right (    2.0,    0.0)
Lower Right (    2.0, 4139.0)
Center      (    1.0, 2069.5)
Band 1 Block=2x128 Type=Int16, ColorInterp=Undefined
  Metadata:
    SBI:Zero Doppler Azimuth First Time=40178 
...
Band 2 Block=2x128 Type=Int16, ColorInterp=Undefined
  Metadata:
    SBI:Zero Doppler Azimuth First Time=40178 
...
Band 17676 Block=2x128 Type=Int16, ColorInterp=Undefined
  Metadata:
    SBI:Zero Doppler Azimuth First Time=40178 
...
asterix >

It seems we have a 2X4139 17676-band image. I guess there is a mis-interpretation of image dimensions. After some tedious debug, I made a patch to convert this dataset correctly, concerning hdf5imagedataset.cpp and bagdataset.cpp (see attached files).

N.B. : sorry, but I cannot give awway the dataset.

Attachments (6)

hdf5imagedataset.cpp (23.2 KB) - added by bugbuster 6 years ago.
bagdataset.cpp (23.3 KB) - added by bugbuster 6 years ago.
bagdataset.patch (233 bytes) - added by bugbuster 6 years ago.
hdf5imagedataset.patch (2.0 KB) - added by bugbuster 6 years ago.
patchCSK-L1A.diff (9.0 KB) - added by msavinaud 4 years ago.
Upgrade of previous patchs to handle CSK-L1A products only
csk_l1a_as_complex.patch (4.1 KB) - added by vf 2 years ago.
Patch to handle correctly CSK L1A imagery

Download all attachments as: .zip

Change History (25)

Changed 6 years ago by bugbuster

Attachment: hdf5imagedataset.cpp added

Changed 6 years ago by bugbuster

Attachment: bagdataset.cpp added

comment:1 Changed 6 years ago by antonio

Cc: antonio added

comment:2 Changed 6 years ago by Even Rouault

Summary: HDF5 : cannot read/convert some multi-band multi-image dataset[""PATCH""] HDF5 : cannot read/convert some multi-band multi-image dataset

bugbuster, it would be good if you could rebase your work on the SVN trunk version, rather than 1.7.2. There have been many changes since. And attach it as a patch too rather than full files. It makes review process easier.

comment:3 in reply to:  2 Changed 6 years ago by bugbuster

Replying to rouault:

bugbuster, it would be good if you could rebase your work on the SVN trunk version, rather than 1.7.2. There have been many changes since.

I've already tried to download the SVN trunk version, with no success : the firewall at my office does not allow that. I can try to work out the patch onto a newer version (1.8.2) if it makes things easier for you.

And attach it as a patch too rather than full files. It makes review process easier.

Sorry, I'm a newby. Do you mean I should send the output of something like :

diff old/bagdataset.cpp new/bagdataset.cpp > bagdataset.patch

comment:4 Changed 6 years ago by Even Rouault

You can go to http://trac.osgeo.org/gdal/browser/trunk and click on the link at the bottom of the page "Download in other formats". That way you will grab a snapshot of the SVN version.

The diff -u format is preferable.

Changed 6 years ago by bugbuster

Attachment: bagdataset.patch added

Changed 6 years ago by bugbuster

Attachment: hdf5imagedataset.patch added

comment:5 Changed 6 years ago by bugbuster

I just attached two patches. Original files are from trunk r-23022.

comment:6 in reply to:  5 ; Changed 6 years ago by antonio

Replying to bugbuster:

I just attached two patches. Original files are from trunk r-23022.

Hi bugbuster, after a quick review it seems to me that your patch dones't preserve the driver logic that, presumably, worked for other kind of products.

I doubt it can be included as is.

Also it is probably better to represent CSK SLC data as a single complex band rather than 2 real bands.

comment:7 in reply to:  6 Changed 6 years ago by bugbuster

Replying to antonio:

Replying to bugbuster:

I just attached two patches. Original files are from trunk r-23022.

Hi Antonio,

Hi bugbuster, after a quick review it seems to me that your patch dones't preserve the driver logic that, presumably, worked for other kind of products.

I doubt it can be included as is.

I only have two datasets containing complex images. And both cannot be converted with Gdal V1.7.2 (or V1.8.0) : it's indeed not a proof :=(

However, the dimension vector is initialized by an HDF5 library function : H5Sget_simple_extent_dims(). The I/F of this function is not described in details, but I found an example in the HDF5 distribution, which makes use of this function : h2ls.c.
As far as I understand the source code correctly, the size vector is in the following order : {Y ; X ; Nbands}
I check out with the dataset mentionned above :

asterix > h5ls CSKS1_SCS_B_PP_07_CO_RA_SF_20081212110938_20081212110944.h5/S01/SBI
SBI        Dataset {17676, 4139, 2}
asterix >

Also it is probably better to represent CSK SLC data as a single complex band rather than 2 real bands.

comment:8 Changed 6 years ago by antonio

Replying to bugbuster: the GDAL HDF5 driver currently assumes that in case a dataset has 3 dimensions the first is the number of bands and the others are the band size:

dims = [nBands, nRows, nCols]

This makes perfectly sense and I assume there is some product (non CSK) that stores data in this way.

Unfortunately the above assumption is not correct for CSK SCS products. In that case the SBI dataset contains a singled band of complex. Since there is non native complex data type in HDF5, data are stores in a 3D dataset having

dims = [nRows, nCols, 2]

where the last dimension "2" is not the number of bands but the tho channels of a complex band (real, imag).

Interpreting 3D dataset dimensions as you suggest

dims = [nRows, nCols, nBands]

would surely break compatibility with non CSk products and also would not address correctly CSK SCS data that IMHO should be interpreted as a single band of complex numbers.

comment:9 Changed 5 years ago by warmerdam

Milestone: 1.7.4

I don't anticipate a 1.7.4 release so dropping this milestone.

comment:10 Changed 4 years ago by msavinaud

Milestone: 1.10.2
Version: 1.7.2svn-trunk

Hi, I have upgraded the patch to made the swap dimension only for CSK L1A data with three dimensions. I think it didn't broke anything for other hdf5 files. I have tested with 5 CSK L1A dataset gdal_translate and gdalinfo seems to work perfectly.

The patch has been made on the svn trunk. I think it could be done also on the 1.9.3 and 1.10.2 branches.

For comparison, I have used HDFView which return a array with 2 components so I think is not necessary to change the type to complex one.

Changed 4 years ago by msavinaud

Attachment: patchCSK-L1A.diff added

Upgrade of previous patchs to handle CSK-L1A products only

comment:11 Changed 4 years ago by Even Rouault

Are there links to public CSK-L1A datasets ?

comment:12 Changed 4 years ago by msavinaud

Yes, you can see the current problem with the two L1A samples provided by e-geos: ftp://ftp.e-geos.it/StripMap%20HIMAGE%20Interferometric%20pair%20-%20L%27Aquila/

The patch works on this two images.

comment:13 Changed 4 years ago by Even Rouault

I don't manage to access the FTP site. It requires login & password

comment:14 Changed 4 years ago by msavinaud

Though Firefox when I come from http://www.e-geos.it/products/cosmo.html I have only a message which indicate that the connection will use the login = cosmodemo and no authentication

comment:15 Changed 4 years ago by Even Rouault

Milestone: 1.10.2

I've integrated a variation around msavinaud's patch in r26503, with hopefully identical behaviour but reduced if else logic.

But, as suggested by Antonio and as far as I understand http://www.e-geos.it/products/pdf/csk-product%20handbook.pdf (page 30), it looks like the 2 bands are the I and Q components of the signal, so they should better be represented by one of GDAL complex data type. The above changeset is hopefully a first step in that direction.

comment:16 Changed 4 years ago by msavinaud

Thanks Even,

I agree with you about the fact that in this case, the type should be modified to a complex one. If you can point me an example about how to do that in a another driver, I can try to reproduce it in hdf one.

comment:17 Changed 4 years ago by Even Rouault

http://trac.osgeo.org/gdal/browser/trunk/gdal/frmts/tsx/tsxdataset.cpp uses CInt16 for example. There are CInt16, CInt32, CFloat32 and CFloat64 available in GDAL. The layout of a single value is real part and imaginary part packed. So for a CInt16, 16bits for the real part followed by 16bits for the imaginary part.

Changed 2 years ago by vf

Attachment: csk_l1a_as_complex.patch added

Patch to handle correctly CSK L1A imagery

comment:18 Changed 2 years ago by Even Rouault

vf,

I've tried the patch and it seems incomplete. gdalinfo on the .h5 file itself reports

  SUBDATASET_2_NAME=HDF5:"/home/even/gdal/data/hdf5/CSKS2_SCS_B_HI_09_HH_RA_SF_20090412050638_20090412050645.h5"://S01/SBI
  SUBDATASET_2_DESC=[21470x22380x2] //S01/SBI (16-bit integer)

I would have expected "[21470x22380x1] S01/SBI Complex 16-bit integer" to be displayed (1 band and Complex). I presume some of the logic to detect complex CSK L1A should be migrated in hdf5dataset.cpp itself, which is a bit annoying.

comment:19 in reply to:  18 Changed 2 years ago by vf

Even,

Yes, I understand your point. For me, the report of "SUBDATASET_2_DESC=[21470x22380x2] S01/SBI (16-bit integer)" is OK, due to when I want to work with a specified subset I call directly :

gdalinfo HDF5:"CSKS1_SCS_B_HI_24_VV_RD_SF_20130509205142_20130509205150.h5"://S01/SBI -nomd

and it reports

Driver: HDF5Image/HDF5 Dataset
Files: CSKS1_SCS_B_HI_24_VV_RD_SF_20130509205142_20130509205150.h5
Size is 13874, 19838
Coordinate System is `'
Corner Coordinates:
Upper Left  (    0.0,    0.0)
Lower Left  (    0.0,19838.0)
Upper Right (13874.0,    0.0)
Lower Right (13874.0,19838.0)
Center      ( 6937.0, 9919.0)
Band 1 Block=128x128 Type=CInt16, ColorInterp=Undefined

For me a proper way to display the subset description, changing the hdf5dataset.cpp logic, will be without 3rd dimension :

SUBDATASET_2_DESC=[21470x22380] //S01/SBI (Complex 16-bit integer)

Well, I will see what can be done.

Note: See TracTickets for help on using tickets.