HDF5 Identify does not handle files with user data at the beginning

In an HDF5 file, the superblock need not start at beginning, we are required to search for it at specific offsets. HDF5Dataset::Identify should probably use H5Fis_hdf5 function to identify an HDF rather than look for the signature in poOpenInfo->pabyHeader.

We have some NOAA data that does not work with GDAL because the superblock starts at 1024 (prefixed with XML).

The issue is that Identify() is supposed to run fast and not open any file, what H5Fis_hdf5() would do. H5Fis_hdf5() calls H5F_locate_signature() which has this interesting info :

 * Function:	H5F_locate_signature
 * Purpose:	Finds the HDF5 superblock signature in a file.	The signature
 *		can appear at address 0, or any power of two beginning with
 *		512.

Do you have a link to those NOAA data ? Perhaps we could call H5Fis_hdf5() only if the extension is .h5 if those files do use it ?

I have checked in the change made to support NOAA in 1.8-esri (r22971). These HDFs start with an XML block "<HDF_UserBlock>", the Identify function looks for this and calls H5Fis_hdf5. We are using H5Fis_hdf5 to search for the superblock in this case to avoid assumptions about the XML block size.

    if( poOpenInfo->pabyHeader )
        if( memcmp(poOpenInfo->pabyHeader,achSignature,8) == 0 )
            return TRUE;

        if( memcmp(poOpenInfo->pabyHeader,"<HDF_UserBlock>",15) == 0)
            if( H5Fis_hdf5(poOpenInfo->pszFilename) )
              return TRUE;

Will try to attach a sample to this ticket (checking if we have permission to distribute samples).

r22973 /trunk/gdal/frmts/hdf5/hdf5dataset.cpp: HDF5: Fix to identify datasets whose header doesn't start with the typical signature but with some XML content (#4196)

