Opened 12 years ago

Closed 8 years ago

#2159 closed defect (worksforme)

HDF5 file causes an infinite loop

Reported by: mpd Owned by: warmerdam
Priority: normal Milestone:
Component: GDAL_Raster Version: 1.5.0
Severity: normal Keywords: HDF5
Cc: dnadeau, antonio, Mateusz Łoskot

Description

The HDF5 file here ftp://ftp.hdfgroup.uiuc.edu/hdf_files/hdf5/eos-from-hdf4/1km1720.h5 generates the following output from gdal_translate:

F:\>apps\gdal_translate HDF5:"G:\BUGS\GDAL\HDF5\1km1720.h5"://Change_in_relative_responses_of_thermal_detectors out.tif
szFilenname G:\BUGS\GDAL\HDF5\1km1720.h5
HDF5: infinite loop closing library
      D,G,A,S,T,D,G,S,F,G,A,S,T,F,FD,P,D,F,FD,P,FD,P,FL,FL,FL,FL,FL,FL,FL,FL,FL,
FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL
,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,F
L,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,
FL,FL

GDAL svn revision: 13473

Platform: Windows

Compiler: Visual Studio 2005

HDF5 version: 1.6.6

Change History (9)

comment:1 Changed 12 years ago by warmerdam

Unfortunately the file is rather large for me to download for a quick check (350MB or so).

Does gdal_translate actually hang? Or do things work ok but the "infinite loop" message get generated?

It *seems* like this is a problem in the source file, and the hdf5 library is coping with but reporting the message as a warning. Is that a likely analysis?

comment:2 Changed 12 years ago by Even Rouault

I've downloaded this dataset and get a crash at the opening of the dataset, using HDF5 1.6.6 too (under Linux 32bit)

valgrind gdalinfo HDF5:"1km1720.h5"://Change_in_relative_responses_of_thermal_detectors

==31110== Memcheck, a memory error detector.
==31110== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==31110== Using LibVEX rev 1732, a library for dynamic binary translation.
==31110== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==31110== Using valgrind-3.2.3-Debian, a dynamic binary instrumentation framework.
==31110== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==31110== For more details, rerun with: -v
==31110==
==31110== Invalid write of size 1
==31110==    at 0x4B831AC: H5D_select_mscat (H5Dselect.c:291)
==31110==    by 0x4B72503: H5D_contig_read (H5Dio.c:1301)
==31110==    by 0x4B70A26: H5D_read (H5Dio.c:843)
==31110==    by 0x4B6F8EE: H5Dread (H5Dio.c:592)
==31110==    by 0x4147E3A: HDF5ImageDataset::CreateProjections() (hdf5imagedataset.cpp:544)
==31110==    by 0x41497C6: HDF5ImageDataset::Open(GDALOpenInfo*) (hdf5imagedataset.cpp:445)
==31110==    by 0x4263303: GDALOpen (gdaldataset.cpp:1774)
==31110==    by 0x8049D5C: main (gdalinfo.c:129)
==31110==  Address 0x6328460 is 0 bytes after a block of size 640 alloc'd
==31110==    at 0x4021AA4: calloc (vg_replace_malloc.c:279)
==31110==    by 0x42911E3: VSICalloc (cpl_vsisimple.cpp:297)
==31110==    by 0x428088B: CPLCalloc (cpl_conv.cpp:80)
==31110==    by 0x4147D90: HDF5ImageDataset::CreateProjections() (hdf5imagedataset.cpp:533)
==31110==    by 0x41497C6: HDF5ImageDataset::Open(GDALOpenInfo*) (hdf5imagedataset.cpp:445)
==31110==    by 0x4263303: GDALOpen (gdaldataset.cpp:1774)
==31110==    by 0x8049D5C: main (gdalinfo.c:129)
==31110==
==31110== Invalid read of size 4
==31110==    at 0x4C5091D: H5T_cmp (H5T.c:3423)
==31110==    by 0x4C520D4: H5T_path_find (H5T.c:3889)
==31110==    by 0x4B70847: H5D_read (H5Dio.c:831)
==31110==    by 0x4B6F8EE: H5Dread (H5Dio.c:592)
==31110==    by 0x4147E75: HDF5ImageDataset::CreateProjections() (hdf5imagedataset.cpp:551)
==31110==    by 0x41497C6: HDF5ImageDataset::Open(GDALOpenInfo*) (hdf5imagedataset.cpp:445)
==31110==    by 0x4263303: GDALOpen (gdaldataset.cpp:1774)
==31110==    by 0x8049D5C: main (gdalinfo.c:129)
==31110==  Address 0x41024296 is not stack'd, malloc'd or (recently) free'd
==31110==
==31110== Process terminating with default action of signal 11 (SIGSEGV)
==31110==  Access not within mapped region at address 0x41024296
==31110==    at 0x4C5091D: H5T_cmp (H5T.c:3423)
==31110==    by 0x4C520D4: H5T_path_find (H5T.c:3889)
==31110==    by 0x4B70847: H5D_read (H5Dio.c:831)
==31110==    by 0x4B6F8EE: H5Dread (H5Dio.c:592)
==31110==    by 0x4147E75: HDF5ImageDataset::CreateProjections() (hdf5imagedataset.cpp:551)
==31110==    by 0x41497C6: HDF5ImageDataset::Open(GDALOpenInfo*) (hdf5imagedataset.cpp:445)
==31110==    by 0x4263303: GDALOpen (gdaldataset.cpp:1774)
==31110==    by 0x8049D5C: main (gdalinfo.c:129)

While playing with HDF5 and valgrind-ing it, I've also found an incorrect memory usage (a strcat on an uninitialized buffer, that I've replaced by a strcpy) and fixed a few minor memory leaks. Commited in r13583

comment:3 Changed 12 years ago by warmerdam

Keywords: HDF5 added

Even,

What is the conclusion on this? Is there still an outstanding bug?

comment:4 Changed 12 years ago by Even Rouault

I think there's still a bug, but probably more in the HDF5 library than in GDAL.

comment:5 Changed 11 years ago by Even Rouault

Cc: dnadeau added

Or maybe not... I can still reproduce the crash with HDF5 1.8.1. Here's the stackstrace I get :

==20169== 
==20169== Source and destination overlap in memcpy(0x6F799A0, 0x6F7A060, 440104)
==20169==    at 0x4024B12: memcpy (mc_replace_strmem.c:402)
==20169==    by 0x509B838: H5D_scatter_mem (H5Dscatgath.c:336)
==20169==    by 0x509C86F: H5D_scatgath_read (H5Dscatgath.c:551)
==20169==    by 0x506C82F: H5D_contig_read (H5Dcontig.c:412)
==20169==    by 0x5088AC6: H5D_read (H5Dio.c:406)
==20169==    by 0x50873BE: H5Dread (H5Dio.c:174)
==20169==    by 0x420B28B: HDF5ImageDataset::CreateProjections() (hdf5imagedataset.cpp:547)
==20169==    by 0x420C2CA: HDF5ImageDataset::Open(GDALOpenInfo*) (hdf5imagedataset.cpp:448)
==20169==    by 0x4350B1C: GDALOpen (gdaldataset.cpp:1922)
==20169==    by 0x8049F1E: main (gdalinfo.c:147)
==20169== 
==20169== Invalid write of size 1
==20169==    at 0x4024BA8: memcpy (mc_replace_strmem.c:402)
==20169==    by 0x509B838: H5D_scatter_mem (H5Dscatgath.c:336)
==20169==    by 0x509C86F: H5D_scatgath_read (H5Dscatgath.c:551)
==20169==    by 0x506C82F: H5D_contig_read (H5Dcontig.c:412)
==20169==    by 0x5088AC6: H5D_read (H5Dio.c:406)
==20169==    by 0x50873BE: H5Dread (H5Dio.c:174)
==20169==    by 0x420B28B: HDF5ImageDataset::CreateProjections() (hdf5imagedataset.cpp:547)
==20169==    by 0x420C2CA: HDF5ImageDataset::Open(GDALOpenInfo*) (hdf5imagedataset.cpp:448)
==20169==    by 0x4350B1C: GDALOpen (gdaldataset.cpp:1922)
==20169==    by 0x8049F1E: main (gdalinfo.c:147)
==20169==  Address 0x6f79c20 is 0 bytes after a block of size 640 alloc'd
==20169==    at 0x4021BDE: calloc (vg_replace_malloc.c:397)
==20169==    by 0x4392433: VSICalloc (cpl_vsisimple.cpp:290)
==20169==    by 0x43753DB: CPLCalloc (cpl_conv.cpp:80)
==20169==    by 0x420B1EE: HDF5ImageDataset::CreateProjections() (hdf5imagedataset.cpp:536)
==20169==    by 0x420C2CA: HDF5ImageDataset::Open(GDALOpenInfo*) (hdf5imagedataset.cpp:448)
==20169==    by 0x4350B1C: GDALOpen (gdaldataset.cpp:1922)
==20169==    by 0x8049F1E: main (gdalinfo.c:147)

By stepping with ddd, the hdf5 library tries to copy 440104 bytes into a 10x16 float array...

comment:6 Changed 11 years ago by antonio

Cc: antonio added

comment:7 Changed 9 years ago by Mateusz Łoskot

Cc: Mateusz Łoskot added

comment:8 Changed 8 years ago by antonio

It seems to be OK now:

Ubuntu 11.04 amd64 HDF5 v1.8.4 GDAL source:trunk@22735

$ gdal_translate HDF5:1km1720.h5://Change_in_relative_responses_of_thermal_detectors out.tif
Input file size is 10, 16
0Warning 1: Lost metadata writing to GeoTIFF ... too large to fit in tag.
Warning 1: Lost metadata writing to GeoTIFF ... too large to fit in tag.
...10...20...30...40...50...60...70...80...90...100 - done.

also no particular problem with valgrind

==10253== 
==10253== HEAP SUMMARY:
==10253==     in use at exit: 6,585 bytes in 25 blocks
==10253==   total heap usage: 13,129 allocs, 13,104 frees, 9,541,448 bytes allocated
==10253== 
==10253== LEAK SUMMARY:
==10253==    definitely lost: 0 bytes in 0 blocks
==10253==    indirectly lost: 0 bytes in 0 blocks
==10253==      possibly lost: 0 bytes in 0 blocks
==10253==    still reachable: 6,585 bytes in 25 blocks
==10253==         suppressed: 0 bytes in 0 blocks
==10253== Reachable blocks (those to which a pointer was found) are not shown.
==10253== To see them, rerun with: --leak-check=full --show-reachable=yes
==10253== 
==10253== For counts of detected and suppressed errors, rerun with: -v
==10253== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 14 from 6)

comment:9 Changed 8 years ago by Even Rouault

Resolution: worksforme
Status: newclosed
Note: See TracTickets for help on using tickets.