Opened 15 years ago
Closed 15 years ago
#3023 closed defect (invalid)
memory leak in hdf driver
Reported by: | vincentschut | Owned by: | warmerdam |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | GDAL_Raster | Version: | svn-trunk |
Severity: | normal | Keywords: | HDF4 |
Cc: | Kyle Shannon |
Description (last modified by )
I'm afraid I've encountered a memory leak in the hdf driver. Context: because of massive parallel (cluster) processing, I'm reusing a python instance for lots of jobs. Some of these jobs use gdal to read and/or save data. In some cases, I saw the memory use of my python worker processes grow with each job, until the whole lot got killed. I've got 8 gig on that server :-) . I've been able to track down the leak to the use of gdal.Open on a hdf (modis) dataset. I guess this means that either the gdal hdf driver or libhdf4 has the leak.
Here is some python code (runs on linux only due to the code to get the mem usage) to prove a leak when opening a hdf file, while no leak when opening a tif. Run in a folder with 'data.hdf' and 'data.tif' present, or change those names to an existing hdf and tif file.
Using 64bit linux, gdal svn rev. 17228 (today), libhdf4.2r2, python 2.6.2
========= python code ========= import os from osgeo import gdal def getmemory(): proc_status = '/proc/%d/status' % os.getpid() scale = {'kB': 1024.0, 'mB': 1024.0*1024.0, 'KB': 1024.0, 'MB': 1024.0*1024.0} v = open(proc_status).read() i = v.index('VmSize:') v = v[i:].split(None, 3) return (int(v[1]) * scale[v[2]]) nFiles = 100 m0 = getmemory() print 'memory usage before:', m0 print print nFiles, 'times the same hdf file' for i in range(nFiles): gdal.OpenShared('data.hdf') m1 = getmemory() print 'memory usage now:', m1, ' difference:', m1-m0 print print nFiles, 'times the same tif file' for i in range(nFiles): gdal.OpenShared('data.tif') m2 = getmemory() print 'memory usage now:', m2, ' difference:', m2-m1
Attachments (1)
Change History (9)
comment:1 by , 15 years ago
Cc: | added |
---|
comment:2 by , 15 years ago
Component: | default → GDAL_Raster |
---|---|
Description: | modified (diff) |
Keywords: | HDF4 added |
Status: | new → assigned |
comment:3 by , 15 years ago
The leak is in opening the main hdf file, not in opening a sds (try changing the above python snippet to open a sds, and you'll see no leak; open the main hdf file and there is a leak). For info on my os, library versions, etc, see the original report. It's there, really.
I ran 'valgrind --leak-check=full --show-reachable=yes gdalinfo data.hdf', here's the output (gdalinfo's output omitted):
==6686== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 1) ==6686== malloc/free: in use at exit: 6,801 bytes in 393 blocks. ==6686== malloc/free: 42,613 allocs, 42,220 frees, 15,364,923 bytes allocated. ==6686== For counts of detected errors, rerun with: -v ==6686== searching for pointers to 393 not-freed blocks. ==6686== checked 1,591,120 bytes. ==6686== ==6686== 200 bytes in 5 blocks are still reachable in loss record 1 of 3 ==6686== at 0x4C2391E: malloc (vg_replace_malloc.c:207) ==6686== by 0x526860E: CPLCreateMutex (cpl_multiproc.cpp:722) ==6686== by 0x52686E4: CPLCreateOrAcquireMutex (cpl_multiproc.cpp:119) ==6686== by 0x5268742: CPLMutexHolder::CPLMutexHolder(void, double, char const*, int) (cpl_multiproc.cpp:63) ==6686== by 0x5235AF2: GetGDALDriverManager (gdaldrivermanager.cpp:73) ==6686== by 0x509C418: GDALAllRegister (gdalallregister.cpp:76) ==6686== by 0x4025E5: main (gdalinfo.c:89) ==6686== ==6686== ==6686== 256 bytes in 1 blocks are still reachable in loss record 2 of 3 ==6686== at 0x4C2391E: malloc (vg_replace_malloc.c:207) ==6686== by 0x53DD02E: NC_reset_maxopenfiles (in /usr/local/lib/libgdal.so.1.13.0) ==6686== by 0x53DD096: NC_open (in /usr/local/lib/libgdal.so.1.13.0) ==6686== by 0x53CC2A6: SDstart (in /usr/local/lib/libgdal.so.1.13.0) ==6686== by 0x50E32EE: HDF4Dataset::Open(GDALOpenInfo*) (hdf4dataset.cpp:679) ==6686== by 0x52303DA: GDALOpen (gdaldataset.cpp:2120) ==6686== by 0x402702: main (gdalinfo.c:149) ==6686== ==6686== ==6686== 6,345 bytes in 387 blocks are definitely lost in loss record 3 of 3 ==6686== at 0x4C2391E: malloc (vg_replace_malloc.c:207) ==6686== by 0x541118B: VPgetinfo (in /usr/local/lib/libgdal.so.1.13.0) ==6686== by 0x54114D5: Vinitialize (in /usr/local/lib/libgdal.so.1.13.0) ==6686== by 0x53DB15E: NC_new_cdf (in /usr/local/lib/libgdal.so.1.13.0) ==6686== by 0x53DD134: NC_open (in /usr/local/lib/libgdal.so.1.13.0) ==6686== by 0x53CC2A6: SDstart (in /usr/local/lib/libgdal.so.1.13.0) ==6686== by 0x5078B6A: EHopen (EHapi.c:374) ==6686== by 0x50E3396: HDF4Dataset::Open(GDALOpenInfo*) (hdf4dataset.cpp:767) ==6686== by 0x52303DA: GDALOpen (gdaldataset.cpp:2120) ==6686== by 0x402702: main (gdalinfo.c:149) ==6686== ==6686== LEAK SUMMARY: ==6686== definitely lost: 6,345 bytes in 387 blocks. ==6686== possibly lost: 0 bytes in 0 blocks. ==6686== still reachable: 456 bytes in 6 blocks. ==6686== suppressed: 0 bytes in 0 blocks.
By the way, I *do* get a 1410 bytes 'definitely lost' according to valgrind when running gdalinfo on a sds instead of on a hdf file. However, in the real world (read: my processing chain) opening subdatasets does not increase my memory usage, while opening hdf files does.
comment:4 by , 15 years ago
I see something mangled my valgrind output... I'll attach it, 'cause I don't know how to edit my previous comment to repair it.
comment:5 by , 15 years ago
From the valgrind trace, it would seem the the leak comes from SWopen() in the HDF-EOS code path. But I see a call to SWclose() a few lines below, so they look properly paired.
(Opening the HDF file itself or one of it subdatasets doesn't involve the same source file, so it is not so surprising that you get different results w.r.t memory leaks)
I've run : "valgrind --leak-check=full gdalinfo MISR_AM1_CGAS_MAY_2007_F15_0031.hdf" where this file is a HDF-EOS file coming from ftp://l4ftl01.larc.nasa.gov/MISR/MIL3MAE.004/2007.05.01/
--> No leak.
So, it might come from your particular file, the libhdf4 version, a 32/64 bit issue, ... My environment : Ubuntu 8.04, 32 bit, libhdf4g 4.1r4-21
If you could attach a (small) dataset with which you can reproduce the issue or provide a link to it, that might help. With the output of gdalinfo on it also.
comment:6 by , 15 years ago
I've been able to test the same file(s) on a different system, also linux 64 bit but with a different hdflib release, which gives no leak. I'll investigate further and try to get the same hdf lib working on the faulty system, and report back. Thanks for investigating.
comment:7 by , 15 years ago
Compiled and installed hdf4.2r4 (leaking was 4.2r2); rebuilt gdal. Leak seems gone, valgrind output:
==16306== LEAK SUMMARY: ==16306== definitely lost: 0 bytes in 0 blocks. ==16306== possibly lost: 0 bytes in 0 blocks. ==16306== still reachable: 200 bytes in 5 blocks. ==16306== suppressed: 0 bytes in 0 blocks.
In python I still have some memory growth, however this is only once when opening the first file. Consequetive gdal.Open calls don't leak memory, so I hope this is just some initialization issue.
Thanks for working on this with me. Suppose we can flag it solved for now. If I do still encounter leaks in my real-world processing I'll come back.
I have tried running the command:
under valgrind and there is no apparent memory leaks according to valgrind. Are you working on linux? Can you try using valgrind to identify the leaks you are encountering?