Opened 15 years ago

Last modified 15 years ago

#3023 closed defect

memory leak in hdf driver — at Initial Version

Reported by: vincentschut Owned by: warmerdam
Priority: normal Milestone:
Component: GDAL_Raster Version: svn-trunk
Severity: normal Keywords: HDF4
Cc: Kyle Shannon

Description

I'm afraid I've encountered a memory leak in the hdf driver. Context: because of massive parallel (cluster) processing, I'm reusing a python instance for lots of jobs. Some of these jobs use gdal to read and/or save data. In some cases, I saw the memory use of my python worker processes grow with each job, until the whole lot got killed. I've got 8 gig on that server :-) . I've been able to track down the leak to the use of gdal.Open on a hdf (modis) dataset. I guess this means that either the gdal hdf driver or libhdf4 has the leak.

Here is some python code (runs on linux only due to the code to get the mem usage) to prove a leak when opening a hdf file, while no leak when opening a tif. Run in a folder with 'data.hdf' and 'data.tif' present, or change those names to an existing hdf and tif file.

Using 64bit linux, gdal svn rev. 17228 (today), libhdf4.2r2, python 2.6.2

========= python code =========

import os from osgeo import gdal

def getmemory():

proc_status = '/proc/%d/status' % os.getpid() scale = {'kB': 1024.0, 'mB': 1024.0*1024.0,

'KB': 1024.0, 'MB': 1024.0*1024.0}

v = open(proc_status).read() i = v.index('VmSize:') v = v[i:].split(None, 3) return (int(v[1]) * scale[v[2]])

nFiles = 100

m0 = getmemory() print 'memory usage before:', m0 print print nFiles, 'times the same hdf file' for i in range(nFiles):

gdal.OpenShared('data.hdf')

m1 = getmemory() print 'memory usage now:', m1, ' difference:', m1-m0 print print nFiles, 'times the same tif file' for i in range(nFiles):

gdal.OpenShared('data.tif')

m2 = getmemory() print 'memory usage now:', m2, ' difference:', m2-m1

Change History (0)

Note: See TracTickets for help on using tickets.