Opened 12 years ago

Closed 8 years ago

#4777 closed enhancement (wontfix)

Ruby Bindings need a proper close() method

Reported by: tylerjohnst Owned by: jimk
Priority: normal Milestone:
Component: RubyBindings Version: 1.9.1
Severity: normal Keywords: ruby
Cc:

Description

Currently there is no way to close an open data source other than setting it to nil and running the garbage collector manually. This is not very ruby like. The SWIG bindings should expose a close() method to dereference the open data source.

Attachments (1)

gdal_close.patch (436 bytes ) - added by jakob 11 years ago.
Patch that adds a close() method to the Dataset

Download all attachments as: .zip

Change History (9)

by jakob, 11 years ago

Attachment: gdal_close.patch added

Patch that adds a close() method to the Dataset

comment:1 by jakob, 11 years ago

I can definitely second that! I am not even sure if running the GC will deterministicaly close the objects.

comment:2 by jimk, 11 years ago

First off, this patch impacts all SWIG bindings, not just ruby.

I ran a somewhat complicated case that it loop over opening 100s of TIFFs. The dataset is defined inside the loop so it goes out of scope after each iteration. Inside the loop it gets band 1 and prints out XSize, YSize, etc. from the band. I observed the memory usage of the ruby process goes up initially with each image is opened, but then limits at about 130MB indicating that the garbage collector is doing it's job.

Putting a close (from the patch) at the end of the loop causes memory use to increase slower (approximately 1MB per 16 images vs 1MB per 4 images). But eventually, I get a segmentation fault when the garbage collector tries to collect the already Closed class. Putting a GC.start instead of the close at the end of the loop keeps memory at 58MB and causes no crashes.

A second test case which just opens a dataset over and over and does nothing else. It crashes after it runs out of file handles because the GC never runs (because not enough memory is used to trigger the GC). However, inserting a GC.start into that loop eliminates the running out of file handles problem. This isn't ideal, but the GC isn't aware of file handle pressure.

require 'gdal/gdal'

while true
 ds = Gdal::Gdal.open('some.tif')
 ds = nil
 GC.start
end

Another case where the patch generates a segmentation fault is if the ruby code uses the dataset (or related band) after it is closed.

To work correctly, I think the "Close" function would have to mark the object closed and then every other SWIG call into GDAL would need to verify that the underlying dataset(s) (including on a band or as part of other more complicated object (gdalwarp comes to mind) is not closed before calling into GDAL. This seems like a lot of effort to try and outsmart the garbage collector.

comment:3 by tylerjohnst, 11 years ago

The main problem I had was that GC.start is asynchronous and with ruby you have little to no control over the garbage collector (You can manually start and stop it). If you are using a long running process (a Ruby on Rails server or worker process) and creating files with GDal, you end up with things still referenced. So trying to manipulate a data source, then serve that data source to the destination, it won't be written till disk until the GC has completed. I noticed poking around the python code that they do in fact have a working close method. I just think that might need to be ported to the ruby implementation.

comment:4 by jimk, 11 years ago

Huh... that is tricky.

I am not seeing a close method on dataset in python in the code. Also, the docs say to simply set dataset = None. What am I missing?

comment:5 by tylerjohnst, 11 years ago

My apologizes, it's been a while since I looked through the Python tests. I guess it didn't have it. There may be something in the ruby C API that allows for easy dereferencing of objects, but I don't know that end of it well enough.

comment:6 by jimk, 11 years ago

There is a C API to run the GC, but I don't think that would be any better than GC.start.

comment:7 by fxthomas, 10 years ago

Contrary to Ruby, Python is using reference counting, so the objects are destroyed as soon as you set them to None.

In practice this means that creating datasets with the Ruby bindings is unusable. This is an example of what I am trying to do :

   # Create the output dataset
   driver = Gdal::Ogr::get_driver_by_name 'KML'
   ds = driver.create_data_source output_path
   ly = ds.create_layer(...)
   ly.create_feature(...)
   ly.create_feature(...)
   ds = nil
   ly = nil
   GC.start

   # Trying to read from the newly-created file randomly prints truncated content
   File.open output_path do |f|
      puts f.read
   end

comment:8 by Even Rouault, 8 years ago

Resolution: wontfix
Status: newclosed

The ruby bindings have been disabled in GDAL 2.0 due to lack of maintainer. Closing as wontfix

Note: See TracTickets for help on using tickets.