Opened 12 years ago

Closed 5 years ago

#4362 closed defect (wontfix)

Speed up gdal_retile.py with multithreading

Reported by: rfw Owned by: warmerdam
Priority: normal Milestone: closed_because_of_github_migration
Component: default Version: unspecified
Severity: normal Keywords:
Cc:

Description

Currently, gdal_retile.py only runs with one thread. This patch adds a -multi option to the script to run with multiple threads. It also adds DataSetCache autosizing and deletes some unnecessary mess, namely the del statements and __del__ methods (as the garbage collector can handle all instances that I found in gdal_retile.py).

Attachments (3)

retile.2.diff (10.4 KB ) - added by rfw 12 years ago.
retile.diff (10.4 KB ) - added by rfw 12 years ago.
Patch for gdal_retile.py for multithreading
gdal_retile_r29478.patch (15.0 KB ) - added by Even Rouault 9 years ago.
Updated patch based on r29478

Download all attachments as: .zip

Change History (9)

by rfw, 12 years ago

Attachment: retile.2.diff added

by rfw, 12 years ago

Attachment: retile.diff added

Patch for gdal_retile.py for multithreading

comment:1 by warmerdam, 12 years ago

Does the patch handle things gracefully on systems without multiprocessing? (Assuming such exist) What about Python versions? How long has this been part of the core?

comment:2 by rfw, 12 years ago

The patch will use dummy implementations of {{Pool}} and {{Lock}} if multiprocessing doesn't exist on the system. As for Python versions, I am relatively sure it would work with Python 2.6+ but unsure if it will work on Python 3.0.

in reply to:  2 comment:3 by lpinner, 12 years ago

Replying to rfw:

but unsure if it will work on Python 3.0.

It won't (as is), see note on Queue from the python doc:

The Queue module has been renamed to queue in Python 3.0.

comment:4 by ctl101, 9 years ago

Hi, I'm a novice to Linux, but on a good learning curve.

I really want to get gdal retile working with multithreading using osgeo4w (gdal) in windows.

I downloaded the above posted retile files (1 and 2 - what is the difference between them?) and learnt how to compare the diff file and then patch the file to my gdal_retile.py original using linux. However, I can't get the modified versions (using either retile or retile.2) to recognise the -multi option. For example if you have 4 cores then what should the option state i.e. should this be "-multi 4", "-multi=4", etc.

Also, giving that I may have patched incorrectly (I do get an error stating that the file has already been patched (the -R reversal one), and a hunk error under various conditions) might it be possible for someone to post a working, fully patched, gdal_retile.py file that is multithread enabled?

Best regards

comment:5 by Even Rouault, 9 years ago

I've spent sometime looking and experimenting at this, and my conclusion is that I don't think the patch, in its current state, can bring the performance improvements one would expect. The main reason is that it uses multiprocessing.dummy, which is an alias for the Python Threading module. But in Python, because of the GIL (Global Interpreter Lock), only one thread can execute Python code at once. So currently only one thread does work at a time because the GIL is taken even when GDAL works. One would need to tweak the SWIG bindings to release/acquire the lock when entering GDAL code. http://matt.eifelle.com/2007/11/23/enabling-thread-support-in-swig-and-python/ gives some idea. There was also a lack of mutex in mosaic_info.getDataSet() and I can't exclude there won't be other places where it is missing. So I'm uploading an updated version with the tiny improvement.

by Even Rouault, 9 years ago

Attachment: gdal_retile_r29478.patch added

Updated patch based on r29478

comment:6 by Even Rouault, 5 years ago

Milestone: closed_because_of_github_migration
Resolution: wontfix
Status: newclosed

This ticket has been automatically closed because Trac is no longer used for GDAL bug tracking, since the project has migrated to GitHub. If you believe this ticket is still valid, you may file it to https://github.com/OSGeo/gdal/issues if it is not already reported there.

Note: See TracTickets for help on using tickets.