Opened 8 years ago

Closed 6 years ago

#6269 closed defect (duplicate)

Threads and the VRT driver on top of GeoTIFF files

Reported by: mdione Owned by: warmerdam
Priority: normal Milestone:
Component: default Version: 2.0.1
Severity: normal Keywords: threads VRT virtual GeoTIFF
Cc:

Description (last modified by mdione)

I'm using a stack that includes Python2, mapnik3 and gdal2 (the last two compiled by hand from the latests releases, plus (lib)tiff, libgeotiff and python-mapnik) for rendering a map that use 3 layers for creating the background: terrain coloring, hillshade and slopeshade.

The original data comes from the recently released SRTM1 from here:

http://e4ftl01.cr.usgs.gov//MODV6_Dal_D/SRTM/SRTMGL1.003/2000.02.11/

I'm only using 934 files covering most of Europe (except those from the last batch released that include the south of Italy, Greece and Turkey). Because of the amount and resolution, instead of creating a huge GeoTIFF file I decided to use a VRT file on top of those, one for each layer.

The three layers are generated with gdaldem. Initially I used these options:

-co BIGTIFF=YES -co TILED=YES -co COMPRESS=LZW

The VRT file seems to work fine in single thread mode, as I also generate 3 files with gdalwarp for reducing the resolution 8x:

gdalwarp -co BIGTIFF=YES -co TILED=YES -co COMPRESS=LZW -tr 0.002222222222224 -0.002222222222224 big.vrt small.tif

Attached you will find the Makefile I use to generate them.

But then my problems arrive. The Python2 script (a heavily modified version https://github.com/StyXman/elevation/blob/3b3f530c050f57b0fdf40a975805a96a6c376288/generate_tiles.py of OSM's generate-tiles.py) uses threads for rendering, and I'm using 4 in my 4-core machine.

When the script starts using the VRT files (from zoom level 7; before, the small versions were used), it starts complaining about problems reading individual files. The files in particular are not always the same, and they have been regenerated just in case. The complaints stated being like this:

ERROR 1: LZWDecode:Wrong length of decoded string: data probably corrupted at scanline 1024
ERROR 1: TIFFReadEncodedTile() failed.
ERROR 1: /home/mdione/src/projects/osm/data/height/datafiles/N37W004-terrain.tif, band 1: IReadBlock failed at X offset 9, Y offset 1
[...]
Segmentation fault

Because the complaint included LZWDecode, I decided to regenerate the files of the region I'm testing on (the Iberian peninsula) without -co COMPRESS=LZW, but the errors persisted:

ERROR 1: TIFFFillTile:Seek error at row 1280, col 1280, tile 51
ERROR 1: TIFFReadEncodedTile() failed.
ERROR 1: /home/mdione/src/projects/osm/data/height/datafiles/N40W004-terrain.tif, band 1: IReadBlock failed at X offset 6, Y offset 3
[...]
Segmentation fault

I also regenerated them without -co TILED=YES, just in case, but now the error has somewhat mutated:

Bus error

I run the script under gdb and got this stack trace:

Program received signal SIGBUS, Bus error.
[Switching to Thread 0x7fffb4c89700 (LWP 941)]
_int_malloc (av=av@entry=0x7fffb0000020, bytes=bytes@entry=80) at malloc.c:3483
3483    malloc.c: No such file or directory.
(gdb) bt
#0  _int_malloc (av=av@entry=0x7fffb0000020, bytes=bytes@entry=80) at malloc.c:3483
#1  0x00007ffff6f6da3e in __GI___libc_malloc (bytes=80) at malloc.c:2895
#2  0x00007ffff2b4bae8 in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007fffe93f10e6 in GDALRasterBand::GetLockedBlockRef(int, int, int) () from /home/mdione/local/lib/libgdal.so.20
#4  0x00007fffe94122e2 in GDALRasterBand::IRasterIO(GDALRWFlag, int, int, int, int, void*, int, int, GDALDataType, long long, long long, GDALRasterIOExtraArg*) () from /home/mdione/local/lib/libgdal.so.20
#5  0x00007fffe90ec03a in GTiffRasterBand::IRasterIO(GDALRWFlag, int, int, int, int, void*, int, int, GDALDataType, long long, long long, GDALRasterIOExtraArg*) () from /home/mdione/local/lib/libgdal.so.20
#6  0x00007fffe93eb09f in GDALProxyRasterBand::IRasterIO(GDALRWFlag, int, int, int, int, void*, int, int, GDALDataType, long long, long long, GDALRasterIOExtraArg*) () from /home/mdione/local/lib/libgdal.so.20
#7  0x00007fffe93ef9c3 in GDALRasterBand::RasterIO(GDALRWFlag, int, int, int, int, void*, int, int, GDALDataType, long long, long long, GDALRasterIOExtraArg*) () from /home/mdione/local/lib/libgdal.so.20
#8  0x00007fffe936775a in VRTSimpleSource::RasterIO(int, int, int, int, void*, int, int, GDALDataType, long long, long long, GDALRasterIOExtraArg*) () from /home/mdione/local/lib/libgdal.so.20
#9  0x00007fffe9361a81 in VRTSourcedRasterBand::IRasterIO(GDALRWFlag, int, int, int, int, void*, int, int, GDALDataType, long long, long long, GDALRasterIOExtraArg*) () from /home/mdione/local/lib/libgdal.so.20
#10 0x00007fffe93ef9c3 in GDALRasterBand::RasterIO(GDALRWFlag, int, int, int, int, void*, int, int, GDALDataType, long long, long long, GDALRasterIOExtraArg*) () from /home/mdione/local/lib/libgdal.so.20
#11 0x00007fffe9ff7ffa in gdal_featureset::get_feature(mapnik::query const&) () from /home/mdione/local/lib/mapnik/3.0/input/gdal.input
#12 0x00007fffe9ffa0f1 in gdal_featureset::next() () from /home/mdione/local/lib/mapnik/3.0/input/gdal.input
#13 0x00007ffff3c42743 in mapnik::feature_style_processor<mapnik::agg_renderer<mapnik::image<mapnik::rgba8_t>, mapnik::label_collision_detector4> >::render_style(mapnik::agg_renderer<mapnik::image<mapnik::rgba8_t>, mapnik::label_collision_detector4>&, mapnik::feature_type_style const*, mapnik::rule_cache const&, std::shared_ptr<mapnik::Featureset>, mapnik::proj_transform const&) ()
   from /home/mdione/local/lib/libmapnik.so.3.0
#14 0x00007ffff3c43101 in mapnik::feature_style_processor<mapnik::agg_renderer<mapnik::image<mapnik::rgba8_t>, mapnik::label_collision_detector4> >::render_material(mapnik::layer_rendering_material const&, mapnik::agg_renderer<mapnik::image<mapnik::rgba8_t>, mapnik::label_collision_detector4>&) () from /home/mdione/local/lib/libmapnik.so.3.0
#15 0x00007ffff3c44dbc in mapnik::feature_style_processor<mapnik::agg_renderer<mapnik::image<mapnik::rgba8_t>, mapnik::label_collision_detector4> >::apply(double) () from /home/mdione/local/lib/libmapnik.so.3.0
#16 0x00007ffff4db08c5 in void agg_renderer_visitor_1::operator()<mapnik::image<mapnik::rgba8_t> >(mapnik::image<mapnik::rgba8_t>&) ()
   from /home/mdione/.local/lib/python2.7/site-packages/mapnik-0.1-py2.7-linux-x86_64.egg/mapnik/_mapnik.so
#17 0x00007ffff4db0ba7 in render(mapnik::Map const&, mapnik::image_any&, double, unsigned int, unsigned int) ()
   from /home/mdione/.local/lib/python2.7/site-packages/mapnik-0.1-py2.7-linux-x86_64.egg/mapnik/_mapnik.so
#18 0x00007ffff4dbd764 in boost::python::detail::caller_arity<2u>::impl<void (*)(mapnik::Map const&, mapnik::image_any&), boost::python::default_call_policies, boost::mpl::vector3<void, mapnik::Map const&, mapnik::image_any&> >::operator()(_object*, _object*) () from /home/mdione/.local/lib/python2.7/site-packages/mapnik-0.1-py2.7-linux-x86_64.egg/mapnik/_mapnik.so
#19 0x00007ffff3496c9d in boost::python::objects::function::call(_object*, _object*) const () from /usr/lib/x86_64-linux-gnu/libboost_python-py27.so.1.58.0
#20 0x00007ffff3496e88 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_python-py27.so.1.58.0
#21 0x00007ffff349eef3 in boost::python::detail::exception_handler::operator()(boost::function0<void> const&) const () from /usr/lib/x86_64-linux-gnu/libboost_python-py27.so.1.58.0
#22 0x00007ffff4dba413 in boost::detail::function::function_obj_invoker2<boost::_bi::bind_t<bool, boost::python::detail::translate_exception<std::runtime_error, void (*)(std::runtime_error const&)>, boost::_bi::list3<boost::arg<1>, boost::arg<2>, boost::_bi::value<void (*)(std::runtime_error const&)> > >, bool, boost::python::detail::exception_handler const&, boost::function0<void> const&>::invoke(boost::detail::function::function_buffer&, boost::python::detail::exception_handler const&, boost::function0<void> const&) () from /home/mdione/.local/lib/python2.7/site-packages/mapnik-0.1-py2.7-linux-x86_64.egg/mapnik/_mapnik.so
#23 0x00007ffff349eec8 in boost::python::detail::exception_handler::operator()(boost::function0<void> const&) const () from /usr/lib/x86_64-linux-gnu/libboost_python-py27.so.1.58.0
#24 0x00007ffff4dba343 in boost::detail::function::function_obj_invoker2<boost::_bi::bind_t<bool, boost::python::detail::translate_exception<mapnik::value_error, void (*)(mapnik::value_error const&)>, boost::_bi::list3<boost::arg<1>, boost::arg<2>, boost::_bi::value<void (*)(mapnik::value_error const&)> > >, bool, boost::python::detail::exception_handler const&, boost::function0<void> const&>::invoke(boost::detail::function::function_buffer&, boost::python::detail::exception_handler const&, boost::function0<void> const&) () from /home/mdione/.local/lib/python2.7/site-packages/mapnik-0.1-py2.7-linux-x86_64.egg/mapnik/_mapnik.so
#25 0x00007ffff349eec8 in boost::python::detail::exception_handler::operator()(boost::function0<void> const&) const () from /usr/lib/x86_64-linux-gnu/libboost_python-py27.so.1.58.0
#26 0x00007ffff4dba273 in boost::detail::function::function_obj_invoker2<boost::_bi::bind_t<bool, boost::python::detail::translate_exception<std::out_of_range, void (*)(std::out_of_range const&)>, boost::_bi::list3<boost::arg<1>, boost::arg<2>, boost::_bi::value<void (*)(std::out_of_range const&)> > >, bool, boost::python::detail::exception_handler const&, boost::function0<void> const&>::invoke(boost::detail::function::function_buffer&, boost::python::detail::exception_handler const&, boost::function0<void> const&) () from /home/mdione/.local/lib/python2.7/site-packages/mapnik-0.1-py2.7-linux-x86_64.egg/mapnik/_mapnik.so
#27 0x00007ffff349eec8 in boost::python::detail::exception_handler::operator()(boost::function0<void> const&) const () from /usr/lib/x86_64-linux-gnu/libboost_python-py27.so.1.58.0
#28 0x00007ffff4dba1a3 in boost::detail::function::function_obj_invoker2<boost::_bi::bind_t<bool, boost::python::detail::translate_exception<std::exception, void (*)(std::exception const&)>, boost::_bi::list3<boost::arg<1>, boost::arg<2>, boost::_bi::value<void (*)(std::exception const&)> > >, bool, boost::python::detail::exception_handler const&, boost::function0<void> const&>::invoke(boost::detail::function::function_buffer&, boost::python::detail::exception_handler const&, boost::function0<void> const&) () from /home/mdione/.local/lib/python2.7/site-packages/mapnik-0.1-py2.7-linux-x86_64.egg/mapnik/_mapnik.so
#29 0x00007ffff349ecad in boost::python::handle_exception_impl(boost::function0<void>) () from /usr/lib/x86_64-linux-gnu/libboost_python-py27.so.1.58.0
#30 0x00007ffff3494079 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_python-py27.so.1.58.0

(I removed the rest of it because it just goes into Python2's code).

Then I run my script with a single thread and it didn't complain at all. This and the facts that gdalwarp could read the whole VRT and generate the small version; that the files it complains about are random on each run; and that the files have been regenerated several time with different options; all this tells me that this is a problem related to threads.

I'm open for digging deeper into this bug, but now I need suggestions about how to instrument gdal's code to track it; running it under gdb or valgrind is quite time expensive.

Attachments (1)

Makefile (4.8 KB ) - added by mdione 8 years ago.
Makefile for generating the different layers.

Download all attachments as: .zip

Change History (7)

by mdione, 8 years ago

Attachment: Makefile added

Makefile for generating the different layers.

comment:1 by mdione, 8 years ago

I forgot to mention: you can find me in #gdal as StyXman if you want to help me in a more interactive way.

comment:2 by Even Rouault, 8 years ago

See "Multi-threading issues" at bottom of http://gdal.org/gdal_vrttut.html . Particularly you should try setting VRT_SHARED_SOURCE=0

comment:3 by mdione, 8 years ago

Description: modified (diff)

comment:4 by Even Rouault, 8 years ago

@mdione Did you try VRT_SHARED_SOURCE=0 ?

comment:5 by Jukka Rahkonen, 6 years ago

@mdione, do you still have problems with this?

comment:6 by Even Rouault, 6 years ago

Resolution: duplicate
Status: newclosed
Note: See TracTickets for help on using tickets.