Opened 14 years ago

Closed 9 years ago

Last modified 9 years ago

#3225 closed defect (fixed)

Potential - small - race condition in gdalrasterband.cpp ?

Reported by: Even Rouault Owned by: warmerdam
Priority: normal Milestone: 2.0.0
Component: default Version: unspecified
Severity: normal Keywords:
Cc:

Description

Frank,

While reviewing #3224, I ended up reviewing the current implementation in gdalrasterband.cpp and (I think) I've discovered a (probably very hard-to-hit) race condition. The scenario is Thread 1 flushing a block (from FlushCacheBlock() for example) while Thread 2 calling TryGetLockedBlockRef() on the same block.

Below a potential executation flow that could lead to crashes :

Thread 1                                                                Thread 2

FlushBlock()                                                            TryGetLockedBlockRef()

1000:  GDALRasterBlock::SafeLockBlock( papoBlocks + nBlockIndex );
                                                                        1117: GDALRasterBlock::SafeLockBlock( papoBlocks + nBlockIndex );
                                                                        1119: return papoBlocks[nBlockIndex];
1002:  poBlock = papoBlocks[nBlockIndex];                               
1003:  papoBlocks[nBlockIndex] = NULL;
.....
1049:  delete poBlock;
                                                                        XXXX : uses non value returned by line 1119, which is now destroyed

The scope of the mutex in SafeLockBlock() is not sufficient. The same applies for the sub-blocking case, so there should be a mutex protecting lines 1000-1003, 1014-1029, 1117-1119 and 1125-1142

Attachments (2)

gdalrasterband.cpp.with_sleeps (2.4 KB ) - added by Even Rouault 14 years ago.
Hacked version that adds 2 sleeps
hit_bug_3225.c (1.1 KB ) - added by Even Rouault 14 years ago.

Download all attachments as: .zip

Change History (5)

comment:1 by Even Rouault, 14 years ago

Yes, I've confirmed that there's an issue with the attached version of gdalrasterband.cpp that is patched to add sleeps at strategic places.

Here's the output of a run of hit_bug_3225.c :

$  ./hit_bug_3225.exe
cache fill
waiting at line 1211
end of waiting at line 1213
using 00000000
end of cache fill
buf[0]=107

begin thread
waiting at line 1010
re read block
waiting at line 1211
end of waiting at line 1012
deleting 01076F88
ERROR 7: Assertion `nLockCount == 0' failed
in file `gdalrasterblock.cpp', line 282

by Even Rouault, 14 years ago

Hacked version that adds 2 sleeps

by Even Rouault, 14 years ago

Attachment: hit_bug_3225.c added

comment:2 by Even Rouault, 9 years ago

Milestone: 2.0
Resolution: fixed
Status: newclosed

I believe this hole has been closed by the work done in GDAL 2.0. GDALRasterBand::FlushBlock() is no longer called from GDALRasterBlock. Instead the new GDALRasterBand::UnreferenceBlock() is called by GDALRasterBlock and always with the RB mutex taken.

comment:3 by Even Rouault, 9 years ago

Milestone: 2.02.0.0

Milestone renamed

Note: See TracTickets for help on using tickets.