Opened 3 years ago

Last modified 22 months ago

#5556 reopened defect

VSIFFlushL is potential bottleneck on Windows

Reported by: Mateusz Łoskot Owned by: warmerdam
Priority: normal Milestone:
Component: default Version: svn-trunk
Severity: normal Keywords: performance
Cc: warmerdam, mateusz@…

Description

(It's not a real defect, but performance issue.)

VSIFFlushL on Windows is implemented in terms of FlushFileBuffers? which is an expansive routine

From How to Programmatically Improve File System Throughput:

FlushFileBuffers?(): This function will flush everything in the write-back cache, as it does not know what part of the cache belongs to your file. This can take a lot of time, depending on the cache size and the speed of the media. How necessary is it? There is a thread which goes through and writes out dirty pages, so it is likely not very necessary.

FlushFileBuffers? requests OS buffers to be written to file. fflush requests C run-time buffers to be written to OS, but actual commit to file may be delayed. So, fflush tends to be order of magnitude faster,

These differences may lead to significant performance differences in case a driver will have been updated from FILE* to VSILFILE*.

Change History (20)

comment:1 Changed 3 years ago by Even Rouault

Did you hit in practice such a performance problem ? From what I can see, there are not so many instances of calls to VSIFFlushL(), but one can be enough.

From the description, FlushFileBuffers?() would be a bit like fflush() + the costly fsync(). Any idea on an equivalent of fflush() for a handle gotten by CreateFile?(). Or perhaps a way of getting a FILE* from a HANDLE ? Or perhaps the implementation of VSIFFlushL() on Windows should be just empty, if the WriteFile?() API doesn't cache stuff on user-space ?

comment:2 Changed 3 years ago by Mateusz Łoskot

Yes, an experiment of patching MITAB to use VSI*L shows the problem very well. A simple conversion of SHP to TAB I tried gives: ~10 seconds (original) vs ~200 seconds (VSI*L).

Empty implementation may do. FILE* to HANDLE is possible as well, as per http://stackoverflow.com/a/5193795/151641

comment:3 Changed 3 years ago by Even Rouault

Hum, I'm not sure about the relevance of doing HANDLE -> int -> FILE* conversion and applying fflush() on that...

Perhaps the best would be to analyze if the fflush() are needed in MITAB at all. Presumably no.

comment:4 in reply to:  3 Changed 3 years ago by Mateusz Łoskot

Replying to rouault:

Perhaps the best would be to analyze if the fflush() are needed in MITAB at all.

Yes, that should be a general strategy to apply while updating a driver to VSI*L. So, with this ticket, I wanted to 1) point it out and archive 2) receive feedback about solutions I'm not aware of.

comment:5 Changed 3 years ago by warmerdam

Cc: warmerdam added

Mateusz, I'm curious why the mitab writing is even calling VSIFlushL()? Perhaps the core problem is that MITAB is abusing flushing?

comment:6 in reply to:  5 Changed 3 years ago by mpd

Replying to warmerdam:

Frank, mitab is calling fflush, but we have a patched version that we have converted to use VSI*L, hence the call to VSIFFlushL. (We have a patched version because Windows has a hard limit on the number of concurrent FILE* handles and we have a user that wants to open gazillions of TAB files at the same time.)

fflush would, of course, be needed after a write if the next operation is a read. If that is a possibility, then the call to VSIFFlushL needs to stay in case the FILE-based VSI*L class is in use (it never is for us). I don't know enough about mitab to know whether or not this is the case.

We felt that it was safer to leave the (modified) VSIFFlushL in place, but to make the Win32 HANDLE-based implementation a no-op (because there is no equivalent to fflush for HANDLEs). Even if the VSIFFlushL were removed from mitab, this change is probably still a good thing to do, for the sake of any other callers.

comment:7 Changed 3 years ago by Even Rouault

Actually if you alternate fwrite and fread, a fseek is sufficient to ensure that what needs to be "flushed" is (and remember a bug in the NITF driver if I remember well, it is actually needed). But this is in the POSIX world, not sure however in the Windows API world... So the no-op could potentially be a problem in some scenarios (Note: MITAB doesn't really yet support random-read-write scenarios, but this is work I should be tackling soon.)

comment:8 in reply to:  7 Changed 3 years ago by mpd

Replying to rouault:

Even,

fseek is indeed an alternative to flushing.

We could not find any reference to, or evidence that, either was needed with Win32 HANDLEs; presumably they do not have private buffers.

comment:9 Changed 3 years ago by warmerdam

Excellent, thanks for the investigation. Then the fix is MITAB moving to using seeks to ensure post write flushes before reading and no changes to VSI*L or the use of FlushFileBuffers? on windows for the flush implementation.

comment:10 in reply to:  9 Changed 3 years ago by mpd

Replying to warmerdam:

Excellent, thanks for the investigation. Then the fix is MITAB moving to using seeks to ensure post write flushes before reading and no changes to VSI*L or the use of FlushFileBuffers? on windows for the flush implementation.

I don't agree. Since mitab uses fflush directly at the moment everything is OK, so you could do nothing.

However, FlushFileBuffers? is not equivalent to fflush, and should only be used where fsync would be. I would recommend removing the call to FlushFileBuffers? as a defence against the unexpected side-effects (compared to fflush), should any driver start calling VSIFFlushL expecting fflush-like behaviour (and performance).

I will submit a mitab FILE->VSILFILE patch as a different case. It will replace fflush with VSIFFlushL to keep the desired behaviour in the POSIX world. On Windows, without the related change to remove FlushFileBuffers?, users will see a significant performance degradation.

As I said, I would fix that by simply removing the call to FlushFileBuffers? (this is what we have done locally). If you wanted to fix it by changing mitab to use seek(tell()), then that is fine too, but the FlushFileBuffers? call will still be there waiting to catch out the unwary.

comment:11 Changed 3 years ago by Even Rouault

The intent of VSIFFlushL() - I believe - is to make sure than anything that should be written is passed to the OS, so that potentially another process that would open the file would see the changes (maybe perhaps not immediately, I don't know). So you could potentially make a no-op implementation for Windows, but I think you would have to write some stress-test so that we can have some confidence in the fact that the WriteFile?() API offers that guarantee. If not, an alternative should be found.

In the MITAB case, I'm not sure why we want a flush to occur (except consistent write / read sequences). To be able to open the file with MapInfo? while still being opened by GDAL ? I doubt so. It is perhaps unneeded.

comment:12 in reply to:  11 ; Changed 3 years ago by mpd

Replying to rouault:

fwrite() on Windows uses WriteFile?() internally, but maintains its own buffer. fflush() on Windows simply forces the buffer to be passed to WriteFile?().

It is possible to force fflush() to call _commit(), either by adding 'c' to the fopen() to the mode parameter, or by adding COMMODE.OBJ to the linker input, but the default is not to call it. _commit() is where FlushFileBuffers?() is called.

Given all that, and assuming that I have analysed it correctly, I cannot see any circumstances in which FileFlushBuffers?() is an appropriate alternative to fflush().

comment:13 in reply to:  12 ; Changed 3 years ago by Even Rouault

Given all that, and assuming that I have analysed it correctly, I cannot see any circumstances in which FileFlushBuffers?() is an appropriate alternative to fflush().

Understood. The question is rather : when using WriteFile?(), do we need something that offers equivalent guarantees than fflush() provides, or is WriteFile?() alone sufficient ?

comment:14 in reply to:  13 Changed 3 years ago by mpd

Replying to rouault:

Understood. The question is rather : when using WriteFile?(), do we need something that offers equivalent guarantees than fflush() provides, or is WriteFile?() alone sufficient ?

I can't see anything in the fflush() implementation written using WriteFile?() to indicate that anything else is required.

comment:15 Changed 3 years ago by warmerdam

FWIW, I agree with Martin after all. VSIFlushL() only needs to offer equivelent to POSIX fflush() functionality, not a hard sync to disk. The windows FileFlushBuffers?() is more action than needed to meet the POSIX expectation.

Even, if you are ok removing it, then please go ahead.

comment:16 Changed 3 years ago by Mateusz Łoskot

Thanks Martin for delivering complete picture. Thanks Frank and Even for taking action.

comment:17 Changed 3 years ago by Even Rouault

Milestone: 2.0
Resolution: fixed
Status: newclosed

trunk r27491 "VSIWin32Handle::Flush(): no-op implementation is sufficient to offer same guarantee as POSIX fflush() (#5556)"

comment:18 Changed 2 years ago by Even Rouault

Resolution: fixed
Status: closedreopened

Martin, Mateusz, it turns out that removing FileFlushBuffers?() cause issues for the SyncToDisk?() implementation I added to the MITAB driver a few months ago. Basically this SyncToDisk?() implementation is supposed to sync the in-memory state to disk, so that opening another OGR datasource to the same dataset (including from the same process) sees a consistant and up-to-date state. With FileFlushBuffers?() disabled, this is not the case and cause the ogr_mitab_30 and _31 tests to fail.

I just noticed that since I run the tests on a real Windows OS (which shows that nobody tried to run the test suite during the 2.0beta and RC stages, or didn't report. sigh...). On Linux under Wine, which is my usual Windows test environment, this is not needed (not that surprising after all since WriteFile?() must be somehow implemented with write() or fwrite())

So for now I've committed r29476 "VSIWin32Handle::Flush(): add VSI_FLUSH config option that can be set to TRUE to force FlushFileBuffers?(). (hack related to #5556)" + r29477 "Improve latest commit to move VSI_FLUSH=TRUE inside TABFile::SyncToDisk?() itself", but a proper solution would be needed. And I'm not even sure that hack is really reliable since if TABRawBinBlock::CommitToFile?() is called before setting the env. variable, the file might not be properly flushed since SyncToDisk?() could be a no-op.

comment:19 Changed 2 years ago by Even Rouault

Cc: mateusz@… added

Re-adding Mateusz explicitely as CC related to above comment, as I'm not sure if you receive notifications anymore after Trac upgrade (might require filling email on https://trac.osgeo.org/gdal/prefs)

comment:20 Changed 22 months ago by Even Rouault

Milestone: 2.0

Removing obsolete milestone

Note: See TracTickets for help on using tickets.