Opened 10 years ago

Closed 5 years ago

#5556 closed defect (wontfix)

VSIFFlushL is potential bottleneck on Windows

Reported by: Mateusz Łoskot Owned by: warmerdam
Priority: normal Milestone: closed_because_of_github_migration
Component: default Version: svn-trunk
Severity: normal Keywords: performance
Cc: warmerdam, mateusz@…

Description

(It's not a real defect, but performance issue.)

VSIFFlushL on Windows is implemented in terms of FlushFileBuffers which is an expansive routine

From How to Programmatically Improve File System Throughput:

FlushFileBuffers(): This function will flush everything in the write-back cache, as it does not know what part of the cache belongs to your file. This can take a lot of time, depending on the cache size and the speed of the media. How necessary is it? There is a thread which goes through and writes out dirty pages, so it is likely not very necessary.

FlushFileBuffers requests OS buffers to be written to file. fflush requests C run-time buffers to be written to OS, but actual commit to file may be delayed. So, fflush tends to be order of magnitude faster,

These differences may lead to significant performance differences in case a driver will have been updated from FILE* to VSILFILE*.

Change History (21)

comment:1 by Even Rouault, 10 years ago

Did you hit in practice such a performance problem ? From what I can see, there are not so many instances of calls to VSIFFlushL(), but one can be enough.

From the description, FlushFileBuffers() would be a bit like fflush() + the costly fsync(). Any idea on an equivalent of fflush() for a handle gotten by CreateFile(). Or perhaps a way of getting a FILE* from a HANDLE ? Or perhaps the implementation of VSIFFlushL() on Windows should be just empty, if the WriteFile() API doesn't cache stuff on user-space ?

comment:2 by Mateusz Łoskot, 10 years ago

Yes, an experiment of patching MITAB to use VSI*L shows the problem very well. A simple conversion of SHP to TAB I tried gives: ~10 seconds (original) vs ~200 seconds (VSI*L).

Empty implementation may do. FILE* to HANDLE is possible as well, as per http://stackoverflow.com/a/5193795/151641

comment:3 by Even Rouault, 10 years ago

Hum, I'm not sure about the relevance of doing HANDLE -> int -> FILE* conversion and applying fflush() on that...

Perhaps the best would be to analyze if the fflush() are needed in MITAB at all. Presumably no.

in reply to:  3 comment:4 by Mateusz Łoskot, 10 years ago

Replying to rouault:

Perhaps the best would be to analyze if the fflush() are needed in MITAB at all.

Yes, that should be a general strategy to apply while updating a driver to VSI*L. So, with this ticket, I wanted to 1) point it out and archive 2) receive feedback about solutions I'm not aware of.

comment:5 by warmerdam, 10 years ago

Cc: warmerdam added

Mateusz, I'm curious why the mitab writing is even calling VSIFlushL()? Perhaps the core problem is that MITAB is abusing flushing?

in reply to:  5 comment:6 by mpd, 10 years ago

Replying to warmerdam:

Frank, mitab is calling fflush, but we have a patched version that we have converted to use VSI*L, hence the call to VSIFFlushL. (We have a patched version because Windows has a hard limit on the number of concurrent FILE* handles and we have a user that wants to open gazillions of TAB files at the same time.)

fflush would, of course, be needed after a write if the next operation is a read. If that is a possibility, then the call to VSIFFlushL needs to stay in case the FILE-based VSI*L class is in use (it never is for us). I don't know enough about mitab to know whether or not this is the case.

We felt that it was safer to leave the (modified) VSIFFlushL in place, but to make the Win32 HANDLE-based implementation a no-op (because there is no equivalent to fflush for HANDLEs). Even if the VSIFFlushL were removed from mitab, this change is probably still a good thing to do, for the sake of any other callers.

comment:7 by Even Rouault, 10 years ago

Actually if you alternate fwrite and fread, a fseek is sufficient to ensure that what needs to be "flushed" is (and remember a bug in the NITF driver if I remember well, it is actually needed). But this is in the POSIX world, not sure however in the Windows API world... So the no-op could potentially be a problem in some scenarios (Note: MITAB doesn't really yet support random-read-write scenarios, but this is work I should be tackling soon.)

in reply to:  7 comment:8 by mpd, 10 years ago

Replying to rouault:

Even,

fseek is indeed an alternative to flushing.

We could not find any reference to, or evidence that, either was needed with Win32 HANDLEs; presumably they do not have private buffers.

comment:9 by warmerdam, 10 years ago

Excellent, thanks for the investigation. Then the fix is MITAB moving to using seeks to ensure post write flushes before reading and no changes to VSI*L or the use of FlushFileBuffers on windows for the flush implementation.

in reply to:  9 comment:10 by mpd, 10 years ago

Replying to warmerdam:

Excellent, thanks for the investigation. Then the fix is MITAB moving to using seeks to ensure post write flushes before reading and no changes to VSI*L or the use of FlushFileBuffers on windows for the flush implementation.

I don't agree. Since mitab uses fflush directly at the moment everything is OK, so you could do nothing.

However, FlushFileBuffers is not equivalent to fflush, and should only be used where fsync would be. I would recommend removing the call to FlushFileBuffers as a defence against the unexpected side-effects (compared to fflush), should any driver start calling VSIFFlushL expecting fflush-like behaviour (and performance).

I will submit a mitab FILE->VSILFILE patch as a different case. It will replace fflush with VSIFFlushL to keep the desired behaviour in the POSIX world. On Windows, without the related change to remove FlushFileBuffers, users will see a significant performance degradation.

As I said, I would fix that by simply removing the call to FlushFileBuffers (this is what we have done locally). If you wanted to fix it by changing mitab to use seek(tell()), then that is fine too, but the FlushFileBuffers call will still be there waiting to catch out the unwary.

comment:11 by Even Rouault, 10 years ago

The intent of VSIFFlushL() - I believe - is to make sure than anything that should be written is passed to the OS, so that potentially another process that would open the file would see the changes (maybe perhaps not immediately, I don't know). So you could potentially make a no-op implementation for Windows, but I think you would have to write some stress-test so that we can have some confidence in the fact that the WriteFile() API offers that guarantee. If not, an alternative should be found.

In the MITAB case, I'm not sure why we want a flush to occur (except consistent write / read sequences). To be able to open the file with MapInfo while still being opened by GDAL ? I doubt so. It is perhaps unneeded.

in reply to:  11 ; comment:12 by mpd, 10 years ago

Replying to rouault:

fwrite() on Windows uses WriteFile() internally, but maintains its own buffer. fflush() on Windows simply forces the buffer to be passed to WriteFile().

It is possible to force fflush() to call _commit(), either by adding 'c' to the fopen() to the mode parameter, or by adding COMMODE.OBJ to the linker input, but the default is not to call it. _commit() is where FlushFileBuffers() is called.

Given all that, and assuming that I have analysed it correctly, I cannot see any circumstances in which FileFlushBuffers() is an appropriate alternative to fflush().

in reply to:  12 ; comment:13 by Even Rouault, 10 years ago

Given all that, and assuming that I have analysed it correctly, I cannot see any circumstances in which FileFlushBuffers() is an appropriate alternative to fflush().

Understood. The question is rather : when using WriteFile(), do we need something that offers equivalent guarantees than fflush() provides, or is WriteFile() alone sufficient ?

in reply to:  13 comment:14 by mpd, 10 years ago

Replying to rouault:

Understood. The question is rather : when using WriteFile(), do we need something that offers equivalent guarantees than fflush() provides, or is WriteFile() alone sufficient ?

I can't see anything in the fflush() implementation written using WriteFile() to indicate that anything else is required.

comment:15 by warmerdam, 10 years ago

FWIW, I agree with Martin after all. VSIFlushL() only needs to offer equivelent to POSIX fflush() functionality, not a hard sync to disk. The windows FileFlushBuffers() is more action than needed to meet the POSIX expectation.

Even, if you are ok removing it, then please go ahead.

comment:16 by Mateusz Łoskot, 10 years ago

Thanks Martin for delivering complete picture. Thanks Frank and Even for taking action.

comment:17 by Even Rouault, 10 years ago

Milestone: 2.0
Resolution: fixed
Status: newclosed

trunk r27491 "VSIWin32Handle::Flush(): no-op implementation is sufficient to offer same guarantee as POSIX fflush() (#5556)"

comment:18 by Even Rouault, 9 years ago

Resolution: fixed
Status: closedreopened

Martin, Mateusz, it turns out that removing FileFlushBuffers() cause issues for the SyncToDisk() implementation I added to the MITAB driver a few months ago. Basically this SyncToDisk() implementation is supposed to sync the in-memory state to disk, so that opening another OGR datasource to the same dataset (including from the same process) sees a consistant and up-to-date state. With FileFlushBuffers() disabled, this is not the case and cause the ogr_mitab_30 and _31 tests to fail.

I just noticed that since I run the tests on a real Windows OS (which shows that nobody tried to run the test suite during the 2.0beta and RC stages, or didn't report. sigh...). On Linux under Wine, which is my usual Windows test environment, this is not needed (not that surprising after all since WriteFile() must be somehow implemented with write() or fwrite())

So for now I've committed r29476 "VSIWin32Handle::Flush(): add VSI_FLUSH config option that can be set to TRUE to force FlushFileBuffers(). (hack related to #5556)" + r29477 "Improve latest commit to move VSI_FLUSH=TRUE inside TABFile::SyncToDisk() itself", but a proper solution would be needed. And I'm not even sure that hack is really reliable since if TABRawBinBlock::CommitToFile() is called before setting the env. variable, the file might not be properly flushed since SyncToDisk() could be a no-op.

comment:19 by Even Rouault, 9 years ago

Cc: mateusz@… added

Re-adding Mateusz explicitely as CC related to above comment, as I'm not sure if you receive notifications anymore after Trac upgrade (might require filling email on https://trac.osgeo.org/gdal/prefs)

comment:20 by Even Rouault, 9 years ago

Milestone: 2.0

Removing obsolete milestone

comment:21 by Even Rouault, 5 years ago

Milestone: closed_because_of_github_migration
Resolution: wontfix
Status: reopenedclosed

This ticket has been automatically closed because Trac is no longer used for GDAL bug tracking, since the project has migrated to GitHub. If you believe this ticket is still valid, you may file it to https://github.com/OSGeo/gdal/issues if it is not already reported there.

Note: See TracTickets for help on using tickets.