#6937 closed defect (fixed)
/vsicurl/ caching causes issues in case of updates followed by read scenarios
Reported by: | Even Rouault | Owned by: | Even Rouault |
---|---|---|---|
Priority: | normal | Milestone: | 2.2.1 |
Component: | default | Version: | unspecified |
Severity: | normal | Keywords: | /vsicurl/ |
Cc: |
Description
From the mailing list
My actual problem is a bit more specific then being unable to open S3 files after upload. The actual problem is that within the same Python session, I can open a file off S3 with the vsis3 driver, but then if I upload a new file that previously did not exist (using boto3), gdal does not see it as a valid file. What appears to be happening is that once an S3 file is read the contents of that bucket are read into a cache, but then if an new file is uploaded in the meantime, trying to then read that file looks in the cache and doesn't see that file as existing and throws an error. If I recall correctly GDAL is reading other contents of that bucket/key-prefix because it's looking accompanying metadata files so is this cached in some way? It seemed like a plausible explanation but I've been unable to find reference to such a cache other than potentially VSI_CACHE, but setting that to FALSE did nothing and my understanding is that it applies to specific datasets, not bucket contents. I've managed to replicate the problem in a very simple Python program below. While both files are uploaded without error (you can use gdalinfo remotely on both), the attempt to open the second file will throw: ERROR 4: `/vsis3/pail-of-images/test2.tif' not recognized as a supported file format. Calling the script a second time works, because (presumably) even though it uploads and overwrites both images again, they both exist from the beginning. Either this is a bug or it's intended behavior in which case there's hopefully some way to change it to force to reread a bucket when trying to open a file. My current workaround is to change the behavior of my app to upload all images first before accessing, but this seems unsatisfactory, not to mention it wreaks havoc with my tests which don't assume such behavior. ######################## #!/usr/bin/env python3 from osgeo import gdal import boto3 filenames = [ 'file1.tif', 'file2.tif' ] bucket = 'pail-of-images' s3 = boto3.resource('s3') for f in filenames: print('Uploading %s to %s' % (f, bucket)) s3.meta.client.upload_file(f, bucket, f) uri = '/vsis3/%s/%s' % (bucket, f) print('Opening %s' % uri) ds = gdal.Open(uri) print(ds.GetMetadata()) ds = None ##########################
Note:
See TracTickets
for help on using tickets.
In 39223: