Opened 7 months ago

Closed 7 months ago

Last modified 7 months ago

#7136 closed defect (fixed)

VSIS3: A file with the same name as a 'folder' causes misbehaviour.

Reported by: tveastman Owned by: warmerdam
Priority: normal Milestone: 2.3.0
Component: default Version: unspecified
Severity: normal Keywords: vsi vsis3 readdir readdirrecursive
Cc: robert.coup@…

Description

S3 'pretends' to have folders, but actually it's just a key-value store where the keys can have / in them, and we pretend it's a folder.

In S3 buckets, you can have a file and a folder with the same name.

Here's a listing of an S3 bucket:

$ aws s3 ls --recursive public-bucket-gdal-vsis3-tests
2017-11-06 11:30:48        237 alpha
2017-11-06 11:30:47        699 alpha/a-zip-file.zip
2017-11-03 15:17:30         30 alpha/delta.txt
2017-11-03 15:17:30         23 alpha/gamma.txt
2017-11-06 11:30:47        699 beta/a-zip-file.zip
2017-11-03 15:17:30          8 beta/eta.txt

Note that there is a file called alpha, but there are also several files in a 'folder' called alpha.

This causes bad bahaviour in VSI:

ReadDirRecursive() fails to list the files in alpha/, and lists the dir 
with two slashes oddly.

   >>>gdal.ReadDirRecursive('/vsis3/public-bucket-gdal-vsis3-tests')
   ['alpha', 'alpha//', 'beta/', 'beta/a-zip-file.zip', 'beta/eta.txt']

ReadDir can list the files if you append the /:

   >>>gdal.ReadDir('/vsis3/public-bucket-gdal-vsis3-tests/alpha')
   None
   >>> gdal.ReadDir('/vsis3/public-bucket-gdal-vsis3-tests/alpha/')
   ['a-zip-file.zip', 'delta.txt', 'gamma.txt']

VSIStatL reports things as directories in a strange way:

   >>> gdal.VSIStatL('/vsis3/public-bucket-gdal-vsis3-tests/alpha/').IsDirectory()
   1   <-- correct
   >>> gdal.VSIStatL('/vsis3/public-bucket-gdal-vsis3-tests/alpha').IsDirectory()
   0  <-- correct
   >>> gdal.VSIStatL('/vsis3/public-bucket-gdal-vsis3-tests/beta').IsDirectory()
   1  <-- correct
   >>> gdal.VSIStatL('/vsis3/public-bucket-gdal-vsis3-tests/beta/')
   None <-- with the trailing slash it's not recognized as a directory even though alpha/ is.

Two issues, it seems:

  • ReadDirRecursive() doesn't return correct results.
  • VSIStatL seems to behave inconsistently when trying to distinguish between the file, and the equivalently named directory.

Change History (2)

comment:1 Changed 7 months ago by Even Rouault

Resolution: fixed
Status: newclosed

In 40649:

Fix CPLReadDirRecursive() to behave properly on /vsis3/ buckets that have foo (file) and foo/ (sub-directory) entries (fixes #7136)

comment:2 Changed 7 months ago by Even Rouault

Milestone: 2.3.0

Note: r40649 is dependendant on previous trunk fixes in /vsis3/ to support foo (file) and foo/ (sub-directory) entries

Note: See TracTickets for help on using tickets.