#7154 closed defect (fixed)
vsis3: certificate issue with bucket with dot in the bucket name
Reported by: | tveastman | Owned by: | warmerdam |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | default | Version: | unspecified |
Severity: | normal | Keywords: | vsis3 |
Cc: | robert.coup@… |
Description
Background: Buckets with a .
in the filename need to be accessed with AWS_VIRTUAL_HOSTING
set to NO
, otherwise an SSL error occurs when the client tries to connect, as demonstrated.
In [17]: gdal.SetConfigOption(b'AWS_VIRTUAL_HOSTING', b'YES') In [18]: gdal.VSICurlClearCache() In [19]: gdal.ReadDir('/vsis3/bucket.with.dots.in/') * Couldn't find host bucket.with.dots.in.s3.amazonaws.com in the .netrc file; using defaults * Hostname was NOT found in DNS cache * Trying 52.216.0.232... * TCP_NODELAY set * Connected to bucket.with.dots.in.s3.amazonaws.com (52.216.0.232) port 443 (#14) * successfully set certificate verify locations: * CAfile: none CApath: /etc/ssl/certs * SSL connection using ECDHE-RSA-AES128-GCM-SHA256 * Server certificate: * subject: C=US; ST=Washington; L=Seattle; O=Amazon.com Inc.; CN=*.s3.amazonaws.com * start date: 2017-09-22 00:00:00 GMT * expire date: 2019-01-03 12:00:00 GMT * subjectAltName does not match bucket.with.dots.in.s3.amazonaws.com * SSL: no alternative certificate subject name matches target host name 'bucket.with.dots.in.s3.amazonaws.com' * Closing connection 14
That was expected behaviour, and you work around it by setting AWS_VIRTUAL_HOSTING
to YES
.
The trouble occurs when the bucket is in a non standard region, and the initial response is a redirect to another region:
In [20]: gdal.SetConfigOption(b'AWS_VIRTUAL_HOSTING', b'NO') In [21]: gdal.VSICurlClearCache() In [22]: gdal.ReadDir('/vsis3/bucket.with.dots.in/') * Couldn't find host s3.amazonaws.com in the .netrc file; using defaults * Hostname was NOT found in DNS cache * Trying 52.216.84.197... * TCP_NODELAY set * Connected to s3.amazonaws.com (52.216.84.197) port 443 (#15) * successfully set certificate verify locations: * CAfile: none CApath: /etc/ssl/certs * SSL connection using ECDHE-RSA-AES128-GCM-SHA256 * Server certificate: * subject: C=US; ST=Washington; L=Seattle; O=Amazon.com Inc.; CN=s3.amazonaws.com * start date: 2017-09-26 00:00:00 GMT * expire date: 2018-09-20 12:00:00 GMT * subjectAltName: s3.amazonaws.com matched * issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=DigiCert Baltimore CA-2 G2 * SSL certificate verify ok. > GET /bucket.with.dots.in/?delimiter=%2F HTTP/1.1 Host: s3.amazonaws.com Accept: */* x-amz-date: 20171120T214123Z x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 Authorization: AWS4-HMAC-SHA256 Credential=AKIAJJA4D44G5LMQYQOQ/20171120/ap-southeast-2/s3/aws4_request,SignedHeaders=host;x-amz-content-sha256;x-amz-date,Signature=8b300ce747c78a0c44d4e8d05318ff57e7a12a2748e46a1abd2767e66c66479a < HTTP/1.1 301 Moved Permanently < x-amz-bucket-region: ap-southeast-2 < x-amz-request-id: 33E45A7527EC2159 < x-amz-id-2: xB6JqOQ29pqPJdQkmjnFwAWT0K+oPdtLVYPpjJ6FGJz92Y80Xf8qw8cWSGcvJvevQuZK05cc4uE= < Content-Type: application/xml < Transfer-Encoding: chunked < Date: Mon, 20 Nov 2017 21:41:23 GMT * Server AmazonS3 is not blacklisted < Server: AmazonS3 < * Connection #15 to host s3.amazonaws.com left intact * Couldn't find host bucket.with.dots.in.s3.amazonaws.com in the .netrc file; using defaults * Hostname was NOT found in DNS cache * Trying 54.231.82.162... * TCP_NODELAY set * Connected to bucket.with.dots.in.s3.amazonaws.com (54.231.82.162) port 443 (#16) * successfully set certificate verify locations: * CAfile: none CApath: /etc/ssl/certs * SSL connection using ECDHE-RSA-AES128-GCM-SHA256 * Server certificate: * subject: C=US; ST=Washington; L=Seattle; O=Amazon.com Inc.; CN=*.s3.amazonaws.com * start date: 2017-09-22 00:00:00 GMT * expire date: 2019-01-03 12:00:00 GMT * subjectAltName does not match bucket.with.dots.in.s3.amazonaws.com * SSL: no alternative certificate subject name matches target host name 'bucket.with.dots.in.s3.amazonaws.com' * Closing connection 16
The second request fails, it looks just like the original example at the top -- a DNS based 'virtual hosted' request, to the wrong region (it should be against the ap-southeast-2
endpoint)
Amazon seems weird about this. the 301 redirect doesn't include a Location:
header, but you can infer where the request should go from the x-amz-bucket-region
response header.
Change History (6)
comment:1 by , 6 years ago
Summary: | vsis3: region redirect causes AWS_VIRTUAL_HOSTING to not be honoured. → vsis3: certificate issue with bucket with dot in the bucket name |
---|
comment:2 by , 6 years ago
comment:3 by , 6 years ago
The AWS certificate issue is what gdal needs to work around. It is a known issue that making an HTTPS call to a virtual hosted S3 bucket with a .
in the name results in a mismatched SSL certificate.
For security, using http-only calls or GDAL_HTTP_UNSAFESSL=1
are both insufficient workarounds.
The workaround required, in order to preserve a verifiable SSL call is:
- Determine the bucket's region from the
x-amz-bucket-region
response header. - Redirect the request to
https://s3.<REGION>.amazonaws.com/<BUCKET>/
(ors3.amazonaws.com
if the region isus-east-1
)
At the moment, a GDAL user who needs to interact securely with a bucket that's not in us-east-1
and has a .
in the name must send a curl HEAD request to s3.amazonaws.com
to determine the region, and then set both AWS_S3_ENDPOINT=https://s3.REGION.amazonaws.com
and AWS_VIRTUAL_HOSTING=no
If GDAL is set for HTTPS and is trying to access a bucket with a .
in the name, it makes sense for its behaviour to respond appropriately.
comment:5 by , 6 years ago
@tveastman Thanks for the latest explanation. The AWS error message is rather misleading with a inappropriate endpoint suggested...
comment:6 by , 6 years ago
It isn't that S3 SSL is broken, it's more that it's not currently valid to issue wildcard certificates at multiple levels – a *.example.com
certificate (like S3 uses) is valid for x.example.com
but not for y.x.example.com
. There's no way to issue a **.example.com
or anything to cover multiple levels without manually specifying them all in the certificate.
The issue is a certificate issue.
The non-virtual hosting way ends up being a virtual hosting way through an AWS error, and curl doesn't like the certificate of the virtual host. Boto has also the same issue: https://github.com/boto/boto/issues/2836
The workaround for GDAL is to define GDAL_HTTP_UNSAFESSL=1
And apparently there's no need for GDAL to default to a non-virtual hosting way when the bucket name has a dot in it, since AWS redirects it to a virtual host. Perhaps this has changed since the time of the initial implementation