Skip to content

S3: Virtual-hosted bucket style doesn't work properly with AWS_ENDPOINT_URL #56

@craigds

Description

@craigds

Using a bucket name with no dots in it in conjunction with AWS_ENDPOINT_URL breaks the URL assembly that arbiter is doing:

export AWS_ENDPOINT_URL='https://localhost.localstack.cloud:4566/'

# echo '[{"type": "readers.copc", "filename": "s3://my-dev-bucket/file.laz"}]' | CURL_VERBOSE=1 pdal pipeline --stdin
Curl config:
	timeout: 5s
	followRedirect: true
	verifyPeer: true
	caBundle: (default)
	caInfo: /opt/certifi/cacert.pem
	Proxy: (default)
* Could not resolve host: my-dev-bucket.https
* Closing connection
Curl failure: Couldn't resolve host name
^C

It seems to be assuming the endpoint url starts with a domain name rather than a scheme? so the result is probably something like https://my-dev-bucket.https://localhost.localstack.cloud:4566/file.laz

This seems fixed as long as the bucket name has at least one . in it, which forces arbiter not to use virtual-hosted style URLs.

Possible fixes could be any of:

  1. disable virtual hosted always (what's the upside? there seems to be a vague promise from AWS that non-virtual-hosted will go away eventually but they haven't given an actual date yet, and there's no proposed options for dot-named buckets)
  2. make it disableable via an env var (GDAL uses AWS_VIRTUAL_HOSTING=FALSE for this purpose)
  3. disable virtual hosting automatically when AWS_ENDPOINT_URL is in use
  4. fix the broken url assembly so it preserves the scheme.

(4) by itself would still be a pain because I'd have to add a subdomain alias for my docker container, but it's workable.

My suggestion would be to implement both (2) and (4) to fix this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions