Output filer mangles bucket name when using s3:// schema and bucket name contains the characters "s3".

Assume that TESK is deployed with a config file that sets the output endpoint to some s3 instance in http or https format... 
```
[default]
endpoint_url=http://some.endpoint.com
```

Then in the job json the "url" for outputs is set to the following:

```
# This works! 
"url": "s3://output",
```
The s3 schema means "output" gets treated as the bucket name.

```
# These all fail!  
"url": "s3://outputs3",
"url": "s3://s3output",

# Here a less contrived example to show how this can happen when you don't even intentionally use "s3" to mean "s3"
"url": "s3://shoulders3486output",
```

The s3 schema is detected but because the bucket name also contains "s3" it falsely triggers this regex: 

https://github.com/elixir-cloud-aai/tesk-core/blob/1a7b8108cf6a5f9308eca916af6d57e6ecdd9081/src/tesk_core/filer_s3.py#L64 

Which mangles the bucket name leading to a bucket not found error.

But we can trick it...

```
# This works! 
"url": "http://s3.foo.bar.baz/shoulders3486output",
```

HTTP is detected as the schema, but the netloc part of the url contains "s3" so it is treated as s3 due to this logic: 

https://github.com/elixir-cloud-aai/tesk-core/blob/1a7b8108cf6a5f9308eca916af6d57e6ecdd9081/src/tesk_core/filer.py#L416-L417 

The bucket name is now part of the URL "path" not the URL "netloc", so it doesn't get mangled.

`s3.foo.bar.baz` the netloc part is never actually used other than detecting if it's an s3 transfer or http transfer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Output filer mangles bucket name when using s3:// schema and bucket name contains the characters "s3". #45

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	if 's3' in netloc:
	return S3Transput

Output filer mangles bucket name when using s3:// schema and bucket name contains the characters "s3". #45

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions