Skip to content

BAM IndexedReader from S3 url: I/O Error #216

@brainstorm

Description

@brainstorm

This is a followup from #189 (comment) regarding feature = ["s3"]. Here's some minimal example code to read a BAM header hosted on S3:

pub fn bam_header(bucket: String, key: String) -> Vec<String> {
    let s3_url = Url::parse(&("s3://".to_string() + &bucket + "/" + &key)).unwrap();
    let bam_reader = IndexedReader::from_url(&s3_url).unwrap();

    let targets = bam_reader.header().target_names().into_iter()
                            .map(|raw_name| String::from_utf8_lossy(raw_name).to_string())
                            .collect();
    return targets;
}

When feature does not have s3, it (predictably) goes "Protocol not supported", like this htslib+pysam's bug recently fixed:

[E::hts_open_format] Failed to open file "s3://umccr-research-dev/htsget/htsnexus_test_NA12878.bam" : Protocol not supported
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: BamOpen { source: Open { target: "s3://umccr-research-dev/htsget/htsnexus_test_NA12878.bam" } }', src/main.rs:17:12

(...)

END RequestId: a2efde5b-5e19-4905-90e1-0b91badcb163
REPORT RequestId: a2efde5b-5e19-4905-90e1-0b91badcb163	Duration: 543.71 ms	Billed Duration: 600 ms	Memory Size: 128 MB	Max Memory Used: 15 MB	
RequestId: a2efde5b-5e19-4905-90e1-0b91badcb163 Error: Runtime exited with error: exit status 101
Runtime.ExitError

Then, enabling S3 support, it leads to I/O error with some simple code that tries to retrieve target names from a BAM header:

START RequestId: 7197fc91-d554-4147-9e0d-a29fe3d6e0fb Version: $LATEST
[E::hts_open_format] Failed to open file "s3://umccr-research-dev/htsget/htsnexus_test_NA12878.bam" : I/O error
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: BamOpen { source: Open { target: "s3://umccr-research-dev/htsget/htsnexus_test_NA12878.bam" } }', src/main.rs:17:12
stack backtrace:
   0:           0x641674 - backtrace::backtrace::libunwind::trace::h234d741a55b60f88
                               at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/libunwind.rs:86
   1:           0x641674 - backtrace::backtrace::trace_unsynchronized::h350b2c8c65b00d1d
                               at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/mod.rs:66
   2:           0x641674 - std::sys_common::backtrace::_print_fmt::h4a536ea1c8e8e74a
                               at src/libstd/sys_common/backtrace.rs:78
   3:           0x641674 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::had63074188e24509
                               at src/libstd/sys_common/backtrace.rs:59
   4:           0x67b09c - core::fmt::write::h0f3ca38b916f7bdd
                               at src/libcore/fmt/mod.rs:1069
   5:           0x63f6d3 - std::io::Write::write_fmt::h904ea4dad7931404
                               at src/libstd/io/mod.rs:1504
   6:           0x643b95 - std::sys_common::backtrace::_print::h5b567d4903ca6eb3
                               at src/libstd/sys_common/backtrace.rs:62
   7:           0x643b95 - std::sys_common::backtrace::print::hf98b9b1b18a4dc81
                               at src/libstd/sys_common/backtrace.rs:49
   8:           0x643b95 - std::panicking::default_hook::{{closure}}::h5fbf8e21242992f2
                               at src/libstd/panicking.rs:198
   9:           0x6438d2 - std::panicking::default_hook::hb4d89e36502020cd
                               at src/libstd/panicking.rs:218
  10:           0x6441a2 - std::panicking::rust_panic_with_hook::hc36f90fb81cc1268
                               at src/libstd/panicking.rs:511
  11:           0x643d8b - rust_begin_unwind
                               at src/libstd/panicking.rs:419
  12:           0x67a481 - core::panicking::panic_fmt::h31cb4ec4ac5347b3
                               at src/libcore/panicking.rs:111
  13:           0x67a2a3 - core::option::expect_none_failed::h3e3ee4886fcb0833
                               at src/libcore/option.rs:1268
  14:           0x402134 - bootstrap::main::h4cfb5e1da07e4c36
  15:           0x401903 - std::rt::lang_start::{{closure}}::h71ce4b28a2a11ce2
  16:           0x6444d1 - std::rt::lang_start_internal::{{closure}}::ha24276d619b0834a
                               at src/libstd/rt.rs:52
  17:           0x6444d1 - std::panicking::try::do_call::ha58b8718efdbddf5
                               at src/libstd/panicking.rs:331
  18:           0x6444d1 - std::panicking::try::h2d6d423bf379e813
                               at src/libstd/panicking.rs:274
  19:           0x6444d1 - std::panic::catch_unwind::h45b4b6133cb33025
                               at src/libstd/panic.rs:394
  20:           0x6444d1 - std::rt::lang_start_internal::h47125699e3ec3d7e
                               at src/libstd/rt.rs:51
  21:           0x402222 - main
END RequestId: 7197fc91-d554-4147-9e0d-a29fe3d6e0fb
REPORT RequestId: 7197fc91-d554-4147-9e0d-a29fe3d6e0fb	Duration: 543.11 ms	Billed Duration: 600 ms	Memory Size: 128 MB	Max Memory Used: 15 MB	
RequestId: 7197fc91-d554-4147-9e0d-a29fe3d6e0fb Error: Runtime exited with error: exit status 101
Runtime.ExitError

I have created this repository as a test/reproducer:

https://github.com/brainstorm/s3-rust-htslib-bam

@pmarks, @dlaehnemann, Would you mind taking a peek at my code and let me know if I'm doing something obviously wrong in there? I would really like to document this down to take a stab at #198 and/or write a blogpost about rust-htslib's 101 to attract more devs/users ;)

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions