-
Notifications
You must be signed in to change notification settings - Fork 3
feat: split get_reads in two rules #125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…plitting bam into fastqs
WalkthroughA new Snakemake rule, Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Snakemake
participant get_bams
participant get_reads
User->>Snakemake: Trigger workflow
Snakemake->>get_bams: Download BAM file from URL
get_bams->>Snakemake: BAM file saved locally
Snakemake->>get_reads: Process local BAM file to extract reads
get_reads->>Snakemake: Reads extracted
Poem
Note ⚡️ AI Code Reviews for VS Code, Cursor, WindsurfCodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback. Note ⚡️ Faster reviews with cachingCodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
workflow/rules/download.smk (1)
1-15
: Well-structured new rule for downloading BAM files.The new
get_bams
rule looks good and follows Snakemake best practices. It:
- Properly outputs to a consistent location in
resources/bams/
- Uses parameters to get the BAM URL dynamically
- Includes appropriate logging
- Sets a reasonable thread count (32) for concurrent downloads
- Includes retry logic (3 attempts) which is essential for handling network issues
- Uses curl for downloading, with proper output redirection
One potential improvement would be to add validation of the downloaded file:
rule get_bams: output: bam="resources/bams/{benchmark}.bam", params: bam_url=get_benchmark_bam_url, log: "logs/download-bams/{benchmark}.log", conda: "../envs/tools.yaml" resources: sort_threads=lambda _, threads: max(threads - 2, 1), threads: 32 retries: 3 shell: - "curl {params.bam_url} --output {output.bam} 2> {log}" + "(curl {params.bam_url} --output {output.bam} && samtools quickcheck {output.bam}) 2> {log}"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
📒 Files selected for processing (1)
workflow/rules/download.smk
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Formatting
🔇 Additional comments (1)
workflow/rules/download.smk (1)
18-38
: Cleanly refactored read extraction rule.The modifications to the
get_reads
rule are logically correct and complete the separation of concerns between downloading BAM files and extracting reads:
- Now properly takes the downloaded BAM file as input
- Log file path has been appropriately renamed to
split-into-reads
to reflect its current purpose- Shell command correctly uses the input BAM file instead of downloading from URL
- Removed the retry logic as it's no longer needed (handled by upstream rule)
The changes improve the workflow by:
- Making each step more focused and easier to debug
- Avoiding redundant downloads if extraction fails
- Allowing for clearer monitoring of each process
Closed in favor of #126 |
Separate download of bam file and splitting the bam into reads in two rules.
Summary by CodeRabbit