Skip to content

Commit ce86db2

Browse files
committed
🚧 output group size table when using weighted sampling
This makes it easier to inspect the effect of a prefilter rule.
1 parent 90874b3 commit ce86db2

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

‎workflow/snakemake_rules/main_workflow.smk

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -297,6 +297,8 @@ rule subsample:
297297
params:
298298
group_by = _get_specific_subsampling_setting("group_by", optional=True),
299299
group_by_weights = _get_specific_subsampling_setting("group_by_weights", optional=True),
300+
# only set this if using group_by_weights
301+
output_group_by_weights = lambda wildcards: f"--output-group-by-sizes results/{wildcards.build_name}/sizes-{wildcards.subsample}.tsv" if _get_subsampling_settings(wildcards).get("group_by_weights", False) else "",
300302
sequences_per_group = _get_specific_subsampling_setting("seq_per_group", optional=True),
301303
subsample_max_sequences = _get_specific_subsampling_setting("max_sequences", optional=True),
302304
sampling_scheme = _get_specific_subsampling_setting("sampling_scheme", optional=True),
@@ -330,6 +332,7 @@ rule subsample:
330332
{params.sequences_per_group} \
331333
{params.subsample_max_sequences} \
332334
{params.sampling_scheme} \
335+
{params.output_group_by_weights} \
333336
--output-strains {output.strains} 2>&1 | tee {log}
334337
"""
335338

0 commit comments

Comments
 (0)