Skip to content

Commit fc0d22c

Browse files
committed
RLS Version 1.5.0 SemiBin2 beta
Big change is the addition of a `SemiBin2` script, which is still experimental, but should be a slightly nicer interface. USER-VISIBLE IMPROVEMENTS SINCE v1.4.0 - Added a new option for ORF finding, called `fast-naive` which is an internal very fast implementation. - Added the possibility of bypassing ORF finding altogether by providing prodigal outputs directly (or any other gene prediction in the right format) - Command line argument checking is more exhaustive instead of exiting at first error - Added `--quiet` flag to reduce the amount of output printed - Better `--help` (group required arguments separately) - Add `--output-compression` option to compress outputs - Add `--tag-output` option which allows for control of the output filenames (and also makes the anvi'o compatible — see discussion at [#123](#123). - Add contig->bin mapping table ([#123](#123)) - `SemiBin.main.main1` and `SemiBin.main.main2` can now be called as a function with command line arguments (`main1` corresponds to _SemiBin1_ and `main2` corresponds to _SemiBin2_) ```python import SemiBin.main ... SemiBin.main.main2(['single_easy_bin', '--input-fasta', ...]) ```
1 parent ac618b1 commit fc0d22c

File tree

4 files changed

+30
-12
lines changed

4 files changed

+30
-12
lines changed

ChangeLog

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
1-
Unreleased
1+
Version 1.5.0 (SemiBin2 beta) Jan 17 2023 by BigDataBiology
22
* Add `SemiBin2` script
33
* Added naive ORF finder
4-
* Make command line arguments more flexible for --sequencing-type argument
54
* Add `--prodigal-output-faa` argument (#113)
5+
* Make command line arguments more flexible for --sequencing-type argument
66
* Argument checking is more exhaustive instead of exiting at first error
77
* Add `--quiet` argument
8-
* Better `--help` (group required arguments separately)
98
* Add `--compression` option
10-
* Make SemiBin.main.main callable with a list of arguments
119
* Add `--tag-output` option
12-
* Add contig->bin mapping table (#123)
10+
* Better `--help` (group required arguments separately)
11+
* Make SemiBin.main.main2 callable with a list of arguments
12+
* Add contig -> bin mapping table (#123)
1313

14-
Version 1.4.0 Dec 2022 by BigDataBiology
14+
Version 1.4.0 Dec 15 2022 by BigDataBiology
1515
* Provide binning algorithm for assemblies from long read
1616
* Add `--allow-missing-mmseqs2` flag to `check_install` subcommand
1717
* Run Prodigal in multiple jobs without multiprocessing (#106)

SemiBin/semibin_version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = '1.4.0'
1+
__version__ = '1.5.0'

docs/semibin2.md

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,25 @@ They have the same functionality, but slightly different interfaces. The exact
77
interface to `SemiBin2` should be considered as unstable (while we will strive
88
to maintain backwards compatibility if you call the `SemiBin` script).
99

10-
# Differences between SemiBin2 and SemiBin1
10+
## Upgrading to SemiBin2
11+
12+
1. If you are using the `easy_*` workflows, then they will probably continue to
13+
work exactly the same (except that you will get better results faster).
14+
2. Outputs are now **always** in a directory called `output_bins`.
15+
3. By default, bins are in file named as `SemiBin_{label}.fa.gz` (and
16+
compressed with _gzip_ as the name indicates).
17+
18+
Points `2` and `3` may require some minor modifications to wrapper scripts.
19+
20+
## Longer list of differences between SemiBin2 and SemiBin1
1121

1222
The biggest different is that the default training mode is self-supervised mode.
1323

1424
- Output bins are now **always** in a directory called `output_bins` (in
15-
- Output filenames are now anvi'o compatible (effectively, the default value of `--tag-output` is `SemiBin`) (see discussion in [#123](https://github.com/BigDataBiology/SemiBin/issues/123))
1625
_SemiBin1_, it actually depended on which parameters were used)
26+
- Output filenames are now anvi'o compatible (effectively, the default value of
27+
`--tag-output` is `SemiBin`), see discussion at
28+
[#123](https://github.com/BigDataBiology/SemiBin/issues/123).
1729
- `--compression` defaults to `gz` (instead of `none`)
1830
- ORF finder defaults to the `fast-naive` internal ORF finder
1931
- `--write-pre-reclustering-bins` is `False` by default
@@ -24,5 +36,6 @@ The biggest different is that the default training mode is self-supervised mode.
2436
A few arguments that were deprecated before are completely removed:
2537
- `--recluster`: it did nothing already as reclustering is default
2638
- `--mode`: Use `--train-from-many`
27-
- `--training-type`: Use `--semi-supervised` to use semi-supervised learning (although that is also deprecated)
39+
- `--training-type`: Use `--semi-supervised` to use semi-supervised learning
40+
(although that is also deprecated)
2841

docs/whatsnew.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,11 @@
11
# What's New
22

3-
## Unreleased github version
3+
## Version 1.5.0 (SemiBin2 beta)
4+
5+
*Released Jan 17, 2023*
6+
7+
Big change is the addition of a `SemiBin2` script, which is still experimental, but should be a slightly nicer interface.
8+
See [[upgrading to SemiBin2](semibin2)]
49

510
### User-visible improvements
611

@@ -10,7 +15,7 @@
1015
- Added `--quiet` flag to reduce the amount of output printed
1116
- Better `--help` (group required arguments separately)
1217
- Add `--output-compression` option to compress outputs
13-
- Add `--tag-output` option which allows for control of the output filenames (and also makes the anvi'o compatible)
18+
- Add `--tag-output` option which allows for control of the output filenames (and also makes the anvi'o compatible — see discussion at [#123](https://github.com/BigDataBiology/SemiBin/issues/123).
1419
- Add contig->bin mapping table ([#123](https://github.com/BigDataBiology/SemiBin/issues/123))
1520
- `SemiBin.main.main1` and `SemiBin.main.main2` can now be called as a function with command line arguments (`main1` corresponds to _SemiBin1_ and `main2` corresponds to _SemiBin2_)
1621

0 commit comments

Comments
 (0)