GitHub - ChengyuanSha/Smart-Alignment: Global Alignment with Scoring Matrix, Custom weight, Affine Gap Penalty and Bounded Dynamic Programming acceleration.

Global Alignment with Scoring Matrix, Custom Weight, Affine Gap Penalty and Bounded Dynamic Programming (DP) Acceleration

Introduction

Every Rosalind node have its own file in the format ACRN.py (in algorithms folder) where ACRN is the acronym for the node in the tree view
The main function in each node is main_XXX, eg. main function in ACRN.py ismain_ACRN(...) the main function input is what it says on the rosalind site.
Testing function is named test_function_name, eg. for ACRN.py -> test_ACRN.py
There are multiple tests per function, first two are positive, last one is negative. Test data are named form 1-3 in dataset (except for extension, there are 6 dataset, first 3 are used by experiment).

Code quality

Code quality and test coverage is ensured by the continuous integration server Codacy and python Coverage. Codacy uses Radon by default to calculate metrics from the source code.

We achieved 100% coverage on all algorithm files. Check Coverage: Run the following commands in project root directory.

coverage run -m unittest discover
coverage report

Running Instruction

Please install all requirements from requirements.txt.

figures.py contains the efficiency experiment (Comparing time efficiency between unbounded algorithm and bounded acceleration) and comparing optimal scores experiment (Comparing alignment scores from bounded algorithm with the optimal scores from unbounded algorithm).

experiments.py contains all other experiments performed by us.

Function usage

GAFF_extended(s, t, scoring_matrix, gap, gap_ext, conserved_seq="", conserved_strength=0, bound=-1,
                  ignore_start_gaps=False, ignore_end_gaps=False, auto_bound = False):

    Global Alignment with Scoring Matrix, Custom Weight, Affine Gap Penalty and Bounded DP Acceleration.

    Inputs: Two protein strings s and t in FASTA format.
            gap: a negative int representing gap opening penalty.
            gap_ext: a negative int representing gap extension penalty.
            conserved_seq: increase the weight on letters in the input sequence (string). Default empty string.
            conserved_strength: weight on conserved_seq. Default 0.
            bound: Define the range of bound dynamic programming. Negative number will run unbound version.
                   Default -1.
            auto_bound: Auto calculate the bound size using DNA length difference, default False.
            ignore_start_gaps: ignoring start gaps in alignment, default False.
            ignore_end_gaps: ignoring end gaps in alignment, default False.

    Returns: The maximum alignment score between s and t, followed by two augmented strings
             s′ and t′ representing an optimal alignment of s and t.

More detailed explanation

Our academic paper can be found in paper.pdf.

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
algorithms		algorithms
datasets		datasets
tests		tests
.coveragerc		.coveragerc
LICENSE		LICENSE
README.md		README.md
paper.pdf		paper.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Global Alignment with Scoring Matrix, Custom Weight, Affine Gap Penalty and Bounded Dynamic Programming (DP) Acceleration

Introduction

Code quality

Running Instruction

Function usage

More detailed explanation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

ChengyuanSha/Smart-Alignment

Folders and files

Latest commit

History

Repository files navigation

Global Alignment with Scoring Matrix, Custom Weight, Affine Gap Penalty and Bounded Dynamic Programming (DP) Acceleration

Introduction

Code quality

Running Instruction

Function usage

More detailed explanation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages