SelectionPressureAnalysis

This repository provides custom scripts (i.e., pseudopipeline) to analyze selection pressure across multiple genes simultaneously.

Pipeline Overview

The pipeline includes the following steps:

Retrieve CDS Sequences of Orthologous Genes
Extract CDS (coding sequence) data for a set of orthologous genes.
Combine the Fasta Sequences
Merge the individual fasta files of sequences into a single file.
Clean and Remove Duplicates
Clean the sequences and ensure only unique sequences are present in each fasta file.
Align Sequences Using MAFFT
Perform sequence alignment using the MAFFT algorithm.
Convert Fasta to PHYLIP Format
Convert the aligned sequences from Fasta to PHYLIP format (version 3.2).
Apply pal2nal
Use pal2nal to synchronize the coding sequences with the protein sequences.
Generate Phylogenetic Trees Using RAxML
Build phylogenetic trees based on the aligned sequences using RAxML.
Perform CODEML Analysis
Conduct CODEML analysis to estimate selection pressures on the gene sequences.

To run these scripts, you will need to install the necessary tools and packages, including:

Please ensure all required dependencies are installed before running the pipeline.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
01_genes_to_study		01_genes_to_study
02_orthogroups_cds		02_orthogroups_cds
01_get_data.py		01_get_data.py
01_make_sequence_titles_unique.py		01_make_sequence_titles_unique.py
02_combine_cds_and_gene_of_interest.py		02_combine_cds_and_gene_of_interest.py
03_translate_cds_to_protein.py		03_translate_cds_to_protein.py
05_fasta2phylip.py		05_fasta2phylip.py
05_replace_names_in_trees.py		05_replace_names_in_trees.py
05_run_pal2nal.sh		05_run_pal2nal.sh
05_truncate_sequence_titles.py		05_truncate_sequence_titles.py
06_RAxML.sh		06_RAxML.sh
README.md		README.md
clean_branch_size.py		clean_branch_size.py
directory_management.py		directory_management.py
make_unique_sequence_titles.py		make_unique_sequence_titles.py
mark_branch.py		mark_branch.py
renames_files.py		renames_files.py
single_liner_multiple.py		single_liner_multiple.py