Skip to content

abhijnarc/antibody

Repository files navigation

antibody

Antibody Engineering Project Documentation

Project Overview

Title: Antibody Engineering Using Immunized Rabbit Spleen RNA Sequences
Objective: Identify unique antibody sequences, predict their structures, and evaluate their binding potential through docking studies.
Background: This project aims to discover novel antibody sequences from immunized rabbits, leveraging computational tools for extraction, filtering, clustering, and analysis of immunogenic sequences.


Materials and Methods

Data Acquisition

  • Input Data: RNA sequences from immunized rabbit spleens and control samples.
  • File Formats: FASTQ/FASTA.
  • Source: [Specify source, e.g., sequencing platform].

Tools and Software

Tool/Software Version Purpose
TRUST4 [version] Antibody sequence extraction
MMseqs2 [version] Clustering and filtering sequences
IgFold [version] Antibody structure prediction
HADDOCK3 [name] Docking studies
Python [version] Scripting for automation

Workflow Steps

1. Preprocessing

  • Steps to clean and convert RNA sequences for analysis.
  • Commands or scripts used for preprocessing.

2. Antibody Identification

  • Parameters for running TRUST4 (e.g., alignment thresholds).
  • Key outputs: CDR1, CDR2, CDR3 regions.

3. Pseudo-Sequence Generation

  • Python script to parse and concatenate CDR regions.
  • Example command: python parse_cdrs.py input.fasta output.fasta

4. Clustering and Filtering

  • Clustering:
    mmseqs cluster inputDB outputDB tmpDir --min-seq-id 0.9
  • Filtering:
    mmseqs createsubdb targetDB inputDB filteredDB
  • Explanation of the unique sequence selection process.

5. Structure Prediction

  • Parameters and methods for IgFold.
  • Input and output examples.

6. Docking Studies

  • Docking software setup and scoring metrics.

Results

Data Outputs

  • Initial sequences: [number].
  • Filtered sequences: [number].
  • Unique sequences: [number].

Clustering Statistics

  • Number of clusters formed.
  • Size distribution of clusters.

Structural Predictions

  • Visualizations of antibody structures.

Docking Results

  • Binding scores of top candidates.

Challenges and Solutions

Challenge Solution
Computational bottlenecks during clustering Parallel processing or cloud computing
Ambiguities in CDR extraction Improved regex patterns and manual validation

Key Insights

  • Total unique sequences: [number].
  • Top-ranked candidates based on docking scores.

Reproducibility

Code and Scripts

  • Repository: [GitHub Link]
  • Example README file:
    ## Usage
    1. Preprocess sequences: `script1.py`.
    2. Run TRUST4: `trust4 --parameters`.
    3. Cluster sequences: `mmseqs cluster ...`.

Workflow Automation

  • Automated pipeline using [Nextflow].
  • Include diagram of workflow.

Environment

  • Dependencies:
    conda activate trust4 mmseqs2 igfold

Future Work

  • Experimental validation of top candidates.
  • Extend study to other species or antibody libraries.
  • Incorporate alternative docking algorithms for higher accuracy.

References

  • TRUST4: [cite].
  • MMseqs2: [cite].
  • IgFold: [cite].

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published