BEDICT-V2:Predicting base editing outcomes with an attention-based deep learning algorithm

Overview

BEDICT-V2 is a deep learning model designed to predict base editing outcomes using an attention-based algorithm. This repository provides the source code and instructions for using the model. We also have a web app you can try out here. https://go.bedict.app/

---

The folder structure:

packages/button
├── absolute_efficiency_model
│   ├── models
│   ├── output
│   └── src
├── dataset
├── main_py_files
│   ├── train.py
│   ├── ....
│   └──inference.py
├── dataset
├── notebooks
├── proportion_model
│   ├── output
│   └── src
├── utils
├── web_application
│   ├── templates
│   ├── static
│   └── app.y
├── README.md
└── requirment.txt

Environment Setup

Set up the environment

Create a virtual environment and install the required dependencies using Conda:

# Create a virtual environment
conda create --name bedict_v2

# Activate the virtual environment
conda activate bedict_v2

# install python
conda install -c anaconda python=3.10

conda install pytorch torchvision cudatoolkit=10.1 -c pytorch

# Install dependencies
pip install -r requirements.txt

Usage

Run Inference on Custom Sequences

You can use the pre-trained BEDICT-V2 models to run inference on your own DNA sequences. Choose between running locally via a notebook or using our web app.

🧪 Option 1: Local Inference Using the Notebook

Prepare your input file:
- Create an Excel file with:
  - Target sequences (20 bases long)
  - PAM sequences (4 bases long)
- Place this file in the dataset/ directory.
Open the notebook:

Navigate to the notebooks/ folder and open Inference_user_defined_sequence.ipynb.
Configure your run:

In the notebook, specify:
- The input Excel file name
- The editor name (e.g., ABE8e-NG)
- Whether you're predicting in vivo or in vitro
Run inference:

The notebook will automatically run:
- The absolute efficiency model
- The proportional model
It will then merge the predictions into a final result table.

🌐 Option 2: Use the Web App (Easiest)

The easiest way to use BEDICT-V2 is through our web app. Just upload your sequences and get results instantly — no setup required!

📦 Note on Pre-trained Models

Pre-trained models are already included in the repository under corresponding folders, such as BEDICT-V2/absolute_efficiency_model/output/...

Train the Model on Your Own Dataset

To deploy BEDICT-V2 on your own dataset (e.g., screening data), follow the steps below:

1. Prepare the Data

An example dataset is provided in the dataset/ folder. Your dataset should be in Excel format and include the following columns:

Target protospacer (20 bases)
PAM sequence (4 bases)
Outcome sequence (20 bases)

2. Pre-process the Data

Use the preprocessing script to convert your Excel input into model-ready formats:

python main_py_files/generate_two_stage_model_data.py

3. Train the Model

Navigate to the main_py_files/ directory and run:

python train.py

This will run both the absolute efficiency model and the proportional model.
If needed, you can also run them separately using:

Absolute_efficiency_main.py for the absolute efficiency model
trainval_test_proportions_main.py for the proportional model

Note:
Before training, be sure to specify the appropriate editor (e.g., ABE8e-NG) and whether you're working with in vivo or in vitro conditions in the configuration file.

4. Infer the Model**

Once the model is trained, navigate to the main_py_files/ directory and run the inference script:

python inference.py

License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BEDICT-V2:Predicting base editing outcomes with an attention-based deep learning algorithm

Overview

Table of Contents

The folder structure:

Environment Setup

Set up the environment

Usage

Run Inference on Custom Sequences

🧪 Option 1: Local Inference Using the Notebook

🌐 Option 2: Use the Web App (Easiest)

📦 Note on Pre-trained Models

Train the Model on Your Own Dataset

1. Prepare the Data

2. Pre-process the Data

3. Train the Model

4. Infer the Model**

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
absolute_efficiency_model		absolute_efficiency_model
dataset		dataset
main_py_files		main_py_files
notebooks		notebooks
proportion_model		proportion_model
utils		utils
web_application		web_application
.gitignore		.gitignore
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
requirements.txt		requirements.txt
runtime.txt		runtime.txt

License

uzh-dqbm-cmi/BEDICT-V2

Folders and files

Latest commit

History

Repository files navigation

BEDICT-V2:Predicting base editing outcomes with an attention-based deep learning algorithm

Overview

Table of Contents

The folder structure:

Environment Setup

Set up the environment

Usage

Run Inference on Custom Sequences

🧪 Option 1: Local Inference Using the Notebook

🌐 Option 2: Use the Web App (Easiest)

📦 Note on Pre-trained Models

Train the Model on Your Own Dataset

1. Prepare the Data

2. Pre-process the Data

3. Train the Model

4. Infer the Model**

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages