DrugPredict

A full-stack bioinformatics application that combines Python-based molecular analysis with a modern React/Next.js web interface for AI-powered drug discovery and compound evaluation.

Quick Start

Prerequisites

Node.js 18+ and npm
Python 3.8+ with conda/pip
Git for version control

Installation

Clone the repository

git clone https://github.com/williamhuang3/ml-based-drug-identifier.git
cd ml-based-drug-identifier

Set up Python environment

# Install Python dependencies
pip install -r requirements.txt

# Or using conda
conda install -c rdkit rdkit -y
conda install -c conda-forge bash

Set up Node.js environment

# Install frontend dependencies
npm install

Start the development servers

# Option 1: Start both servers together
npm run dev-full

# Option 2: Start servers separately (two terminals)
npm run flask-dev    # Terminal 1: Flask backend
npm run dev          # Terminal 2: Next.js frontend

Open your browser Navigate to http://localhost:3000 (frontend) or http://localhost:5001 (API)

Usage

Web Interface

Start the application with npm run dev
Enter a target name (e.g., "Coronavirus", "EGFR") or ChemBL ID
Click "Search & Analyze" to run the analysis pipeline
View results in the organized tabs:
- Overview: Summary statistics and target information
- Compounds: Detailed compound data table
- Statistics: Mann-Whitney U test results
- Visualizations: Molecular descriptor plots
- ML Predictions: Random Forest regression results

Command Line (Python)

# Run the Python analysis directly
python main.py

Follow the prompts to:

Enter a biological target for analysis
Wait for ChemBL data retrieval and processing
Run PaDEL descriptor calculation: bash padel.sh
View generated plots and statistical results

📊 Analysis Pipeline

Target Query: Search ChemBL database for compounds targeting specific proteins
Data Preprocessing: Filter and clean compound data, remove duplicates
Bioactivity Classification: Label compounds based on IC50 thresholds
Molecular Descriptors: Calculate Lipinski descriptors using RDKit
Statistical Testing: Perform Mann-Whitney U tests between active/inactive groups
Visualization: Generate box plots, scatter plots, and distribution charts
Machine Learning: Train Random Forest model using PaDEL descriptors
Prediction: Generate IC50 predictions and evaluate model performance

🎯 Key Metrics

IC50 Classification Thresholds:
- Active: ≤ 1,000 nM
- Intermediate: 1,000 - 10,000 nM
- Inactive: ≥ 10,000 nM
Lipinski Descriptors:
- Molecular Weight (MW)
- Lipophilicity (LogP)
- Hydrogen Bond Donors
- Hydrogen Bond Acceptors
Model Performance Metrics:
- R² Score (coefficient of determination)
- RMSE (Root Mean Square Error)
- MAE (Mean Absolute Error)

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

William Huang - Project Creator
Data Professor (YouTube) - Inspiration and tutorials
ChemBL Database - Compound and bioactivity data

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
backend		backend
data		data
public		public
scripts		scripts
src		src
.gitignore		.gitignore
.vercelignore		.vercelignore
README.md		README.md
next-env.d.ts		next-env.d.ts
next.config.js		next.config.js
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
render.yaml		render.yaml
requirements.txt		requirements.txt
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DrugPredict

Quick Start

Prerequisites

Installation

Usage

Web Interface

Command Line (Python)

📊 Analysis Pipeline

🎯 Key Metrics

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

williamhuang3/ml-based-drug-identifier

Folders and files

Latest commit

History

Repository files navigation

DrugPredict

Quick Start

Prerequisites

Installation

Usage

Web Interface

Command Line (Python)

📊 Analysis Pipeline

🎯 Key Metrics

📄 License

🙏 Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages