Multiple Disease Prediction Project ⚕️💉

This project leverages Streamlit to create a web-based application that predicts the likelihood of Diabetes, Heart Disease, and Parkinson's Disease based on user-provided health data.

🟢 Deployed at: [https://hiremeplsthx.streamlit.app/]

Features

✅ Predicts multiple diseases using trained ML models
✅ Displays model performance metrics (Accuracy, Precision, Recall, F1 Score)
✅ Includes sample data for quick testing
✅ Unified prediction interface with all three diseases accessible via the left sidebar
✅ User-friendly interface powered by Streamlit

Installation

Clone this repository:

git clone https://github.com/your-username/your-repo-name.git
cd your-repo-name

⚙️ Create and Activate a Conda Environment

⚠️ Recommended: Creating a separate Conda environment helps isolate dependencies. However, you can skip this step if you're comfortable using global packages.

conda create -n disease_prediction_env python=3.10
conda activate disease_prediction_env

Install the required dependencies:
```
pip install -r requirements.txt
```
Run the application:
```
streamlit run app.py
```

Usage

Select the desired disease prediction option from the sidebar.
Enter your health details in the provided input fields.
Click the Predict button to view the prediction result and model metrics.

✨ Key Enhancements

➕ Added "Healthy" and "Non-Healthy" buttons to simplify testing with pre-existing values.
🔽 Improved usability by integrating multiple diseases under a unified sidebar menu.
📊 Stored model performance metrics (accuracy, precision, etc.) in JSON format for easy data handling and streamlined updates.
📝 Utilized Markdown in Streamlit for a clearer and more informative presentation of results.
🧹 Enhanced code readability with well-structured comments and Markdown descriptions in Jupyter Notebook.

🔍 Future Enhancements

🔧 Perform a comprehensive code review to improve stability and performance.
📈 Experiment with models such as Random Forest, SVM, and XGBoost for enhanced prediction accuracy.
🎨 Improve the UI design to provide a better user experience.

🗂️ Project Structure (Refer Jupyter Notebook and run each cell)

The project follows a structured development pipeline:

Environment Setup ⚙️
Dataset Acquisition 📄
Data Preprocessing 🧪
Model Training and Saving 🧠
Streamlit Deployment 🌐
Enhancements and Testing ✅

📄 Dataset Acquisition

📚 Datasets Used

Pima Indians Diabetes Database (Kaggle)
Indian Parkinson's Patient Records (Kaggle)
Parkinson's Disease Dataset (Kaggle)

📥 Loading the Datasets

# Load the datasets
diabetes = pd.read_csv("data/diabetes.csv")
heart = pd.read_csv("data/heart.csv")
parkinsons = pd.read_csv("data/parkinsons.csv")

💾 Saving Processed Data for Future Use

# Save processed data
diabetes.to_csv("data/diabetes_cleaned.csv", index=False)
heart.to_csv("data/heart_cleaned.csv", index=False)
parkinsons.to_csv("data/parkinsons_cleaned.csv", index=False)

🧪 Data Preprocessing

🩺 Addressed missing values, outliers, and scaling issues for improved model performance.
🔄 Utilized StandardScaler for consistent scaling to prevent skewed predictions.

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)

🧠 Train & Save the Models

🤖 Model Selection

Implemented Logistic Regression for its simplicity and effectiveness.
🔍 Future plans include exploring Random Forest, SVM, and XGBoost.

💾 Training & Saving Models with Scalers

from sklearn.linear_model import LogisticRegression
import joblib
import json

# Train and save the model
model = LogisticRegression()
model.fit(X_train_scaled, y_train)

# Save model and scaler
joblib.dump({'model': model, 'scaler': scaler}, 'models/diabetes_model.pkl')

# Save performance metrics
metrics = {
    'accuracy': 0.89,
    'f1_score': 0.87,
    'recall': 0.85,
    'precision': 0.88
}

with open("metrics.json", "w") as f:
    json.dump(metrics, f)

🌐 Streamlit Deployment

🧩 Key Features

🟢 "Fill Healthy Values" and 🔴 "Fill Diabetic Values" buttons simplify testing.
🔽 Implemented dropdown menus for binary options for improved user experience.
🔄 Utilized session states to manage dynamic inputs efficiently.
✅ Models are integrated with scalers to ensure accurate predictions in Streamlit.

▶️ Running the Application

streamlit run app.py

Visit the live app here: [https://hiremeplsthx.streamlit.app/]

⭐ Contributing

If you find this project helpful, consider giving it a ⭐ and sharing your thoughts! Suggestions and improvements are welcome. 😊

📜 License

This project is licensed under the MIT License.

👨‍💻 Developed By

Mirang Bhandari

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.devcontainer		.devcontainer
Models		Models
datasets		datasets
LICENSE.txt		LICENSE.txt
README.md		README.md
diabetes_metrics.json		diabetes_metrics.json
heart disease_metrics.json		heart disease_metrics.json
parkinson's_metrics.json		parkinson's_metrics.json
prediction.py		prediction.py
requirements.txt		requirements.txt
train.ipynb		train.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multiple Disease Prediction Project ⚕️💉

Features

Installation

⚙️ Create and Activate a Conda Environment

Usage

✨ Key Enhancements

🔍 Future Enhancements

🗂️ Project Structure (Refer Jupyter Notebook and run each cell)

📄 Dataset Acquisition

📚 Datasets Used

📥 Loading the Datasets

💾 Saving Processed Data for Future Use

🧪 Data Preprocessing

🧠 Train & Save the Models

🤖 Model Selection

💾 Training & Saving Models with Scalers

🌐 Streamlit Deployment

🧩 Key Features

▶️ Running the Application

⭐ Contributing

📜 License

👨‍💻 Developed By

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Bloodwingv2/Multiple_Disease_Prediction

Folders and files

Latest commit

History

Repository files navigation

Multiple Disease Prediction Project ⚕️💉

Features

Installation

⚙️ Create and Activate a Conda Environment

Usage

✨ Key Enhancements

🔍 Future Enhancements

🗂️ Project Structure (Refer Jupyter Notebook and run each cell)

📄 Dataset Acquisition

📚 Datasets Used

📥 Loading the Datasets

💾 Saving Processed Data for Future Use

🧪 Data Preprocessing

🧠 Train & Save the Models

🤖 Model Selection

💾 Training & Saving Models with Scalers

🌐 Streamlit Deployment

🧩 Key Features

▶️ Running the Application

⭐ Contributing

📜 License

👨‍💻 Developed By

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages