Skip to content

SaketJha-323/Liver_Cirrhosis_Stage_Detection_System

Repository files navigation

🧬 Liver Cirrhosis Stage Detection Project

This project focuses on predicting the histologic stage of liver cirrhosis based on various medical and clinical indicators of a patient. Using a dataset from a Mayo Clinic study on primary biliary cirrhosis (PBC) conducted between 1974 and 1984, machine learning models were trained to classify patients into one of three stages of liver damage:

  • Stage 1Mild
  • Stage 2Moderate
  • Stage 3Severe

📊 Key Highlights

  • Dataset Source:

    • Mayo Clinic study on Primary Biliary Cirrhosis (PBC).
    • Period: 1974–1984.
  • Target Variable:

    • Stage of cirrhosis (1, 2, or 3).
  • Features Used:
    The dataset includes medical and biochemical attributes such as:

    • N_Days: Duration from registration to death/transplant/study end
    • Status: Patient status – C (Censored), CL (Censored due to liver transplant), D (Death)
    • Drug: Drug administered – D-penicillamine or placebo
    • Age: Age in days
    • Sex: M or F
    • Ascites, Hepatomegaly, Spiders: Presence of medical symptoms (Y/N)
    • Edema: Level of edema severity (N/S/Y)
    • Biochemical values:
      • Bilirubin (mg/dl), Cholesterol (mg/dl), Albumin (gm/dl), Copper (ug/day)
      • Alk_Phos (U/l), SGOT (U/ml), Triglycerides (mg/dl)
      • Platelets (per 1000 ml), Prothrombin time (sec)

🛠 Feature Engineering

  • Converted categorical features (e.g., Sex, Drug, Edema, etc.) to numerical format.
  • Handled missing data using appropriate imputation strategies.
  • Normalized numerical features for improved model performance.

🤖 Models Evaluated

A range of classification models were evaluated using StackingClassifier with different meta (final) estimators:

  • Models Tested:

    • Logistic Regression59.64% accuracy
    • SGD Classifier58.38% accuracy
    • LDA59.40% accuracy
    • Random Forest94.76% accuracy
    • XGBoost95.54% accuracy
    • SVM84.16% accuracy
    • KNN88.42% accuracy
    • Gaussian50.54% accuracy
  • Base Learners:

    • RandomForestClassifier
    • XGBClassifier
  • Final Estimators Tested:

    • SVC95.90% accuracy
    • LogisticRegression95.74% accuracy
    • SGDClassifier95.64% accuracy
    • LinearDiscriminantAnalysis95.80% accuracy
    • RandomForestClassifier95.86% accuracy
    • XGBClassifier96.14% accuracy
  • Final Estimator:

    • XGBClassifier was chosen as the final meta-model due to its superior performance.

🧪 Prediction Results

  • Test 1:

    • Input: Synthetic random patient data
    • Output:
      • Predicted Cirrhosis Stage: 1Stage 1 (Mild)
    • ✔️ Shows model is capable of detecting low-risk cases.
  • Test 2:

    • Input: Real sample from original dataset
    • Output:
      • Predicted Cirrhosis Stage: 3Stage 3 (Severe)
    • ✔️ Confirms model can detect advanced liver damage from clinical data.

📌 Usage

  • Input patient features (age, lab results, symptoms, etc.) in the required format.
  • The model will output the predicted stage of cirrhosis with a corresponding severity label.
  • Can assist medical professionals in early detection and prioritization of treatment.

Python Version

  • Python 3.12.5