A sophisticated medical assistant application that combines large language models with trusted medical sources to provide accurate medical information and analysis.
ClinicalGPT Medical Assistant
β’ π Features
β’ π οΈ System Architecture
β’ π Quick Start
β Prerequisites
β Installation
β Using the Application
β’ π§ Configuration
β Environment Variables
β Trusted Domains
β’ π― Key Components
β Server (server/
)
β Utils (utils/
)
β’ π API Reference
β Endpoints
β Sample Request
β’ π Security
β’ π€ Contributing
β’ π License
β’ π Acknowledgments
- Intelligent Medical Queries: Get accurate responses to medical questions using state-of-the-art language models
- Web Search Integration: Automatic search and validation from trusted medical sources
- File Analysis: Process medical documents including:
- Text files (.txt)
- CSV data files
- JSON documents
- Medical images
- PDF documents
- Modern Web Interface: Responsive design with real-time feedback
- History Management: Track and review past queries and analyses
- Multi-Device Support: Intelligent hardware acceleration on:
- NVIDIA GPUs (CUDA)
- AMD GPUs (ROCm)
- Apple Silicon (MPS)
- Intel NPUs
- CPU fallback
- Medical Disclaimer: Transparent communication about AI limitations and the informational nature of responses
graph TB
%% Client Layer
subgraph "Client Layer"
WI[Web Interface]
APIC[API Clients]
end
%% API Layer
subgraph "API Layer"
API[API Endpoints]
HE[Health Endpoints]
QE[Query Endpoints]
FE[File Processing Endpoints]
ME[Medical Term Detection]
end
%% Core Services
subgraph "Core Services"
QP[Query Processor]
FP[File Processor]
WS[Web Search Integration]
end
%% Model Layer
subgraph "Model Management"
ML[Model Loader]
IE[Inference Engine]
subgraph "Distribution Strategies"
MP[Model Parallelism]
PP[Pipeline Parallelism]
LO[Layer Offloading]
end
end
%% Hardware Layer
subgraph "Hardware Acceleration"
CUDA[NVIDIA CUDA]
ROCM[AMD ROCm]
MPS[Apple Silicon]
NPU[Intel NPUs]
CPU[CPU Fallback]
end
%% External Services
subgraph "External Services"
subgraph "Medical Data Sources"
MAYO[Mayo Clinic]
CDC[CDC]
NIH[NIH]
WEBMD[WebMD]
PUBMED[PubMed]
WHO[World Health Org.]
REUTERS[Reuters Health]
end
OCR[OCR Services]
PDF[PDF Processing]
end
%% Connections - Client to API
WI --> API
APIC --> API
%% API Layer connections
API --> HE
API --> QE
API --> FE
API --> ME
%% API to Core Services
QE --> QP
FE --> FP
QP --> WS
%% Core Services to Model Management
QP --> ML
QP --> IE
FP --> ML
FP --> IE
%% Model Management internal connections
ML --> MP
ML --> PP
ML --> LO
MP --> IE
PP --> IE
LO --> IE
%% Hardware Acceleration
IE --> CUDA
IE --> ROCM
IE --> MPS
IE --> NPU
IE --> CPU
%% External Services connections
WS --> MAYO
WS --> CDC
WS --> NIH
WS --> WEBMD
WS --> PUBMED
WS --> WHO
WS --> REUTERS
FP --> OCR
FP --> PDF
sequenceDiagram
participant User
participant WebUI as Web Interface
participant APILayer as API Endpoints
participant QueryProc as Query Processor
participant FileProc as File Processor
participant WebSearch as Web Search Integration
participant ModelMgmt as Model Management
participant InfEngine as Inference Engine
participant DistStrat as Distribution Strategies
participant HWAccel as Hardware Acceleration
participant ExtSrc as External Medical Sources
%% User submits a query
User->>WebUI: Enters medical query
WebUI->>APILayer: POST /api/query
APILayer->>QueryProc: Process query request
%% Web search if enabled
alt Web search enabled
QueryProc->>WebSearch: Search for medical information
WebSearch->>ExtSrc: Query trusted medical websites
ExtSrc-->>WebSearch: Return medical information
WebSearch-->>QueryProc: Return search results
QueryProc->>QueryProc: Enhance prompt with web results
end
%% Model processing
QueryProc->>ModelMgmt: Request model inference
ModelMgmt->>DistStrat: Apply distribution strategy
%% Choose appropriate hardware acceleration
alt NVIDIA GPU Available
DistStrat->>HWAccel: Use CUDA acceleration
else AMD GPU Available
DistStrat->>HWAccel: Use ROCm acceleration
else Apple Silicon
DistStrat->>HWAccel: Use MPS acceleration
else Intel NPU
DistStrat->>HWAccel: Use Intel NPU acceleration
else
DistStrat->>HWAccel: Use CPU fallback
end
%% Inference process
HWAccel-->>InfEngine: Hardware-accelerated processing
InfEngine-->>ModelMgmt: Return model response
ModelMgmt-->>QueryProc: Return formatted response
%% Combine results
QueryProc-->>APILayer: Return combined results
APILayer-->>WebUI: Return JSON response
WebUI->>WebUI: Format response with Markdown
WebUI->>WebUI: Apply medical term highlighting
WebUI-->>User: Display formatted response
%% Alternative flow for file upload
rect rgb(71, 73, 73)
Note over User,WebUI: File Upload Flow
User->>WebUI: Uploads medical file
WebUI->>APILayer: POST /api/process-file
APILayer->>FileProc: Process uploaded file
alt PDF Document
FileProc->>FileProc: Extract text and structure
else Image File
FileProc->>FileProc: Perform OCR
else CSV/JSON
FileProc->>FileProc: Parse data structure
else Text File
FileProc->>FileProc: Process plain text
end
FileProc->>ModelMgmt: Request file analysis
ModelMgmt->>InfEngine: Generate analysis
InfEngine-->>ModelMgmt: Return analysis
ModelMgmt-->>FileProc: Return analysis results
FileProc-->>APILayer: Return processed results
APILayer-->>WebUI: Return JSON response
WebUI->>WebUI: Format file analysis results
WebUI-->>User: Display file analysis
end
- Python 3.12 or higher
- PyTorch compatible hardware (GPU recommended)
- Internet connection for web search features
- Microsoft C++ Build Tools: Required on Windows for compiling certain Python packages with C extensions (e.g., some dependencies for advanced file processing). Download from Visual Studio Build Tools. Ensure "C++ build tools" are selected during installation.
- Clone the repository:
git clone [repository-url]
cd mastersDegree-finalProject
- Run the setup script:
run.bat
The script will:
- Create a virtual environment
- Install dependencies
- Configure PyTorch for your hardware
- Start the server
- Open the web interface in your default browser
Access the web interface at http://localhost:5000 in your browser. The interface will open automatically when using run.bat.
FLASK_DEBUG
: Enable/disable debug modePORT
: Server port (default: 5000)MODEL_PATH
: Path to the model (default: HPAI-BSC/Llama3.1-Aloe-Beta-8B)USE_INTEL_NPU
: Enable Intel NPU accelerationUSE_AMD_NPU
: Enable AMD NPU acceleration
Edit config.ini
to modify the list of trusted medical sources.
- Flask-based REST API
- Model management and inference
- Modular design with Strategy pattern for model distribution
- Support for model parallelism, pipeline parallelism, and partial offloading
- File processing and analysis
- Web search integration
- Web scraping functionality
- Modular architecture with provider-specific implementations
- Trusted domain verification
- File processing utilities
- Support for various document formats
- Medical term extraction
- Medical term detection
- Text analysis tools
- Modular Architecture: Components are organized into focused, reusable modules
- Strategy Pattern: Used for model distribution across different hardware setups
- Legacy Support: Backward compatibility layers for evolving interfaces
- Clear Separation of Concerns: Each module handles specific functionality
GET /api/health
: Server health checkPOST /api/query
: Process medical queriesPOST /api/process-file
: Analyze medical filesGET /api/device-info
: Hardware acceleration infoGET /api/info
: API capabilities and status
POST /api/query
{
"query": "What are the symptoms of type 2 diabetes?",
"search_web": true
}
- Content validation and sanitization
- Trusted domain verification
- Input length restrictions
- Error handling and logging
- Medical disclaimer and usage limitations clearly stated
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Submit a pull request
This project is licensed under the CCv1 License - see the LICENSE file for details.
- Hugging Face for model hosting
- Trusted medical sources (NIH, CDC, Mayo Clinic, WHO, Reuters, etc.) // Added Reuters
- Open-source medical research community