A comprehensive AI-powered feedback system for oral presentations in healthcare education, specifically designed for Summative OSCE (Objective Structured Clinical Examination) feedback. This system provides real-time analysis of medical presentations with detailed feedback on clinical reasoning, communication skills, and presentation structure.
- 🎤 Speech-to-Text: Automatic transcription using OpenAI Whisper
- 🧠 AI Analysis: AWS Bedrock with Claude Opus for clinical data extraction
- 📊 Clinical Analysis: OPQRST analysis, gap detection, flow analysis, and communication metrics
- 📝 Rubric Scoring: Automated scoring aligned with OSCE rubric criteria
- 💬 Detailed Feedback: Comprehensive feedback generation with evidence references
- 🌐 Web Interface: Modern React/Next.js frontend with real-time updates
- 📱 Responsive Design: Mobile-friendly interface for various devices
- OPQRST Analysis: Systematic evaluation of medical history elements
- Gap Detection: Identification of missing clinical information
- Flow Analysis: Assessment of presentation structure and organization
- Communication Metrics: Evaluation of communication effectiveness
- Rubric Scoring: Multi-dimensional scoring across clinical domains
- Teaching Points: Educational guidance for improvement
backend/
├── integrated_main.py # Main FastAPI server
├── demo_pipeline_any_audio.py # Analysis pipeline
├── logging_backend.py # Logging utilities
└── temp_uploads/ # Temporary file storage
rutgers-health-frontend/
├── src/
│ ├── app/ # Next.js app router
│ ├── components/ # React components
│ │ ├── UploadTab.tsx # File upload interface
│ │ ├── FullTranscriptDisplay.tsx # Transcript viewer
│ │ ├── DetailedAnalysisTab.tsx # Analysis results
│ │ ├── ResultsTab.tsx # Scoring and metrics
│ │ └── DashboardTab.tsx # System analytics
│ └── lib/ # Utilities and stores
│ ├── store.ts # Zustand state management
│ ├── api.ts # API client
│ └── debugLogger.ts # Debug logging
src/
├── opqrst_analyzer.py # OPQRST analysis
├── gap_detector.py # Gap detection
├── flow_analyzer.py # Flow analysis
├── communication_metrics.py # Communication evaluation
├── feedback_generator.py # Feedback generation
└── bedrock_client.py # AWS Bedrock integration
- Python 3.9+
- Node.js 18+
- AWS Account with Bedrock access
- Audio files in WAV format
- Clone the repository
git clone https://github.com/patelshrey40/rutgers-health-ai-feedback.git
cd rutgers-health-ai-feedback
- Install Python dependencies
pip install -r requirements.txt
- Configure AWS credentials
Create
aws-config.json
:
{
"region": "us-east-1",
"access_key_id": "YOUR_AWS_ACCESS_KEY_ID",
"secret_access_key": "YOUR_AWS_SECRET_ACCESS_KEY",
"session_token": "YOUR_AWS_SESSION_TOKEN"
}
- Navigate to frontend directory
cd rutgers-health-frontend
- Install dependencies
npm install
# Navigate to project root
cd /path/to/rutgers-health-ai-feedback
# Start the backend server
cd backend
python integrated_main.py
Expected Output:
INFO: Started server process [12345]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
# Open a new terminal window
cd /path/to/rutgers-health-ai-feedback/rutgers-health-frontend
# Start the frontend development server
npm run dev
Expected Output:
▲ Next.js 15.5.4
- Local: http://localhost:3000
- Network: http://192.168.1.100:3000
✓ Ready in 2.3s
- Backend: Visit
http://localhost:8000
- should show API welcome message - Frontend: Visit
http://localhost:3000
- should show the application interface
# From project root
docker-compose up -d
docker-compose ps
Create start_backend.sh
:
#!/bin/bash
cd backend
python integrated_main.py
Create start_frontend.sh
:
#!/bin/bash
cd rutgers-health-frontend
npm run dev
Make them executable:
chmod +x start_backend.sh start_frontend.sh
# In the terminal running the backend
# Press Ctrl+C to stop the server
# In the terminal running the frontend
# Press Ctrl+C to stop the development server
# Stop all services
docker-compose down
# Stop and remove volumes
docker-compose down -v
# Find process using port 8000
lsof -ti:8000
# Kill the process
kill -9 $(lsof -ti:8000)
# Find process using port 3000
lsof -ti:3000
# Kill the process
kill -9 $(lsof -ti:3000)
# Kill all Python processes (be careful!)
pkill -f "python.*integrated_main"
# Kill all Node processes (be careful!)
pkill -f "node.*next"
# Terminal 1: Start Backend
cd /Users/charmypatel/Desktop/test/rutgers-health-ai-feedback
cd backend
python integrated_main.py
# Terminal 2: Start Frontend (after backend is running)
cd /Users/charmypatel/Desktop/test/rutgers-health-ai-feedback
cd rutgers-health-frontend
npm run dev
- Open browser to
http://localhost:3000
- Go to Upload tab
- Upload a WAV file
- Wait for processing
- View results in Transcript, Detailed Analysis, Results, and Dashboard tabs
# Terminal 1: Stop Backend
# Press Ctrl+C
# Terminal 2: Stop Frontend
# Press Ctrl+C
# Morning: Start the project
cd /path/to/project
cd backend && python integrated_main.py &
cd ../rutgers-health-frontend && npm run dev &
# Evening: Stop the project
pkill -f "python.*integrated_main"
pkill -f "node.*next"
# Make changes to code
# Backend: Restart with Ctrl+C and run again
# Frontend: Usually auto-reloads, or restart with Ctrl+C and npm run dev
# Check if ports are in use
netstat -an | grep :8000
netstat -an | grep :3000
# Check running processes
ps aux | grep python
ps aux | grep node
-
Upload Audio File
- Navigate to the Upload tab
- Select a WAV audio file
- Click "Upload and Analyze"
- Wait for processing to complete
-
View Results
- Transcript Tab: View the full transcription
- Detailed Analysis Tab: See comprehensive analysis results
- Results Tab: View scoring and metrics
- Dashboard Tab: See system analytics
Run the analysis pipeline directly:
python demo_pipeline_any_audio.py <case_id> [whisper_model]
Examples:
# Use default case with base model
python demo_pipeline_any_audio.py
# Use specific case with medium model
python demo_pipeline_any_audio.py 0042 medium
# Available Whisper models: tiny, base, small, medium, large
Place your audio files in:
data/shared-dataset/Oral_presentations_audio_out_anon/RUHH_Oral_{case_id}_bleeped_anon.wav
The system creates comprehensive output files in demo_output_whisper/
:
{case_id}_feedback.json
- Complete analysis results{case_id}_metadata.json
- Processing metadata{case_id}_structured.json
- Structured clinical data{case_id}_whisper_transcript.txt
- Full transcription
- OPQRST Analysis: Coverage of medical history elements
- Gap Detection: Missing clinical information identification
- Flow Analysis: Presentation structure evaluation
- Communication Metrics: Communication effectiveness scores
- Rubric Scores: Multi-dimensional clinical scoring
- Detailed Feedback: Comprehensive improvement recommendations
# AWS Configuration
export AWS_DEFAULT_REGION="us-east-1"
export AWS_ACCESS_KEY_ID="your_access_key"
export AWS_SECRET_ACCESS_KEY="your_secret_key"
export AWS_SESSION_TOKEN="your_session_token"
# Optional: Custom model settings
export WHISPER_MODEL="base" # tiny, base, small, medium, large
export BEDROCK_MODEL_ID="anthropic.claude-3-opus-20240229-v1:0"
- tiny: Fastest, least accurate
- base: Good balance (recommended)
- small: Better accuracy
- medium: High accuracy
- large: Best accuracy, slowest
# Test complete workflow
python test_complete_workflow.py
# Test frontend integration
python test_frontend_integration.py
# Test analysis modules
python test_analysis_modules.py
Enable debug logging:
# Backend debug
python logging_backend.py
# Frontend debug
# Open browser console to view debug logs
-
Whisper FP16 Error
- Solution: System automatically uses FP32 on CPU
- Ensure sufficient RAM for model loading
-
AWS Bedrock Access
- Verify AWS credentials in
aws-config.json
- Ensure Bedrock access is enabled in your AWS account
- Check region compatibility
- Verify AWS credentials in
-
Memory Issues
- Use smaller Whisper models (tiny, base)
- Ensure sufficient RAM (8GB+ recommended)
- Close other applications during processing
-
Audio Format Issues
- Only WAV files are supported
- Ensure audio quality is good
- Check file size (large files may timeout)
-
Port Already in Use
# Check what's using port 8000 lsof -ti:8000 # Check what's using port 3000 lsof -ti:3000 # Kill processes if needed kill -9 $(lsof -ti:8000) kill -9 $(lsof -ti:3000)
-
Backend Won't Start
# Check Python version python --version # Check if dependencies are installed pip list | grep fastapi # Reinstall dependencies pip install -r requirements.txt
-
Frontend Won't Start
# Check Node version node --version # Clear npm cache npm cache clean --force # Reinstall dependencies rm -rf node_modules package-lock.json npm install
-
Analysis Not Working
- Check AWS credentials are valid
- Verify audio file is in WAV format
- Check backend logs for error messages
- Ensure sufficient disk space for output files
- Debug Tab: Use the Debug tab in the frontend to test API connections
- Console Logs: Check browser console for detailed error messages
- Backend Logs: Monitor terminal output for processing status
- Network Tab: Use browser dev tools to inspect API calls
# Check if backend is running
curl http://localhost:8000/
# Check if frontend is running
curl http://localhost:3000/
# Check system resources
top
htop
# Check disk space
df -h
# Check memory usage
free -h
# Terminal 1: Backend
cd backend && python integrated_main.py
# Terminal 2: Frontend
cd rutgers-health-frontend && npm run dev
# Method 1: Ctrl+C in each terminal
# Method 2: Kill by port
kill -9 $(lsof -ti:8000) $(lsof -ti:3000)
# Stop all processes
pkill -f "python.*integrated_main"
pkill -f "node.*next"
# Clear temporary files
rm -rf backend/temp_uploads/*
rm -rf demo_output_whisper/*
# Restart
cd backend && python integrated_main.py &
cd ../rutgers-health-frontend && npm run dev &
Project Structure:
├── backend/
│ ├── integrated_main.py # Main server
│ ├── temp_uploads/ # Uploaded files
│ └── demo_output_whisper/ # Analysis results
├── rutgers-health-frontend/
│ ├── src/components/ # React components
│ └── src/lib/ # Utilities
├── src/ # Analysis modules
├── data/ # Sample audio files
├── aws-config.json # AWS credentials
└── requirements.txt # Python dependencies
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Docs: http://localhost:8000/docs
- Health Check: http://localhost:8000/
# Optional: Set these for custom configuration
export AWS_DEFAULT_REGION="us-east-1"
export WHISPER_MODEL="base"
export BEDROCK_MODEL_ID="anthropic.claude-3-opus-20240229-v1:0"
- Backend Logs: Terminal output where you run
python integrated_main.py
- Frontend Logs: Browser console (F12 → Console)
- Debug Logs: Check browser console for detailed API calls
- Error Logs: Backend terminal and browser console
- Tiny Model: ~30 seconds
- Base Model: ~60 seconds (recommended)
- Small Model: ~90 seconds
- Medium Model: ~2-3 minutes
- Large Model: ~5-10 minutes
- RAM: 8GB+ recommended
- CPU: Multi-core processor
- Storage: 2GB+ for models and dependencies
- Network: Stable internet for AWS Bedrock access
- Audio files are processed locally
- Transcripts are stored temporarily in memory
- No persistent storage of sensitive medical data
- AWS credentials should be kept secure
- Use environment variables for credentials
- Don't commit
aws-config.json
to version control - Regularly rotate AWS access keys
- Use IAM roles with minimal required permissions
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
- Python: Follow PEP 8 guidelines
- TypeScript: Use ESLint configuration
- React: Follow React best practices
- Documentation: Update README for new features
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI Whisper for speech-to-text capabilities
- AWS Bedrock for LLM analysis
- Rutgers Health for medical education context
- Open source community for various dependencies
For issues and questions:
- Check the troubleshooting section
- Review existing GitHub issues
- Create a new issue with detailed information
- Include system logs and error messages
- v1.0.0: Initial release with basic functionality
- v1.1.0: Added web interface and real-time processing
- v1.2.0: Enhanced analysis pipeline and improved accuracy
- v1.3.0: Added comprehensive debugging and error handling
- v1.4.0: Full frontend-backend integration with production-ready features
# Nuclear option: Reset everything
pkill -f python
pkill -f node
rm -rf backend/temp_uploads/*
rm -rf demo_output_whisper/*
rm -rf rutgers-health-frontend/node_modules
rm -rf rutgers-health-frontend/.next
# Reinstall everything
pip install -r requirements.txt
cd rutgers-health-frontend && npm install
# Check memory usage
free -h
top
# Kill heavy processes
pkill -f whisper
pkill -f python
# Restart with smaller model
export WHISPER_MODEL="tiny"
# Test AWS connection
aws sts get-caller-identity
# Check Bedrock access
aws bedrock list-foundation-models --region us-east-1
# Check disk space
df -h
# Clean up old files
find demo_output_whisper/ -name "*.json" -mtime +7 -delete
find backend/temp_uploads/ -name "*.wav" -mtime +1 -delete
- Check system resources:
htop
- Monitor logs for errors
- Clean temporary files if needed
- Update dependencies:
pip install -r requirements.txt --upgrade
- Clean old analysis files
- Check AWS quota usage
- Review and rotate AWS credentials
- Update documentation
- Performance optimization review
- Check this README first
- Review troubleshooting section
- Check GitHub issues: https://github.com/patelshrey40/rutgers-health-ai-feedback/issues
- Create new issue with:
- System information (OS, Python version, Node version)
- Error messages (full stack trace)
- Steps to reproduce
- Log files
# When reporting issues, include:
python --version
node --version
npm --version
pip list | grep -E "(fastapi|whisper|boto3)"
npm list --depth=0
- Backend terminal output
- Browser console logs
- System logs if available
- Error screenshots
Built with ❤️ for medical education and healthcare training
Last Updated: October 2024
Version: 1.4.0
Maintainer: Rutgers Health AI Team