ASR-Translator-as-Microservice

A microservice-based system that performs Automatic Speech Recognition (ASR) on English audio files and translates the text to Persian. The system is built with Django and uses an Event-Driven Architecture (EDA) with RabbitMQ for communication between services.

System Architecture

The system consists of three main components:

API Gateway (Django): Handles file uploads and translation status requests
ASR Service: Performs speech-to-text conversion using VOSK
Translation Service: Translates English text to Persian using Argostranslate

All components communicate asynchronously through RabbitMQ events.

Prerequisites

Python 3.11+
RabbitMQ Server
VOSK English model (vosk-model-small-en-us-0.15)
Docker
Prometheus & Grafana (for monitoring)
PostgreSQL (recommended) or SQLite
Redis (for caching)

Installation

Traditional Setup

Clone the repository:

git clone https://github.com/arfa79/ASR-Translator-as-Microservice.git
cd ASR-Translator-as-Microservice

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Set up environment variables:

# Generate a .env file with secure settings
python generate_env.py

# Or create .env manually with necessary settings:
# SECRET_KEY, DB_* settings, etc.

Set up PostgreSQL:

# Install PostgreSQL if not already installed
# On Ubuntu/Debian:
sudo apt install postgresql postgresql-contrib

# Create database
sudo -u postgres createdb asr_translator

# Or use the database settings you specified in the .env file

Download VOSK model:
- Download vosk-model-small-en-us-0.15
- Extract it to the project root directory
Set up RabbitMQ:
- Install Erlang
- Install RabbitMQ Server
- Start RabbitMQ service
Set up Redis (optional, but recommended for caching):

# Install Redis if not already installed
# On Ubuntu/Debian:
sudo apt install redis-server

# Start Redis
sudo service redis-server start

Initialize Django:

python manage.py migrate
python manage.py createsuperuser  # Optional, for admin access

Docker Setup

Clone the repository:

git clone https://github.com/yourusername/ASR-Translator-as-Microservice.git
cd ASR-Translator-as-Microservice

Create a .env file from the example template:

cp env.example .env

Edit the .env file to configure your environment:
- Update SECRET_KEY with a secure key
- Set database credentials
- Configure other settings as needed

Running the System

Without Docker

You need to run three components in separate terminals:

Django Server:

python manage.py runserver

ASR Service:

python asr_system.py

Translation Service:

python translator_agent.py

(Optional) Run with metrics collection and autoscaling:

# Verify dependencies and configure autoscaling
./setup_autoscaling.sh

# Run the integrated system
python -m asr_translator.main

With Docker

Build and start all services:

docker-compose up -d

Check service status:

docker-compose ps

Access the application:
- Web API: http://localhost:8000/
- RabbitMQ Management: http://localhost:15672/ (username/password from .env)
- Prometheus: http://localhost:9090/
- Grafana: http://localhost:3000/ (default admin/admin)

Testing

The system includes a comprehensive testing suite in the tests/ directory:

To run all tests using pytest:

# Run all tests
cd tests
pytest

Test Structure

test_vosk_model.py: Tests for the VOSK speech recognition functionality
- test_model_loading: Verifies VOSK model loading
- test_recognizer_creation: Tests KaldiRecognizer creation
- test_audio_processing: Tests standard audio file processing
- test_8k_audio_processing: Tests 8kHz audio file processing
Integration Tests: Tests for API endpoints and service communication
Performance Tests: Tests for system performance under load

Test Configuration

The conftest.py file contains fixtures used across tests:

vosk_model: Loads the VOSK model for testing
data_dir: Creates and manages test data directory
sample_audio_file: Provides sample audio for testing
service_url: Configures the service URL for testing

Usage

API Endpoints

Upload Audio File:

POST http://localhost:8000/upload/
Content-Type: multipart/form-data
Body: audio=@your-file.wav

Response:

{
    "status": "accepted",
    "file_id": "unique-identifier",
    "message": "File uploaded successfully and processing has begun"
}

Check Translation Status:

GET http://localhost:8000/translation/

Response:

{
    "file_id": "unique-identifier",
    "translation": "Persian translation"  # If completed
}

or

{
    "file_id": "unique-identifier",
    "status": "transcribing|translating"  # If in progress
}

Features

Core Features

Asynchronous processing using event-driven architecture
Automatic file cleanup after processing
Health monitoring for both services
Rate limiting for API endpoints
Comprehensive error handling and logging
Support for WAV audio files
Automatic retry logic for service connections

Performance Optimizations

Streaming Processing: ASR processing in chunks for immediate feedback
Parallel Processing: Large audio files split into segments and processed concurrently
Model Caching: VOSK models loaded once and kept in memory
Translation Caching: Redis-based caching for translations to avoid redundant work
Message Priorities: RabbitMQ message priorities based on file size
CPU Affinity Settings: Services assigned to specific CPU cores
Message Compression: zlib compression for RabbitMQ messages
HTTP Streaming Responses: Real-time updates to clients
PostgreSQL Database: High-performance database for production use

Performance Monitoring

The system includes a built-in metrics collection system using Prometheus:

Setup Monitoring Stack:

./monitoring/setup_monitoring.sh
cd monitoring
docker-compose up -d

Available Metrics:
- Request Rates: Audio uploads, ASR requests, and translations
- Processing Times: Duration measurements for each step
- Resource Usage: Memory and CPU monitoring
- Queue Sizes: RabbitMQ queue monitoring
- Cache Hit Ratio: Translation cache performance
Access Dashboards:
- Prometheus: http://localhost:9090
- Grafana: http://localhost:3000 (login with admin/admin)

Autoscaling

The system can dynamically scale based on workload metrics:

Setup Autoscaling:

# Verify dependencies and configure autoscaling
./setup_autoscaling.sh

# Enable autoscaling
export ENABLE_AUTOSCALING=True

Scaling Logic:
- Scales up when queue sizes exceed thresholds
- Scales up when CPU usage is too high
- Scales up when processing times are too long
- Scales down during low load periods
Configuration: Customize thresholds via environment variables

export QUEUE_HIGH_THRESHOLD=10
export CPU_HIGH_THRESHOLD=70.0
export PROCESSING_TIME_THRESHOLD=30.0

Docker Deployment

Service Architecture

The Docker setup includes the following services:

web: Django API server for handling HTTP requests
asr_worker: Speech recognition worker service using VOSK
translator_worker: Text translation worker service using Argostranslate
db: PostgreSQL database for persistent data storage
redis: Redis cache for improved performance
rabbitmq: Message broker for communication between services
prometheus: Metrics collection for monitoring
grafana: Visualization dashboard for metrics

Resource Management and CPU Affinity

Container Resource Settings

The Docker Compose configuration is designed to work with the application's internal CPU affinity and resource management:

CPU Limits: Set to zero (cpus: '0') to allow the application to manage its own CPU allocation through CPU affinity settings.
CPU Reservations: Set minimum CPU resources that containers should have access to.
Memory Limits: Set higher than required to accommodate peak usage and prevent OOM kills.

This approach allows the ASR and Translator services to:

Run their internal CPU affinity optimizations without container interference
Dynamically scale CPU usage based on workload
Properly handle parallel processing of audio files

Adjusting Resource Settings

If you observe resource-related issues:

Check application logs for affinity or resource errors
Adjust the container settings in docker-compose.yml as follows:
- Increase memory limits if you see OOM errors
- Adjust CPU reservations based on host capacity
- Consider setting cpus limit if the application consumes too many resources

CPU Pinning for Production

For production deployments on multi-CPU systems, you may want to pin specific containers to specific CPUs to match the application's internal CPU affinity settings:

# Example: Run containers with specific CPU pinning (Docker run example)
docker run --cpuset-cpus="0,1" --name asr_worker_1 your-asr-image

This ensures the application's internal CPU affinity matches the container's CPU allocation.

Scaling Docker Services

To scale the worker services:

# Scale ASR workers to 3 instances
docker-compose up -d --scale asr_worker=3

# Scale translator workers to 2 instances
docker-compose up -d --scale translator_worker=2

Accessing Docker Logs

# View logs from all services
docker-compose logs

# View logs from a specific service
docker-compose logs web

# Follow logs in real-time
docker-compose logs -f asr_worker

Common Docker Tasks

Database Migrations

docker-compose exec web python manage.py makemigrations
docker-compose exec web python manage.py migrate

Creating a Superuser

docker-compose exec web python manage.py createsuperuser

Backing Up the Database

docker-compose exec db pg_dump -U postgres asr_translator > backup.sql

Restoring the Database

cat backup.sql | docker-compose exec -T db psql -U postgres asr_translator

Stopping Docker Services

# Stop services but keep volumes and networks
docker-compose down

# Stop services and remove volumes (WARNING: This will delete all data)
docker-compose down -v

Docker Troubleshooting

Container Won't Start

Check logs for the failing container:

docker-compose logs [service-name]

Database Connection Issues

Ensure PostgreSQL is running and the connection details in .env are correct:

docker-compose exec db psql -U postgres -c "SELECT 1"

Models Not Loading

Check if models are correctly mounted in the volumes:

docker-compose exec asr_worker ls -la /app/models/vosk

Resource-Related Issues

If you're experiencing issues related to CPU or memory:

# Check container resource usage
docker stats

# View container details including resource limits
docker inspect asr_worker_1 | grep -A 20 "HostConfig"

Docker Production Deployment Notes

For production deployments:

Use proper SSL/TLS termination with a reverse proxy like Nginx
Set DEBUG=False in the .env file
Use strong, unique passwords for all services
Consider using Docker Swarm or Kubernetes for advanced orchestration
Set up regular backups of the database and media files
Use proper monitoring and alerting

Docker Security Considerations

The Docker Compose setup exposes several ports to the host. In production, consider restricting access using a proper network configuration.
Default credentials are included in the env.example file. Always change these for production deployments.
Secret management: Consider using Docker secrets or a dedicated solution like HashiCorp Vault for managing sensitive information.

Performance Tuning

Database Optimization

Proper indexes have been added to commonly queried fields
Custom QuerySets and Managers optimize database access patterns
Bulk operations are used for efficiency with large datasets

Container Performance

The deploy section in the compose file includes resource reservations and limits for the worker containers. The default configuration is designed to work with the application's internal CPU affinity and resource management features, but you may need to adjust based on your server capacity and workload requirements.

Dependencies

The project uses dependencies with specific versions as defined in requirements.txt:

Web Framework: Django==5.2, djangorestframework==3.14.0
Speech Recognition: vosk==0.3.45, SoundFile==0.10.3.post1
Translation: argostranslate==1.8.0
Messaging: pika==1.3.2
Caching: redis==4.5.5, django-redis==5.2.0
Database: psycopg2-binary==2.9.6
Monitoring: prometheus-client==0.16.0
HTTP: requests==2.28.2
Utils: python-dotenv==1.0.0, numpy==1.24.3

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.vscode		.vscode
asr_translator		asr_translator
audio_processing		audio_processing
media/uploads		media/uploads
monitoring		monitoring
speech_translator		speech_translator
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
Dockerfile.worker		Dockerfile.worker
LICENSE		LICENSE
README.md		README.md
asr_system.py		asr_system.py
docker-compose.yml		docker-compose.yml
docker-entrypoint.sh		docker-entrypoint.sh
docker-worker-entrypoint.sh		docker-worker-entrypoint.sh
env.example		env.example
generate_env.py		generate_env.py
manage.py		manage.py
requirements.txt		requirements.txt
setup_autoscaling.sh		setup_autoscaling.sh
translator_agent.py		translator_agent.py

License

arfa79/ASR-Translator-as-Microservice

Folders and files

Latest commit

History

Repository files navigation