Skip to content

arfa79/ASR-Translator-as-Microservice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ASR-Translator-as-Microservice

A microservice-based system that performs Automatic Speech Recognition (ASR) on English audio files and translates the text to Persian. The system is built with Django and uses an Event-Driven Architecture (EDA) with RabbitMQ for communication between services.

Table of Contents

System Architecture

The system consists of three main components:

  1. API Gateway (Django): Handles file uploads and translation status requests
  2. ASR Service: Performs speech-to-text conversion using VOSK
  3. Translation Service: Translates English text to Persian using Argostranslate

All components communicate asynchronously through RabbitMQ events.

Prerequisites

  • Python 3.11+
  • RabbitMQ Server
  • VOSK English model (vosk-model-small-en-us-0.15)
  • Docker
  • Prometheus & Grafana (for monitoring)
  • PostgreSQL (recommended) or SQLite
  • Redis (for caching)

Installation

Traditional Setup

  1. Clone the repository:
git clone https://github.com/arfa79/ASR-Translator-as-Microservice.git
cd ASR-Translator-as-Microservice
  1. Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Set up environment variables:
# Generate a .env file with secure settings
python generate_env.py

# Or create .env manually with necessary settings:
# SECRET_KEY, DB_* settings, etc.
  1. Set up PostgreSQL:
# Install PostgreSQL if not already installed
# On Ubuntu/Debian:
sudo apt install postgresql postgresql-contrib

# Create database
sudo -u postgres createdb asr_translator

# Or use the database settings you specified in the .env file
  1. Download VOSK model:

  2. Set up RabbitMQ:

  3. Set up Redis (optional, but recommended for caching):

# Install Redis if not already installed
# On Ubuntu/Debian:
sudo apt install redis-server

# Start Redis
sudo service redis-server start
  1. Initialize Django:
python manage.py migrate
python manage.py createsuperuser  # Optional, for admin access

Docker Setup

  1. Clone the repository:
git clone https://github.com/yourusername/ASR-Translator-as-Microservice.git
cd ASR-Translator-as-Microservice
  1. Create a .env file from the example template:
cp env.example .env
  1. Edit the .env file to configure your environment:
    • Update SECRET_KEY with a secure key
    • Set database credentials
    • Configure other settings as needed

Running the System

Without Docker

You need to run three components in separate terminals:

  1. Django Server:
python manage.py runserver
  1. ASR Service:
python asr_system.py
  1. Translation Service:
python translator_agent.py
  1. (Optional) Run with metrics collection and autoscaling:
# Verify dependencies and configure autoscaling
./setup_autoscaling.sh

# Run the integrated system
python -m asr_translator.main

With Docker

  1. Build and start all services:
docker-compose up -d
  1. Check service status:
docker-compose ps
  1. Access the application:

Testing

The system includes a comprehensive testing suite in the tests/ directory:

To run all tests using pytest:

# Run all tests
cd tests
pytest

Test Structure

  • test_vosk_model.py: Tests for the VOSK speech recognition functionality

    • test_model_loading: Verifies VOSK model loading
    • test_recognizer_creation: Tests KaldiRecognizer creation
    • test_audio_processing: Tests standard audio file processing
    • test_8k_audio_processing: Tests 8kHz audio file processing
  • Integration Tests: Tests for API endpoints and service communication

  • Performance Tests: Tests for system performance under load

Test Configuration

The conftest.py file contains fixtures used across tests:

  • vosk_model: Loads the VOSK model for testing
  • data_dir: Creates and manages test data directory
  • sample_audio_file: Provides sample audio for testing
  • service_url: Configures the service URL for testing

Usage

API Endpoints

  1. Upload Audio File:
POST http://localhost:8000/upload/
Content-Type: multipart/form-data
Body: audio=@your-file.wav

Response:

{
    "status": "accepted",
    "file_id": "unique-identifier",
    "message": "File uploaded successfully and processing has begun"
}
  1. Check Translation Status:
GET http://localhost:8000/translation/

Response:

{
    "file_id": "unique-identifier",
    "translation": "Persian translation"  # If completed
}

or

{
    "file_id": "unique-identifier",
    "status": "transcribing|translating"  # If in progress
}

Features

Core Features

  • Asynchronous processing using event-driven architecture
  • Automatic file cleanup after processing
  • Health monitoring for both services
  • Rate limiting for API endpoints
  • Comprehensive error handling and logging
  • Support for WAV audio files
  • Automatic retry logic for service connections

Performance Optimizations

  • Streaming Processing: ASR processing in chunks for immediate feedback
  • Parallel Processing: Large audio files split into segments and processed concurrently
  • Model Caching: VOSK models loaded once and kept in memory
  • Translation Caching: Redis-based caching for translations to avoid redundant work
  • Message Priorities: RabbitMQ message priorities based on file size
  • CPU Affinity Settings: Services assigned to specific CPU cores
  • Message Compression: zlib compression for RabbitMQ messages
  • HTTP Streaming Responses: Real-time updates to clients
  • PostgreSQL Database: High-performance database for production use

Performance Monitoring

The system includes a built-in metrics collection system using Prometheus:

  1. Setup Monitoring Stack:
./monitoring/setup_monitoring.sh
cd monitoring
docker-compose up -d
  1. Available Metrics:

    • Request Rates: Audio uploads, ASR requests, and translations
    • Processing Times: Duration measurements for each step
    • Resource Usage: Memory and CPU monitoring
    • Queue Sizes: RabbitMQ queue monitoring
    • Cache Hit Ratio: Translation cache performance
  2. Access Dashboards:

Autoscaling

The system can dynamically scale based on workload metrics:

  1. Setup Autoscaling:
# Verify dependencies and configure autoscaling
./setup_autoscaling.sh

# Enable autoscaling
export ENABLE_AUTOSCALING=True
  1. Scaling Logic:

    • Scales up when queue sizes exceed thresholds
    • Scales up when CPU usage is too high
    • Scales up when processing times are too long
    • Scales down during low load periods
  2. Configuration: Customize thresholds via environment variables

export QUEUE_HIGH_THRESHOLD=10
export CPU_HIGH_THRESHOLD=70.0
export PROCESSING_TIME_THRESHOLD=30.0

Docker Deployment

Service Architecture

The Docker setup includes the following services:

  1. web: Django API server for handling HTTP requests
  2. asr_worker: Speech recognition worker service using VOSK
  3. translator_worker: Text translation worker service using Argostranslate
  4. db: PostgreSQL database for persistent data storage
  5. redis: Redis cache for improved performance
  6. rabbitmq: Message broker for communication between services
  7. prometheus: Metrics collection for monitoring
  8. grafana: Visualization dashboard for metrics

Resource Management and CPU Affinity

Container Resource Settings

The Docker Compose configuration is designed to work with the application's internal CPU affinity and resource management:

  • CPU Limits: Set to zero (cpus: '0') to allow the application to manage its own CPU allocation through CPU affinity settings.
  • CPU Reservations: Set minimum CPU resources that containers should have access to.
  • Memory Limits: Set higher than required to accommodate peak usage and prevent OOM kills.

This approach allows the ASR and Translator services to:

  1. Run their internal CPU affinity optimizations without container interference
  2. Dynamically scale CPU usage based on workload
  3. Properly handle parallel processing of audio files

Adjusting Resource Settings

If you observe resource-related issues:

  1. Check application logs for affinity or resource errors
  2. Adjust the container settings in docker-compose.yml as follows:
    • Increase memory limits if you see OOM errors
    • Adjust CPU reservations based on host capacity
    • Consider setting cpus limit if the application consumes too many resources

CPU Pinning for Production

For production deployments on multi-CPU systems, you may want to pin specific containers to specific CPUs to match the application's internal CPU affinity settings:

# Example: Run containers with specific CPU pinning (Docker run example)
docker run --cpuset-cpus="0,1" --name asr_worker_1 your-asr-image

This ensures the application's internal CPU affinity matches the container's CPU allocation.

Scaling Docker Services

To scale the worker services:

# Scale ASR workers to 3 instances
docker-compose up -d --scale asr_worker=3

# Scale translator workers to 2 instances
docker-compose up -d --scale translator_worker=2

Accessing Docker Logs

# View logs from all services
docker-compose logs

# View logs from a specific service
docker-compose logs web

# Follow logs in real-time
docker-compose logs -f asr_worker

Common Docker Tasks

Database Migrations

docker-compose exec web python manage.py makemigrations
docker-compose exec web python manage.py migrate

Creating a Superuser

docker-compose exec web python manage.py createsuperuser

Backing Up the Database

docker-compose exec db pg_dump -U postgres asr_translator > backup.sql

Restoring the Database

cat backup.sql | docker-compose exec -T db psql -U postgres asr_translator

Stopping Docker Services

# Stop services but keep volumes and networks
docker-compose down

# Stop services and remove volumes (WARNING: This will delete all data)
docker-compose down -v

Docker Troubleshooting

Container Won't Start

Check logs for the failing container:

docker-compose logs [service-name]

Database Connection Issues

Ensure PostgreSQL is running and the connection details in .env are correct:

docker-compose exec db psql -U postgres -c "SELECT 1"

Models Not Loading

Check if models are correctly mounted in the volumes:

docker-compose exec asr_worker ls -la /app/models/vosk

Resource-Related Issues

If you're experiencing issues related to CPU or memory:

# Check container resource usage
docker stats

# View container details including resource limits
docker inspect asr_worker_1 | grep -A 20 "HostConfig"

Docker Production Deployment Notes

For production deployments:

  1. Use proper SSL/TLS termination with a reverse proxy like Nginx
  2. Set DEBUG=False in the .env file
  3. Use strong, unique passwords for all services
  4. Consider using Docker Swarm or Kubernetes for advanced orchestration
  5. Set up regular backups of the database and media files
  6. Use proper monitoring and alerting

Docker Security Considerations

  • The Docker Compose setup exposes several ports to the host. In production, consider restricting access using a proper network configuration.
  • Default credentials are included in the env.example file. Always change these for production deployments.
  • Secret management: Consider using Docker secrets or a dedicated solution like HashiCorp Vault for managing sensitive information.

Performance Tuning

Database Optimization

  • Proper indexes have been added to commonly queried fields
  • Custom QuerySets and Managers optimize database access patterns
  • Bulk operations are used for efficiency with large datasets

Container Performance

The deploy section in the compose file includes resource reservations and limits for the worker containers. The default configuration is designed to work with the application's internal CPU affinity and resource management features, but you may need to adjust based on your server capacity and workload requirements.

Dependencies

The project uses dependencies with specific versions as defined in requirements.txt:

  • Web Framework: Django==5.2, djangorestframework==3.14.0
  • Speech Recognition: vosk==0.3.45, SoundFile==0.10.3.post1
  • Translation: argostranslate==1.8.0
  • Messaging: pika==1.3.2
  • Caching: redis==4.5.5, django-redis==5.2.0
  • Database: psycopg2-binary==2.9.6
  • Monitoring: prometheus-client==0.16.0
  • HTTP: requests==2.28.2
  • Utils: python-dotenv==1.0.0, numpy==1.24.3

About

In this project we have to implement an ASR system plus a text translator in microservice architecture

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published