Microsoft Translator Provider

Microsoft Azure Translator integration for the MT Providers framework.

Overview

This provider enables seamless integration with Microsoft Azure Translator services through the MT Providers framework. It supports both synchronous and asynchronous operations, automatic retries, rate limiting, and comprehensive error handling.

Installation

Prerequisites

Python 3.8 or higher
Azure Translator subscription key
Azure region identifier

Install from PyPI

pip install mt_provider_microsoft

Install for Development

git clone https://github.com/assystant/mt-provider-microsoft.git
cd mt-provider-microsoft
pip install -e ".[test,docs]"

Features

✅ Single and Batch Translations: Translate individual texts or process multiple texts efficiently
✅ Async Support: Full async/await support with aiohttp for non-blocking operations
✅ Rate Limiting: Built-in rate limiting to respect API quotas
✅ Automatic Retries: Configurable retry logic with exponential backoff
✅ Error Handling: Comprehensive error handling with detailed error messages
✅ Region Support: Multi-region deployment support
✅ Response Metadata: Includes detected language and confidence scores
✅ Type Safety: Full type annotations with mypy support
✅ Framework Integration: Seamless integration with MT Providers ecosystem

Configuration

Basic Configuration

from mt_providers.types import TranslationConfig

config = TranslationConfig(
    api_key="your-azure-translator-key",
    region="westus2",  # Your Azure region
    timeout=30,        # Optional: request timeout in seconds
    rate_limit=10,     # Optional: requests per second
)

Environment Variables

import os
from mt_providers.types import TranslationConfig

config = TranslationConfig(
    api_key=os.getenv("AZURE_TRANSLATOR_KEY"),
    region=os.getenv("AZURE_TRANSLATOR_REGION", "westus2"),
    timeout=int(os.getenv("AZURE_TRANSLATOR_TIMEOUT", "30")),
)

Configuration Options

Option	Type	Required	Default	Description
`api_key`	str	Yes	-	Azure Translator subscription key
`region`	str	Yes	-	Azure region (e.g., "westus2", "eastus")
`endpoint`	str	No	Microsoft default	Custom API endpoint URL
`timeout`	int	No	30	Request timeout in seconds
`rate_limit`	int	No	None	Maximum requests per second
`retry_attempts`	int	No	3	Number of retry attempts
`retry_backoff`	float	No	1.0	Retry backoff multiplier

Usage Examples

Basic Translation

from mt_providers import get_provider
from mt_providers.types import TranslationConfig

# Configure the provider
config = TranslationConfig(
    api_key="your-azure-translator-key",
    region="westus2"
)

# Get the Microsoft provider
translator = get_provider("microsoft")(config)

# Translate a single text
result = translator.translate("Hello world", "en", "es")
print(f"Translation: {result['translated_text']}")  # "¡Hola mundo!"
print(f"Detected language: {result['metadata']['detected_language']}")

Batch Translation

# Translate multiple texts efficiently
texts = [
    "Hello world",
    "How are you?", 
    "Good morning",
    "Thank you very much"
]

results = translator.bulk_translate(texts, "en", "es")

for i, result in enumerate(results):
    print(f"{texts[i]} → {result['translated_text']}")
    # Hello world → ¡Hola mundo!
    # How are you? → ¿Cómo estás?
    # Good morning → Buenos días
    # Thank you very much → Muchas gracias

Async Translation

import asyncio

async def async_translate_example():
    # Single async translation
    result = await translator.translate_async("Hello world", "en", "fr")
    print(f"Async result: {result['translated_text']}")  # "Bonjour le monde"
    
    # Batch async translation
    texts = ["Hello", "World", "Python"]
    results = await translator.bulk_translate_async(texts, "en", "de")
    
    for text, result in zip(texts, results):
        print(f"{text} → {result['translated_text']}")

# Run async function
asyncio.run(async_translate_example())

Error Handling

from mt_providers.exceptions import (
    ConfigurationError,
    TranslationError, 
    ProviderError
)
from mt_providers.types import TranslationStatus

try:
    result = translator.translate("Hello", "en", "es")
    
    if result['status'] == TranslationStatus.SUCCESS:
        print(f"Success: {result['translated_text']}")
    else:
        print(f"Translation failed: {result['error']}")
        
except ConfigurationError as e:
    print(f"Configuration error: {e}")
except TranslationError as e:
    print(f"Translation error: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")

Advanced Configuration

# Custom endpoint and advanced settings
config = TranslationConfig(
    api_key="your-key",
    region="westus2",
    endpoint="https://custom-endpoint.cognitiveservices.azure.com/translator/text/v3.0/translate",
    timeout=60,
    rate_limit=50,  # 50 requests per second
    retry_attempts=5,
    retry_backoff=2.0
)

translator = get_provider("microsoft")(config)

API Reference

MicrosoftTranslator Class

The main translator class that implements the MT Providers interface.

Methods

`translate(text: str, source_lang: str, target_lang: str) -> TranslationResult`

Translates a single text synchronously.

Parameters:

text (str): Text to translate (max 5000 characters)
source_lang (str): Source language code (ISO 639-1, e.g., "en", "es")
target_lang (str): Target language code (ISO 639-1, e.g., "fr", "de")

Returns:

TranslationResult: Dictionary with translation results and metadata

Example:

result = translator.translate("Hello", "en", "es")
# Returns: {
#     'translated_text': '¡Hola',
#     'status': TranslationStatus.SUCCESS,
#     'metadata': {
#         'detected_language': 'en',
#         'confidence': 0.95,
#         'provider': 'microsoft',
#         'model': 'azure-translator-3.0'
#     }
# }

`bulk_translate(texts: List[str], source_lang: str, target_lang: str) -> List[TranslationResult]`

Translates multiple texts in a single batch request.

Parameters:

texts (List[str]): List of texts to translate (max 100 texts)
source_lang (str): Source language code
target_lang (str): Target language code

Returns:

List[TranslationResult]: List of translation results

`translate_async(text: str, source_lang: str, target_lang: str) -> TranslationResult`

Asynchronous version of translate().

`bulk_translate_async(texts: List[str], source_lang: str, target_lang: str) -> List[TranslationResult]`

Asynchronous version of bulk_translate().

Supported Languages

Microsoft Translator supports 100+ languages. Common language codes include:

Language	Code	Language	Code
English	en	Spanish	es
French	fr	German	de
Italian	it	Portuguese	pt
Russian	ru	Chinese (Simplified)	zh
Japanese	ja	Korean	ko
Arabic	ar	Hindi	hi

For the complete list, see Microsoft's language support documentation.

Error Handling

Exception Types

The provider raises specific exceptions for different error scenarios:

from mt_providers.exceptions import (
    ConfigurationError,     # Invalid configuration
    TranslationError,       # Translation-specific errors
    ProviderError,          # Provider-specific errors
    RateLimitError,         # Rate limit exceeded
    TimeoutError           # Request timeout
)

Error Response Handling

try:
    result = translator.translate("Hello", "en", "invalid-lang")
except TranslationError as e:
    print(f"Translation failed: {e}")
    print(f"Error code: {e.error_code}")
    print(f"Provider: {e.provider}")
except RateLimitError as e:
    print(f"Rate limit exceeded. Retry after: {e.retry_after} seconds")
except TimeoutError as e:
    print(f"Request timed out after {e.timeout} seconds")

Status Codes

Translation results include status information:

from mt_providers.types import TranslationStatus

result = translator.translate("Hello", "en", "es")

if result['status'] == TranslationStatus.SUCCESS:
    print("Translation successful")
elif result['status'] == TranslationStatus.ERROR:
    print(f"Translation failed: {result['error']}")
elif result['status'] == TranslationStatus.PARTIAL:
    print("Partial success (some texts failed in batch)")

Limits and Quotas

Azure Translator Limits

Character limit: 5,000 characters per request
Batch size: Maximum 100 texts per batch request
Rate limits: Varies by subscription tier
- Free tier: 2M characters/month
- Standard tier: Configurable quotas
Text length: No hard limit, but optimal performance under 1000 characters

Provider Limits

Timeout: Default 30 seconds (configurable)
Retries: Default 3 attempts with exponential backoff
Rate limiting: Configurable requests per second

# Example: High-throughput configuration
config = TranslationConfig(
    api_key="your-key",
    region="westus2",
    timeout=60,
    rate_limit=100,     # 100 requests/second
    retry_attempts=5,
    retry_backoff=1.5
)

Troubleshooting

Common Issues

1. Authentication Errors

# Error: Invalid subscription key
ConfigurationError: Invalid subscription key. Check your API key and region.

# Solution: Verify your credentials
config = TranslationConfig(
    api_key="your-valid-key",  # Check Azure portal
    region="westus2"           # Match your resource region
)

2. Rate Limiting

# Error: Too Many Requests
RateLimitError: Rate limit exceeded. Retry after 60 seconds.

# Solution: Implement backoff or reduce rate
config = TranslationConfig(
    api_key="your-key",
    region="westus2",
    rate_limit=10  # Reduce requests per second
)

3. Language Code Issues

# Error: Unsupported language
TranslationError: Language 'xyz' is not supported.

# Solution: Use valid ISO 639-1 codes
result = translator.translate("Hello", "en", "es")  # ✓ Valid
result = translator.translate("Hello", "english", "spanish")  # ✗ Invalid

4. Text Length Issues

# Error: Text too long
TranslationError: Text exceeds maximum length of 5000 characters.

# Solution: Split long texts
def translate_long_text(text, source, target):
    max_length = 4500  # Leave buffer for safety
    if len(text) <= max_length:
        return translator.translate(text, source, target)
    
    # Split and translate in chunks
    chunks = [text[i:i+max_length] for i in range(0, len(text), max_length)]
    results = translator.bulk_translate(chunks, source, target)
    
    return {
        'translated_text': ''.join(r['translated_text'] for r in results),
        'status': results[0]['status'],
        'metadata': results[0]['metadata']
    }

Debug Mode

Enable debug logging for troubleshooting:

import logging
from mt_providers import configure_logging

# Enable debug logging
configure_logging(level=logging.DEBUG)

# Now all API calls will be logged
translator = get_provider("microsoft")(config)
result = translator.translate("Hello", "en", "es")

Integration Examples

Web Application Integration

from flask import Flask, request, jsonify
from mt_providers import get_provider
from mt_providers.types import TranslationConfig

app = Flask(__name__)

# Initialize translator
config = TranslationConfig(
    api_key=os.getenv("AZURE_TRANSLATOR_KEY"),
    region=os.getenv("AZURE_TRANSLATOR_REGION")
)
translator = get_provider("microsoft")(config)

@app.route('/translate', methods=['POST'])
def translate_text():
    data = request.json
    
    try:
        result = translator.translate(
            data['text'],
            data['source_lang'],
            data['target_lang']
        )
        return jsonify(result)
    except Exception as e:
        return jsonify({'error': str(e)}), 400

if __name__ == '__main__':
    app.run(debug=True)

Async Web Framework (FastAPI)

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from mt_providers import get_provider
from mt_providers.types import TranslationConfig

app = FastAPI()

class TranslationRequest(BaseModel):
    text: str
    source_lang: str
    target_lang: str

# Initialize async translator
config = TranslationConfig(
    api_key=os.getenv("AZURE_TRANSLATOR_KEY"),
    region=os.getenv("AZURE_TRANSLATOR_REGION")
)
translator = get_provider("microsoft")(config)

@app.post("/translate")
async def translate_text(request: TranslationRequest):
    try:
        result = await translator.translate_async(
            request.text,
            request.source_lang,
            request.target_lang
        )
        return result
    except Exception as e:
        raise HTTPException(status_code=400, detail=str(e))

Batch Processing Pipeline

import asyncio
from typing import List
import pandas as pd

async def translate_dataframe(df: pd.DataFrame, text_column: str, 
                            source_lang: str, target_lang: str) -> pd.DataFrame:
    """Translate a column in a pandas DataFrame."""
    
    # Batch translate all texts
    texts = df[text_column].tolist()
    batch_size = 100  # Microsoft's batch limit
    
    all_results = []
    for i in range(0, len(texts), batch_size):
        batch = texts[i:i+batch_size]
        results = await translator.bulk_translate_async(batch, source_lang, target_lang)
        all_results.extend(results)
    
    # Add translated column
    df[f'{text_column}_translated'] = [r['translated_text'] for r in all_results]
    df[f'{text_column}_confidence'] = [r['metadata']['confidence'] for r in all_results]
    
    return df

# Usage
df = pd.read_csv('multilingual_data.csv')
df_translated = asyncio.run(translate_dataframe(df, 'content', 'auto', 'en'))
df_translated.to_csv('translated_data.csv', index=False)

Best Practices

1. Configuration Management

# Use environment variables for sensitive data
import os
from dataclasses import dataclass

@dataclass
class Config:
    azure_key: str = os.getenv("AZURE_TRANSLATOR_KEY")
    azure_region: str = os.getenv("AZURE_TRANSLATOR_REGION", "westus2")
    timeout: int = int(os.getenv("TRANSLATION_TIMEOUT", "30"))
    rate_limit: int = int(os.getenv("TRANSLATION_RATE_LIMIT", "10"))

config = Config()
translation_config = TranslationConfig(
    api_key=config.azure_key,
    region=config.azure_region,
    timeout=config.timeout,
    rate_limit=config.rate_limit
)

2. Error Handling Strategy

from mt_providers.exceptions import TranslationError, RateLimitError
import time

def robust_translate(text, source, target, max_retries=3):
    """Translate with robust error handling."""
    
    for attempt in range(max_retries):
        try:
            return translator.translate(text, source, target)
            
        except RateLimitError as e:
            if attempt < max_retries - 1:
                time.sleep(e.retry_after or 60)
                continue
            raise
            
        except TranslationError as e:
            if e.error_code == "TEMPORARY_ERROR" and attempt < max_retries - 1:
                time.sleep(2 ** attempt)  # Exponential backoff
                continue
            raise

3. Performance Optimization

# Use batch translation for multiple texts
texts = ["Hello", "World", "Python", "Translation"]

# ✗ Inefficient: Multiple API calls
results = []
for text in texts:
    result = translator.translate(text, "en", "es")
    results.append(result)

# ✓ Efficient: Single batch API call
results = translator.bulk_translate(texts, "en", "es")

# ✓ Even better: Async batch translation
results = await translator.bulk_translate_async(texts, "en", "es")

4. Caching Strategy

from functools import lru_cache
import hashlib

class CachedTranslator:
    def __init__(self, translator):
        self.translator = translator
        self._cache = {}
    
    def _cache_key(self, text, source, target):
        """Generate cache key for translation."""
        content = f"{text}:{source}:{target}"
        return hashlib.md5(content.encode()).hexdigest()
    
    def translate(self, text, source, target):
        """Translate with caching."""
        cache_key = self._cache_key(text, source, target)
        
        if cache_key in self._cache:
            return self._cache[cache_key]
        
        result = self.translator.translate(text, source, target)
        self._cache[cache_key] = result
        return result

# Usage
cached_translator = CachedTranslator(translator)

Development

Setting Up Development Environment

# Clone the repository
git clone https://github.com/assystant/mt-provider-microsoft.git
cd mt-provider-microsoft

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install development dependencies
pip install -e ".[test,docs,dev]"

# Install pre-commit hooks
pre-commit install

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=mt_provider_microsoft --cov-report=html

# Run only async tests
pytest -k "async"

# Run with verbose output
pytest -v

Code Quality

# Format code
black mt_provider_microsoft/ tests/

# Sort imports
isort mt_provider_microsoft/ tests/

# Lint code
flake8 mt_provider_microsoft/ tests/

# Type checking
mypy mt_provider_microsoft/

Contributing

We welcome contributions! Please see our Contributing Guide for details.

Quick Start for Contributors

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Make your changes and add tests
Ensure all tests pass: pytest
Ensure code quality: black . && isort . && flake8
Commit your changes: git commit -m 'Add amazing feature'
Push to the branch: git push origin feature/amazing-feature
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

Documentation: MT Providers Documentation
Issues: GitHub Issues
Discussions: GitHub Discussions
Azure Support: Azure Translator Documentation

Changelog

See CHANGELOG.md for a detailed history of changes.

Made with ❤️ by the MT Providers team

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
mt_provider_microsoft		mt_provider_microsoft
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml

License

Assystant/mt_providers_microsoft

Folders and files

Latest commit

History

Repository files navigation

Microsoft Translator Provider

Overview

Table of Contents

Installation

Prerequisites

Install from PyPI

Install for Development

Features

Configuration

Basic Configuration

Environment Variables

Configuration Options

Usage Examples

Basic Translation

Batch Translation

Async Translation

Error Handling

Advanced Configuration

API Reference

MicrosoftTranslator Class

Methods

translate(text: str, source_lang: str, target_lang: str) -> TranslationResult

bulk_translate(texts: List[str], source_lang: str, target_lang: str) -> List[TranslationResult]

translate_async(text: str, source_lang: str, target_lang: str) -> TranslationResult

bulk_translate_async(texts: List[str], source_lang: str, target_lang: str) -> List[TranslationResult]

Supported Languages

Error Handling

Exception Types

Error Response Handling

Status Codes

Limits and Quotas

Azure Translator Limits

Provider Limits

Troubleshooting

Common Issues

1. Authentication Errors

2. Rate Limiting

3. Language Code Issues

4. Text Length Issues

Debug Mode

Integration Examples

Web Application Integration

Async Web Framework (FastAPI)

Batch Processing Pipeline

Best Practices

1. Configuration Management

2. Error Handling Strategy

3. Performance Optimization

4. Caching Strategy

Development

Setting Up Development Environment

Running Tests

Code Quality

Contributing

Quick Start for Contributors

License

Support

Changelog

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

`translate(text: str, source_lang: str, target_lang: str) -> TranslationResult`

`bulk_translate(texts: List[str], source_lang: str, target_lang: str) -> List[TranslationResult]`

`translate_async(text: str, source_lang: str, target_lang: str) -> TranslationResult`

`bulk_translate_async(texts: List[str], source_lang: str, target_lang: str) -> List[TranslationResult]`

Packages