Microsoft Azure Translator integration for the MT Providers framework.
This provider enables seamless integration with Microsoft Azure Translator services through the MT Providers framework. It supports both synchronous and asynchronous operations, automatic retries, rate limiting, and comprehensive error handling.
- Installation
- Features
- Quick Start
- Configuration
- Usage Examples
- API Reference
- Error Handling
- Limits and Quotas
- Contributing
- License
- Python 3.8 or higher
- Azure Translator subscription key
- Azure region identifier
pip install mt_provider_microsoft
git clone https://github.com/assystant/mt-provider-microsoft.git
cd mt-provider-microsoft
pip install -e ".[test,docs]"
- ✅ Single and Batch Translations: Translate individual texts or process multiple texts efficiently
- ✅ Async Support: Full async/await support with aiohttp for non-blocking operations
- ✅ Rate Limiting: Built-in rate limiting to respect API quotas
- ✅ Automatic Retries: Configurable retry logic with exponential backoff
- ✅ Error Handling: Comprehensive error handling with detailed error messages
- ✅ Region Support: Multi-region deployment support
- ✅ Response Metadata: Includes detected language and confidence scores
- ✅ Type Safety: Full type annotations with mypy support
- ✅ Framework Integration: Seamless integration with MT Providers ecosystem
from mt_providers.types import TranslationConfig
config = TranslationConfig(
api_key="your-azure-translator-key",
region="westus2", # Your Azure region
timeout=30, # Optional: request timeout in seconds
rate_limit=10, # Optional: requests per second
)
import os
from mt_providers.types import TranslationConfig
config = TranslationConfig(
api_key=os.getenv("AZURE_TRANSLATOR_KEY"),
region=os.getenv("AZURE_TRANSLATOR_REGION", "westus2"),
timeout=int(os.getenv("AZURE_TRANSLATOR_TIMEOUT", "30")),
)
Option | Type | Required | Default | Description |
---|---|---|---|---|
api_key |
str | Yes | - | Azure Translator subscription key |
region |
str | Yes | - | Azure region (e.g., "westus2", "eastus") |
endpoint |
str | No | Microsoft default | Custom API endpoint URL |
timeout |
int | No | 30 | Request timeout in seconds |
rate_limit |
int | No | None | Maximum requests per second |
retry_attempts |
int | No | 3 | Number of retry attempts |
retry_backoff |
float | No | 1.0 | Retry backoff multiplier |
from mt_providers import get_provider
from mt_providers.types import TranslationConfig
# Configure the provider
config = TranslationConfig(
api_key="your-azure-translator-key",
region="westus2"
)
# Get the Microsoft provider
translator = get_provider("microsoft")(config)
# Translate a single text
result = translator.translate("Hello world", "en", "es")
print(f"Translation: {result['translated_text']}") # "¡Hola mundo!"
print(f"Detected language: {result['metadata']['detected_language']}")
# Translate multiple texts efficiently
texts = [
"Hello world",
"How are you?",
"Good morning",
"Thank you very much"
]
results = translator.bulk_translate(texts, "en", "es")
for i, result in enumerate(results):
print(f"{texts[i]} → {result['translated_text']}")
# Hello world → ¡Hola mundo!
# How are you? → ¿Cómo estás?
# Good morning → Buenos días
# Thank you very much → Muchas gracias
import asyncio
async def async_translate_example():
# Single async translation
result = await translator.translate_async("Hello world", "en", "fr")
print(f"Async result: {result['translated_text']}") # "Bonjour le monde"
# Batch async translation
texts = ["Hello", "World", "Python"]
results = await translator.bulk_translate_async(texts, "en", "de")
for text, result in zip(texts, results):
print(f"{text} → {result['translated_text']}")
# Run async function
asyncio.run(async_translate_example())
from mt_providers.exceptions import (
ConfigurationError,
TranslationError,
ProviderError
)
from mt_providers.types import TranslationStatus
try:
result = translator.translate("Hello", "en", "es")
if result['status'] == TranslationStatus.SUCCESS:
print(f"Success: {result['translated_text']}")
else:
print(f"Translation failed: {result['error']}")
except ConfigurationError as e:
print(f"Configuration error: {e}")
except TranslationError as e:
print(f"Translation error: {e}")
except Exception as e:
print(f"Unexpected error: {e}")
# Custom endpoint and advanced settings
config = TranslationConfig(
api_key="your-key",
region="westus2",
endpoint="https://custom-endpoint.cognitiveservices.azure.com/translator/text/v3.0/translate",
timeout=60,
rate_limit=50, # 50 requests per second
retry_attempts=5,
retry_backoff=2.0
)
translator = get_provider("microsoft")(config)
The main translator class that implements the MT Providers interface.
Translates a single text synchronously.
Parameters:
text
(str): Text to translate (max 5000 characters)source_lang
(str): Source language code (ISO 639-1, e.g., "en", "es")target_lang
(str): Target language code (ISO 639-1, e.g., "fr", "de")
Returns:
TranslationResult
: Dictionary with translation results and metadata
Example:
result = translator.translate("Hello", "en", "es")
# Returns: {
# 'translated_text': '¡Hola',
# 'status': TranslationStatus.SUCCESS,
# 'metadata': {
# 'detected_language': 'en',
# 'confidence': 0.95,
# 'provider': 'microsoft',
# 'model': 'azure-translator-3.0'
# }
# }
Translates multiple texts in a single batch request.
Parameters:
texts
(List[str]): List of texts to translate (max 100 texts)source_lang
(str): Source language codetarget_lang
(str): Target language code
Returns:
List[TranslationResult]
: List of translation results
Asynchronous version of translate()
.
bulk_translate_async(texts: List[str], source_lang: str, target_lang: str) -> List[TranslationResult]
Asynchronous version of bulk_translate()
.
Microsoft Translator supports 100+ languages. Common language codes include:
Language | Code | Language | Code |
---|---|---|---|
English | en | Spanish | es |
French | fr | German | de |
Italian | it | Portuguese | pt |
Russian | ru | Chinese (Simplified) | zh |
Japanese | ja | Korean | ko |
Arabic | ar | Hindi | hi |
For the complete list, see Microsoft's language support documentation.
The provider raises specific exceptions for different error scenarios:
from mt_providers.exceptions import (
ConfigurationError, # Invalid configuration
TranslationError, # Translation-specific errors
ProviderError, # Provider-specific errors
RateLimitError, # Rate limit exceeded
TimeoutError # Request timeout
)
try:
result = translator.translate("Hello", "en", "invalid-lang")
except TranslationError as e:
print(f"Translation failed: {e}")
print(f"Error code: {e.error_code}")
print(f"Provider: {e.provider}")
except RateLimitError as e:
print(f"Rate limit exceeded. Retry after: {e.retry_after} seconds")
except TimeoutError as e:
print(f"Request timed out after {e.timeout} seconds")
Translation results include status information:
from mt_providers.types import TranslationStatus
result = translator.translate("Hello", "en", "es")
if result['status'] == TranslationStatus.SUCCESS:
print("Translation successful")
elif result['status'] == TranslationStatus.ERROR:
print(f"Translation failed: {result['error']}")
elif result['status'] == TranslationStatus.PARTIAL:
print("Partial success (some texts failed in batch)")
- Character limit: 5,000 characters per request
- Batch size: Maximum 100 texts per batch request
- Rate limits: Varies by subscription tier
- Free tier: 2M characters/month
- Standard tier: Configurable quotas
- Text length: No hard limit, but optimal performance under 1000 characters
- Timeout: Default 30 seconds (configurable)
- Retries: Default 3 attempts with exponential backoff
- Rate limiting: Configurable requests per second
# Example: High-throughput configuration
config = TranslationConfig(
api_key="your-key",
region="westus2",
timeout=60,
rate_limit=100, # 100 requests/second
retry_attempts=5,
retry_backoff=1.5
)
# Error: Invalid subscription key
ConfigurationError: Invalid subscription key. Check your API key and region.
# Solution: Verify your credentials
config = TranslationConfig(
api_key="your-valid-key", # Check Azure portal
region="westus2" # Match your resource region
)
# Error: Too Many Requests
RateLimitError: Rate limit exceeded. Retry after 60 seconds.
# Solution: Implement backoff or reduce rate
config = TranslationConfig(
api_key="your-key",
region="westus2",
rate_limit=10 # Reduce requests per second
)
# Error: Unsupported language
TranslationError: Language 'xyz' is not supported.
# Solution: Use valid ISO 639-1 codes
result = translator.translate("Hello", "en", "es") # ✓ Valid
result = translator.translate("Hello", "english", "spanish") # ✗ Invalid
# Error: Text too long
TranslationError: Text exceeds maximum length of 5000 characters.
# Solution: Split long texts
def translate_long_text(text, source, target):
max_length = 4500 # Leave buffer for safety
if len(text) <= max_length:
return translator.translate(text, source, target)
# Split and translate in chunks
chunks = [text[i:i+max_length] for i in range(0, len(text), max_length)]
results = translator.bulk_translate(chunks, source, target)
return {
'translated_text': ''.join(r['translated_text'] for r in results),
'status': results[0]['status'],
'metadata': results[0]['metadata']
}
Enable debug logging for troubleshooting:
import logging
from mt_providers import configure_logging
# Enable debug logging
configure_logging(level=logging.DEBUG)
# Now all API calls will be logged
translator = get_provider("microsoft")(config)
result = translator.translate("Hello", "en", "es")
from flask import Flask, request, jsonify
from mt_providers import get_provider
from mt_providers.types import TranslationConfig
app = Flask(__name__)
# Initialize translator
config = TranslationConfig(
api_key=os.getenv("AZURE_TRANSLATOR_KEY"),
region=os.getenv("AZURE_TRANSLATOR_REGION")
)
translator = get_provider("microsoft")(config)
@app.route('/translate', methods=['POST'])
def translate_text():
data = request.json
try:
result = translator.translate(
data['text'],
data['source_lang'],
data['target_lang']
)
return jsonify(result)
except Exception as e:
return jsonify({'error': str(e)}), 400
if __name__ == '__main__':
app.run(debug=True)
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from mt_providers import get_provider
from mt_providers.types import TranslationConfig
app = FastAPI()
class TranslationRequest(BaseModel):
text: str
source_lang: str
target_lang: str
# Initialize async translator
config = TranslationConfig(
api_key=os.getenv("AZURE_TRANSLATOR_KEY"),
region=os.getenv("AZURE_TRANSLATOR_REGION")
)
translator = get_provider("microsoft")(config)
@app.post("/translate")
async def translate_text(request: TranslationRequest):
try:
result = await translator.translate_async(
request.text,
request.source_lang,
request.target_lang
)
return result
except Exception as e:
raise HTTPException(status_code=400, detail=str(e))
import asyncio
from typing import List
import pandas as pd
async def translate_dataframe(df: pd.DataFrame, text_column: str,
source_lang: str, target_lang: str) -> pd.DataFrame:
"""Translate a column in a pandas DataFrame."""
# Batch translate all texts
texts = df[text_column].tolist()
batch_size = 100 # Microsoft's batch limit
all_results = []
for i in range(0, len(texts), batch_size):
batch = texts[i:i+batch_size]
results = await translator.bulk_translate_async(batch, source_lang, target_lang)
all_results.extend(results)
# Add translated column
df[f'{text_column}_translated'] = [r['translated_text'] for r in all_results]
df[f'{text_column}_confidence'] = [r['metadata']['confidence'] for r in all_results]
return df
# Usage
df = pd.read_csv('multilingual_data.csv')
df_translated = asyncio.run(translate_dataframe(df, 'content', 'auto', 'en'))
df_translated.to_csv('translated_data.csv', index=False)
# Use environment variables for sensitive data
import os
from dataclasses import dataclass
@dataclass
class Config:
azure_key: str = os.getenv("AZURE_TRANSLATOR_KEY")
azure_region: str = os.getenv("AZURE_TRANSLATOR_REGION", "westus2")
timeout: int = int(os.getenv("TRANSLATION_TIMEOUT", "30"))
rate_limit: int = int(os.getenv("TRANSLATION_RATE_LIMIT", "10"))
config = Config()
translation_config = TranslationConfig(
api_key=config.azure_key,
region=config.azure_region,
timeout=config.timeout,
rate_limit=config.rate_limit
)
from mt_providers.exceptions import TranslationError, RateLimitError
import time
def robust_translate(text, source, target, max_retries=3):
"""Translate with robust error handling."""
for attempt in range(max_retries):
try:
return translator.translate(text, source, target)
except RateLimitError as e:
if attempt < max_retries - 1:
time.sleep(e.retry_after or 60)
continue
raise
except TranslationError as e:
if e.error_code == "TEMPORARY_ERROR" and attempt < max_retries - 1:
time.sleep(2 ** attempt) # Exponential backoff
continue
raise
# Use batch translation for multiple texts
texts = ["Hello", "World", "Python", "Translation"]
# ✗ Inefficient: Multiple API calls
results = []
for text in texts:
result = translator.translate(text, "en", "es")
results.append(result)
# ✓ Efficient: Single batch API call
results = translator.bulk_translate(texts, "en", "es")
# ✓ Even better: Async batch translation
results = await translator.bulk_translate_async(texts, "en", "es")
from functools import lru_cache
import hashlib
class CachedTranslator:
def __init__(self, translator):
self.translator = translator
self._cache = {}
def _cache_key(self, text, source, target):
"""Generate cache key for translation."""
content = f"{text}:{source}:{target}"
return hashlib.md5(content.encode()).hexdigest()
def translate(self, text, source, target):
"""Translate with caching."""
cache_key = self._cache_key(text, source, target)
if cache_key in self._cache:
return self._cache[cache_key]
result = self.translator.translate(text, source, target)
self._cache[cache_key] = result
return result
# Usage
cached_translator = CachedTranslator(translator)
# Clone the repository
git clone https://github.com/assystant/mt-provider-microsoft.git
cd mt-provider-microsoft
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install development dependencies
pip install -e ".[test,docs,dev]"
# Install pre-commit hooks
pre-commit install
# Run all tests
pytest
# Run with coverage
pytest --cov=mt_provider_microsoft --cov-report=html
# Run only async tests
pytest -k "async"
# Run with verbose output
pytest -v
# Format code
black mt_provider_microsoft/ tests/
# Sort imports
isort mt_provider_microsoft/ tests/
# Lint code
flake8 mt_provider_microsoft/ tests/
# Type checking
mypy mt_provider_microsoft/
We welcome contributions! Please see our Contributing Guide for details.
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature
- Make your changes and add tests
- Ensure all tests pass:
pytest
- Ensure code quality:
black . && isort . && flake8
- Commit your changes:
git commit -m 'Add amazing feature'
- Push to the branch:
git push origin feature/amazing-feature
- Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Documentation: MT Providers Documentation
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Azure Support: Azure Translator Documentation
See CHANGELOG.md for a detailed history of changes.
Made with ❤️ by the MT Providers team