- Overview
- Prerequisites
- Deployment Steps
- Deployment Validation
- Running the Guidance
- Next Steps
- Cleanup
- FAQ, known issues, additional considerations, and limitations
- Notices
- Authors
This Guidance provides a comprehensive machine translation pipeline using Amazon Bedrock and other AWS services. It addresses the challenge of efficiently translating large volumes of text while maintaining high quality through automated quality assessment and estimation.
The solution enables organizations to:
- Process batch translations at scale
- Evaluate translation quality using AI-powered assessment
- Estimate quality metrics using specialized models
- Store translation memory for future reference and improvement
The architecture follows these key steps:
- Upload source sentences to Amazon Simple Storage Service (Amazon S3) Input Bucket to initiate the translation process.
- Amazon Step Functions initiates the translation pipeline orchestration workflow.
- AWS Lambda function begins execution, receiving configuration parameters from AWS Systems Manager Parameter Store for secure management.
- AWS Lambda generates and stores translation prompts in the Amazon S3 Input Bucket
- AWS Lambda invokes Amazon Bedrock Foundation Models foundation models to perform machine translation.
- The translated outputs from Amazon Bedrock are stored in the Amazon S3 Model Output bucket.
- AWS Lambda retrieves the translated content from Amazon S3 Model Output
- AWS Lambda sends the translations to Amazon SageMaker Endpoint for COMET score estimation.
- Amazon SageMaker evaluation results are consolidated with the output and stored in Amazon S3 in the Consolidated Results bucket.
- AWS Lambda invokes Amazon Bedrock for LLM driven qualitative assessment.
- Amazon Bedrock's quality assessment results are stored in the Amazon S3 Consolidated Results bucket.
- AWS Glue processes the consolidated results from the Amazon S3 Evaluation Results bucket for end-user consumption and analysis.
You are responsible for the cost of the AWS services used while running this Guidance. As of June 2025, the cost for running this Guidance with the default settings in the US East (N. Virginia) is approximately X per month for processing (Y records).
We recommend creating a Budget through AWS Cost Explorer to help manage costs. Prices are subject to change. For full details, refer to the pricing webpage for each AWS service used in this Guidance.
The following table provides a sample cost breakdown for deploying this Guidance with the default parameters in the US East (Ohio) Region for one request of around 200 characters.
AWS service | Dimensions | Cost [USD] |
---|---|---|
Amazon Aurora PostGresSQL* | Aurora Serverless v2, 1 ACU per hour, Storage amount (100 GB), 1 hour usage | $ 0.25 |
AWS Secrets Manager | Number of secrets (2), Average duration of each secret (30 days), Number of API calls (3 per month) | $ 0.80 |
Amazon Elastic Container Registry | Amount of data stored (1 GB per month) | $ 0.10 |
Amazon SageMaker (Inference) | Instance name (ml.c4.2xlarge), Number of Batch Transform jobs per month (1), Number of instances per job (1), Hour(s) per instance per job (0.5) | $ 0.24 |
Amazon Bedrock (Nova) | 1.00 K input tokens 1.00 K output tokens | $ 0.004 |
AWS Glue | Number of DPUs for Apache Spark job (10), Number of DPUs for Python Shell job (0.0625) | $ 0.75 |
AWS Lambda | Amount of ephemeral storage allocated (512 MB), Number of requests (10 requesth) | $ 0.1 |
AWS Step Functions | State transitions per workflow (12) | $ 0.1 |
- Please note that the Aurora database is optional and only relevant when translation memory is used. Cost will vary depending on usage and data stored
- Docker or Podman for building the SageMaker container image
-
AWS account with permissions to create and manage the following services:
- Amazon Bedrock
- AWS Lambda
- Amazon S3
- AWS Step Functions
- Amazon Aurora PostgreSQL
- Amazon SageMaker
- AWS Glue
- AWS IAM
-
Amazon Bedrock model access must be enabled for the following model ID(s):
- Amazon Nova Pro (us.amazon.nova-pro-v1:0)
This Guidance uses aws-cdk. If you are using aws-cdk for first time, please perform the below bootstrapping:
cdk bootstrap aws://ACCOUNT-NUMBER/REGION
Replace ACCOUNT-NUMBER
with your AWS account number and REGION
with your target AWS region.
- Amazon Bedrock has default quotas for API requests per second. You may need to request a quota increase for production workloads.
- AWS SageMaker has instance type limits that may require an increase for larger deployments.
This Guidance is best suited for regions where Amazon Bedrock and all required models are available. As of the latest update, the following regions are recommended:
- US East (N. Virginia)
- US West (Oregon)
- Europe (Frankfurt)
The solution needs access to two S3 buckets, respectively for input and output data. You may use pre-existing buckets or create new ones. Throughout the guide we will refer to these buckets as "your-input-bucket-name" and "your-output-bucket-name".
This guidance deploys COMETKiki model quality scoring model as a SageMaker self-hosted endpoint. At first runtime it downloads open source weights from HuggingFace. You will need to provide a Hugging Face access token with sufficient permissions for the endpoint to work. The screenshot below provides an example of the minimal set of permissions required:
Please note that the open source version is intented for experimentation and testing only. For a commercial use, please refer to Widn.ai MarketPlace listings
- Clone the repo using command
git clone https://github.com/aws-samples/guidance-for-machine-translation-pipelines-using-amazon-bedrock.git
- Navigate to the repo folder
cd guidance-for-machine-translation-pipelines-using-amazon-bedrock
- Create and activate a Python virtual environment:
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
- Install the required dependencies:
pip install -r deployment/requirements.txt
- Build the quality estimation image:
cd source/sagemaker/
source ./build_and_push.sh
Feel free to update the AWS_DEFAULT_REGION beforehand. The initial implementation defaults to us-east-1.
- Update the
cdk.json
context with your configuration:
{
"context": {
"input_bucket_name": "your-input-bucket-name",
"output_bucket_name": "your-output-bucket-name",
"quality_estimation_sgm_model_name":"your-quality-estimation-model-name",
"quality_estimation_sgm_endpoint_name": "your-quality-estimation-endpoint-name",
"quality_estimation_sgm_image_uri":"your-quality-estimation-sgm-endpoint-image-uri",
"quality_estimation_sgm_topic_name":"sagemaker-quality-estimation-inference-topic", #defaults to sagemaker-quality-estimation-inference-topic
"hugging_face_token": "your-hugging-face-token",
"config_secret_name": "workflow-bedrock-config" # defaults to workflow-bedrock-config,
"marketplace_endpoint_name": "your-marketplace-endpoint-name" #optional
}
}
- Navigate to the deployment directory:
cd ../../deployment
- Bootstrap your AWS environment (if you haven't already):
cdk bootstrap
- Synthesize the CloudFormation template:
cdk synth
- Deploy the stacks:
cdk deploy --all
Alternatively, if you have access to the Amazon Q CLI you may use the agent to execute the deployment steps on your behalf while be prompted periodically for your approval or any missing information. To do so run the following commands:
q chat
Once the q chat session is initialized run:
> /context add README.md
> /context add DEPLOYMENT_PROMPT.md
> Deploy the solution in guided mode please!
Make sure your credentials are properly configured prior to initiating the Q Chat session.
If Q CLI hangs for any reason you can hit Ctrl+C to stop the current action and prompt it to resume the deployment. You can say "Keep going!" for example.
Once the solution is deployed, feel free to ask Q to run a test for you.
If you encounter any environment related issue during deployment, the Q CLI Agent can also be a great resoure to help you troubleshoot and solve.
-
Open the AWS CloudFormation console and verify that all stacks have been created successfully with a status of "CREATE_COMPLETE":
- DatabaseStack
- SageMakerStack
- WorkflowStack
-
Verify the Step Functions state machine has been created:
- Open the AWS Step Functions console
- Look for a state machine named "BatchMachineTranslationStateMachineCDK"
- Verify the state machine definition matches the expected workflow
-
Verify the SageMaker endpoint is active:
- Open the Amazon SageMaker console
- Navigate to Endpoints
- Verify the quality estimation endpoint is "InService"
-
Run the following CLI command to validate the deployment:
aws stepfunctions describe-state-machine --state-machine-arn arn:aws:states:<region>:<account-id>:stateMachine:BatchMachineTranslationStateMachineCDK
- Create a JSON file matching the following structure:
[
{
"source_text": <your_source_text>,
"source_lang": <source_language>,
"target_lang": <target_language>
}
]
You can find a sample file (sample_test.json) in the sample_data folder
- Upload the JSON file to your input S3 bucket:
aws s3 cp your-input-file.json s3://your-input-bucket-name/inputs/your-input-file.json
- Start the Step Functions workflow with the following input:
aws stepfunctions start-execution \
--state-machine-arn arn:aws:states:<region>:<account-id>:stateMachine:BatchMachineTranslationStateMachineCDK \
--input '{"callerId": "user123", "inputFileKey": "inputs/your-input-file.json", "inputBucket": "your-input-bucket", "outputBucket": "your-input-bucket"}'
You may also start the workflow from the console by opening the state machine configuration and clicking the Execute button.
The pipeline accepts the following input parameters:
callerId
(required): A unique identifier for tracking the translation job and organizing output filesinputFileKey
(required): The S3 object key path to your JSON input file containing the translation requestsinputBucket
(required): Input bucketoutputBucket
(required): Output bucket containing all generated files by the state machine
These parameters enable the workflow to process your data and store results in an organized manner within your output bucket.
Once the state machine has run you should be able to view its entire execution and status from the Step Functions console.
Please note that the state machine is configured to invoke Bedrock through the real-time API for any input file that includes 100 records or less (right side of the state machine). Beyond this threshold, the state machine switches to Amazon Bedrock Batch Inference mode (left side of the state machine). Both the real time and batch mode achieve the same actions. The main difference is how the state machine steps interact with Amazon Bedrock (real time inference vs batch job).
The pipeline uses several Lambda functions with configurable environment variables:
Prompt Generator Function:
DATABASE_SECRET_ARN
: ARN of the Aurora credentials secretCLUSTER_ARN
: ARN of the Aurora PostgreSQL clusterDATABASE_NAME
: Name of the translation memory databaseWORKFLOW_SECRET_ARN
: ARN of the workflow configuration secretDEFAULT_SOURCE_LANG
: Default source language (default: "en")DEFAULT_TARGET_LANG
: Default target language (default: "fr")ENABLE_TRANSLATION_MEMORY
: Enable translation memory lookup (default: "true")
Quality Score Estimation:
SAGEMAKER_ENDPOINT_NAME
: Name of the SageMaker quality estimation endpointQUALITY_ESTIMATION_MODE
: Mode for quality estimation model hosting ("MARKETPLACE_SELF_HOSTED" or "OPEN_SOURCE_SELF_HOSTED", default: OPEN_SOURCE_SELF_HOSTED)
Translation and Assessment Functions:
WORKFLOW_SECRET_ARN
: ARN of the workflow configuration secret containing Bedrock model IDs and inference profiles
- Monitor the execution in the AWS Step Functions console:
- Navigate to the Step Functions console
- Select the "BatchMachineTranslationStateMachineCDK" state machine
- Find your execution and monitor its progress
- Once the workflow completes, the results will be available in your output S3 bucket:
aws s3 ls s3://your-output-bucket-name/user123/<execution-id>/analysis/
- Download the results:
aws s3 cp s3://your-output-bucket-name/user123/<execution-id>/analysis/results.jsonl .
- The results file contains:
- Source text
- Translated text
- Quality assessment scores
- Quality estimation metrics
- Recommendations for improvement
Result files are stored in your output S3 bucket following this path structure:
s3://{output_bucket_name}/{callerId}/{executionId}/analysis/
To find the execution ID:
- Step Functions Console: Navigate to your state machine execution and copy the execution ARN's ID portion
- CLI: Use the execution ARN returned from the
start-execution
command
For rapid data exploration, use Amazon S3 Select directly from the S3 console:
- Navigate to the results JSON file in your S3 bucket
- Select Actions → Query with S3 Select
- Configure the following settings:
- Format: JSON
- JSON Content type: Lines
- Compression: None
- Output settings: JSON
- Run the default query:
SELECT * FROM s3object s LIMIT 5
This allows you to quickly explore translation results, quality assessments, and identify records that need attention without downloading the entire file.
Each line in the results file follows this JSON structure:
{
"recordId": "string",
"assessment": {
"dimensions": {
"accuracy": {
"comment": "string",
"status": "MEETS_REQUIREMENTS|NEEDS_ATTENTION|NOT_ASSESSED"
},
"fluency": {
"comment": "string",
"status": "MEETS_REQUIREMENTS|NEEDS_ATTENTION|NOT_ASSESSED"
},
"style": {
"comment": "string",
"status": "MEETS_REQUIREMENTS|NEEDS_ATTENTION|NOT_ASSESSED"
},
"terminology": {
"comment": "string",
"status": "MEETS_REQUIREMENTS|NEEDS_ATTENTION|NOT_ASSESSED"
}
},
"overall_status": "MEETS_REQUIREMENTS|NEEDS_ATTENTION|ERROR"
},
"source_language": "string",
"source_text": "string",
"target_language": "string",
"translated_text": "string",
"score": "number"
}
Field Descriptions:
recordId
: Unique identifier for each translation recordassessment.dimensions
: Quality evaluation across four dimensions (accuracy, fluency, style, terminology)assessment.overall_status
: Overall quality assessment resultsource_language
: Source language codesource_text
: Original text to be translatedtarget_language
: Target language codetranslated_text
: Generated translationscore
: COMET quality estimation score (numerical value)
The translation memory feature enhances translation quality by leveraging previously translated content. To enable this functionality, you need to initialize the translation memory database with sample data.
Follow these steps to set up and populate the translation memory database:
-
Navigate to the database initialization notebook:
cd source/database/init_notebook/
-
Open the Jupyter notebook using one of these options:
Option A: Local Jupyter
jupyter notebook database_init.ipynb
Option B: Amazon SageMaker Studio
- Open Amazon SageMaker Studio in the AWS Console
- Upload the
database_init.ipynb
file to your SageMaker Studio environment - Open the notebook and select a Python 3 kernel
-
Follow the notebook instructions to:
- Install required Python dependencies
- Load sample WMT19 French-German translation sample data
- Generate embeddings using Amazon Bedrock's Titan model
- Populate the Aurora PostgreSQL database with translation pairs and embeddings
- Test vector similarity search functionality
-
Update database connection parameters: Before running the notebook, replace the placeholder values with your actual CloudFormation output values:
DatabaseSecretArn
: ARN of the Aurora credentials secretDatabaseClusterArn
: ARN of the Aurora PostgreSQL clusterDatabaseName
: Name of the translation memory database
Once initialized, the translation memory will automatically provide context to improve translation consistency for similar text segments.
Important: Ensure that the ENABLE_TRANSLATION_MEMORY
environment variable is set to "true" in the Prompt Generator Lambda function to activate translation memory functionality.
The solution provides flexible configuration options for model selection and prompt customization, enabling caller-specific settings for cost optimization and quality control.
You can configure different Amazon Bedrock models or inference profiles for specific callers by updating the workflow configuration secret in AWS Secrets Manager. This enables:
- Cost tracking: Different models have varying costs, allowing you to optimize expenses per use case
- Quality optimization: Select models best suited for specific language pairs or content types
- Performance tuning: Choose faster models for real-time scenarios or more accurate models for critical translations
To configure caller-specific models:
- Navigate to AWS Secrets Manager console
- Find the secret named
workflow-bedrock-config
(or your custom secret name) - Update the JSON configuration with caller-specific model mappings:
{
"bedrock_model_id": "us.amazon.nova-pro-v1:0", # Default
"bedrock_model_id.premium_user": "us.amazon.nova-pro-v1:0",
"assessment_model_id.budget_user": "arn:aws:bedrock:us-west-2:123456789012:application-inference-profile/abc123def456", #Using inference profile
"assessment_model_id": "us.amazon.nova-pro-v1:0",
"assessment_model_id.premium_user": "us.amazon.nova-pro-v1:0",
"assessment_model_id.premium_user": "arn:aws:bedrock:us-west-2:123456789012:application-inference-profile/abc123def456"
}
You may need to double check whether the roles attached to the Lambda functions included in the state machine are granted access to your custom inference profile.
Prompt templates can be customized for different translation scenarios and caller requirements:
Translation Prompts: Modify source/lambda/prompt_generator/prompt_template.txt
to:
- Add domain-specific instructions
- Include formatting requirements
- Specify tone or style preferences
Quality Assessment Prompts: Update source/lambda/quality_assessment/prompt_template.txt
to:
- Focus on specific quality criteria
- Add industry-specific evaluation metrics
- Customize scoring rubrics
Prompt templates support dynamic variables including {source_language}
, {target_language}
, {source_text}
, and {context}
for flexible content generation.
After deploying this Guidance, you can enhance it in the following ways:
-
Customize Translation Models: Experiment with different Amazon Bedrock models to find the best fit for your specific language pairs and content types.
-
Fine-tune Quality Assessment: Adjust the quality assessment prompts in
source/lambda/quality_assessment/prompt_template.txt
to focus on specific aspects of translation quality relevant to your use case. -
Integrate with Translation Memory: Extend the database functionality to build a more robust translation memory system that can suggest translations based on previous similar content.
-
Add Human Review Workflow: Implement a human review step for translations that don't meet quality thresholds, using Amazon Augmented AI (A2I).
-
Implement Domain-specific Terminology: Create and maintain terminology databases for specific domains to ensure consistent translation of technical terms.
-
Scale for Production: Adjust the infrastructure for production workloads by:
- Increasing Lambda function timeouts and memory
- Scaling SageMaker endpoints
- Implementing auto-scaling for Aurora PostgreSQL
- Adding monitoring and alerting
To avoid incurring future charges, follow these steps to delete all resources:
- Empty the S3 buckets:
aws s3 rm s3://your-input-bucket-name --recursive
aws s3 rm s3://your-output-bucket-name --recursive
- Delete the CDK stacks:
cd deployment
cdk destroy --all
- Delete the ECR repository containing the quality estimation image:
aws ecr delete-repository --repository-name <repository-name> --force
Q: What languages are supported by this solution? A: The solution supports all languages available in the Amazon Bedrock models you choose to use. Amazon Nova Pro supports a wide range of languages, but performance may vary across language pairs.
Q: How can I improve translation quality? A: You can improve translation quality by providing more context in your input data, fine-tuning the prompts, and using the quality assessment feedback to iteratively improve your translations.
Q: Can this solution handle specialized terminology? A: Yes, you can include specialized terminology in the context field of your input data. For more robust terminology handling, consider extending the solution with a terminology database.
- The quality estimation model may occasionally time out for very large batches. Consider breaking large translation jobs into smaller batches.
- Some language pairs may have lower quality scores due to limitations in the underlying models.
- This Guidance creates an Amazon SageMaker endpoint that is billed per hour regardless of usage.
- The Aurora PostgreSQL database is provisioned as Serverless v2, which has minimum capacity units that will be billed even during idle periods.
- Consider implementing data encryption at rest and in transit for sensitive content.
For any feedback, questions, or suggestions, please use the issues tab under this repo.
Customers are responsible for making their own independent assessment of the information in this Guidance. This Guidance: (a) is for informational purposes only, (b) represents AWS current product offerings and practices, which are subject to change without notice, and (c) does not create any commitments or assurances from AWS and its affiliates, suppliers or licensors. AWS products or services are provided "as is" without warranties, representations, or conditions of any kind, whether express or implied. AWS responsibilities and liabilities to its customers are controlled by AWS agreements, and this Guidance is not part of, nor does it modify, any agreement between AWS and its customers.
- Narcisse Zekpa: nzzekpa@amazon.com
- Daniel Suarez: dssouto@amazon.com
- Deepika Suresh: dpsure@amazon.com