Skip to content

aws-solutions-library-samples/guidance-for-machine-translation-pipelines-using-generative-ai-on-aws

Guidance for Machine Translation pipelines using Generative AI on AWS

Table of Contents

Required

  1. Overview
  2. Prerequisites
  3. Deployment Steps
  4. Deployment Validation
  5. Running the Guidance
  6. Next Steps
  7. Cleanup
  8. FAQ, known issues, additional considerations, and limitations
  9. Notices
  10. Authors

Overview

This Guidance provides a comprehensive machine translation pipeline using Amazon Bedrock and other AWS services. It addresses the challenge of efficiently translating large volumes of text while maintaining high quality through automated quality assessment and estimation.

The solution enables organizations to:

  • Process batch translations at scale
  • Evaluate translation quality using AI-powered assessment
  • Estimate quality metrics using specialized models
  • Store translation memory for future reference and improvement

Architecture Diagram

The architecture follows these key steps:

  1. Upload source sentences to Amazon Simple Storage Service (Amazon S3) Input Bucket to initiate the translation process.
  2. Amazon Step Functions initiates the translation pipeline orchestration workflow.
  3. AWS Lambda function begins execution, receiving configuration parameters from AWS Systems Manager Parameter Store for secure management.
  4. AWS Lambda generates and stores translation prompts in the Amazon S3 Input Bucket
  5. AWS Lambda invokes Amazon Bedrock Foundation Models foundation models to perform machine translation.
  6. The translated outputs from Amazon Bedrock are stored in the Amazon S3 Model Output bucket.
  7. AWS Lambda retrieves the translated content from Amazon S3 Model Output
  8. AWS Lambda sends the translations to Amazon SageMaker Endpoint for COMET score estimation.
  9. Amazon SageMaker evaluation results are consolidated with the output and stored in Amazon S3 in the Consolidated Results bucket.
  10. AWS Lambda invokes Amazon Bedrock for LLM driven qualitative assessment.
  11. Amazon Bedrock's quality assessment results are stored in the Amazon S3 Consolidated Results bucket.
  12. AWS Glue processes the consolidated results from the Amazon S3 Evaluation Results bucket for end-user consumption and analysis.

Cost

You are responsible for the cost of the AWS services used while running this Guidance. As of June 2025, the cost for running this Guidance with the default settings in the US East (N. Virginia) is approximately X per month for processing (Y records).

We recommend creating a Budget through AWS Cost Explorer to help manage costs. Prices are subject to change. For full details, refer to the pricing webpage for each AWS service used in this Guidance.

Sample Cost Table

The following table provides a sample cost breakdown for deploying this Guidance with the default parameters in the US East (Ohio) Region for one request of around 200 characters.

AWS service Dimensions Cost [USD]
Amazon Aurora PostGresSQL* Aurora Serverless v2, 1 ACU per hour, Storage amount (100 GB), 1 hour usage $ 0.25
AWS Secrets Manager Number of secrets (2), Average duration of each secret (30 days), Number of API calls (3 per month) $ 0.80
Amazon Elastic Container Registry Amount of data stored (1 GB per month) $ 0.10
Amazon SageMaker (Inference) Instance name (ml.c4.2xlarge), Number of Batch Transform jobs per month (1), Number of instances per job (1), Hour(s) per instance per job (0.5) $ 0.24
Amazon Bedrock (Nova) 1.00 K input tokens 1.00 K output tokens $ 0.004
AWS Glue Number of DPUs for Apache Spark job (10), Number of DPUs for Python Shell job (0.0625) $ 0.75
AWS Lambda Amount of ephemeral storage allocated (512 MB), Number of requests (10 requesth) $ 0.1
AWS Step Functions State transitions per workflow (12) $ 0.1
  • Please note that the Aurora database is optional and only relevant when translation memory is used. Cost will vary depending on usage and data stored

Prerequisites

Third-party tools

  • Docker or Podman for building the SageMaker container image

AWS account requirements

  • AWS account with permissions to create and manage the following services:

    • Amazon Bedrock
    • AWS Lambda
    • Amazon S3
    • AWS Step Functions
    • Amazon Aurora PostgreSQL
    • Amazon SageMaker
    • AWS Glue
    • AWS IAM
  • Amazon Bedrock model access must be enabled for the following model ID(s):

    • Amazon Nova Pro (us.amazon.nova-pro-v1:0)

aws cdk bootstrap

This Guidance uses aws-cdk. If you are using aws-cdk for first time, please perform the below bootstrapping:

cdk bootstrap aws://ACCOUNT-NUMBER/REGION

Replace ACCOUNT-NUMBER with your AWS account number and REGION with your target AWS region.

Service limits (if applicable)

  • Amazon Bedrock has default quotas for API requests per second. You may need to request a quota increase for production workloads.
  • AWS SageMaker has instance type limits that may require an increase for larger deployments.

Supported Regions (if applicable)

This Guidance is best suited for regions where Amazon Bedrock and all required models are available. As of the latest update, the following regions are recommended:

  • US East (N. Virginia)
  • US West (Oregon)
  • Europe (Frankfurt)

Input and output buckets

The solution needs access to two S3 buckets, respectively for input and output data. You may use pre-existing buckets or create new ones. Throughout the guide we will refer to these buckets as "your-input-bucket-name" and "your-output-bucket-name".

HuggingFace Access Token for COMET Score estimation model

This guidance deploys COMETKiki model quality scoring model as a SageMaker self-hosted endpoint. At first runtime it downloads open source weights from HuggingFace. You will need to provide a Hugging Face access token with sufficient permissions for the endpoint to work. The screenshot below provides an example of the minimal set of permissions required:

HF Minimum Permissions

Please note that the open source version is intented for experimentation and testing only. For a commercial use, please refer to Widn.ai MarketPlace listings

Deployment Steps (required)

  1. Clone the repo using command
git clone https://github.com/aws-samples/guidance-for-machine-translation-pipelines-using-amazon-bedrock.git
  1. Navigate to the repo folder
cd guidance-for-machine-translation-pipelines-using-amazon-bedrock
  1. Create and activate a Python virtual environment:
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
  1. Install the required dependencies:
pip install -r deployment/requirements.txt
  1. Build the quality estimation image:
cd source/sagemaker/
source ./build_and_push.sh

Feel free to update the AWS_DEFAULT_REGION beforehand. The initial implementation defaults to us-east-1.

  1. Update the cdk.json context with your configuration:
{
  "context": {
    "input_bucket_name": "your-input-bucket-name",
    "output_bucket_name": "your-output-bucket-name",
    "quality_estimation_sgm_model_name":"your-quality-estimation-model-name",
    "quality_estimation_sgm_endpoint_name": "your-quality-estimation-endpoint-name",
    "quality_estimation_sgm_image_uri":"your-quality-estimation-sgm-endpoint-image-uri",
    "quality_estimation_sgm_topic_name":"sagemaker-quality-estimation-inference-topic", #defaults to sagemaker-quality-estimation-inference-topic
    "hugging_face_token": "your-hugging-face-token",
    "config_secret_name": "workflow-bedrock-config" # defaults to workflow-bedrock-config,
    "marketplace_endpoint_name": "your-marketplace-endpoint-name" #optional
  }
}
  1. Navigate to the deployment directory:
cd ../../deployment
  1. Bootstrap your AWS environment (if you haven't already):
cdk bootstrap
  1. Synthesize the CloudFormation template:
cdk synth
  1. Deploy the stacks:
cdk deploy --all

Deployment Step using Amazon Q CLI (experimental)

Amazon Q CLI

Alternatively, if you have access to the Amazon Q CLI you may use the agent to execute the deployment steps on your behalf while be prompted periodically for your approval or any missing information. To do so run the following commands:

q chat

Once the q chat session is initialized run:

> /context add README.md
> /context add DEPLOYMENT_PROMPT.md
> Deploy the solution in guided mode please!

Make sure your credentials are properly configured prior to initiating the Q Chat session.

If Q CLI hangs for any reason you can hit Ctrl+C to stop the current action and prompt it to resume the deployment. You can say "Keep going!" for example.

Once the solution is deployed, feel free to ask Q to run a test for you.

If you encounter any environment related issue during deployment, the Q CLI Agent can also be a great resoure to help you troubleshoot and solve.

Deployment Validation

  1. Open the AWS CloudFormation console and verify that all stacks have been created successfully with a status of "CREATE_COMPLETE":

    • DatabaseStack
    • SageMakerStack
    • WorkflowStack
  2. Verify the Step Functions state machine has been created:

    • Open the AWS Step Functions console
    • Look for a state machine named "BatchMachineTranslationStateMachineCDK"
    • Verify the state machine definition matches the expected workflow
  3. Verify the SageMaker endpoint is active:

    • Open the Amazon SageMaker console
    • Navigate to Endpoints
    • Verify the quality estimation endpoint is "InService"
  4. Run the following CLI command to validate the deployment:

aws stepfunctions describe-state-machine --state-machine-arn arn:aws:states:<region>:<account-id>:stateMachine:BatchMachineTranslationStateMachineCDK

Running the Guidance

Prepare Input Data

  1. Create a JSON file matching the following structure:
[
    {
        "source_text": <your_source_text>,
        "source_lang": <source_language>,
        "target_lang": <target_language>
    }
]

You can find a sample file (sample_test.json) in the sample_data folder

  1. Upload the JSON file to your input S3 bucket:
aws s3 cp your-input-file.json s3://your-input-bucket-name/inputs/your-input-file.json

Start the Translation Workflow

  1. Start the Step Functions workflow with the following input:
aws stepfunctions start-execution \
  --state-machine-arn arn:aws:states:<region>:<account-id>:stateMachine:BatchMachineTranslationStateMachineCDK \
  --input '{"callerId": "user123", "inputFileKey": "inputs/your-input-file.json", "inputBucket": "your-input-bucket", "outputBucket": "your-input-bucket"}'

You may also start the workflow from the console by opening the state machine configuration and clicking the Execute button.

State Machine

The pipeline accepts the following input parameters:

  • callerId (required): A unique identifier for tracking the translation job and organizing output files
  • inputFileKey (required): The S3 object key path to your JSON input file containing the translation requests
  • inputBucket (required): Input bucket
  • outputBucket (required): Output bucket containing all generated files by the state machine

These parameters enable the workflow to process your data and store results in an organized manner within your output bucket.

Once the state machine has run you should be able to view its entire execution and status from the Step Functions console.

State Machine Execution

Please note that the state machine is configured to invoke Bedrock through the real-time API for any input file that includes 100 records or less (right side of the state machine). Beyond this threshold, the state machine switches to Amazon Bedrock Batch Inference mode (left side of the state machine). Both the real time and batch mode achieve the same actions. The main difference is how the state machine steps interact with Amazon Bedrock (real time inference vs batch job).

Lambda Function Configuration

The pipeline uses several Lambda functions with configurable environment variables:

Prompt Generator Function:

  • DATABASE_SECRET_ARN: ARN of the Aurora credentials secret
  • CLUSTER_ARN: ARN of the Aurora PostgreSQL cluster
  • DATABASE_NAME: Name of the translation memory database
  • WORKFLOW_SECRET_ARN: ARN of the workflow configuration secret
  • DEFAULT_SOURCE_LANG: Default source language (default: "en")
  • DEFAULT_TARGET_LANG: Default target language (default: "fr")
  • ENABLE_TRANSLATION_MEMORY: Enable translation memory lookup (default: "true")

Quality Score Estimation:

  • SAGEMAKER_ENDPOINT_NAME: Name of the SageMaker quality estimation endpoint
  • QUALITY_ESTIMATION_MODE: Mode for quality estimation model hosting ("MARKETPLACE_SELF_HOSTED" or "OPEN_SOURCE_SELF_HOSTED", default: OPEN_SOURCE_SELF_HOSTED)

Translation and Assessment Functions:

  • WORKFLOW_SECRET_ARN: ARN of the workflow configuration secret containing Bedrock model IDs and inference profiles
  1. Monitor the execution in the AWS Step Functions console:
    • Navigate to the Step Functions console
    • Select the "BatchMachineTranslationStateMachineCDK" state machine
    • Find your execution and monitor its progress

View Results

  1. Once the workflow completes, the results will be available in your output S3 bucket:
aws s3 ls s3://your-output-bucket-name/user123/<execution-id>/analysis/
  1. Download the results:
aws s3 cp s3://your-output-bucket-name/user123/<execution-id>/analysis/results.jsonl .
  1. The results file contains:
    • Source text
    • Translated text
    • Quality assessment scores
    • Quality estimation metrics
    • Recommendations for improvement

Accessing Solution Output

Result files are stored in your output S3 bucket following this path structure:

s3://{output_bucket_name}/{callerId}/{executionId}/analysis/

To find the execution ID:

  • Step Functions Console: Navigate to your state machine execution and copy the execution ARN's ID portion
  • CLI: Use the execution ARN returned from the start-execution command

Quick Data Review with S3 Select

For rapid data exploration, use Amazon S3 Select directly from the S3 console:

  1. Navigate to the results JSON file in your S3 bucket
  2. Select ActionsQuery with S3 Select
  3. Configure the following settings:
    • Format: JSON
    • JSON Content type: Lines
    • Compression: None
    • Output settings: JSON
  4. Run the default query: SELECT * FROM s3object s LIMIT 5

This allows you to quickly explore translation results, quality assessments, and identify records that need attention without downloading the entire file.

Output JSON Schema

Each line in the results file follows this JSON structure:

{
  "recordId": "string",
  "assessment": {
    "dimensions": {
      "accuracy": {
        "comment": "string",
        "status": "MEETS_REQUIREMENTS|NEEDS_ATTENTION|NOT_ASSESSED"
      },
      "fluency": {
        "comment": "string",
        "status": "MEETS_REQUIREMENTS|NEEDS_ATTENTION|NOT_ASSESSED"
      },
      "style": {
        "comment": "string",
        "status": "MEETS_REQUIREMENTS|NEEDS_ATTENTION|NOT_ASSESSED"
      },
      "terminology": {
        "comment": "string",
        "status": "MEETS_REQUIREMENTS|NEEDS_ATTENTION|NOT_ASSESSED"
      }
    },
    "overall_status": "MEETS_REQUIREMENTS|NEEDS_ATTENTION|ERROR"
  },
  "source_language": "string",
  "source_text": "string",
  "target_language": "string",
  "translated_text": "string",
  "score": "number"
}

Field Descriptions:

  • recordId: Unique identifier for each translation record
  • assessment.dimensions: Quality evaluation across four dimensions (accuracy, fluency, style, terminology)
  • assessment.overall_status: Overall quality assessment result
  • source_language: Source language code
  • source_text: Original text to be translated
  • target_language: Target language code
  • translated_text: Generated translation
  • score: COMET quality estimation score (numerical value)

Translation Memory

The translation memory feature enhances translation quality by leveraging previously translated content. To enable this functionality, you need to initialize the translation memory database with sample data.

Initialize Translation Memory Database

Follow these steps to set up and populate the translation memory database:

  1. Navigate to the database initialization notebook:

    cd source/database/init_notebook/
  2. Open the Jupyter notebook using one of these options:

    Option A: Local Jupyter

    jupyter notebook database_init.ipynb

    Option B: Amazon SageMaker Studio

    • Open Amazon SageMaker Studio in the AWS Console
    • Upload the database_init.ipynb file to your SageMaker Studio environment
    • Open the notebook and select a Python 3 kernel
  3. Follow the notebook instructions to:

    • Install required Python dependencies
    • Load sample WMT19 French-German translation sample data
    • Generate embeddings using Amazon Bedrock's Titan model
    • Populate the Aurora PostgreSQL database with translation pairs and embeddings
    • Test vector similarity search functionality
  4. Update database connection parameters: Before running the notebook, replace the placeholder values with your actual CloudFormation output values:

    • DatabaseSecretArn: ARN of the Aurora credentials secret
    • DatabaseClusterArn: ARN of the Aurora PostgreSQL cluster
    • DatabaseName: Name of the translation memory database

Once initialized, the translation memory will automatically provide context to improve translation consistency for similar text segments.

Important: Ensure that the ENABLE_TRANSLATION_MEMORY environment variable is set to "true" in the Prompt Generator Lambda function to activate translation memory functionality.

Model Configuration and Customization

The solution provides flexible configuration options for model selection and prompt customization, enabling caller-specific settings for cost optimization and quality control.

Caller-Specific Model Configuration

You can configure different Amazon Bedrock models or inference profiles for specific callers by updating the workflow configuration secret in AWS Secrets Manager. This enables:

  • Cost tracking: Different models have varying costs, allowing you to optimize expenses per use case
  • Quality optimization: Select models best suited for specific language pairs or content types
  • Performance tuning: Choose faster models for real-time scenarios or more accurate models for critical translations

To configure caller-specific models:

  1. Navigate to AWS Secrets Manager console
  2. Find the secret named workflow-bedrock-config (or your custom secret name)
  3. Update the JSON configuration with caller-specific model mappings:
{
  "bedrock_model_id": "us.amazon.nova-pro-v1:0", # Default
  "bedrock_model_id.premium_user": "us.amazon.nova-pro-v1:0",
  "assessment_model_id.budget_user": "arn:aws:bedrock:us-west-2:123456789012:application-inference-profile/abc123def456", #Using inference profile
  "assessment_model_id": "us.amazon.nova-pro-v1:0",
  "assessment_model_id.premium_user": "us.amazon.nova-pro-v1:0",
  "assessment_model_id.premium_user": "arn:aws:bedrock:us-west-2:123456789012:application-inference-profile/abc123def456"
}

You may need to double check whether the roles attached to the Lambda functions included in the state machine are granted access to your custom inference profile.

Custom Prompt Templates

Prompt templates can be customized for different translation scenarios and caller requirements:

Translation Prompts: Modify source/lambda/prompt_generator/prompt_template.txt to:

  • Add domain-specific instructions
  • Include formatting requirements
  • Specify tone or style preferences

Quality Assessment Prompts: Update source/lambda/quality_assessment/prompt_template.txt to:

  • Focus on specific quality criteria
  • Add industry-specific evaluation metrics
  • Customize scoring rubrics

Prompt templates support dynamic variables including {source_language}, {target_language}, {source_text}, and {context} for flexible content generation.

Next Steps

After deploying this Guidance, you can enhance it in the following ways:

  1. Customize Translation Models: Experiment with different Amazon Bedrock models to find the best fit for your specific language pairs and content types.

  2. Fine-tune Quality Assessment: Adjust the quality assessment prompts in source/lambda/quality_assessment/prompt_template.txt to focus on specific aspects of translation quality relevant to your use case.

  3. Integrate with Translation Memory: Extend the database functionality to build a more robust translation memory system that can suggest translations based on previous similar content.

  4. Add Human Review Workflow: Implement a human review step for translations that don't meet quality thresholds, using Amazon Augmented AI (A2I).

  5. Implement Domain-specific Terminology: Create and maintain terminology databases for specific domains to ensure consistent translation of technical terms.

  6. Scale for Production: Adjust the infrastructure for production workloads by:

    • Increasing Lambda function timeouts and memory
    • Scaling SageMaker endpoints
    • Implementing auto-scaling for Aurora PostgreSQL
    • Adding monitoring and alerting

Cleanup

To avoid incurring future charges, follow these steps to delete all resources:

  1. Empty the S3 buckets:
aws s3 rm s3://your-input-bucket-name --recursive
aws s3 rm s3://your-output-bucket-name --recursive
  1. Delete the CDK stacks:
cd deployment
cdk destroy --all
  1. Delete the ECR repository containing the quality estimation image:
aws ecr delete-repository --repository-name <repository-name> --force

FAQ, known issues, additional considerations, and limitations

Frequently Asked Questions

Q: What languages are supported by this solution? A: The solution supports all languages available in the Amazon Bedrock models you choose to use. Amazon Nova Pro supports a wide range of languages, but performance may vary across language pairs.

Q: How can I improve translation quality? A: You can improve translation quality by providing more context in your input data, fine-tuning the prompts, and using the quality assessment feedback to iteratively improve your translations.

Q: Can this solution handle specialized terminology? A: Yes, you can include specialized terminology in the context field of your input data. For more robust terminology handling, consider extending the solution with a terminology database.

Known Issues

  • The quality estimation model may occasionally time out for very large batches. Consider breaking large translation jobs into smaller batches.
  • Some language pairs may have lower quality scores due to limitations in the underlying models.

Additional Considerations

  • This Guidance creates an Amazon SageMaker endpoint that is billed per hour regardless of usage.
  • The Aurora PostgreSQL database is provisioned as Serverless v2, which has minimum capacity units that will be billed even during idle periods.
  • Consider implementing data encryption at rest and in transit for sensitive content.

For any feedback, questions, or suggestions, please use the issues tab under this repo.

Notices

Customers are responsible for making their own independent assessment of the information in this Guidance. This Guidance: (a) is for informational purposes only, (b) represents AWS current product offerings and practices, which are subject to change without notice, and (c) does not create any commitments or assurances from AWS and its affiliates, suppliers or licensors. AWS products or services are provided "as is" without warranties, representations, or conditions of any kind, whether express or implied. AWS responsibilities and liabilities to its customers are controlled by AWS agreements, and this Guidance is not part of, nor does it modify, any agreement between AWS and its customers.

Authors

About

No description, website, or topics provided.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5