Skip to content

Bedrock issue with latency optimized inference and prompt caching #4626

@Nathan-Kr

Description

@Nathan-Kr

Describe the bug

Using both latency optimzed and prompt caching is throwing:
botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the Converse operation: The model returned the following errors: system.0.cache_control: Extra inputs are not permitted

Only occurs when I set latency optimized and a cachepoint. If I remove one of them it do works as intended.
Maybe it is a limitation of optimized mode, but I couldn't find any information on that.

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

converse request possible with both latency optimized and cachepoints

Current Behavior

botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the Converse operation: The model returned the following errors: system.0.cache_control: Extra inputs are not permitted

Reproduction Steps

import boto3

BEDROCK_REGION = "us-east-2"
BEDROCK_MODEL_ID = "us.anthropic.claude-3-5-haiku-20241022-v1:0"

bedrock = boto3.client("bedrock-runtime", region_name=BEDROCK_REGION)

response = bedrock.converse(modelId=BEDROCK_MODEL_ID,
    performanceConfig={"latency": "optimized"},
    system=[
        {
            "text": (
                "You are an app that creates playlists for a radio station "
                "that plays rock and pop music. Only return song names and the artist."
            )
        },
        {
            "cachePoint": {
                "type": "default"
            }
        }
    ],
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "Make me a playlist of 5 songs."}
            ]
        }
    ])

print(response)

Possible Solution

No response

Additional Information/Context

No response

SDK version used

1.40.37

Environment details (OS name and version, etc.)

Linux fedora 6.13.9-200.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Mar 29 01:29:31 UTC 2025 x86_64 GNU/Linux

Metadata

Metadata

Assignees

Labels

bedrockp3This is a minor priority issueservice-apiThis issue is caused by the service API, not the SDK implementation.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions