Skip to content

Bug: ai-proxy does not correctly escape AWS Bedrock inference profile ARN in signature and request #14309

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task done
muscionig opened this issue Feb 25, 2025 · 8 comments

Comments

@muscionig
Copy link

muscionig commented Feb 25, 2025

Is there an existing issue for this?

  • I have searched the existing issues

Kong version ($ kong version)

Kong 3.9.0

Current Behavior

When using AWS Bedrock inference profiles, which are formatted as: arn:aws:bedrock:us-east-1:<account_id>:application-inference-profile/<profile_id>, kong is unable to route the request.
I have experience two behaviors:

  1. if the ai-proxy is configured with the unescaped ARN, Kong fails with:
{"error":{"message":"transformation failed from type bedrock://llm/v1/chat: no generation candidates received from Bedrock, or max_tokens too short"}}
  1. when configured with the escaped one (arn%3Aaws%3Abedrock%3Aus-east-1%3A<account_id>%3Aapplication-inference-profile%2F<profile_id>), Kong fails with:
{"message":"The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method."}

Expected Behavior

Kong should correctly format and escape the ARN when generating the SigV4 signature and constructing the request URL to match AWS’s expected format.

  • The Converse API requires the URL to be formatted as:
    /model/<model-id/inference-profile>/converse
    
  • The AWS CLI (in debug mode) shows that this URL is then encoded when making the request to Bedrock.

Steps To Reproduce

  1. Create the plugin with the unescaped model_name ARN

    • Example configuration (following the kong documentation):
      curl -X POST http://localhost:8001/services/bedrock-service/plugins \
       --header 'Content-Type: application/json' \
       --header 'accept: application/json' \
       --data '{
       "name": "ai-proxy",
       "instance_name": "ai-proxy-bedrock",
       "config": {
         "route_type": "llm/v1/chat",
         "auth": {
           "allow_override": false,
           "aws_access_key_id": "<YOUR_AWS_ACCESS_KEY_ID>",
           "aws_secret_access_key": "<YOUR_AWS_SECRET_ACCESS_KEY>"
         },
         "model": {
           "provider": "bedrock",
           "name": "arn:aws:bedrock:us-east-1:<account_id>:application-inference-profile/<profile_id>",
           "options": {
             "bedrock": {
               "aws_region": "us-east-1"
             }
           }
         }
       }
       }'
      
      This will fail with the transformation error reported above.
  2. Create the plugin with the escaped model_name ARN

    • Example configuration (following the kong documentation):
      curl -X POST http://localhost:8001/services/bedrock-service/plugins \
       --header 'Content-Type: application/json' \
       ...
           "provider": "bedrock",
           "name": "arn%3Aaws%3Abedrock%3Aus-east-1%3A<account_id>%3Aapplication-inference-profile%2F<profile_id>",
       ...
       }'
      
      This will fail with the sigv4 error.

Anything else?

  • I did check the aws cli with the inference profile to rule out authentication errors with the inference profile, and the profile works using the converse endpoint.
  • I verified the URL behavior by enabling the File Log plugin on /dev/stdout.
@muscionig
Copy link
Author

I have attempted a fix in PR #14310.

While going through the source code I noticed that the plugin allows for setting and overriding the upstream_path. I did try that, but the overridden path is then used for the sigv4 authentication.

Another possible way would be to allow the inference profile by recommending the model.options.upstream_path edits, and by adding ngx.unescape_uri(parsed_url.path) in the authentication process.

@muscionig muscionig changed the title Bug: AI-Proxy Does Not Correctly Escape AWS Bedrock Inference Profile ARN in Signature Bug: ai-proxy does not correctly escape AWS Bedrock inference profile ARN in signature and request Feb 25, 2025
@tysoekong
Copy link
Contributor

tysoekong commented Feb 25, 2025

Hey @muscionig thanks so much for this, I actually hadn't seen that you can just specify the model ARN directly for Bedrock-converse, I thought it was just for InvokeModel, so I didn't check it. But yeah I had this same problem with "upstream_path" which is:

  1. fixed but
  2. getting removed soon, the field is not needed with the next release

I will either ask if your fix can be merged directly, or I will bundle it into the next 3.10 main and tag you as contributor (because we have a giant PR of fixes coming already, it's quicker).

@muscionig
Copy link
Author

Hi @tysoekong, thanks for the update and for considering my changes! I’m totally fine with the fix being integrated into a different PR and with the collaborator approach.

Is the PR public? I’d love to test it in the meantime to ensure everything works as expected.

@ttyS0e
Copy link
Contributor

ttyS0e commented Mar 13, 2025

Hey @muscionig

I'm not sure what you've found is entirely the problem.

When I have fixed URL escaping in just the profile ARN:

2025/03/13 00:12:57 [notice] 2603#0: [kong] bedrock.lua:configure_request:654 [ai-proxy]----------------------------------------------------------------------+
2025/03/13 00:12:57 [notice] 2603#0: |"/arn%3Aaws%3Abedrock%3Aus-west-1%3A123456789012%3Aprompt/PROMPT12345:1"                                                |
2025/03/13 00:12:57 [notice] 2603#0: +------------------------------------------------------------------------------------------------------------------------+

I still get the "no candidates received". Adding a pre-function to log the output, Bedrock is returning:

{"Output":{"__type":"com.amazon.coral.service#UnknownOperationException"},"Version":"1.0"}

I think that this SHOULD work even without any of the escaping (i.e. setting model.name to arn:aws:bedrock:us-west-1:123456789012:prompt/PROMPT12345:1 but actually there's some other issue here. I am not sure what it won't accept about this request path, I need to ask AWS. Searching says it might be related to the wrong Content-Type ?

@muscionig
Copy link
Author

muscionig commented Mar 13, 2025

Hi @ttyS0e,

It’s definitely possible there’s more to this than what I’ve found. Here’s how I debugged it:

  1. I created an inference-profile via the API (since it’s not yet available in the console).
  2. I invoked the profile using:
    aws bedrock-runtime converse --model-id <inference-profile-arn> [...] --debug
    This outputs the URL the CLI uses, which is URL-encoded (as mentioned in the main description).
  3. On Kong:
    • Using the ARN as <model-id> without escaping results in:
      {"error":{"message":"transformation failed from type bedrock://llm/v1/chat: no generation candidates received from Bedrock, or max_tokens too short"}}
    • Using the URL-encoded ARN returns:
      {"message":"The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details."}
      This error typically occurs when the request reaches an AWS service endpoint, but SigV4 signing is incorrect. That’s why I suspected the issue lies in escaping (assuming this request is hitting bedrock).

I also found a useful example of how SigV4 should handle URL escaping in the aws-sdk-v2 for Go:
aws/aws-sdk-go-v2#L31.

My hypothesis is that Kong is trying to route after the / in the ARN, but I could be entirely mistaken.

EDIT: looking at your logs, I would try to escape the remaining / before PROMPT12345:1 to see if you get the same behavior I am experiencing.

@muscionig
Copy link
Author

muscionig commented Mar 18, 2025

@ttyS0e

I've identified the bug and implemented a working fix for inference-profiles. TL;DR: The issue lies between ai-proxy and lua-resty-aws, specifically in how lua-resty-aws escapes the path and then signs the ai-proxy request.

Issue in lua-resty-aws

The signing function used in lua-resty-aws is located here:
https://github.com/Kong/lua-resty-aws/blob/main/src/resty/aws/request/signatures/v4.lua

This function calls canonicalise_path:
https://github.com/Kong/lua-resty-aws/blob/21dbf8cd7351622d467143b7f294c9d70199aa91/src/resty/aws/request/signatures/utils.lua#L37-L62

canonicalise_path escapes the URI derived from path, but it does not escape any /, as these are assumed to be routable. As a result, the canonicalURI is:

INPUT: /model/arn:aws:bedrock:us-east-1:<account_id>:application-inference-profile/<profile_id>/converse 
OUTPUT: /model/arn%3Aaws%3Abedrock%3Aus-east-1%3A<account_id>%3Aapplication-inference-profile/<profile_id>/converse

AWS CLI and SigV4 for ARNs

When using awscli with an unescaped ARN, the canonical URL and request path are:

aws bedrock-runtime converse --model-id arn:aws:bedrock:us-east-1:<account_id>:application-inference-profile/<profile_id> [...] --debug

Output:

## Canonical request  

POST  
/model/arn%253Aaws%253Abedrock%253Aus-east-1%253A<account_id>%253Aapplication-inference-profile%252F<profile_id>/converse  

content-type: application/json  
host: bedrock-runtime.us-east-1.amazonaws.com  
x-amz-date: <date>  
x-amz-security-token: <token>  

<signed_headers>  
[...]  

## Request  

....method=POST, url=https://bedrock-runtime.us-east-1.amazonaws.com/model/arn%3Aaws%3Abedrock%3Aus-east-1%3A<account_id>%3Aapplication-inference-profile%2F<profile_id>/converse, headers={'....  

Key Observations

  1. canonicalURI: double-encoded model name in URL
  2. Request path: single-encoded model name in URL

Fix Implementation

To resolve this without modifying lua-resty-aws, I leveraged an existing mechanism: the function does not canonicalize the path if a canonicalURI is explicitly provided as a request parameter. I have escaped the model name within the ai-proxy.

I update my PR with this fix: #14310. Here is the updated part of the code:

canonicalURI = fmt(
ai_shared.operation_map[DRIVER_NAME][conf.route_type].path,
ngx.escape_uri(conf.model.name),
"converse")
local r = {
headers = {},
method = ai_shared.operation_map[DRIVER_NAME][conf.route_type].method,
path = parsed_url.path,
host = parsed_url.host,
port = tonumber(parsed_url.port) or 443,
body = kong.request.get_raw_body()
canonicalURI = canonicalURI,
}

NOTE: I also updated the subrequest function, though I’m not entirely sure when that function is called. However, the fix should be applied there as well.

Result

After rebuilding kong with my fix, when setting up ai-proxy, I passed the model name as:

arn%3Aaws%3Abedrock%3Aus-east-1%3A<account_id>%3Aapplication-inference-profile%2F<profile_id>

This results in a valid call to Bedrock.

NOTE: it may be more user friendly to do the double escape within the ai-proxy and not in the model parameters. Let me know if you would like me to change that in my fix, unless this will be included into another PR.

Let me know if you would like more details!

@muscionig
Copy link
Author

muscionig commented Apr 26, 2025

@ttyS0e

I have been running with this fix for over a month now, I have just committed an update to my PR #14310. It fixes the streaming case on converse-stream and it also ensures current behavior is maintained if the model.name is not an ARN. Also, the plugin is doing the escapes, so the user can just input the unescaped ARN.

I noticed this PR: #14416. Was this the one you mentioned my contribution could be included?

@subnetmarco
Copy link
Member

@muscionig we are going to look into this issue as part of the next release, stay tuned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants