Skip to content

bug: Amazon Q CLI fails for all model calls when a really large file is added to context #1254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks done
walmsles opened this issue Apr 16, 2025 · 2 comments · May be fixed by #1331
Open
3 tasks done

bug: Amazon Q CLI fails for all model calls when a really large file is added to context #1254

walmsles opened this issue Apr 16, 2025 · 2 comments · May be fixed by #1331
Assignees

Comments

@walmsles
Copy link

Checks

Operating system

MacOS (Sequoia 15.4)

Expected behaviour

Bug: Amazon Q CLI fails with ValidationException when large file is added to context

Description

When using Amazon Q CLI, adding a large file to the context causes all model inference operations to fail with a ValidationException error. The CLI commands that don't require model inference (like /context show) continue to work, but any attempt to chat with the model results in an error.

Expected behavior

Amazon Q should either:

  • Successfully process the file if it's within acceptable limits
  • Return a clear error message about token size limitations before attempting inference
  • Gracefully handle the oversized file without breaking model inference capabilities

Actual behavior

All requests requiring model inference fail with a ValidationException error after adding a large file to context. Commands that don't require model inference (like /context show) continue to work normally.

Context information

/context show

🌍 global:
.amazonq/rules/**/*.md
README.md
AmazonQ.md

👤 profile (default):
amazon_q/*.md ./t.md (1 match)

1 matched file in use:
👤 [~295900 tokens] /Users/m.walmsley/dev/open-source/test/./t.md

Total: ~295900 tokens

hello

Amazon Q is having trouble responding right now:
0: unhandled error (ValidationException)
1: service error
2: unhandled error (ValidationException)
3: Error { code: "ValidationException", message: "Improperly formed request.", aws_request_id: "a6641e95-548e-4f5a-a7a0-e8cd23e7c6e3" }

Location:
crates/q_cli/src/cli/chat/mod.rs:668

BACKTRACE
1: backtrace::backtrace::trace::h976c2bd252d3a769
at :
2: backtrace::capture::Backtrace::new::h61944a637b13bacd
at :
3: color_eyre::config::EyreHook::into_eyre_hook::{{closure}}::he363a1d13cffc841
at :
4: eyre::error::<impl core::convert::From for eyre::Report>::from::h24ff12ad7df51aa1
at :
5: q_cli::cli::chat::ChatContext::try_chat::{{closure}}::h61eb9def6e01d349
at :
6: q_cli::cli::chat::chat::{{closure}}::h75365372f3391e05
at :
7: q_cli::cli::Cli::execute::{{closure}}::h39c461beca91f57e
at :
8: q_cli::main::{{closure}}::h8cf890f0b810cff0
at :
9: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::h69e1a9c9f94f69df
at :
10: tokio::runtime::context::runtime::enter_runtime::h309a865907b44753
at :
11: tokio::runtime::runtime::Runtime::block_on::h8f3a850c0b7753b1
at :
12: q_cli::main::ha217496dbe4ef1e3
at :
13: std::sys::backtrace::__rust_begin_short_backtrace::hc7cc976bfd4313cb
at :
14: std::rt::lang_start::{{closure}}::hd2ca6dad83ca0cb5
at :
15: std::rt::lang_start_internal::hacda2dedffd2edb4
at :
16: _main
at :

Run with COLORBT_SHOW_HIDDEN=1 environment variable to disable frame filtering.

Environment

  • Operating System: macOS
  • Amazon Q CLI version: 1.7.3
  • File size that triggered the issue: ~800KB
  • Token count that triggered the issue: ~296K tokens
  • Total context tokens: ~298,860 tokens

Additional context

This issue significantly impacts productivity as users cannot use large reference files in their workflows. The error message "Improperly formed request" is not helpful in diagnosing the actual issue, which appears to be related to context token limitations.

Possible solutions

  • Implement proper token size validation with clear error messages before attempting to process
  • Add graceful error handling that preserves model inference capabilities even when token limits are exceeded
  • Provide documentation on maximum context size limitations
  • Improve error messages to clearly indicate when the issue is related to context size
  • Consider implementing automatic file chunking, summarization, or truncation for large files
  • Add a warning when approaching token limits during context addition

Actual behaviour

Image

Steps to reproduce

  1. Install and configure Amazon Q CLI
  2. Start a new conversation with q chat
  3. Add a large file to the context using /context add <large-file> (in my case, a file of ~800KB containing ~296K tokens)
  4. Attempt to make any request that requires model inference (like a simple "hello")

Environment

[q-details]
version = "1.7.3"
hash = "3e4ae79d371315e80ddac772b43fff2cba314104"
date = "2025-04-10T05:48:16.435186Z (6d ago)"
variant = "full"

[system-info]
os = "macOS 15.4.0 (24E248)"
chip = "Apple M2 Max"
total-cores = 12
memory = "32.00 GB"

[environment]
cwd = "/Users/USER/dev/open-source/test"
cli-path = "/Users/USER/dev/open-source/test"
os = "Mac"
shell-path = "/bin/zsh"
shell-version = "5.9"
terminal = "iTerm 2"
install-method = "brew"

[env-vars]
PATH = "/Users/USER/tools:/Users/USER/bin:/Users/USER/.pyenv/shims:/Users/USER/.nvm/versions/node/v22.4.0/bin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin:/usr/local/go/bin:/Users/USER/tools:/Users/USER/bin:/Users/USER/.nvm/versions/node/v22.4.0/bin:/Applications/iTerm.app/Contents/Resources/utilities:/Users/USER/Library/Application Support/JetBrains/Toolbox/scripts:/Users/USER/.local/bin:/Users/USER/Library/Application Support/JetBrains/Toolbox/scripts"
QTERM_SESSION_ID = "ea9e223656664f2b850c8cfb45dd7b5c"
Q_SET_PARENT_CHECK = "1"
Q_TERM = "1.7.3"
SHELL = "/bin/zsh"
TERM = "xterm-256color"
__CFBundleIdentifier = "com.googlecode.iterm2"
@GoodluckH GoodluckH self-assigned this Apr 18, 2025
@GoodluckH
Copy link
Collaborator

Currently we have a 200k token limit, we can use 150k as a limit for context files.

We can add the following validations:

  1. Hard validation: upon adding new rules, we stops the user is newly matched context files will exceed limit
  2. Soft validation: still allow users to add files but we will intelligently drop the largest files when user is interacting with the model. And will prompt user that "Some context files are skipped due to size limits, run /context show to learn more"

@walmsles
Copy link
Author

walmsles commented Apr 23, 2025

Currently we have a 200k token limit, we can use 150k as a limit for context files.
I feel reaching this limit is unlikely but nice to fail gracefully all the same

We can add the following validations:

  1. Hard validation: upon adding new rules, we stops the user is newly matched context files will exceed limit

Given the glob'ing nature of config, I think a warning when the threshold has been reached when adding configuration and the scan reveals too much context would be nice including "run /context show to learn more" so can see what is excluded.

  1. Soft validation: still allow users to add files but we will intelligently drop the largest files when user is interacting with the model. And will prompt user that "Some context files are skipped due to size limits, run /context show to learn more"

I dislike the idea of "intelligently dropping largest files", in my opinion I think dropping the last found files in order of discovery makes more sense - its a more declarative outcome which I feel is critical for this. That way I can tweak the context order in the profile and get the most important context first, otherwise I am guessing as to how to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants