Context surrogacy/decreasing boilerplate #2825

kfsone · 2025-04-21T22:43:12Z

kfsone
Apr 21, 2025

Especially once you are starting to work with longer source files etc, it can be very difficult to do anything with Roo because the context is a conversation rather than a discourse. There's no summation, no redaction.

In one example, I watched as Roo consumed a dictionary fragment, noticed it was incomplete and tried reading it again. On the 3rd read, it must have passed the context length, because the model dropped the original task and instead set to work completing the missing 99% of the dictionary.

There are numerous situations where, even without MCP, Roo ought to be able to have a way to redact: the most common, that I see, is when it reads a file, make several changes to it, and then reads the file again.

Take the following exchange:

--- context ---
    <pre-prompt>
    User: I want you to optimize foo.py
    AI: I'll help with this. I'll start by taking a look at foo.py and analyzing it for code that can be optimized. Then I'll suggest ways to improve it.
    First I need to read foo.py so I understand what the code looks like.
    <read file1.py instruction>
--- context ---

the AI has added ~50 tokens to the context already, none of which have any real value. The user might - possibly - have avoided this by @ing foo.py in their request. This may seem like small waste but lets look at how this progresses.

--- context ---
  <pre-prompt>
  User: I want you to optimize foo.py
  AI: I'll help with this. I'll start by taking a look at foo.py and analyzing it for code that can be optimized. Then I'll suggest ways to improve it.
  First I need to read foo.py so I understand what the code looks like.
  AI: <read file1.py instruction>
  Roo: ... content of file1.py
  AI: I see file1.py is a python script that picks seashells from the seashore. I see an immediate issue that the shell picking from the seashore is seeking the seashore for sufficient shells once per shell picking session.
  ``py
  def she_sells():
    go_to_seashore()
    pick_shell()
    sell_shells()
  ``
  We can drastically improve performance by picking multiple shells for each seashore visit.
  ``py
  def she_sells():
    go_to_seashore()
    while not inventory_full():
      pick_shell()
    sell_shells()
  ``
  To achieve this I will add an inventory concept that allows more than one shell, accounts for the number of pockets and hands she has, and the space required to hold the shells. I will use a knapsack algorithm to optimize shell storage.
  AI: <edit file1.py instruction>
  AI: <file1.py edits>

  AI: I see there are some mypy errors. Let me address those
  ... ~100 tokens
  AI: I see the problem, I renamed pick_shell to pick_shells but I did not change the call site.
  ... ~100 tokens
  AI: That fixed it. So I have changed foo.py by adding an inventory system that allows for carrying/handling multiple shells, and I refactored the outer loop so that more shells are collected per visit to the sea shore. With this, the execution of foo.py will be significantly faster.

  Is there anything else you'd like me to help you with today?
--- context ---

If the AI nails it on this first try and you have nothing else to do, this is ok.

However, the way LLMs are extending their context windows means that most models will not know when to stop. Further, models are being biased towards a scattershot/"spew first, ask permission later" strategy.

For example, in the above case the model immediately started making changes without asking the user if there was a reason not to collect more than one shell at a time.

So, now the user responds, and I hope showing it this way drives home the issue and makes the need for surrogacy/compaction/redaction/resolution critical:

--- context ---
  <pre-prompt>
  User: I want you to optimize foo.py
  AI: I'll help with this. I'll start by taking a look at foo.py and analyzing it for code that can be optimized. Then I'll suggest ways to improve it.
  First I need to read foo.py so I understand what the code looks like.
  AI: <read file1.py instruction>
  Roo: ... 2500 tokens from file1.py ...
  AI: I see file1.py is a python script that picks seashells from the seashore. I see an immediate issue that the shell picking from the seashore is seeking the seashore for sufficient shells once per shell picking session.
  ``py
  def she_sells():
    go_to_seashore()
    pick_shell()
    sell_shells()
  ``
  We can drastically improve performance by picking multiple shells for each seashore visit.
  ``py
  def she_sells():
    go_to_seashore()
    while not inventory_full():
      pick_shell()
    sell_shells()
  ``
  To achieve this I will add an inventory concept that allows more than one shell, accounts for the number of pockets and hands she has, and the space required to hold the shells. I will use a knapsack algorithm to optimize shell storage.
  AI: <edit file1.py instruction>
  AI: <file1.py edits>
  AI: I see there are some mypy errors. Let me address those
  ... ~100 tokens
  AI: I see the problem, I renamed pick_shell to pick_shells but I did not change the call site.
  ... ~100 tokens
  AI: That fixed it. So I have changed foo.py by adding an inventory system that allows for carrying/handling multiple shells, and I refactored the outer loop so that more shells are collected per visit to the sea shore. With this, the execution of foo.py will be significantly faster.

  Is there anything else you'd like me to help you with today?

  User: The problem is actually the speed of pick_shell, it takes nearly 3 minutes to run for one shell. And the shells worth selling are rare, it is unlikely there will be more than one on the beach any time I go.

  AI: Ok, that makes a lot of sense. I did wonder about the implementation of pick_shell, in particular I wasn't sure what the time.sleep(30) was for. Optimizing pick_shell would definitely be a better approach.

  Let me revert my changes.
  AI: <instructions to undo changes to foo.py>
  AI: Now, since we've changed foo.py, let me re-read it so I understand better what we are working on before I proceed. 
  AI: <instruction to read foo.py>
  Roo: ... 2620 tokens of foo.py ...
  AI: Ok, based on comments, I suspect the sleep in pick_shell was left there from debugging. I also see that there are some very expensive data transformations in tight loops that are repeated multiple times, and some logging that passes expensive-to-calculate values whether or not the line logs.

--- context ---

... Proposal ...

Instead of treating the context as linear text, Roo needs to tag/label/attribute portions of it; it needs to provide the AI with redactable tools, and perhaps the way to do this is to have the model take scoped actions.

  AI: I need to see what the files in assets/subdir1 are so I can create a manifest that will load the required files.
  [execute shell command 'dir subdir1']
  Roo: ... file not found subdir1 ...
  AI: I tried to execute a command to list the assets in assets/subdir1 but it appears we are not in the assets folder. Let me determine which folder we are in.
  [execute shell command 'pwd']
  Roo: workspace/
  AI: Ah, I see the problem. I did not include the assets part of the path. Let me try again.
  [execute shell command 'dir assets/subdir1']
  Roo: flower.tga, boom.wav, clippy.exe

Leaving this in the context exposes Roo to the risk of LLM-attention landing on things like "file not found" and "subdir1" without the assets/ prefix. It becomes increasingly likely as context progresses that it will use "subdir1" without "assets/" because it was mentioned at least twice.

The majority of this text is non-constructive: If you ran the same preceeding context with a different seed, chances are the model would do the right thing first time around.

  AI: I need to see what the files in assets/subdir1 are so I can create a manifest that will load the required files.
  <--- TBD --->
  AI: [execute shell command 'dir subdir1']
  Roo: ... file not found subdir1 ...
  AI: I tried to execute a command to list the assets in assets/subdir1 but it appears we are not in the assets folder. Let me determine which folder we are in.
  [execute shell command 'pwd']
  Roo: workspace/
  AI: Ah, I see the problem. I did not include the assets part of the path.
  <--- TBD --->
  [execute shell command 'dir assets/subdir1']
  Roo: flower.tga, boom.wav, clippy.exe
  <--- TBD --->

The "TBD" here is where we need to introduce something new: a markup for the model to track that it is starting a subtask/activity, a way for it to indicate that follow-on sections are retries/doovers while temporarily leaving the intermediate tokens in the context, and finally a way for it to conclude the activity and discard the sub-activity token overhead entirely.

  AI: I need to see what the files in assets/subdir1 are so I can create a manifest that will load the required files.
  ** AI: start subactivity ("list-of-assets-subdir1-files") **
  AI: [execute shell command 'dir subdir1']
  Roo: ... file not found subdir1 ...
  AI: I tried to execute a command to list the assets in assets/subdir1 but it appears we are not in the assets folder. Let me determine which folder we are in.
  [execute shell command 'pwd']
  Roo: workspace/
  AI: Ah, I see the problem. I did not include the assets part of the path.
  ** AI: reset subactivity ("list-of-assets-subdir1-files") **
  [execute shell command 'dir assets/subdir1']
  Roo: flower.tga, boom.wav, clippy.exe
  AI: OK, that worked, and gave me the information I needed. I can move on.
  ** AI: complete subactivity ("list-of-assets-subdir1-files") **
  AI: Now I should ...

At this point, Roo can eliminate a bunch of deadweight:

  AI: I need to see what the files in assets/subdir1 are so I can create a manifest that will load the required files.
  [execute shell command 'dir assets/subdir1']
  Roo: flower.tga, boom.wav, clippy.exe
  AI: Now I should ...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Context surrogacy/decreasing boilerplate #2825

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Context surrogacy/decreasing boilerplate #2825

Uh oh!

Uh oh!

kfsone Apr 21, 2025

Replies: 0 comments

kfsone
Apr 21, 2025