Skip to content

Commit a90b9c7

Browse files
committed
(chore) some AI slop leftover
1 parent 9476392 commit a90b9c7

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

_posts/2025-04-21-alignment-tightrope.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,13 +17,13 @@ I've had my flight home from Istanbul with one layover. It offered me a time win
1717
Silver-Sutton started straight: we need agents that learn mainly from the data they themselves create as they prod, poke and rewrite the world minute by minute. It instantly clicked with thoughts I’ve been wrestling with in one of my own drafts on evaluation benchmarks. I'd (rather poorly) sketched the idea in this tweet:
1818
<blockquote class="twitter-tweet"><a href="https://twitter.com/futurisold/status/1912099520998965274">April 15, 2025</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
1919

20-
If the distribution keeps shifting under the model’s feet, the old leaderboard mindset collapses. So I picture two kinds of tests: the familiar, **static**, frozen benchmarks we already obsess over, and a new, **dynamic**, volatile class of challenges the agent spawns for itself whenever it feels the floor wobble—spikes of entropy where it stalls, branches, backtracks, and later grades its own performance. I expect a surge in research on information-theoretic measures applied to this latter dynamic class. For instance, xjdr's [entropix](https://github.com/xjdr-alt/entropix) project is a great example of this. **Context aware sampling** is a promising idea for measuring uncertainty at runtime.
20+
Ergo, leaderboards don't mean much when the distribution won't sit still. As data drifts, they stop measuring progress and start rewarding hacks. So I picture two kinds of tests: the familiar, **static**, frozen benchmarks we already obsess over, and a new, **dynamic**, volatile class of challenges the agent spawns for itself whenever it feels the floor wobble—moments where it stalls, branches off, backtracks, and then figures out how well it did. I expect a surge in research on information-theoretic measures applied to this latter dynamic class. For instance, xjdr's [entropix](https://github.com/xjdr-alt/entropix) project is a great example of this. **Context aware sampling** is a promising idea for measuring uncertainty at runtime.
2121

2222
The recent improved memory feature in ChatGPT could enable what Silver-Sutton envision from agents, namely **guidance** based on long-term trends and the user's specific goals. They further noted that simple goals in a complex environment "*may often require a wide variety of skills to be mastered*". I suspect we'll see more and more agents tested against Minecraft-like environments in which skill acquisition is crucial. One of my closest friends used [symbolicai](https://github.com/ExtensityAI/symbolicai)'s [contracts](https://futurisold.github.io/2025-03-01-dbc/) together with a distilled version of DeepSeek's R1 to create off-the-shelf higher-order [expressions](https://arxiv.org/pdf/2402.00854) that expanded their agent's toolbox. It worked flawlessly.
2323

2424
Since LLMs coupled with external memory act as a [universal computer](https://arxiv.org/pdf/2301.04589), they provide a rich medium for the internal computation of the agent to unfold. Moreover, the underlying transformer architecture [can implement a broad class of standard machine learning algorithms in context](https://arxiv.org/abs/2306.04637). Given that most reasoning LLMs are designed to imitate human reasoning in textual form, Silver-Sutton raised the natural question of whether this provides a good basis for the optimal instance of a universal computer. It might very well be that the answer is no; the authors of [Coconut](https://arxiv.org/abs/2412.06769) **certainly** agree.
2525

26-
Personally, when it comes to reasoning, I believe that natural language, despite its ambiguities and inefficiencies, might still be an optimal substrate for an agent’s internal computation. Why? Because verbal reasoning is our civilization’s oldest compression scheme for thought. Across centuries, we've distilled complex chains of logic, intuition, and abstraction into shared textual formats. It's error-prone—of course it is; if it weren’t, we’d all be doing formal math by default—but it’s also the only medium that’s scaled collective reasoning across billions of minds and thousands of years. In that sense, natural language isn’t just a tool we use; it’s where we buried most of our epistemic heritage. Moreover, according to [Vann McGee](https://ocw.mit.edu/courses/24-242-logic-ii-spring-2004/20c09c32f5c237d1fb4207a83153dbb5_why_study_comptt.pdf), it's decidable:
26+
Personally, when it comes to reasoning, I believe that natural language, despite its ambiguities and inefficiencies, might still be an optimal substrate for an agent’s internal computation. Why? Because verbal reasoning is our civilization’s oldest compression scheme for thought. Across centuries, we've distilled complex chains of logic, intuition, and abstraction into shared textual formats. It's error-prone—of course it is; if it weren’t, we’d all be doing formal math by default—but it’s also the only medium that’s scaled collective reasoning across billions of minds and thousands of years. In that sense, natural language is where we buried most of our epistemic heritage. Moreover, according to [Vann McGee](https://ocw.mit.edu/courses/24-242-logic-ii-spring-2004/20c09c32f5c237d1fb4207a83153dbb5_why_study_comptt.pdf), it's decidable:
2727
> “Recursion theory is concerned with problems that can be solved by following a rule or a system of rules. Linguistic conventions, in particular, are rules for the use of a language, and so human language is the sort of rule-governed behavior to which recursion theory applies. Thus, if, as seems likely, an English-speaking child learns a system of rules that enable her to tell which strings of English words are English sentences, then the set of English sentences has to be a decidable set. This observation puts nontrivial constraints upon what the grammar of a natural language can look like. As Wittgenstein never tired of pointing out, when we learn the meaning of a word, we learn how to use the word. That is, we learn a rule that governs the word’s use.”
2828
2929
It's a good segway to speak about scientific research. While certain processes can be virtualized and simulated, fast-forwarded to explore millions of configurations in seconds, reality doesn’t grant us that luxury. We're still bottlenecked by the tempo of the physical world. Experiments take time, materials have constraints, and feedback loops are often slow and noisy. I appreciated Silver-Sutton's almost hidden definition of reality, as if tucked between parentheses like an easter egg: "*open-ended problems with a plurality of seemingly ill-defined rewards.*" That’s exactly what scientific exploration is. Even our best simulators operate under assumptions, and Wolfram's principle of [computational irreducibility](https://mathworld.wolfram.com/ComputationalIrreducibility.html) adds another layer of humility: for many systems, there is no shortcut; you just have to run the damn thing. This is why grounding agents in the real world matters. As they note, "*Without this grounding, an agent, no matter how sophisticated, will become an echo chamber of existing human knowledge.*" That line hit home. I once had the idea that future research infrastructure should integrate with lab equipment that exposes REST APIs. Silver–Sutton talk about a similar idea, though they phrase it as *digital interfaces*. The point was that of self-managing experimental pipelines. The agent shouldn't only write code or generate hypotheses, but also trigger physical experiments, wait for real-world results, and loop them back into the reasoning chain. That's one of the core bets we're making at [ExtensityAI](https://www.extensity.ai/): that research is the most valuable currency of the future.

0 commit comments

Comments
 (0)