I'd like to quickly summarize what is different about our approach and why it matters.
Our work was inspired by brilliant research done at MIT CSAIL on "Recursive Language Models" (RLMs). One of the controversies has been whether these models are just a formalization of what agents like Claude Code already do vs. whether they bring new capabilities to the table.
By outperforming Claude on the major long-context benchmark, we provide a strong signal that something fundamentally new is happening. (In other words, it's not "just Claude Code" because it demonstrably outperforms Claude Code in the long-context regime.)
Where our contribution, LCM, differs from RLMs is how we handle recursion. RLMs use "symbolic recursion" -- i.e., they have an LLM write a script to recursively call itself in order to manipulate the context, which is stored in a REPL. This provides maximum flexibility... but it often goes wrong, since the LLM may write imperfect scripts.
LCM attempts to decompose the recursion from RLMs into deterministic primitives so that the control flow can be managed by an engine rather than left to the whims of the LLM. In practice, this means we replace bespoke scripts with two mechanisms: (1) A DAG-based context management system that works like paged virtual memory, except for managing conversations and files; and (2) Operator-level recursion, like "Map" for LLMs, which lets one tool call process thousands of tasks.
An analogy we draw in the paper is the evolution from GO-TO statements (of Dijkstra's "Considered Harmful" fame) to structured programming. RLMs are maximally expressive, but all of that power comes with the risk of things going awry. We have built a more mechanistic system, which can provide stronger guarantees when deployed in production with today's models.
Happy to answer any questions! Thanks for taking a look at the paper!
> Because expansion can recover arbitrarily large volumes of earlier conversation, this tool is restricted to sub-agents spawned via the Task tool; the main agent cannot call it directly. This restriction prevents uncontrolled context growth in the primary interaction loop.
What if the lcm_expand is called for a summary that has 1000s of messages that immediately floods the sub-agent's own context window?
Does lcm_expand only unroll one "layer" of the DAG and unrolls more if needed by another subagent?
It has two differences:
1. It does not store chat history, reasoning traces etc, only workflow artifacts (requirements, codebase analysis, implementation plan, etc). I frankly do not believe those things are relevant.
2. It is significantly simpler and more lightweight, using only markdown files
Much of this feels like a technical report of what they did, and makes me feel like we've reached the ICO whitepaper phase. I have very similar features in my custom coding agent, they seem pretty common sense to have. Are you really throwing away the compacted history? Saving it doesn't seem like a feature, the opposite seems like a gap. Same for making it available via toos/search, pretty standard stuff. Then too, ADK framework I use handles parallel agents/tools.