<aside> 🚀
When we took a closer look at the architecture of Claude Code, we experienced a profound sense of serendipity: its core design philosophies are remarkably similar to our Complementary RL framework. From asynchronous memory extraction (extractMemory) to cross-session refinement (autoDream) and on-demand retrieval, the parallels are striking. In this post, we use Claude Code's memory system as a lens to explore the step-by-step development and the unwritten chain-of-thought behind Complementary RL.
</aside>

<aside> 💡
TL;DR
A closer look at Claude Code reveals a striking alignment with our Complementary RL framework: true agent memory must be subtractive, actively evolving, and ultimately internalized. Here are the core findings from our journey to build this architecture:
extractMemory and autoDream, we found that raw context is just noise. To be useful, memory must be actively distilled into generalizable rules.Phase 1: Extracting and Evolving (The Rollout Process)As LLMs take on increasingly complex roles, transitioning from reactive chatbots to autonomous agents, our expectations of them scale accordingly. We are moving beyond simple, context-level memorization; we now expect agents to construct their own persistent memory systems. They need to retain critical insights, avoid repeating past mistakes, and evolve.
**The Instinct to Add: ****To fulfill this demand for active, complex reasoning, the early design phase of agent memory was almost additive. We tried to force agents to remember everything by engineering highly complex structures, intricate knowledge graphs, massive tabular databases, and dense vector stores paired with convoluted hybrid retrieval mechanisms.
The Subtractive Shift: However, as underlying models have grown more capable, a clear counter-trend has emerged in cutting-edge tools like Claude Code. The prevailing design principle is becoming subtractive. Instead of over-engineering the storage, these systems strip memory down to a raw, hierarchical file system. We simply provide the model with an index of what it knows, equip it with the right tools, and trust the agent to actively read the files only when it deems it necessary.
<aside> 💡
This shift is undeniably driven by improved LLM capabilities, like advanced tool-calling. But it is also driven by a harder truth: unfiltered context pollution degrades reasoning. The ideal memory design pattern is not a rigid database schema, but an evolving, model-driven workspace.
</aside>
With this paradigm shift in mind, this post will explore:
Instead of relying on sophisticated databases, vector stores, or complex data structures, Claude Code manages its entire memory system using raw Markdown files. Embracing this radically subtractive design, Claude operates on a streamlined, four-tiered memory lifecycle:
CLAUDE.md: These are explicit, user-defined rules, conventions, and architectural decisions that are loaded directly into the system prompt at the start of every session.autoDream: Operating on a periodic time-window or session interval, autoDream actively reads existing memories, merges redundancies, and prunes outdated information. This ensures the memory bank continuously evolves rather than simply bloating over time.