Reference ยท Concept

The Karpathy memory system, explained

Persistent, file-based memory your AI reads at the start of every session, in plain language, and how to build one yourself.

The one-line version

A Karpathy-style memory system is a small set of plain text or markdown files that your AI reads at the start of every session and updates as it works. The chat window forgets; the files don't. That's the whole idea.

On the name. This pattern is named after Andrej Karpathy's framing of how to think about LLMs, not an official project of his. The mental model: the model is the processor, the context window is RAM, and anything you want to survive past a single session has to live in files, outside the chat. The system below is an independent take on that idea, built for daily use.

Why the context window is the bottleneck

Every session with an AI starts from near-zero. You re-explain your project, your preferences, the mistake you corrected yesterday. The context window is finite and resets, so the same context gets re-loaded by hand, forever. People try to fix this by chasing a bigger or smarter model. But a bigger engine doesn't help if the car has no trunk.

The fix is to stop treating memory as something the model provides and start treating it as something you own. You write down what the AI should never have to be told twice, you keep it in files, and you make reading those files the first step of every session.

What the system actually contains

The part that makes it stick: enforcement

Notes rot. A memory system that depends on you remembering to update it will drift out of date and become a liability. So the system assumes rot and fights it mechanically: a hook that flags changed files for re-documentation, a scheduled pass that hunts stale pages and dead links, and a rule that work isn't "done" until the docs reflect it. None of that needs intelligence. It needs plumbing.

Why file-based and model-agnostic

Because the tools change every week and the files don't. When a new model ships, you point it at the same memory and it picks up where the last one left off, with your context instead of a blank chat. The model is the engine, and the engine swaps. The memory is the car. Plain files also mean you can read, edit, version, and own the whole thing without a vendor in the loop.

Frequently asked

Is this an official Karpathy project?
No. It's a community pattern named after his framing of context as the scarce resource. The implementation here is independent.
How is it different from RAG or vector memory?
RAG retrieves chunks from an embedded store by similarity at query time. This is smaller, human-readable, and deterministic: a curated index and a few high-value pages, files you can open and edit by hand. They can coexist; the memory layer is the durable, owned core.
Where does CLAUDE.md fit?
CLAUDE.md is the front door: the file the assistant reads first, which points to the rest of the memory and tells the model how to bootstrap itself each session.
Do I need to be technical?
To start, no. One folder and one index file change the experience in a day. The enforcement layer is where it helps to have someone wire the plumbing.

Build one this afternoon

The full system is documented and free. The essay explains the why, the build guide hands your AI the how. Forkable in an afternoon.

Read the Karpathy+ guide Have it built for you