For about two years there was a reflex answer to a hard question. Where does an AI agent keep what it knows? In a vector database. Embed everything the agent sees, store the vectors, and when it needs to remember, embed the query and pull back whatever sits nearest in the high-dimensional dark. "Agent memory" and "a vector store" became almost synonymous — to the point that buying one and calling the problem solved was the standard move.
In 2026 that reflex is visibly breaking, and the most interesting agents are reaching for something far older: files.
The reversal, stated out loud#
You rarely get to watch an architectural assumption get named and rejected in the same sentence, but Volcengine's OpenViking does exactly that. Its design "abandons the fragmented vector storage model of traditional RAG" and "innovatively adopts a file system paradigm" to organize what an agent needs — memories, resources, and skills — as a browsable tree you walk with ls and find under a viking:// path. Context management becomes "traceable file operations rather than vague semantic matching." Memory is something you can open and read, not a ranking you have to trust.
The same conclusion arrived at Manus by the harder road of production. Its team treats the file system as the agent's context directly: "unlimited in size, persistent by nature, and directly operable by the agent itself." The agent writes a todo.md and keeps rewriting it to recite its own plan back into recent attention; it offloads a bulky web page to a file and keeps only the path, so the content is restorably compressed — gone from the window, one read away from returning. None of that is a vector lookup. It's a working directory.
And Anthropic shipped the pattern as a primitive. The memory tool is a directory of files the model reads and writes with ordinary file commands, living outside the context window. The company's own measurements put a 100-turn agentic-search task up 29% from merely clearing stale tool results, and up to 39% once a file-backed memory is added — because the durable facts survive in a file while the transient exhaust gets evicted.
Why files win the working-memory job#
This isn't nostalgia for the shell. It's that an agent's working memory needs three properties, and a filesystem provides all three while a nearest-neighbors index provides none.
Exact addressing. A path returns one specific file, deterministically. A vector search returns the top-k things that look similar to your query — useful when you don't know the exact item, useless when you do. "Hand me the plan I wrote two steps ago" is a lookup by name, not by resemblance.
Mutation in place. When a fact changes, you overwrite the file or delete it. In a vector store you re-embed and upsert, and until you do, the stale vector keeps surfacing as a plausible-looking match. State that changes constantly — which is what working memory is — fights the embedding model the whole way.
Preserved order. Directories, filenames, and line order keep sequence. Similarity ranking flattens it: a semantic search over your history can tell you what looks like the thing you're asking about, but not what happened, in what order — which is why agents that lean on a vector store for episodic memory keep losing the thread and repeating yesterday's mistake.
The vector database didn't fail. It got demoted to the one job it was always right for — and freed from three it was never built to do.
What the vector database is still for#
The correction here is not "vector databases were a mistake." It's that they were being asked to carry two jobs that only sound like one. The job they're genuinely good at is fuzzy semantic recall over a large external corpus you didn't write — documentation, a support archive, a code index — where you don't know the exact item and approximate match is the entire point. That job has a name, and it isn't memory: it's retrieval-augmented generation. The distinction that the file-system shift makes legible is that RAG reads from a corpus someone else curated while memory is a store the agent writes to itself — and the moment your "memory" is really a write-heavy, mutable, ordered scratchpad, you've wandered out of the vector store's competence.
Pick the wrong tool and the failure modes diverge in nasty ways. A vector store used as working memory can't reliably return the exact note the agent just made, can't cleanly retract a fact that's now false, and quietly poisons itself when the agent's own mistaken output gets embedded and recalled later as trusted truth. A filesystem used for open-domain semantic search over millions of documents is just grep with extra steps. Each is a perfectly good answer to the other question.
The hybrids are the proof, not the exception#
The systems built to last in this space are not picking a side; they're splitting the work. cognee builds a knowledge graph and vectors over ingested data through an extract-cognify-load pipeline. Zep's Graphiti models facts as a temporal graph so it can answer "what changed, and when." Letta runs the agent inside a runtime that owns its own files and context. In every case the structured, addressable, mutable part of memory lives in a graph or a filesystem, and embeddings are reserved for the part that is genuinely about semantic similarity. That's the same lesson the mem0/Zep/Letta frameworks already encode as a choice about where memory lives, now made explicit: don't make one index do both jobs.
So the practical rule for 2026 is smaller and sharper than "use a vector database for memory." Ask what the data actually is. If it's the agent's own working state — the plan, the scratch notes, the artifacts it's editing as it goes — give it a filesystem and page it in just-in-time. If it's fuzzy recall over a big external corpus you didn't author, that's where the vector database earns its keep. The reason 2026's agents write to files isn't that someone reinvented the directory. It's that the field finally noticed it had been filing its working state under the wrong cabinet.



