For three years the reflex was automatic: to do retrieval-augmented generation, you chunk the document, embed the chunks, drop the vectors in a database, and at query time pull the nearest neighbors by cosine similarity. Every serious RAG stack — pgvector, Pinecone, Qdrant — is a variation on that spine. PageIndex, an MIT-licensed project from VectifyAI now sitting near 33.7k stars, does none of it. There are no embeddings anywhere in its retrieval path.

Instead, it reads a document and builds a tree: a machine-optimized table of contents where every node carries a title, a short summary, and a page range. When a question comes in, an LLM reads the tree and reasons about which nodes probably contain the answer — the way an analyst flips straight to "Item 7A. Quantitative and Qualitative Disclosures About Market Risk" instead of scanning the whole 10-K. The content of the chosen nodes is what feeds the final answer.

The number that gets the attention#

On FinanceBench, a benchmark of question-answering over long financial documents, VectifyAI's Mafin 2.5 pipeline built on PageIndex reports 98.7% accuracy. Traditional vector RAG on the same task lands around 50% (MarkTechPost). The project's one-line thesis is similarity ≠ relevance: an embedding finds passages that look like your query, but the passage that actually answers a financial question ("what was the change in operating margin, and why?") is often written in language that shares almost no surface tokens with the question.

That gap is real. It is also domain-shaped, and reading it as a universal scoreboard is the mistake. FinanceBench is long, deeply structured single documents — 10-Ks, filings, prospectuses — the exact material where a table of contents is meaningful and where 512-token chunking shatters an argument across a dozen unrelated neighbors. Show PageIndex a corpus with real hierarchy and it wins convincingly. That's the finding. It is not the whole story.

Cost didn't disappear — it moved#

Here's the part the leaderboard hides. Vector RAG and PageIndex don't differ mainly on accuracy; they differ on where the cost lives.

Vector RAG is index-time expensive, query-time cheap. You pay once to embed the corpus, and from then on every query is a sub-millisecond approximate-nearest-neighbor lookup costing fractions of a cent. The pain is keeping that index fresh.

PageIndex inverts it. Building the tree is comparatively light — but there is an LLM reasoning pass on every single query. So its bill scales with query volume and document depth, not corpus size. And the indexing isn't free either: structuring one 131-page report with ~137 nodes is ~137 LLM calls, and a 50-document set runs to ~7,000 calls before it answers anything (Towards Data Science).

You're not choosing between vectors and reasoning. You're choosing whether to pay once at index time and pennies per query, or little to structure and an LLM's attention on every question you ask.

The scaling story flips#

That single fact reorganizes the whole decision. PageIndex is made for a bounded, high-value corpus queried thoughtfully: one annual report you'll interrogate fifty ways, one contract a lawyer needs answered precisely, a single regulatory filing. Low query volume, enormous value per answer, deep structure — the economics and the accuracy both line up.

Push it toward consumer scale and it breaks in a way vectors don't. You cannot hold a million-document tree in a context window, and you cannot afford a reasoning traversal on every request of a high-QPS search box. This is precisely where ANN over a vector index is unbeatable — it was engineered for sub-millisecond recall over millions of vectors at almost no marginal cost. For a big, flat, heterogeneous corpus, vectors still win on speed and cost, full stop. (It's the same "which shape is your problem" logic that separates GraphRAG from plain vector RAG and long-context from retrieval — the winner is set by the workload, not the leaderboard.)

Where this actually lands: hybrid#

The practitioners who've run both aren't picking a side. They're stacking them: use vector ANN as the coarse filter to cut ten thousand documents down to a hundred, then let PageIndex reason over that shortlist for the final, precise hop. The vectors do the scaling; the reasoning does the relevance. VectifyAI's own follow-on work ("Proxy-Pointer RAG") is explicitly chasing vectorless accuracy at vector-RAG scale and cost — which is a tacit admission that neither pure approach is the destination.

So the right question isn't "are vectors obsolete?" They aren't. It's the one the two architectures actually put in front of you: for this corpus, at this query volume, do you want your cost at index time or query time? Answer that honestly and PageIndex stops looking like a vector-killer and starts looking like what it is — the sharpest tool on the bench for deep, bounded, structured documents, and the wrong tool for a web-scale search box. The 98.7 is not a verdict on embeddings. It's a map of where reasoning-based retrieval is worth paying for.