For a couple of years the open-source vector database question had a tidy answer shaped like a staircase. Prototyping on your laptop? Chroma. Mid-size production app that wants hybrid search and modules? Weaviate. Hundreds of millions or billions of vectors? Milvus. You located yourself on the scale axis and read off the name. It was a good heuristic, and it has quietly stopped being true.
What broke it is that every one of these projects grew toward the others. The staircase flattened into a plain.
The convergence that ate the old answer
Chroma — the "just pip install it" one — rewrote its core in Rust and stood up Chroma Cloud, a serverless offering that spins a database in under thirty seconds. So the prototyping tool now scales out. Milvus — the billion-vector behemoth — ships Milvus Lite, an embedded Python mode you install with pip install pymilvus[milvus-lite], so the heavyweight now runs on your laptop. Weaviate sits in the middle and always could go either way, single binary or cluster. The result: all three span laptop-to-cluster. "Which one scales" no longer separates them, because they all do.
The feature moats are eroding on the same schedule. Hybrid search — keyword BM25 fused with vector similarity — was Weaviate's signature differentiator. Milvus 2.5 made it native. Chroma Cloud offers it. The thing you used to switch databases for is becoming a checkbox all of them tick.
So if the spec sheets are converging, what's actually left to decide on?
Default posture: where the tool assumes you are
Chroma's default posture is a single process on your machine. You import it, it persists to a folder, and you are doing retrieval in five minutes with no services to operate. The Rust rewrite means that simplicity no longer costs you raw speed the way the old Python core did. Everything about its design assumes you'd rather start coding than start provisioning — and reaches for the cloud only when one process genuinely isn't enough.
Milvus's default posture is the opposite: a distributed system you operate. Even though Milvus Lite exists, the project's center of gravity is the cluster — separated compute and storage, a menu of index types including GPU-accelerated CAGRA and DiskANN, the architecture you reach for when "billions" is a literal word and not a figure of speech. It's a graduated project of the LF AI & Data Foundation, with the maturity that implies. The price of that ceiling is the floor: run it distributed and you are also running etcd for metadata, object storage like MinIO, and a message queue like Pulsar or Kafka. The benchmark gap between these databases is smaller than the Day-2 operations gap, and Milvus is where that operations gap is widest.
Weaviate's posture is a production service from the start. Not embedded-in-your-process, not a distributed system you babysit — a server you run, with the RAG-pipeline batteries (vectorizers, generative modules, rerankers) already included. It assumes you're building an application that will go to production and want the surrounding machinery in the database rather than glued on. If you want one component to own embedding, search, and reranking together, that's the bet Weaviate makes for you.
How to actually choose
Stop asking which one scales; they all do. Ask where you are right now and what you're willing to operate. If you want to be retrieving vectors before lunch with zero infrastructure, Chroma's single-process default is the path of least resistance. If you genuinely have — or will soon have — billions of vectors and a team to run a distributed system, Milvus's ceiling is the highest and its ops bill is the steepest, honestly paid. If you want a production-grade service with the RAG pipeline built in and neither extreme, Weaviate's modular middle is built for exactly that.
As the feature lists converge, the decision quietly migrates from "what can it do" to "what architecture am I signing up to run." Pick the default posture that matches where you actually are, not the one that matches where you hope to be — you can graduate later, and all three now let you. For the managed-and-Postgres end of this same decision, see pgvector vs Pinecone vs Qdrant; for what you put into these databases, the best embedding models for RAG agents.



