You can tell the embedded vector stores apart by what they do not have: a server. No process to deploy, no port to expose, no replica to page you at 3am. The vectors live in a file or a library that loads straight into your agent's address space. That is the whole pitch, and it is a good one. The trap is assuming the three serious options are interchangeable because they share that pitch. They are not, and the thing that separates them is the least glamorous property in the stack: what happens to the index when the data changes.
The tier nobody benchmarks honestly
Most comparisons of embedded stores reach for recall@10 and queries-per-second, then declare a winner by a few milliseconds. That number is real and almost never the thing that bites you. An agent's corpus is rarely static. It ingests documents, re-embeds them when the model changes, deletes stale memories, appends conversation turns. The benchmark measures a frozen dataset; production measures churn. So the honest axis is not "how fast does it search" but "how gracefully does it absorb writes." On that axis the three diverge sharply.
If you have already ruled out the client-server vector DBs and the pgvector vs Pinecone vs Qdrant camp because you do not want to run infrastructure, this is the tier you land in. Choosing well here is mostly about being honest about your data's mutability.
sqlite-vec: exact, honest, and linear
sqlite-vec is the one that tells you the truth on the tin. It is a SQLite extension in C that adds a vec0 virtual table and does exact, brute-force KNN: every query scans every stored vector. There is no approximate index. The ANN work, IVF and DiskANN and friends, has lived as tracking issue #25 since 2024 and, as of mid-2026, has not shipped in a release. You will see -ivf and -diskann source files in the tree; treat them as construction, not a foundation.
This sounds like a limitation and it is, but it buys you something rare: perfect recall and zero index to corrupt. Writes are just SQL inserts. There is no graph to rebuild, no checkpoint that re-serializes a structure, no flag named "experimental." For a few hundred thousand vectors riding inside a SQLite file you already ship, it is the most predictable choice on the board. Past a million-plus high-dimensional rows, the linear scan stops being free and you feel every query.
sqlite-vec does not have an approximate index, and that is a feature until the moment it is a wall.
DuckDB vss: HNSW with an asterisk
DuckDB is the 39k-star analytical engine, and its vss extension bolts on a real HNSW index, which means real approximate search at scale. The asterisk is persistence. By default an HNSW index can only be built on in-memory tables. To put it on a disk-backed database you must SET hnsw_enable_experimental_persistence = true, and the name is not decoration: WAL recovery is not implemented for custom indexes, so an unclean shutdown with uncommitted changes can corrupt the index. Worse for mutable workloads, there are no incremental updates to the persistent index; every checkpoint re-serializes the entire structure to disk.
That shape is fine, even excellent, for what DuckDB is for: load a batch, build the index, run analytical similarity queries, throw it away. It is uncomfortable for an agent that mutates its memory all day. If your retrieval is part of a larger analytical pipeline, leaning on the engine you already have is a sound move; the vss extension is the pragmatic choice precisely because it rides inside DuckDB.
LanceDB: built for the data that moves
LanceDB is the only one of the three that was designed from scratch as an embedded vector engine rather than borrowed from a database. The Rust core sits on the Lance columnar format, which gives it the property the other two lack: MVCC versioning with time travel, ACID transactions, and zero-copy column evolution. You can add a new embedding column without rewriting existing rows, roll back to a prior version, and query datasets larger than RAM via memory-mapped on-disk storage. Its ANN indexes (IVF-PQ and HNSW variants) coexist with that mutable, versioned substrate instead of fighting it.
The cost is that it is a separate dependency with its own format, not a feature riding inside SQLite or DuckDB. You are adopting an engine, not extending one you already trust. For a static or rarely-changing corpus that is overkill. For an agent whose memory is constantly appended, re-embedded, and pruned, it is the only one of the three that does not make you choose between approximate scale and safe mutation.
How to actually choose
Skip the QPS shootout. Ask what your data does. Small, exact, already-in-SQLite, modest scale: sqlite-vec. Batch-loaded, analytical, throwaway index, you live in DuckDB anyway: vss. Mutable, versioned, larger-than-RAM, churning all day: LanceDB. The index model and its behavior under writes is the decision; recall and latency are downstream of it. If you are still upstream of this choice entirely, start from the index algorithms and our wider take on the best vector database before you commit a format you will be married to.



