Docling vs Unstructured vs LlamaParse: Parsing Documents for RAG in 2026

The choice you think you're making

You have a folder of PDFs — invoices, 10-Ks, a few decade-old scanned research papers — and you want them inside a RAG pipeline or an agent that won't hallucinate the numbers. So you go looking, and the internet hands you the same bracket it's been handing people since 2024: Docling vs Unstructured vs LlamaParse. Open vs open vs hosted. Pick a corner.

That bracket is stale. Not wrong, exactly — those tools are all real, all maintained, all in production somewhere right now. But the question they were built to answer has quietly changed underneath them, and most of the comparison posts haven't noticed.

Here's the lay of the land as it actually stands in June 2026:

▟ docling-project/docling

IBM's gen-AI document converter, an LF AI & Data project

★ 61.9kPythondocling-project/docling

▟ Unstructured-IO/unstructured

Broad-format open-source ETL for LLM ingestion

★ 15kPythonUnstructured-IO/unstructured

▟ datalab-to/marker

Fast, accurate PDF-to-markdown + JSON

★ 36.3kPythondatalab-to/marker

▟ opendatalab/MinerU

PDFs and Office docs to LLM-ready markdown/JSON

★ 68.2kPythonopendatalab/MinerU

▟ run-llama/llama_cloud_services

LlamaParse + LlamaExtract SDKs for LlamaCloud

★ 4.3kTypeScriptrun-llama/llama_cloud_services

Note who's at the top of the star chart. It isn't IBM's polished pipeline or the hosted API everyone benchmarks against. It's MinerU, an open-source project out of Shanghai AI Lab, at 68.2k. And note the footnote nobody quotes: the llama_cloud_services package is mid-migration, with LlamaParse moving to slimmer dedicated SDKs. The hosted incumbent is reorganizing while the open repos pull away.

The distinction that died

For two years the real axis of this argument was how a parser reads a page. Unstructured and classic Docling are pipelines: layout detection, then OCR, then table-structure recognition, then reading-order reassembly — a chain of specialist models, each one a place where the page can fall apart. LlamaParse went the other way: hand the whole page to a large multimodal model and let it write the markdown. Slower, pricier per page, but it didn't drop the thread on a gnarly nested table.

That was a genuine tradeoff. Pick speed and self-hosting, or pick a frontier model's judgment.

It's gone. Look at OmniDocBench, the CVPR 2025 benchmark that's become the closest thing this field has to a scoreboard. On its latest, stricter revision the top of the leaderboard isn't GPT-class or Gemini 3 Pro doing whole-page vision. It's MinerU2.5-Pro, scoring around 95.7 overall — a 1.2-billion-parameter model (arXiv 2509.22186). Right behind it sits a cluster of other sub-2B specialists — GLM-OCR, PaddleOCR-VL — and trailing all of them, the massive frontier VLMs (Gemini 3 Pro, Qwen3-VL-235B) that the small models were supposed to be a budget imitation of. The tiny open checkpoints are beating the giants outright.

The new winners are vision models small enough to run on a single consumer GPU — which means the OCR-versus-layout question and the open-versus-hosted question collapsed into the same answer at the same time.

The way MinerU2.5 does it is the tell: it decodes a downsampled image for global layout, then crops and re-reads regions at high resolution. That's the OCR step and the layout step fused into one model with two passes. The pipeline-vs-LLM distinction didn't get won by either side. It got dissolved.

What you should actually optimize for

If parsers are converging on quality, what separates them? Not plain text. Every tool here nails plain prose; benchmark that and you'll measure noise.

The variance is entirely in tables and reading order — which is exactly why OmniDocBench scores them as separate tracks (table structure via TEDS, reading order via edit distance) instead of one blended number. A multi-column page where the parser stitches the wrong columns together — or a financial table that arrives in your vector store as a flat run of digits — is the failure that quietly poisons RAG retrieval months later. Your chunks look fine. Your answers are subtly, confidently wrong. Speed-first tools like Marker are excellent on clean prose but give up ground on dense tables to MinerU's heavier pipeline; that gap is the decision.

So the honest selection guide, stripped of the corner-picking:

Many formats, compliance-bound, self-hosted ETL — Unstructured is still the workhorse. It eats EML, PPTX, HTML, the long tail. Just don't expect it to win the table benchmark.
Clean gen-AI document model with real ecosystem glue — Docling, with native LangChain/LlamaIndex/Haystack hooks and an IBM-sized maintenance commitment.
Hard PDFs, want zero infra, fine paying per page — LlamaParse, eyes open about the SDK shuffle and the per-page model bill.
Tables and reading order are the whole game, and you have a GPU — MinerU2.5 or a sub-2B VLM parser. This is the lane that got better this year, and it's the open, local one.

The part nobody puts in the comparison table

The quiet story of 2026 is that the best document parsing stopped being a service you buy and became a checkpoint you download. The hosted option isn't worse — LlamaParse on a frontier model is genuinely excellent on a messy scan — it's that the reason to reach for hosted parsing (you can't run anything this good yourself) evaporated. You can. It fits on one card.

The pragmatic move is boring and correct: route the easy 80% through whatever pipeline you already run, and send the documents that actually carry tables and multi-column layouts to a VLM parser — increasingly a local one. That triage is the architecture now. Not which logo you picked.

The bracket was never really Docling vs Unstructured vs LlamaParse. It was can I keep this on my own hardware and still get the tables right. For the first time, the answer is yes — and it costs a download, not a contract.

Docling vs Unstructured vs LlamaParse: Parsing Documents for RAG in 2026

The choice you think you're making

The distinction that died

What you should actually optimize for

The part nobody puts in the comparison table

Dex Mareno

Dispatches from the machines, in your inbox

Docling vs Unstructured vs LlamaParse: Parsing Documents for RAG in 2026

The choice you think you're making

The distinction that died

What you should actually optimize for

The part nobody puts in the comparison table

Dex Mareno

Continue reading

The Best Reranker for RAG in 2026: Cohere vs Jina vs BGE

GraphRAG vs Vector RAG: When a Knowledge Graph Actually Earns Its Cost

RAG vs Long Context: When to Retrieve and When to Stuff the Window

Dispatches from the machines, in your inbox