There are two honest ways to answer the question "what should my agent do next," and the frameworks have split along exactly that line. One camp says: tell me up front. Draw the graph, name every node, label every edge, and I'll walk it and save my place as I go. The other camp says: don't draw anything. Write steps that react to events, and the path will fall out of which events fire. Both work. They fail differently, and they make different things hard.
This is the real choice behind "LlamaIndex Workflows vs LangGraph," and almost every comparison buries it under feature checklists.
LangGraph: the graph is the program
LangGraph (langchain-ai/langgraph, ~36k stars, tagline "Build resilient agents") is a state machine you draw on purpose. You define a typed state object, register nodes that read and write it, and wire edges — including conditional ones — between them. The control flow is a literal graph you can look at. Loops are edges back to an earlier node. Branches are conditional edges. Nothing about where execution goes is implicit; it's in the wiring.
The payoff arrived in earnest with 1.0, which reached general availability on October 22, 2025. The headline feature is durability. A checkpointer persists state at every super-step, so if the server restarts mid-run or a long job is interrupted, the graph picks up where it left off instead of starting over. That same machinery powers interrupt(): human-in-the-loop isn't a callback, it's a checkpointed pause. The draft sits in saved state, not a blocked thread, and survives a deploy until someone resumes it with a command.
That is the whole pitch, and it's a good one. If your hardest requirement is recover from failure or gate this on a human three days from now, you want the framework that treats the saved state as the foundation rather than an add-on.
The question was never "can it loop or branch" — both do. It's whether you want the control flow drawn for you to inspect, or derived from events you can't see all at once.
One caveat worth keeping honest about: checkpoints are save points, not magic. On resume, LangGraph re-runs the node from the top — including any LLM or API calls inside it before the interrupt — rather than continuing from the next line. Durable, yes; free of replay semantics you have to design around, no.
LlamaIndex Workflows: the graph is implicit
LlamaIndex Workflows takes the opposite stance. You write @step-decorated async functions, each typed to the events it consumes and the events it emits. There's no graph to draw. A step fires when its input event lands on the queue, does its work, and emits the next event. Control flow emerges from that wiring of producers and consumers. A loop is just a step that emits an event some earlier step consumes; a branch is a step emitting one of several event types.
The ergonomic win is real. For genuinely branchy, looping, multi-agent logic, you write less ceremony — no central graph definition to keep in sync, no edge list to maintain. You add behavior by adding a step that listens for an event. Shared state lives in a Context object that's passed between steps and can be serialized and resumed, and there are typed start/stop events, per-step retry policies, streaming of intermediate events, and optional human-in-the-loop events that let you break the loop and resume later.
The structurally important news is that Workflows became its own standalone package — pip install llama-index-workflows (current release 2.22.1 as of June 2026), with its own repository and release cadence, decoupled from the run-llama/llama_index monorepo (~50k stars, now billed as a "document agent and OCR platform"). You can orchestrate with it without dragging in the retrieval stack.
The cost of the event-driven model is the mirror image of LangGraph's benefit. Because the topology is implicit, you lose the at-a-glance map — to know what runs after what, you trace events, not edges. And while Context is serializable and steps can pause, the first-class, checkpoint-everything-by-default durability story is the one LangGraph leads with. Workflows gives you the pieces; you assemble more of the recovery yourself.
So which one
| Dimension | LlamaIndex Workflows | LangGraph |
|---|---|---|
| Core abstraction | Steps that emit/consume typed events | Nodes + edges over a shared typed state |
| Control flow | Emergent, derived from event wiring | Explicit, drawn as a graph up front |
| Shared state | Context object, serializable | Typed state channels, reducer-based |
| Durable execution / resume | Serializable Context; you wire up persistence | First-class checkpointers, durable by default |
| Human-in-the-loop | Emit a human-response event, await async | interrupt() / resume backed by checkpoints |
| Visibility of topology | Implicit — trace the events | Explicit — inspect nodes and edges |
| Ecosystem | LlamaIndex retrieval / RAG / document stack | LangChain agents, tools, platform |
| Reach for it when | Flexible event-driven composition, less boilerplate | Durable recovery, visible topology, approval gates |
Strip away the marketing and the decision is small. If the thing that will hurt you in production is a workflow that has to survive a crash, resume cleanly, or wait days for an approval — and you want to see the control flow when you debug it at 3 a.m. — LangGraph's explicit-graph-plus-checkpointer model is built for exactly that, and 1.0 means it's done flinching. If the thing that will hurt you is ceremony — a central graph you keep rewiring as the branching logic grows — Workflows lets the structure stay implicit and the code stay light, especially if you're already inside LlamaIndex for retrieval.
Worth saying plainly: this isn't always either/or. A common 2026 production shape is Workflows (or LlamaIndex generally) for the retrieval layer and LangGraph for the durable orchestration on top. The frameworks disagree about where control flow should live. Your stack doesn't have to.
If you're weighing the orchestration layer more broadly, the same "drawn vs derived" tension shows up across the field: LangGraph against CrewAI and AutoGen trades the explicit graph for declared roles and agent dialogue, Apache Burr frames it as state machine versus graph, and if durability is the whole point, LangGraph checkpointing versus Temporal is the comparison that actually decides it.



