Most framework comparisons line up checkboxes — memory, streaming, tool support, observability — and let the rows blur together because eventually everyone ships everything. That misses the one decision you can't refactor your way out of later: what counts as an action. Before an agent can plan, call a tool, or recover from an error, something has to define the unit of work the LLM emits each turn. smolagents, LangGraph, and CrewAI give three different answers, and that single disagreement explains most of why they feel so unlike each other in practice.

This piece is deliberately about that axis. If you want the broader three-way on orchestration styles, the companion LangGraph vs CrewAI vs AutoGen covers that ground; here the question is narrower and sharper — code-as-action vs graph vs role-team.

smolagents: the action is a block of Python

Barebones library for agents that "think in code" — actions are executable Python

Hugging Face's smolagents makes the most contrarian bet of the three. Its flagship CodeAgent doesn't emit a JSON object naming one tool and its arguments. It writes a snippet of Python, and that snippet is the action — executed in an interpreter, with the result fed back as the next observation.

This isn't a stylistic preference. It traces directly to CodeAct, the paper "Executable Code Actions Elicit Better LLM Agents" (Wang et al., ICML 2024), which argued that the standard JSON-per-step action space is needlessly constrained. Python is already a universal interface for composition: you can loop, branch, store intermediate values, and nest several tool calls inside one action. The paper reported up to ~20% higher success rates on agent benchmarks; Hugging Face's introductory blog post reports that writing actions as code uses roughly 30% fewer steps — and therefore ~30% fewer LLM calls — than handing the model a dictionary of tools to fill in.

The JSON agent asks "which one tool next?" The code agent asks "what's the program that solves this?" — and writes it.

The cost of this bet is the obvious one: you are executing model-generated code, so sandboxing matters (smolagents supports restricted execution and remote sandboxes). The payoff is a tiny, auditable codebase and an agent that composes naturally. If your instinct is "I want the least framework possible and I trust code over schemas," this is your camp.

LangGraph: the action is a transition in a graph you built

Low-level orchestration for stateful, long-running agents modeled as explicit state graphs

LangGraph sits at the opposite pole on the specify-vs-delegate spectrum. You don't hand it a goal and step back; you draw the machine. An agent is a state graph — nodes are functions that read and write a shared state object, and edges (including conditional ones) decide what runs next. The action space is whatever you wired into those nodes.

That's more boilerplate, and LangGraph doesn't pretend otherwise — it bills itself as a low-level orchestration framework. What you buy with the verbosity is control and durability: the docs center on durable execution that survives failures, human-in-the-loop interrupts, persistent memory, and inspectable state, with observability through LangSmith. Its design borrows from dataflow systems like Pregel and Beam, with a NetworkX-flavored graph API.

The mental model here isn't "write code as you go" — it's "define the control flow up front and let state move through it." When the thing you're building is a long-running workflow that must branch, retry, pause for a human, and resume exactly where it left off, that explicitness stops feeling like overhead and starts feeling like the only sane option. If you're weighing it against a more managed runtime, Claude Agent SDK vs LangGraph digs into that trade specifically.

CrewAI: the action is a role doing its job

Lean, independent framework orchestrating role-based agent "crews" plus event-driven "flows"
★ 54kPythoncrewAIInc/crewAI

CrewAI models the agent system as a team of personas. You define agents with a role, a goal, and a backstory, hand the crew a set of tasks, and let them collaborate and delegate. The action space is abstracted away behind the metaphor: you specify who is on the team and what they're responsible for, and the framework handles the turn-taking.

Two facts are worth pinning down. First, CrewAI is independent — its docs state it was "built from scratch, completely independent of LangChain or other agent frameworks," which surprises people who remember its early lineage. Second, it isn't only role-play improv: alongside Crews (autonomous role teams) it ships Flows, an event-driven abstraction for when you need precise, deterministic control. That's CrewAI quietly conceding that pure delegation isn't always enough — the same tension LangGraph addresses by default.

CrewAI is the fastest of the three to a working multi-agent demo, and its star count reflects that pull. The trade is that the role metaphor is opinionated: it's wonderful when your problem decomposes cleanly into "a researcher, a writer, an editor," and awkward when it doesn't.

How to choose

Ignore the leaderboard. The decision is about your action space, and three honest questions settle it:

The framing that actually predicts your experience is specify vs delegate, code vs JSON. LangGraph is maximum specify; CrewAI is maximum delegate; smolagents reframes the whole question by changing what an action is. If you're also weighing the planning loop on top of any of these, that's a separate decision about reasoning strategies — orthogonal to the action space, and worth getting right too.