Pydantic AI V2.0.0 went stable on June 23, after seven betas — and then, in the nine days that followed, the team shipped three more releases: v2.1.0, v2.2.0, and v2.3.0. That cadence is the first thing worth noticing. A framework that pushes four releases in a week and a half is not coasting on a milestone; it's a project that just cleared a large refactor and is sprinting to fill it in. The version number is the boring part. The architecture underneath it is a real bet on where agent frameworks are heading.
Capabilities: one primitive to bundle the four things#
V1, like most first-generation agent libraries, spread an agent's configuration across several surfaces — you registered tools here, set instructions there, passed model settings at the call site, hung hooks off events. V2's central idea is a capability: a single composable unit that bundles an agent's tools, hooks, instructions, and model settings into one reusable thing that reaches every layer of the agent.
That sounds like tidying, but it changes what an agent is in your codebase. Instead of an agent being a pile of settings you assemble at construction time, it becomes a composition of capabilities — units you can name, type, test in isolation, and reuse across agents. A "web-research" capability carries its own tools, its own instructions, its own model settings, and drops into any agent as one import. Configuration stops being incidental and becomes an interface.
A small core and a fast Harness — with a graduation path#
The more interesting decision is structural. V2 splits the project in two. The core stays deliberately small and stable: the agent loop, the providers, the capability and hooks API, and only the capabilities fundamental to every agent. Everything else lives in the Harness — the first-party "batteries": memory, guardrails, context management, filesystem access, code mode, and more — where it can move fast without destabilizing the foundation.
A capability can graduate from the Harness into core once it proves broadly essential. That single sentence is a governance model disguised as an architecture.
That graduation path is the clever bit. It lets the framework experiment aggressively in the Harness — ship a memory system, iterate, break it — while promising that the core you build against changes slowly. It's the same instinct that keeps a language's standard library conservative while its package ecosystem churns. For anyone who got burned by a breaking change in a fast-moving agent framework, an explicit stable/experimental boundary is worth more than any single feature.
The real story: harness-first, not graph#
Step back and the naming gives away the bet. Over the last year, a large part of the field converged on the graph as the agent abstraction — nodes, edges, explicit state machines — a direction so dominant that every serious framework seemed to grow a graph, LangGraph most of all. Pydantic AI V2 goes the other way. It keeps a plain agent loop and asks you to compose capabilities around it — a harness-first design, the same shape as the OpenAI and Anthropic agent SDKs.
The two philosophies optimize for different failure modes. Graphs make control flow explicit and inspectable — you can see exactly which node runs next, which is a gift for complex, branchy, human-in-the-loop workflows. Harnesses keep the loop simple and push complexity into composable, swappable units — a gift for teams who want the model to drive and the framework to stay out of the way. V2 is a wager that for most agents, the loop is not the hard part and shouldn't be modeled as if it were. If you're choosing a stack today, that's the axis to decide on first — and it reframes the Pydantic AI vs OpenAI Agents SDK vs Agno question as less about features and more about which mental model you want to live in.
Providers are now commodity plug-ins#
The point releases make a quieter argument. Claude Sonnet 5 support landed in v2.2.0 within days of the model's own launch, and v2.3.0 (July 2) added a native Z.AI provider with thinking support — first-class integration for the open-weight GLM family. A framework adding a Chinese open-weight model as a native provider, days after adding a US frontier model, tells you the provider layer has become a commodity plug-in: the differentiation has moved up the stack, to how you compose behavior, not which model you can reach.
Should you move#
New projects should start on V2 — the capability model is where the ecosystem is going. Existing V1 users should treat this as a genuine migration, not a pip install --upgrade: the configuration model changed, and that's the whole point of the release. The payoff is that agent behavior becomes something you assemble from typed, testable, reusable units, with a stable core you can trust and a Harness you can raid for the parts you'd otherwise rebuild. Whether harness-first or graph-first wins is still an open question. V2 just made the harness case a lot harder to ignore.



