From Framework to Harness

There is a tell in the description line of one of the newer agent SDKs. AWS's Strands does not call itself an orchestration framework or an agent platform. Its one-sentence pitch is: "Build an agent harness and control it end-to-end." Harness. Not engine, not framework, not autopilot. The word admits something the whole category spent two years not admitting: the thing inside is already driving. Your job is restraint and instrumentation.

That is the shift worth understanding if you are choosing tools right now. The 2024 generation of agent libraries encoded control flow — you drew the graph, defined the chain, specified which node ran after which. They were, in effect, ways of writing the program the model wasn't smart enough to write for itself. The 2026 generation inverts it. The model plans. The framework gives you a loop, a set of tools, hard limits, and a trace. The value migrated from expressing the workflow to constraining and observing the autonomy.

When the model got good at deciding what to do, the money moved to deciding what it's not allowed to do — and proving what it did.

Here is the shelf, read through that lens.

▟ strands-agents/sdk-python

Model-driven agent harness; you supply a prompt and tools, the loop and observability are the product. Bedrock/Anthropic/OpenAI/Gemini, deploys to Lambda/Fargate/EKS with OpenTelemetry built in.

★ 6kPython/TypeScriptstrands-agents/sdk-python

▟ openai/openai-agents-python

The minimalist take: agents, handoffs, guardrails, and tracing in a few hundred lines of surface area. Provider-agnostic despite the name. The restraints (guardrails) and the trace are first-class; the orchestration is deliberately thin.

★ 27kPythonopenai/openai-agents-python

▟ pydantic/pydantic-ai

Type safety as the control surface. Structured outputs validated by Pydantic models, dependency injection for testable agents, and Logfire for observability. The bet: the way you constrain an agent is by constraining its types.

★ 18kPythonpydantic/pydantic-ai

The interesting comparison is with the incumbent that defined the previous era.

▟ langchain-ai/langgraph

Self-described now as a "low-level orchestration framework for building stateful agents." Still graph-first — you express the structure — but the 2026 additions are durable execution, persistence, and human-in-the-loop checkpoints. Even the orchestration camp is migrating toward state and recoverability.

★ 35kPythonlangchain-ai/langgraph

LangGraph is not obsolete; it is the most honest version of the orchestration philosophy, and for workflows where you genuinely need to pin the control flow — regulated steps, deterministic branching — drawing the graph is the right call. But notice what its own roadmap prioritized: durability, persistence, checkpoints. Recovery and state, not richer ways to specify the path. The graph people and the harness people are converging on the same problem from opposite sides, and that problem is memory.

▟ mem0ai/mem0

A universal memory layer — user, session, and agent state with semantic retrieval over a vector store. The piece every harness leaves out, because a loop with tools and a trace is still amnesiac between runs.

★ 59kPython/TypeScriptmem0ai/mem0

Mem0 having more stars than any agent framework on this list is not an accident. It is the market pricing the gap. Once you accept that the model drives, the unsolved problems are no longer how does it decide but how does it remember, and how do I see what it did. The harness gives you the second. Memory layers are the scramble to give you the first.

If you are starting an agent today, the practical advice falls out of the framing. Pick the thinnest harness you can tolerate — Strands or the OpenAI SDK if you want minimal surface, Pydantic AI if your safety story is types and validation. Add a real memory layer early rather than bolting one on after your agent starts repeating itself. Reach for LangGraph specifically when you need to guarantee a path, not when you're hoping the model finds one. And treat the trace as load-bearing infrastructure, not a debug afterthought: in a world where the model makes the decisions, the log of what it decided is the only thing standing between you and a system you cannot account for.

The frameworks got smaller because the models got bigger. What's left in your hands is the harness — the straps and the telemetry. Choose those deliberately. They're the part that's actually yours.

Dex Mareno

Dispatches from the machines, in your inbox

From Framework to Harness

Dex Mareno

Continue reading

Your Container Is Not A Sandbox

The Evals Are the Product

The Price Fell. The Bill Rose. Both Numbers Are True.

Dispatches from the machines, in your inbox