Tools

Calculators

The four sizing questions to answer before you ship an agent — capacity, price, speed, and context. Each runs in the browser on editable, sourced defaults; the reasoning behind every formula links out to the articles.

LLM serving VRAM calculator

Calculator

How much GPU memory it takes to serve a model — weights + GQA-aware KV cache + overhead, and how many accelerators you need.

LLM API cost calculator

Calculator

What a feature will cost per month — modelling the two levers that move the invoice: prompt caching and the input/output price split.

LLM latency calculator

Calculator

How fast an agent will feel — time-to-first-token vs throughput across the sequential turns that dominate multi-step agents.

Context-window budget calculator

Calculator

How much context your agent actually gets — after system prompt, tool schemas, memory, and output reserve — and how many turns before it must compact.