The internet keeps staging this as a cage match. "any-llm vs LiteLLM," as if you could put them on one scale and read off a winner. You can't, and the reason is the whole point: they sit on different layers of your stack. One is a Python import. The other is, when it matters, a service you run. Picking between them by counting supported providers is like choosing between a wrench and a hardware store by which has more shelves.

So let's throw out the "which is better" frame and ask the only question that actually forks your architecture: do you need a gateway?

What any-llm actually is#

Mozilla.ai's any-llm is an in-process Python SDK. You pip install it, you call one unified function, and you talk to OpenAI, Anthropic, Mistral, Ollama, and the rest behind a single typed interface that normalizes responses to OpenAI-style ChatCompletion objects. There is no server. There is no extra network hop. There is nothing to deploy, scale, or wake up for at 3am.

The interesting design choice is underneath. any-llm leverages each provider's official SDK where one exists, rather than re-implementing the provider's HTTP API itself. Mozilla.ai frames this explicitly as the differentiator — competing tools "reimplement provider interfaces rather than leveraging official SDKs," and any-llm chooses fidelity instead. When Anthropic ships a new field or changes streaming semantics, you inherit it through their own maintained client, not through a translation layer someone has to notice and patch.

What it deliberately omits: spend tracking, virtual keys, budgets, analytics. The README points you at a separate gateway project if you want that. any-llm is not pretending to be your control plane. It's a clean seam for provider portability inside one process, and it stops there on purpose.

What LiteLLM actually is#

LiteLLM is two products wearing one name. There's the Python library — completion() across 100+ providers in OpenAI format — which competes directly with any-llm and is the part everyone benchmarks. And there's the Proxy Server, the AI Gateway, which is where LiteLLM's real value lives and where the comparison stops being apples-to-apples.

The proxy is a self-hosted service that gives you the things a single import can't: cost tracking and spend logs, virtual keys with per-key and per-team budgets, rate limiting, load balancing and fallback routing across providers, response caching, request logging, and guardrails. This is platform infrastructure. If you hand model access to twenty internal teams and need to know who spent what and cut off the one that's looping, this is the layer that answers.

The architectural cost is the same one that makes it powerful: LiteLLM standardizes every provider into OpenAI's request/response shape, which means it re-implements provider behavior rather than deferring to official SDKs. That's what lets it support a long tail of providers uniformly — and it's also the surface that can drift from a provider's actual behavior when that provider ships something new.

The question isn't which tool is better. It's whether the thing in the middle of your requests should be a function call or a service with a pager.

The comparison that's actually fair#

The at-a-glance table above lines these two up on the dimensions that matter — deployment model, integration method, governance, and who has to operate the thing. Read it and the asymmetry is obvious: only one of them is a service.

If your real question is the other axis — hosted aggregator versus self-hosted infrastructure — that's a different fork, and worth its own look: see OpenRouter vs LiteLLM for the buy-versus-run decision, and LiteLLM vs Portkey vs TensorZero once you've decided you do want a gateway and need to pick one.

The crossover point, named#

Here is the line, stated plainly.

Below it: you are one application — an agent, a backend, a single service — that wants to not be married to one model vendor. You want model="claude-opus-4-8" and model="gpt-4o" to both just work, with type hints, in your process. Standing up a proxy here is a mistake you will feel slowly. You've added a service to the critical path of every request: something to deploy, scale, monitor, patch, and — per the honest comparisons — often back with Redis for caching and a database for spend tracking, plus a second instance if you'd like it to stay up. That's a lot of running infrastructure to avoid an import statement. any-llm wins this case not on features but on what it refuses to make you operate.

Above it: you are a platform team. Many apps, many teams, real money, the question "who can call which model and how much have they spent" asked by someone in finance. Here the SDK-only approach leaves you blind — a library inside each app cannot enforce a budget across all of them or revoke a key centrally. This is exactly the multi-tenant problem LiteLLM's proxy was built for, and any-llm explicitly declines to solve it.

The crossover is the moment a second team needs to share your model access under a budget you control. Before that moment, a proxy is overhead. After it, an in-process SDK is a blind spot.

The decision rule#

Stop asking which is better and ask who has to operate it.

If exactly one app makes the calls and you control its source, reach for any-llm — and enjoy that when a provider changes, the provider's own SDK absorbs it for you, with no proxy to re-implement and no service to wake for. The day a second team needs metered, keyed access to those same models, you don't "switch to LiteLLM" — you add its proxy as the gateway it was always meant to be, and let any-llm or LiteLLM's library keep talking to providers underneath it. They were never rivals. One is the seam in your code; the other is the building you put around it once more than one tenant lives there.