There is a way the OWASP Top 10 for LLM Applications gets read in a security review, and it is the wrong way. Someone opens the list, sees ten model-flavored risks — prompt injection, poisoning, hallucination, data leakage — and concludes the takeaway is pick a safer model. The list is then filed under "model selection," and the review moves on.

Read it a second time and the shape inverts. Most of these ten are not the model behaving badly. They are your architecture trusting the model too much. The model is rarely the vulnerability; the trust your code places in its output is.

The current list, and why the 2025 edition matters

The edition you should be working from is the 2025 list, published in November 2024 by the OWASP Gen AI Security Project. It is not cosmetic housekeeping over the original 2023 ranking. It added System Prompt Leakage (LLM07) and Vector and Embedding Weaknesses (LLM08), renamed Insecure Output Handling to Improper Output Handling (LLM05), and broadened a narrow "Denial of Service" into Unbounded Consumption (LLM10). Every one of those edits points the same direction: away from the model in the abstract and toward the system that wraps it.

The full ten, with the part nobody underlines — where the bug actually lives:

Tally the "where" column and the misreading becomes visible. Only three entries — LLM04, LLM09, and the model half of LLM01 — are about what the weights do when left alone. The other seven are about what your application does with what the model said.

The model is rarely the vulnerability. The trust your code places in its output is. Six of the ten OWASP risks are fixed in your architecture, not by swapping checkpoints.

Why this matters more for agents than for chatbots

A single-shot chatbot can half-ignore the integration entries. Its output goes to a human who reads it on a screen; the blast radius of a bad answer is one person's annoyance. An agent removes the human from that loop, and the integration entries stop being theoretical.

Take LLM05 Improper Output Handling. For a chatbot, unvalidated output is a cosmetic risk — maybe some unescaped HTML. For an agent, the output is the command: it's the SQL that runs, the shell string that executes, the URL the tool fetches. OWASP draws the line precisely — LLM05 is "insufficient validation, sanitization, and handling of LLM output before it is passed downstream." The moment downstream is a tool instead of a textarea, an output bug becomes a code-execution bug. This is exactly the bridge by which an indirect prompt injection escalates from "weird answer" to "privileged access to the target's environment."

Then LLM06 Excessive Agency, the most agent-shaped risk on the list — defined, essentially, by agency itself. OWASP decomposes it into three root causes: excessive functionality (the agent can call tools it never needs), excessive permissions (those tools run with more authority than the task requires), and excessive autonomy (the agent acts on irreversible operations with no human gate). Its own definition is worth quoting: Excessive Agency "enables damaging actions to be performed in response to unexpected, ambiguous or manipulated outputs from an LLM, regardless of what is causing the LLM to malfunction." Note the last clause. It does not matter why the model produced a bad tool call — a jailbreak, a poisoned document, an honest hallucination. If the agency was excessive, the damage is the same.

LLM01 Prompt Injection changes character too. A chatbot mostly faces direct injection from the user it's talking to. An agent reads the open web, untrusted documents, and the outputs of other tools as a matter of routine — so indirect injection, where the hostile instruction rides in on a retrieved page, is part of the normal control flow, not an edge case. And LLM03 Supply Chain widens with every tool you bolt on: each plugin, each MCP server, each external model is one more dependency that can be poisoned or rug-pulled. Even LLM10 Unbounded Consumption has an agentic signature OWASP names outright — recursive tool calls, the loop that quietly bills you into next quarter.

The one mental shift the list is really asking for

If you take a single idea from the document, take this: treat the model as an untrusted user that can be socially engineered, and move your controls to the boundary around it. That reframing collapses six of the ten entries into one discipline.

It means you validate output before it acts — the same scrutiny you'd give a web form, applied to the model's tool calls. It means least privilege as the default: the agent gets the minimum tools, each scoped to the minimum permission, and irreversible actions wait behind a human gate. It means guardrails on the boundary, not faith in the weights — the injection-defense layer sits where untrusted input enters, not inside the model's good intentions. And — straight from LLM07 — it means your system prompt holds no credentials and no authorization logic, because a prompt is a thing that leaks.

None of that is a model decision. It's an architecture decision, which is the whole point the list keeps making and the security review keeps missing. The OWASP Top 10 is not a checklist for choosing a model. It's a checklist for not trusting the one you chose.