A team I traded notes with had shipped a support agent that would, every so often, simply never answer. No crash, no timeout error, no stuck tool. The trace showed the model calling a tool, reading the result, calling a tool, reading the result — competent, relevant calls, over and over — until the run hit its step cap and died with nothing to show the user. The bug was one line in their request. They had set tool_choice: "required" at the top of the loop, reasoning that an agent should, well, use its tools. What they had actually done was tell the model it was never allowed to be finished.

That is the thing about tool_choice. It looks like a knob for making a reluctant model pick up its tools. It is really a decision about who gets to say the conversation is over — and once you see it that way, both the common bugs stop being mysterious.

The same ladder, three vocabularies#

Strip away the naming and every major API exposes the same four positions. OpenAI spells them none, auto, required, and a named function object. Anthropic uses {type: "auto"}, {type: "any"}, {type: "tool", name: ...}, and {type: "none"}. Gemini sets function_calling_config.mode to AUTO, ANY, or NONE, with an allowed_function_names list to fence in which tools ANY may reach for.

Line them up and they say the same four things:

The important word in all four is turn. tool_choice is a per-request setting, and its whole meaning is about what a single turn is permitted to be.

Why "required" traps the loop#

An agent loop is a simple machine: call the model, run whatever tool it asked for, feed the result back, call again. The loop ends on the turn the model returns text instead of a tool call. Anthropic states this exactly — the runner loops until Claude returns a message without a tool use, which is the difference between a stop_reason of tool_use and one of end_turn.

Now re-read what required does. It makes every turn a tool call by definition. Which means the one turn that would end the loop — the plain-text answer — is precisely the turn you have forbidden. The model isn't confused and it isn't looping on a bad observation; it is doing exactly what you asked, forever. OpenAI's own forum has the bug report in the title: infinite loop with tool_choice: "required".

auto asks the model "do you want to use a tool here?" required tells it "you may not stop." Only one of those is a question an agent can ever answer with "I'm done."

The fix is not a bigger step cap — that just buys more wasted tokens before the same failure. The fix is to treat tool_choice as the per-turn control it is: force the turns that genuinely must call a tool (a mandatory lookup before the model is allowed to speak), and set it back to auto the instant you want the agent to be able to finish. If you find yourself setting it once, for the whole run, you have almost certainly set it wrong.

But "auto" isn't neutral either#

The reflex after reading the above is to leave everything on auto and move on. Watch that one too, because its failure is the opposite and quieter. A model on auto with a rich enough context window will cheerfully skip the tool and answer from what it already half-remembers — the classic case of an agent that "knows" a stale fact instead of calling the tool that would ground it. auto's failure is under-calling; required's failure is never-stopping. They are two ends of the same axis, and the axis is who decides whether this turn does real work or wraps up.

Which is why the useful mental model isn't "how do I make the model use tools." It's: for this turn, may the model stop? If a lookup is non-negotiable before any answer — a policy check, a required grounding retrieval — force it. If the model should be free to conclude, hand the decision back. Getting the tool descriptions right reduces how often you need to override the model at all; tool_choice is the override for when a good description isn't enough.

The one place forcing is simply correct#

There is a use where forcing a specific tool is not a footgun but the entire point: structured output. Define one tool whose input schema is exactly the JSON you want, force it, and read the object straight out of the tool-call arguments — you never execute anything. Anthropic's guidance is to provide a single tool and set tool_choice to it. Here the "can never return text" property is a feature: you don't want prose, you want the schema.

Two caveats keep this honest. First, forcing a named tool tends to collapse the turn to a single call — OpenAI users have long noted that giving tool_choice a value switches off parallel tool calls, and Anthropic's disable_parallel_tool_use formalizes the same "exactly one" behavior. So "force this one tool, but also fan out to others" is an awkward ask on today's APIs — the same tension you weigh when choosing parallel versus sequential calls. Second, the forced-tool JSON trick is now partly obsolete: most providers ship a native structured-output mode that constrains decoding to your schema without borrowing the tool machinery. If all you wanted was valid JSON, reach for that first.

The one-line takeaway is the one that team learned the slow way: tool_choice is not a dial you set and forget. It is a per-turn answer to a single question — is the model allowed to be done yet? — and the bugs on both sides come from answering it once, globally, when the right answer changes every turn.