The pitch for outcome-based pricing is the most buyer-friendly sentence in enterprise software: you only pay when it works. Sierra, the customer-service agent company Bret Taylor built after leaving the Salesforce co-CEO chair, says it plainly — it charges clients only when its AI resolves the issue without a human, and Taylor believes this "will become the standard commercial model for intelligent agents." Intercom's Fin bills $0.99 per resolution. Salesforce's Agentforce launched at $2 per conversation. The seat is dying; the result is the new line item.
It sounds like the vendor finally has skin in the game. Read the contract and you find the vendor also holds the whistle, the stopwatch, and the scorebook.
A contingent fee where the contractor writes the invoice
Token pricing had one underrated virtue: the buyer could count. A million tokens is a million tokens. You could meter it yourself, reconcile it against the invoice, and argue from your own logs. The unit was boring and observable, which is exactly what you want in a unit.
An outcome is not observable in that way. It is a judgment. "Resolution" is not a quantity that exists in the world to be measured; it is a verdict that someone has to render. And in every one of these contracts, the someone is the party that gets paid when the verdict comes back "resolved."
This is a contingent-fee arrangement — pay-for-performance, the model we use for plaintiff's lawyers and debt collectors precisely because we can verify the outcome (a judgment, a recovered dollar). Strip out the verification and you have something stranger: a contractor who performs the work, declares whether the work succeeded, and bills you on the strength of its own declaration. The incentive isn't hidden. It's structural. Every borderline interaction is worth $0.99 if you call it a win and $0 if you call it a wash, and the same company makes the call and cashes the check.
Token pricing let the buyer count. Outcome pricing asks the buyer to trust the seller's count of how often the seller won.
What "resolved" actually means when Fin says it
You do not have to theorize about the incentive. Read Fin's own documentation on how resolutions are counted. There are two kinds.
A confirmed resolution is what you'd expect: the customer types "thanks, that worked." A assumed resolution is the interesting one. Per Fin's docs, if the customer reads Fin's answer and simply leaves the conversation without asking for more help, that counts — and bills — as a resolution.
Sit with that. The billable event is the customer's silence. A person who got their answer and a person who gave up in disgust exit the chat the same way: they close the tab. Fin's policy says it won't bill when its model detects frustration and escalates to a human, which is a real and creditable guardrail. But the default reading of "user stopped talking" is "user satisfied," and the default reading favors the vendor's ledger. The buyer is now paying for the absence of a complaint.
The numbers, and where they mislead
| Model | Unit | Headline price | Who defines / counts the unit | Buyer can independently verify? |
|---|---|---|---|---|
| Token / usage | 1M tokens | ~$0.07–$20 (varies by model) | Neither — it's metered | Yes — logs are countable |
| Per resolution (Fin) | A "resolution" | $0.99 | Vendor (incl. assumed resolutions) | Weakly — depends on vendor's classifier |
| Per conversation (Agentforce, orig.) | A 24-hr conversation window | $2.00 | Vendor's session boundary | Partially — windows are observable |
| Per action (Agentforce Flex) | One agent action | $0.10 | Vendor's action taxonomy | Partially |
Two things the table understates.
First, the per-conversation model showed how much the boundary matters. Under Agentforce's original scheme, a conversation ran from first response until resolved or 24 hours of inactivity, whichever came first — so a customer who replied the next morning started a fresh $2 meter. Buyers couldn't forecast it, and Salesforce moved to a Flex Credits model at roughly $0.10 per action in 2025. The lesson wasn't that $2 was too high. It was that the seller's definition of "one conversation" silently set the bill.
Second, "you only pay for outcomes" quietly implies the price per outcome is fixed and low. It is neither. Gartner projects that by 2030 the GenAI cost per resolution in customer service will exceed $3 — surpassing the cost of an offshore human agent. The unit was supposed to be the safe number. It is climbing, and it is climbing on a definition you don't own.
The countability problem is the whole problem
This is not the argument that AI bills are going up. They are, but that's a separate piece. This is narrower and, I think, more durable: the industry has moved the unit of sale from something the buyer can count to something only the seller can adjudicate, and it has done so while calling the move "alignment."
There is a clean tell for whether an outcome metric is honest: ask who would win a dispute over a borderline case, and whether you have the data to even have the dispute. With tokens, you have your own logs. With a "resolution," you have the vendor's classifier and the vendor's definition of what your customer's silence meant. Sierra's own framing concedes the soft spot — what counts as a successful resolution is a negotiated definition, and definitions drift.
So negotiate it. Make "resolution" a term in the contract, not a setting in the vendor's dashboard. Demand the raw transcripts behind assumed resolutions and sample them yourself. Cap re-bills, audit reversals, and treat the resolution rate the way you'd treat any KPI the counterparty both produces and profits from — as marketing until proven otherwise. Zylo's 2026 index found 78% of IT leaders already hit with unexpected charges from usage and AI pricing; the next surprise won't be the rate. It'll be the count.
Pay for outcomes if you like. Just remember you're also paying the referee.



