Here is a fact about your long-running agent that almost no dashboard is showing you: the moment it summarizes its own history to stay under the token budget, it is editing its own rulebook — and it edits in one direction.

Compaction is the technique that makes long-horizon agents possible at all. When a run approaches the context-window limit, the agent condenses old turns — tool calls, file reads, dead-end reasoning — into a shorter summary and throws the originals away. We've argued before about whether an agent should compact its own context; this is the safety cost that argument was missing. Claude Code triggers this automatically at around 83.5% of the window. The gains are real and well-advertised: Anthropic reports that context editing plus a memory tool lifted agentic-search performance by 39% and cut token consumption by 84% across a 100-turn evaluation. That is the number everyone quotes.

Here is the number nobody quotes. In a June 2026 study bluntly titled Governance Decay, researchers ran 1,323 episodes in which an agent was given a policy — a rule it obeyed with perfect reliability while the rule sat in full context. Then they let the session run long enough to compact. Violation of that same rule rose from 0% to 30%, and on some models to 59%. The agent had not been argued out of the rule. It had not been jailbroken. The rule had simply not survived the summary.

The tell: it's a coin flip on the constraint itself#

The mechanism is almost insultingly simple, and it is the whole story. When the constraint survived the summarization pass, violation stayed at 0%. When the constraint was dropped from the summary, violation jumped to 38%. Compaction is not degrading the agent's judgment. It is running a lossy compressor over the one paragraph that was holding the agent's behavior in place, and whether your guardrail lives or dies is decided by a summarizer that was never told the paragraph was load-bearing.

If that were random, you could bound the risk. It is not random, and this is the one genuinely non-obvious thing worth taking away from all of this.

Summarizers keep the verbs, not the prohibitions#

A companion paper studied which constraints decay, and found a clean asymmetry. Prohibitions — "never write to the production table," "don't email outside the org" — erode under context pressure. Requirements — "always log the run," "attach the ticket number" — persist. In their runs, compliance with omission-style ("don't") constraints fell from 73% at turn 5 to 33% at turn 16, while commission-style ("do") constraints held at 100% the whole way.

Think about what a summarizer optimizes for and this stops being surprising. It keeps what is generating recent, visible activity. A "do" rule keeps producing artifacts — a log line, a ticket, a field on every record — so the summary keeps re-grounding it. A "don't" rule, when it is working, produces nothing. There is no event to reference, no recent action to compress toward. A guardrail doing its job is indistinguishable from silence, and silence is the first thing a compressor discards.

Which yields the counterintuitive shape of the failure: your agent's safety rails decay fastest precisely when the agent has been behaving. The longer nothing goes wrong, the less evidence the summarizer has that the prohibition ever mattered, and the more confidently it drops it — right up until the turn where it matters.

It gets worse in the direction you'd fear. The soft, deployment-specific policies — the stuff that isn't a universal safety norm but is the difference between your agent and a liability — decayed 8.3× faster than hard, well-known norms. And because compaction is just text-in, text-out, it's reachable: the same researchers built a Compaction-Eviction Attack, an injection that biases the summarizer toward dropping one chosen rule. Optimized, it defeated every model they tried. Your compaction step is an attack surface, not merely a reliability quirk.

What to actually do#

The good news is that the fix is embarrassingly cheap, because the problem is structural, not intellectual. The rule doesn't need to be understood better; it needs to not be in the pile you shred. The study's Constraint Pinning — keeping the governing constraints in a region compaction can't touch — restored violation to 0% at under 0.5% token overhead. That is the cheapest safety intervention in this entire beat.

Concretely, for anyone shipping a long-running agent this quarter:

The framing that got us here — compaction as a lossless checkpoint you can trust — is the bug. It is a lossy, biased, adversary-reachable rewrite of the exact tokens you were relying on to keep the agent in bounds. The 84% you saved on tokens is real. So is the 30% of the time your agent now does the thing you forbade. Measure both.