Benchmarks
Measured context savings on real payloads.
A full working session produces a lot of raw output — page snapshots, issue lists, logs, test runs. Left unmanaged, every byte lands in the context window and crowds out the actual work. context-mode keeps that raw output in a sandbox and returns only the answer.
Session-level result
Across one full session, raw output totaling about 315 KB collapses to about 5.4 KB in context — roughly 98.3% saved. In practice that extends a productive window from about 30 minutes before context fills to about 3 hours of useful work.
The savings come from a single discipline: the model writes a program that processes the data and prints only the result, so raw bytes never enter context. See Context protection for how the routing enforces this.
Per-task results
Each row is a real data-heavy task. Raw is the size of the output the task produces; In context is what actually reaches the window after context-mode processes it.
| Task | Raw | In context | Saved |
|---|---|---|---|
| Playwright snapshot | 56.2 KB | 299 B | 99.5% |
| GitHub issues, 20 items | 58.9 KB | 1.1 KB | 98.1% |
| Access log, 500 lines | 45.1 KB | 155 B | 99.7% |
| Analytics CSV, 500 rows | 85.5 KB | 222 B | 99.7% |
| Git log, 153 commits | 11.6 KB | 107 B | 99.1% |
| Test output, 30 suites | 6.0 KB | 337 B | 94.5% |
How the savings are calculated
The savings ratio is simple: the bytes processed in the sandbox divided by the bytes returned to context.
const saved = 1 - bytesReturnedToContext / bytesProcessed;
// Playwright snapshot: 1 - 299 / 57548 ≈ 0.995 (56.2 KB = 57548 bytes)A task that processes 56.2 KB and returns a 299-byte answer saves about 99.5% of the bytes it would otherwise have spent. The more data a task touches and the smaller its answer, the higher the ratio — which is why filtering, counting, and summarizing tasks routinely land at 99% and higher.
How to reproduce
These numbers are not synthetic — they come from ordinary tasks. Reproduce them on your own data:
Run a data-heavy task
Ask your agent to do something that produces a large payload — parse a long log, list issues, summarize a build, or capture a page snapshot.
Check your savings
In Claude Code and Cursor, run the /context-mode:ctx-stats slash command. On
other hosts, ask your agent for context-mode stats so it calls the ctx_stats
tool.
The report shows the bytes processed, the bytes returned to context, and the savings ratio for your session, with a per-tool breakdown.
Your ratios will track the table above for similar tasks. Workloads with larger raw payloads and small answers save the most.