Benchmarks

A full working session produces a lot of raw output — page snapshots, issue lists, logs, test runs. Left unmanaged, every byte lands in the context window and crowds out the actual work. context-mode keeps that raw output in a sandbox and returns only the answer.

Session-level result

Across one full session, raw output totaling about 315 KB collapses to about 5.4 KB in context — roughly 98.3% saved. In practice that extends a productive window from about 30 minutes before context fills to about 3 hours of useful work.

The savings come from a single discipline: the model writes a program that processes the data and prints only the result, so raw bytes never enter context. See Context protection for how the routing enforces this.

Per-task results

Each row is a real data-heavy task. Raw is the size of the output the task produces; In context is what actually reaches the window after context-mode processes it.

Task	Raw	In context	Saved
Playwright snapshot	56.2 KB	299 B	99.5%
GitHub issues, 20 items	58.9 KB	1.1 KB	98.1%
Access log, 500 lines	45.1 KB	155 B	99.7%
Analytics CSV, 500 rows	85.5 KB	222 B	99.7%
Git log, 153 commits	11.6 KB	107 B	99.1%
Test output, 30 suites	6.0 KB	337 B	94.5%

How the savings are calculated

The savings ratio is simple: the bytes processed in the sandbox divided by the bytes returned to context.

Savings ratio

const saved = 1 - bytesReturnedToContext / bytesProcessed;
// Playwright snapshot: 1 - 299 / 57548 ≈ 0.995  (56.2 KB = 57548 bytes)

A task that processes 56.2 KB and returns a 299-byte answer saves about 99.5% of the bytes it would otherwise have spent. The more data a task touches and the smaller its answer, the higher the ratio — which is why filtering, counting, and summarizing tasks routinely land at 99% and higher.

Your ratios will track the table above for similar tasks. Workloads with larger raw payloads and small answers save the most.

Tools

The ctx_* tools that produce these savings.

Context protection

How routing keeps raw bytes out of the window.

Session-level result

Per-task results

How the savings are calculated

How to reproduce

Run a data-heavy task

Check your savings

Tools

Context protection

On this page

Benchmarks

Session-level result

Per-task results

How the savings are calculated

How to reproduce

Run a data-heavy task

Check your savings

Related

Tools

Context protection

On this page