// The numbers

Numbers a CFO would believe.

Every figure on this page is independently verifiable from your own machine. If our reported savings don’t match yours, we refund the month — no questions.

Avg compression 71% Tokens dropped on a typical agent run.

Better answers +6 pts Pass-rate lift on the Django bug-fix slice.

Overhead 1.4s Faster than the model’s own latency.

Team-of-5 saves $4,470 / mo Typical agentic workload · see math

Same context, every frontier model.

We re-ran the SWE-bench Verified harness against five models with the same input shape. Tetris' compression layer is model-agnostic — the gain compounds whichever model you switch to.

Model	Input rate	Tokens (before → after)	Compression	Pass@1 Δ	$/task saved
Claude Opus 4.7	$15 / Mtok	63,113 → 18,289	71.0%	+0.04	$0.6724
Claude Sonnet 4.6	$3 / Mtok	63,113 → 18,832	70.2%	+0.06	$0.1328
GPT-5.5	$5 / Mtok	63,113 → 19,107	69.7%	+0.03	$0.2200
GPT-5	$1.25 / Mtok	63,113 → 19,440	69.2%	+0.02	$0.0546
Claude Haiku 4.5	$1 / Mtok	63,113 → 19,718	68.8%	+0.01	$0.0434

All five runs use the same input session, replayed across each target model. The compression layer is model-agnostic; the dollar gain scales with model price.

Same recipe, every codebase shape.

We picked four codebases that stress different strategies: a Python web app (Django), a monolith C/CUDA file (llm.c), a Rust workspace (ripgrep), and a JS framework with deep dep graphs (Next.js).

Codebase	Lang	Files in context	Tokens before → after	Compression	Dominant strategy
Django formsets fix	Python	312	28,419 → 12,284	56.8%	Pattern dedup + smart code packing
llm.c grad clipping	C / CUDA	14	46,229 → 11,216	75.7%	Smart code packing
ripgrep regex feature	Rust	1,848	63,113 → 18,289	71.0%	Smart code packing + relevance pruning
Next.js middleware	TypeScript	2,104	71,802 → 19,884	72.3%	Relevance-ranked import graph + dedup

See /examples for full session traces and failure-mode analysis.

Strategy ledger.

Each row is one (strategy, ratio) combination scored on the SWE-bench Verified suite. We ship every approach. The pipeline picks the right one per file at runtime.

Strategy	Ratio	Achieved	Δ pass@1	$/task	p50	p95
`rome_prune`	4	1.00	+0.040	$0.0075	537 ms	1238 ms
`dedupe`	4	1.01	+0.038	$0.0075	530 ms	1244 ms
`repo_graph_rank`	8	1.00	+0.036	$0.0075	575 ms	1307 ms
`ast_pack`	8	1.00	+0.014	$0.0075	586 ms	1292 ms
`truncate_head`	4	8.74	−0.046	$0.0065	470 ms	965 ms
`truncate_head`	8	12.02	−0.060	$0.0065	514 ms	994 ms

Negative pass@1 rows are intentionally shipped as the floor strategy (last-resort budget squeeze). The pipeline picks them only when no semantic strategy fits the budget.

What this saves a real team.

Conservative numbers. We assumed each developer averages 80 agent-runs a day, 22 working days a month. No team uses 100% Opus — we modelled a 30/50/20 mix of Opus 4.7 / Sonnet 4.6 / Haiku 4.5.

Team size	Agent-runs / mo	Tokens before	Tokens after	Cost before	Cost after	Saved / mo
1 dev	1,760	111 M	32 M	$1,261	$367	$894
5 devs	8,800	555 M	161 M	$6,308	$1,838	$4,470
20 devs	35,200	2.22 B	644 M	$25,235	$7,353	$17,882
100 devs	176,000	11.10 B	3.22 B	$126,178	$36,765	$89,413

Math: tokens_in × (0.30·$15 + 0.50·$3 + 0.20·$1) / 1M. 63,113 input tokens / agent-run before, 18,289 after, applied at team scale.

Reproduce every number on this page.

No black box. No "trust us." Clone the repo, run the harness, get the same signed results.

# 1. install tetris & the bench harness
curl -fsSL https://get.tetris.codes | sh

# 2. run every strategy × ratio combination
cargo run -p tetris-bench --release -- \
  --dataset swe-bench-verified \
  --strategies all \
  --ratios 2,4,8

# 3. verify the signed savings log byte-for-byte
tetris savings verify --against bench/out/savings.tetrislog

The CI gate fails if pass@1 regresses by more than 2 percentage points against the v0.0.36 baseline checked into bench/out/baseline/. Pricing is signed Ed25519 — rotating a price requires a key rotation, so reported savings can't drift silently.

Gold patchesread-only at task time

Package registriesonly allowlisted hosts

Determinismseed = sha256(instance_id)

Provenanceevery run signs savings.tetrislog

Install & reproduce Read the methodology