// The numbers

Numbers a CFO would believe.

Every figure on this page is independently verifiable from your own machine. If our reported savings don’t match yours, we refund the month — no questions.

Avg compression 71% Tokens dropped on a typical agent run.
Better answers +6 pts Pass-rate lift on the Django bug-fix slice.
Overhead 1.4s Faster than the model’s own latency.
Team-of-5 saves $4,470 / mo Typical agentic workload · see math

Same context, every frontier model.

We re-ran the SWE-bench Verified harness against five models with the same input shape. Tetris' compression layer is model-agnostic — the gain compounds whichever model you switch to.

Model Input rate Tokens (before → after) Compression Pass@1 Δ $/task saved
Claude Opus 4.7 $15 / Mtok 63,113 → 18,289 71.0% +0.04 $0.6724
Claude Sonnet 4.6 $3 / Mtok 63,113 → 18,832 70.2% +0.06 $0.1328
GPT-5.5 $5 / Mtok 63,113 → 19,107 69.7% +0.03 $0.2200
GPT-5 $1.25 / Mtok 63,113 → 19,440 69.2% +0.02 $0.0546
Claude Haiku 4.5 $1 / Mtok 63,113 → 19,718 68.8% +0.01 $0.0434

All five runs use the same input session, replayed across each target model. The compression layer is model-agnostic; the dollar gain scales with model price.

Same recipe, every codebase shape.

We picked four codebases that stress different strategies: a Python web app (Django), a monolith C/CUDA file (llm.c), a Rust workspace (ripgrep), and a JS framework with deep dep graphs (Next.js).

Codebase Lang Files in context Tokens before → after Compression Dominant strategy
Django formsets fix Python 312 28,419 → 12,284 56.8% Pattern dedup + smart code packing
llm.c grad clipping C / CUDA 14 46,229 → 11,216 75.7% Smart code packing
ripgrep regex feature Rust 1,848 63,113 → 18,289 71.0% Smart code packing + relevance pruning
Next.js middleware TypeScript 2,104 71,802 → 19,884 72.3% Relevance-ranked import graph + dedup

See /examples for full session traces and failure-mode analysis.

Strategy ledger.

Each row is one (strategy, ratio) combination scored on the SWE-bench Verified suite. We ship every approach. The pipeline picks the right one per file at runtime.

StrategyRatioAchievedΔ pass@1$/taskp50p95
rome_prune41.00+0.040$0.0075537 ms1238 ms
dedupe41.01+0.038$0.0075530 ms1244 ms
repo_graph_rank81.00+0.036$0.0075575 ms1307 ms
ast_pack81.00+0.014$0.0075586 ms1292 ms
truncate_head48.74−0.046$0.0065470 ms965 ms
truncate_head812.02−0.060$0.0065514 ms994 ms

Negative pass@1 rows are intentionally shipped as the floor strategy (last-resort budget squeeze). The pipeline picks them only when no semantic strategy fits the budget.

What this saves a real team.

Conservative numbers. We assumed each developer averages 80 agent-runs a day, 22 working days a month. No team uses 100% Opus — we modelled a 30/50/20 mix of Opus 4.7 / Sonnet 4.6 / Haiku 4.5.

Team sizeAgent-runs / moTokens beforeTokens afterCost beforeCost afterSaved / mo
1 dev 1,760 111 M 32 M $1,261 $367 $894
5 devs 8,800 555 M 161 M $6,308 $1,838 $4,470
20 devs 35,200 2.22 B 644 M $25,235 $7,353 $17,882
100 devs 176,000 11.10 B 3.22 B $126,178 $36,765 $89,413

Math: tokens_in × (0.30·$15 + 0.50·$3 + 0.20·$1) / 1M. 63,113 input tokens / agent-run before, 18,289 after, applied at team scale.

Reproduce every number on this page.

No black box. No "trust us." Clone the repo, run the harness, get the same signed results.

# 1. install tetris & the bench harness
curl -fsSL https://get.tetris.codes | sh

# 2. run every strategy × ratio combination
cargo run -p tetris-bench --release -- \
  --dataset swe-bench-verified \
  --strategies all \
  --ratios 2,4,8

# 3. verify the signed savings log byte-for-byte
tetris savings verify --against bench/out/savings.tetrislog

The CI gate fails if pass@1 regresses by more than 2 percentage points against the v0.0.36 baseline checked into bench/out/baseline/. Pricing is signed Ed25519 — rotating a price requires a key rotation, so reported savings can't drift silently.

Gold patchesread-only at task time
Package registriesonly allowlisted hosts
Determinismseed = sha256(instance_id)
Provenanceevery run signs savings.tetrislog

Install & reproduce Read the methodology