Headroom — Context Compression for AI Agents
GitHub: chopratejas/headroom
PyPI: headroom-ai
npm: headroom-ai
Docs: headroom-docs.vercel.app
Model: Kompress-base (HuggingFace)
License: Apache 2.0
60–95% fewer tokens · Library · Proxy · MCP · 6 algorithms · Local-first · Reversible
Compresses everything your AI agent reads — tool outputs, logs, RAG chunks, files, and conversation history — before it reaches the LLM. Same answers, fraction of the tokens.
How It Works
Your agent → Headroom (locally) → LLM
Headroom sits between your agent and the LLM as a transparent compression layer:
- ContentRouter — detects content type, selects the right compressor
- SmartCrusher / CodeCompressor / Kompress-base — compress JSON, AST, or prose
- CacheAligner — stabilizes prefixes so provider KV caches actually hit
- CCR (Compress-Confirm-Retrieve) — stores originals locally; LLM calls
headroom_retrieveif needed - Cross-agent memory — shared store across Claude, Codex, Gemini, auto-dedup
headroom learn— mines failed sessions, writes corrections to CLAUDE.md / AGENTS.md
Installation
pip install "headroom-ai[all]" # Python
npm install headroom-ai # Node/TypeScriptUsage: 4 Modes
1 — Wrap a coding agent (one command)
headroom wrap claude
headroom wrap codex
headroom wrap cursor
headroom wrap aider
headroom wrap copilot2 — Drop-in proxy (zero code changes)
headroom proxy --port 8787
# Any OpenAI-compatible client can route through it3 — Inline library
from headroom import compress
compressed = compress(messages)4 — MCP server
Tools: headroom_compress, headroom_retrieve, headroom_stats
headroom mcp installProof
Token savings on real workloads:
| Workload | Before | After | Savings |
|---|---|---|---|
| Code search (100 results) | 17,765 | 1,408 | 92% |
| SRE incident debugging | 65,694 | 5,118 | 92% |
| GitHub issue triage | 54,174 | 14,761 | 73% |
| Codebase exploration | 78,502 | 41,254 | 47% |
Accuracy preserved on benchmarks:
| Benchmark | Baseline | Headroom | Delta |
|---|---|---|---|
| GSM8K (Math) | 0.870 | 0.870 | ±0.000 |
| TruthfulQA (Factual) | 0.530 | 0.560 | +0.030 |
| SQuAD v2 (QA) | — | 97% | 19% compression |
| BFCL (Tools) | — | 97% | 32% compression |
Agent Compatibility
| Agent | wrap | Notes |
|---|---|---|
| Claude Code | ✅ | —memory, —code-graph |
| Codex | ✅ | shares memory with Claude |
| Cursor | ✅ | prints config |
| Aider | ✅ | starts proxy + launches |
| Copilot CLI | ✅ | starts proxy + launches (supports subscription mode) |
| OpenClaw | ✅ | ContextEngine plugin |
Architecture
- SmartCrusher — JSON compression
- CodeCompressor — AST-aware code compression
- Kompress-base — trained text compression model (HuggingFace)
- CCR — reversible compression: originals stored locally, retrievable on demand
- CacheAligner — KV cache optimization