Headroom — Context Compression for AI Agents

GitHub: chopratejas/headroom PyPI: headroom-ai npm: headroom-ai Docs: headroom-docs.vercel.app Model: Kompress-base (HuggingFace) License: Apache 2.0

60–95% fewer tokens · Library · Proxy · MCP · 6 algorithms · Local-first · Reversible

Compresses everything your AI agent reads — tool outputs, logs, RAG chunks, files, and conversation history — before it reaches the LLM. Same answers, fraction of the tokens.

How It Works

Your agent → Headroom (locally) → LLM

Headroom sits between your agent and the LLM as a transparent compression layer:

  • ContentRouter — detects content type, selects the right compressor
  • SmartCrusher / CodeCompressor / Kompress-base — compress JSON, AST, or prose
  • CacheAligner — stabilizes prefixes so provider KV caches actually hit
  • CCR (Compress-Confirm-Retrieve) — stores originals locally; LLM calls headroom_retrieve if needed
  • Cross-agent memory — shared store across Claude, Codex, Gemini, auto-dedup
  • headroom learn — mines failed sessions, writes corrections to CLAUDE.md / AGENTS.md

Installation

pip install "headroom-ai[all]"   # Python
npm install headroom-ai          # Node/TypeScript

Usage: 4 Modes

1 — Wrap a coding agent (one command)

headroom wrap claude
headroom wrap codex
headroom wrap cursor
headroom wrap aider
headroom wrap copilot

2 — Drop-in proxy (zero code changes)

headroom proxy --port 8787
# Any OpenAI-compatible client can route through it

3 — Inline library

from headroom import compress
compressed = compress(messages)

4 — MCP server

Tools: headroom_compress, headroom_retrieve, headroom_stats

headroom mcp install

Proof

Token savings on real workloads:

WorkloadBeforeAfterSavings
Code search (100 results)17,7651,40892%
SRE incident debugging65,6945,11892%
GitHub issue triage54,17414,76173%
Codebase exploration78,50241,25447%

Accuracy preserved on benchmarks:

BenchmarkBaselineHeadroomDelta
GSM8K (Math)0.8700.870±0.000
TruthfulQA (Factual)0.5300.560+0.030
SQuAD v2 (QA)97%19% compression
BFCL (Tools)97%32% compression

Agent Compatibility

AgentwrapNotes
Claude Code—memory, —code-graph
Codexshares memory with Claude
Cursorprints config
Aiderstarts proxy + launches
Copilot CLIstarts proxy + launches (supports subscription mode)
OpenClawContextEngine plugin

Architecture

  • SmartCrusher — JSON compression
  • CodeCompressor — AST-aware code compression
  • Kompress-base — trained text compression model (HuggingFace)
  • CCR — reversible compression: originals stored locally, retrievable on demand
  • CacheAligner — KV cache optimization