A community-driven registry for the Claude Code ecosystem. Not affiliated with Anthropic.
Are you the author? Sign in to claim
Field guide to token optimization in LLM-powered applications. Compiled from Claude Code, Cline, Codex, and OpenCode.
A reference on token optimization techniques used in production LLM applications. Based on analysis of four systems: Claude Code, Cline, Codex, and OpenCode.
length / 4. No tokenizer. It's consistently accurate enough for threshold decisions at zero cost.QUICKSTART.md contains copy-paste prompts you can feed to any AI coding agent to audit your codebase for token waste and apply fixes. No background reading required.
| Chapter | Topic |
|---|---|
| Context Compaction | Conversation compression — from free pruning to LLM summarization |
| Token Counting & Estimation | Fast heuristics, hybrid tracking, output budgeting |
| Prompt Caching | Cache-aware prompt design, breakpoint placement, invalidation detection |
| System Prompt Optimization | Modular assembly, variant architectures, stable-first ordering |
| Tool Output Management | Multi-layer truncation, head/tail preservation, pagination |
| Message Architecture | Normalization, tool pairing invariants, streaming data models |
| Context Window Management | Budget allocation, effective window calculation, degradation strategies |
| Multi-Agent Context | Subagent isolation, history forking, token budgets |
| Diagnostics & Observability | Token attribution, duplicate detection, cost tracking |
| Design Patterns | Cross-cutting principles and the 20 highest-impact optimizations |
Every system studied implements the same defense-in-depth model:
1. PREVENT Tool-specific limits at the source
2. TRUNCATE Generic caps catch what slips through
3. CACHE Prompt caching cuts repeat costs by 10x
4. PRUNE Clear stale results before expensive operations
5. COMPACT LLM summarization as last resort
| Metric | Value |
|---|---|
| Bytes per token (heuristic) | 4 |
| Cache read vs. input cost | 10x cheaper |
| Compaction trigger | 85-93% of context window |
| Output token cap (recommended) | 8-32k |
| Tool output budget | 10k tokens per result |
| Post-compaction target | ~50k tokens |
Based on token management implementations in:
1000+ skills curated from Anthropic, Vercel, Stripe, and other engineering teams
Design enforcement with memory — keeps your UI consistent across a project
Universal SEO skill for Claude Code. 25 sub-skills + 18 sub-agents covering technical SEO, E-E-A-T, schema, GEO/AEO, bac
Route Claude Code traffic to any of 17 provider backends including free or local models