A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
Head Vis Public Release
Reference implementation. This repository is not maintained and not accepting contributions. Report security issues to security@anthropic.com; we do not commit to fixes or timelines.
A visualization tool for attention heads in transformer language models. Pick a head, see its top-activating sequences across your dataset, its attention patterns, per-head metrics (induction score, previous-token score, entropy, …), and a PCA/UMAP projection of its Q/K/O/V space. Optionally decompose individual attention edges into contributing sparse-dictionary feature pairs.
The frontend is static-first — once data/ is populated it serves from disk with no backend. A backend unlocks live operations (custom prompts, live attribution compute) but is optional.
npm install
npm run build
Serve dist/ alongside a populated data/ directory (see below).
data/ — using Claudedata_pipeline.py and server.py are skeletons with the data-format contracts fully specified in docstrings. The intended workflow is to hand them to Claude Code with your model and dataset:
Here's
data_pipeline.pyfrom the headvis repo. I want to run it againstgpt2from HuggingFace using theopenwebtextdataset, studying layers 5 and 8 across all heads. Implement theNotImplementedErrorfunctions and run the pipeline.
Claude reads the docstrings (which are the spec), implements the model-specific forward-pass and tokenization bits using transformers + datasets, and runs it. The output-format contracts are exact — as long as the JSON shapes match, the frontend works.
The same applies to server.py for the backend:
Implement
add_custom_sequenceandproject_to_umapinserver.pyfor the same gpt2 setup. Leave the attribution endpoints stubbed (I don't have an SAE).
┌─────────────────────────────────┐
│ data_pipeline.py (run once) │──→ data/config.json
│ │ data/scatter_data.json
│ dataset scan → attention → │ data/heads/L{l}H{h}.json
│ metrics, decile sampling, │ data/umap/L{l}H{h}.json
│ PCA/UMAP fit │ data/umap/sequences.json
└─────────────────────────────────┘
│
▼
┌─────────────────────────────────┐ ┌──────────────────────────────┐
│ Svelte frontend (src/) │ ◀──────▶ │ server.py (optional) │
│ │ POST │ │
│ static read from data/ │ │ live attribution, │
│ optional runtime POST │ │ custom-prompt ingestion, │
│ │ │ UMAP point projection │
└─────────────────────────────────┘ └──────────────────────────────┘
data_pipeline.py — offlineWalks your dataset, computes attention per head, produces the static data/ tree. Every metric is a closed-form reduction over the attention matrix — the logit_* trio need pre-softmax QK scores (hook the attention module or recompute Q @ K.T), everything else runs on post-softmax weights.
| function | produces | model required |
|---|---|---|
compute_head_metrics | scatter-plot coordinates | yes (attention) |
sample_by_decile | heads/L{l}H{h}.json sequences array | yes (attention) |
compute_projection_cloud | umap/L{l}H{h}.json | yes (Q/K/O/V vectors) |
compute_qk_distance_histogram, compute_top_tokens | derived stats | no (pure transforms) |
write_* | JSON serialization | no |
Framework-agnostic parts (decile bucketing, COO sparsification, histogram binning, JSON writing) are pure data transforms — implementable once and reusable. Only the attention-extraction and vector-extraction bits depend on your model framework.
server.py — runtimeSix endpoints. The frontend runs fully static without any of them.
| endpoint | unlocks | needs |
|---|---|---|
add_custom_sequence | type any prompt, see its attention | model + tokenizer |
delete_custom_sequence | delete button on custom prompts | file I/O only |
project_to_umap | "where does this prompt sit in the PCA cloud?" | model + saved PCA rotation |
save_umap_clusters | persist user-lassoed cluster labels | file I/O only |
qk_attributions, ov_attributions | decompose an attention edge into feature pairs | sparse feature dictionary (SAE/transcoder) |
The two file-I/O-only endpoints are implementable with no model. The attribution endpoints need a trained sparse dictionary — see SAELens if you don't have one. Everything else is a single forward pass.
Side-effect contract: endpoints that produce durable results write them into data/ so the next load hits the cache. The frontend tries data/attributions/{qk,ov}/L{l}H{h}S{seq}Q{q}K{k}.json before POSTing.
The context blob: data/server_config.json has a context field that the frontend forwards verbatim on every POST. It never interprets the contents — put whatever your implementation needs in there (model path, cache dir, etc.). Frontend in static-only mode when server_config.json is {} or missing.
Every JSON schema is in the docstrings of data_pipeline.py. A few that trip people up:
attention_indices are flat COO indices: idx = q * seq_len + k. Values are parallel. Store top-K per row or above a threshold; dense matrices blow up file sizes.interval=10 is the highest-activation decile, interval=1 the lowest. 1-indexed. interval=-1 (URL param) means the custom-sequences view.src/lib/sparse.js). Your max_activation should do the same.views key — if you only fit PCA, omit the views key from the UMAP JSON entirely; the frontend auto-selects PCA mode when views is absent.The frontend supports URL state: ?layer=L&head=H&tab=T&interval=N. For a specific attribution: ?layer=L&head=H&attr=qk&seq=ID&qpos=P&kpos=P. See src/Index.svelte for the full param list.
1000+ skills curated from Anthropic, Vercel, Stripe, and other engineering teams
Claude Code skill for YouTube creators — channel audits, video SEO, retention scripts, thumbnails, content strategy, Sho
Design enforcement with memory — keeps your UI consistent across a project
AI image generation skill for Claude Code -- Creative Director powered by Gemini