A community-driven registry for the Claude Code ecosystem. Not affiliated with Anthropic.
Are you the author? Sign in to claim
forge-mcp
An MCP server for Forge — Voxell's hosted text-embedding API. It exposes Forge to any MCP client (Claude, Cursor, Cline, Windsurf, VS Code, …) as two tools:
embed — turn text into vectorslist_models — list available models and their dimensionsYou bring a Forge API key. The server is stateless, and Voxell does not store the text you send or the vectors it returns — only usage metadata (token counts) is recorded, for billing. It does embeddings only — no storage, no search, no RAG. Those are different products.
One-click install in your editor (then replace your-key-here with a real key from
dash.voxell.ai):
Claude Code — one command:
claude mcp add forge -e FORGE_API_KEY=your-key-here -- npx -y @voxell/forge-mcp
Any other client (Claude Desktop, Cline, Windsurf, Zed, …) uses the standard mcpServers
block — see Use it below.
ultra is the 8B — ~75+
average task score on MTEB, currently #4 on MTEB (English), and the top usable model (the
three ranked above it are research-only). turbo (0.6B) is the fast/cheap default. Pick your
quality/cost point.dim to truncate (re-normalized) for ~4× smaller, cheaper vectors.input_type: "document" and each query
with input_type: "query", then rank by cosine similarity.dim to truncate (Matryoshka) and trade a little accuracy
for smaller, cheaper vectors.embed tool — no separate script.Most MCP clients run it on demand with npx. Add this to your client's MCP config:
{
"mcpServers": {
"forge": {
"command": "npx",
"args": ["-y", "@voxell/forge-mcp"],
"env": { "FORGE_API_KEY": "your-key-here" }
}
}
}
(Cursor, Claude Desktop, Cline, Windsurf, and VS Code all use this mcpServers shape.)
embed| arg | type | default | notes |
|---|---|---|---|
input | string or string[] | — | text(s) to embed (required) |
model | string | turbo | turbo (1024-d), pro (2560-d), ultra (4096-d) |
dim | number | model default | truncate to N dimensions (Matryoshka) — works on every model |
input_type | "query" | "document" | document | use query for search queries |
Returns the vectors plus the model, dimension, and token count.
Default is turbo — the one you probably want. pro/ultra trade size and speed for more
dimensions.
list_modelsLists the available models and their dimensions.
| env | required | default |
|---|---|---|
FORGE_API_KEY | yes | — |
FORGE_BASE_URL | no | https://api.voxell.ai |
Forge speaks the OpenAI embeddings API. Point any OpenAI client at Forge — no code change, and your existing vector dimensions are preserved:
from openai import OpenAI
client = OpenAI(base_url="https://api.voxell.ai/v1", api_key="your-forge-key")
# the exact call you already make — now on a higher-ranked engine:
client.embeddings.create(model="text-embedding-3-large", input=["hello world"]) # -> 3072-d
Your OpenAI model names map to a matching-dimension Forge tier (text-embedding-3-small/
ada-002 → 1536-d, text-embedding-3-large → 3072-d), so existing vector stores slot in
unchanged. Or address Forge tiers directly — turbo | pro | ultra. Also supports dimensions
(Matryoshka, re-normalized) and encoding_format: "base64".
It's an upgrade on every path. Forge's smallest tier (turbo, Qwen3-Embedding-0.6B)
outranks OpenAI's largest embedding model (text-embedding-3-large) on MTEB — so there's no
drop-in that lands worse. ultra (Qwen3-Embedding-8B, ~75+ average task score, #4 on MTEB English)
is a different league.
Why re-embedding onto Forge is worth it. Embedding is a one-way door: whatever an encoder discards at write time is gone — no reranker, longer prompt, or bigger LLM downstream reconstructs what the vectors never captured. The model you embed with sets the ceiling on everything above it. Re-embed once onto a higher-ranked engine and that ceiling rises — permanently.
MIT © Voxell, Inc.
Run Claude Code as an MCP server so any agent can delegate coding tasks to it
Browser automation using accessibility snapshots instead of screenshots
English-first Korean equity intelligence MCP — DART filings, foreign-holder 5%-rule flows, activist filings, KRX news. F
Unity MCP acts as a bridge between AI assistants and your Unity Editor. Give your LLM tools to manage assets, control sc