A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
AI gateway written in Go. Lightweight unified OpenAI-compatible API for OpenAI, Anthropic, Gemini, Groq, xAI & Ollama. L
A fast and lightweight AI gateway written in Go, providing a unified OpenAI-compatible API for OpenAI, Anthropic, Gemini, DeepSeek, xAI, Groq, OpenRouter, Z.ai, Azure OpenAI, Oracle, Ollama, and more.
Step 1: Start GoModel container
docker run --rm -p 8080:8080 \
-e LOGGING_ENABLED=true \
-e LOGGING_LOG_BODIES=true \
-e LOG_FORMAT=text \
-e LOGGING_LOG_HEADERS=true \
-e OPENAI_API_KEY="your-openai-key" \
enterpilot/gomodel
Pass only the provider credentials or base URL you need (at least one required):
docker run --rm -p 8080:8080 \
-e OPENAI_API_KEY="your-openai-key" \
-e ANTHROPIC_API_KEY="your-anthropic-key" \
-e GEMINI_API_KEY="your-gemini-key" \
-e VERTEX_PROJECT="your-gcp-project" \
-e VERTEX_LOCATION="us-central1" \
-e VERTEX_AUTH_TYPE="gcp_adc" \
-e DEEPSEEK_API_KEY="your-deepseek-key" \
-e GROQ_API_KEY="your-groq-key" \
-e OPENROUTER_API_KEY="your-openrouter-key" \
-e ZAI_API_KEY="your-zai-key" \
-e XAI_API_KEY="your-xai-key" \
-e AZURE_API_KEY="your-azure-key" \
-e AZURE_BASE_URL="https://your-resource.openai.azure.com/openai/deployments/your-deployment" \
-e AZURE_API_VERSION="2024-10-21" \
-e ORACLE_API_KEY="your-oracle-key" \
-e ORACLE_BASE_URL="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/20231130/actions/v1" \
-e ORACLE_MODELS="openai.gpt-oss-120b,xai.grok-3" \
-e OLLAMA_BASE_URL="http://host.docker.internal:11434/v1" \
-e VLLM_BASE_URL="http://host.docker.internal:8000/v1" \
enterpilot/gomodel
⚠️ Avoid passing secrets via -e on the command line - they can leak via shell history and process lists. For production, use docker run --env-file .env to load API keys from a file instead.
Step 2: Make your first API call
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5-chat-latest",
"messages": [{"role": "user", "content": "Hello!"}]
}'
That's it! GoModel automatically detects which providers are available based on the credentials you supply.
Example model identifiers are illustrative and subject to change; consult provider catalogs for current models. Feature columns reflect gateway API support, not every individual model capability exposed by an upstream provider.
| Provider | Credential | Example Model | Chat | /responses | Embed | Files | Batches | Passthru |
|---|---|---|---|---|---|---|---|---|
| OpenAI | OPENAI_API_KEY | gpt-5.5 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Anthropic | ANTHROPIC_API_KEY | claude-sonnet-4-20250514 | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ |
| Google Gemini | GEMINI_API_KEY | gemini-2.5-flash | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
| Vertex AI | VERTEX_PROJECT + VERTEX_LOCATION | google/gemini-2.5-flash | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
| DeepSeek | DEEPSEEK_API_KEY | deepseek-v4-pro | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ |
| Groq | GROQ_API_KEY | llama-3.3-70b-versatile | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
| OpenRouter | OPENROUTER_API_KEY | google/gemini-2.5-flash | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Z.ai | ZAI_API_KEY (ZAI_BASE_URL optional) | glm-5.1 | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ |
| xAI (Grok) | XAI_API_KEY | grok-4 | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
| Alibaba Cloud Model Studio (Bailian) | BAILIAN_API_KEY (BAILIAN_BASE_URL optional) | qwen3-max | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| MiniMax | MINIMAX_API_KEY (MINIMAX_BASE_URL optional) | MiniMax-M3 | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ |
| Xiaomi MiMo | XIAOMI_API_KEY (XIAOMI_BASE_URL optional) | mimo-v2.5-pro | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ |
| OpenCode Go | OPENCODE_GO_API_KEY (OPENCODE_GO_BASE_URL optional) | glm-5.1 | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| Azure OpenAI | AZURE_API_KEY + AZURE_BASE_URL (AZURE_API_VERSION optional) | gpt-5 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Oracle | ORACLE_API_KEY + ORACLE_BASE_URL | openai.gpt-oss-120b | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| Ollama | OLLAMA_BASE_URL | llama3.2 | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
| vLLM | VLLM_BASE_URL (VLLM_API_KEY optional) | meta-llama/Llama-3.1-8B-Instruct | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ |
| Amazon Bedrock | BEDROCK_BASE_URL (region or endpoint) + AWS credentials | anthropic.claude-3-5-haiku-20241022-v1:0 | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
✅ Supported ❌ Unsupported
For Z.ai's GLM Coding Plan, set ZAI_BASE_URL=https://api.z.ai/api/coding/paas/v4.
Xiaomi MiMo TTS (mimo-v2.5-tts*) and ASR (mimo-v2.5-asr) are served through
/v1/audio/speech and /v1/audio/transcriptions (translated to MiMo's
chat-completions audio dialect) as well as directly via chat completions; for
1M context append [1m] to the model ID and list it in XIAOMI_MODELS.
OpenCode Go (OpenCode Zen) routes per model — most models use OpenAI-style
/chat/completions, while /messages-only models (default qwen3.7-max,
override with OPENCODE_GO_MESSAGES_MODELS) are sent to the Anthropic-native
endpoint. Set OPENCODE_GO_API_KEY; the base URL defaults to
https://opencode.ai/zen/go/v1.
Configured model lists are available for every provider with
<PROVIDER>_MODELS, for example
OPENROUTER_MODELS=openai/gpt-oss-120b,anthropic/claude-sonnet-4 or
ORACLE_MODELS=openai.gpt-oss-120b,xai.grok-3. DeepSeek defaults to
https://api.deepseek.com; set DEEPSEEK_BASE_URL only when using a compatible
proxy or alternate DeepSeek endpoint. By default,
CONFIGURED_PROVIDER_MODELS_MODE=fallback uses those lists only when upstream
/models is unavailable or empty. Set CONFIGURED_PROVIDER_MODELS_MODE=allowlist
to expose only configured models for providers that define a list, skipping
their upstream /models calls.
For vLLM, set VLLM_API_KEY only if the upstream server was started with
--api-key.
To register multiple instances of the same provider type without config.yaml,
use suffixed env vars such as OPENAI_EAST_API_KEY and
OPENAI_EAST_BASE_URL; add OPENAI_EAST_MODELS to configure that instance's
model list. This registers provider openai-east with type openai.
Vertex AI follows the same suffix pattern — VERTEX_US_PROJECT registers
provider vertex-us. Vertex project and location env vars must match the
instance prefix: for a suffixed instance such as VERTEX_US_PROJECT, also set
VERTEX_US_LOCATION and any other suffixed settings for that instance, rather
than the generic VERTEX_PROJECT / VERTEX_LOCATION. VERTEX_AUTH_TYPE
defaults to Application Default Credentials (gcp_adc).
Prerequisites: Go 1.26.4+
Create a .env file:
cp .env.template .env
Add your API keys to .env (at least one required).
Start the server:
make run
Infrastructure only (Redis, PostgreSQL, MongoDB, Adminer - no image build):
docker compose up -d
# or: make infra
Full stack (adds GoModel + Prometheus; builds the app image):
cp .env.template .env
# Add your API keys to .env
docker compose --profile app up -d
# or: make image
| Service | URL |
|---|---|
| GoModel API | http://localhost:8080 |
| Adminer (DB UI) | http://localhost:8081 |
| Prometheus | http://localhost:9090 |
docker build -t gomodel .
docker run --rm -p 8080:8080 --env-file .env gomodel
| Endpoint | Method | Description |
|---|---|---|
/v1/chat/completions | POST | Chat completions (streaming supported) |
/v1/responses | POST | OpenAI Responses API |
/v1/conversations | POST | Create a conversation (gateway-managed) |
/v1/conversations/{id} | GET | Retrieve a conversation |
/v1/conversations/{id} | POST | Replace conversation metadata in full |
/v1/conversations/{id} | DELETE | Delete a conversation |
/v1/embeddings | POST | Text embeddings |
/v1/models | GET | List available models |
/v1/files | POST | Upload a file (OpenAI-compatible multipart) |
/v1/files | GET | List files |
/v1/files/{id} | GET | Retrieve file metadata |
/v1/files/{id} | DELETE | Delete a file |
/v1/files/{id}/content | GET | Retrieve raw file content |
/v1/batches | POST | Create a native provider batch (OpenAI-compatible schema; inline requests supported where provider-native) |
/v1/batches | GET | List stored batches |
/v1/batches/{id} | GET | Retrieve one stored batch |
/v1/batches/{id}/cancel | POST | Cancel a pending batch |
/v1/batches/{id}/results | GET | Retrieve native batch results when available |
| Endpoint | Method | Description |
|---|---|---|
/v1/messages | POST | Anthropic Messages API through translated model routing (streaming supported) |
/v1/messages/count_tokens | POST | Heuristic Anthropic Messages input token estimate |
| Endpoint | Method | Description |
|---|---|---|
/p/{provider}/... | GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS | Provider-native passthrough with opaque upstream responses |
| Endpoint | Method | Description |
|---|---|---|
/admin/dashboard | GET | Admin dashboard UI |
/admin/runtime/config | GET | Admin runtime configuration |
/admin/cache/overview | GET | Cache statistics overview |
/admin/usage/summary | GET | Aggregate token usage statistics |
/admin/usage/daily | GET | Per-period token usage breakdown |
/admin/usage/models | GET | Usage breakdown by model |
/admin/usage/user-paths | GET | Usage breakdown by user path |
/admin/usage/log | GET | Paginated usage log entries |
/admin/audit/detail | GET | Detailed audit entry information |
/admin/audit/log | GET | Paginated audit log entries |
/admin/audit/conversation | GET | Conversation thread around one audit entry |
/admin/providers/status | GET | Provider availability status |
/admin/runtime/refresh | POST | Refresh runtime configuration |
/admin/models | GET | List models with provider type |
/admin/models/categories | GET | List model categories |
/admin/model-overrides | GET | List model overrides |
/admin/model-overrides | PUT | Create/update model override |
/admin/model-overrides | DELETE | Remove model override |
/admin/auth-keys | GET | List authentication keys |
Legacy alias: Until 2026-08-09, all admin endpoints are also reachable under
/admin/api/v1/*. Legacy responses includeDeprecation: trueandSunset: Sun, 09 Aug 2026 00:00:00 GMTheaders. The endpoint formerly at/admin/api/v1/dashboard/configmoved to/admin/runtime/configon the new prefix.
| Endpoint | Method | Description |
|---|---|---|
/health | GET | Health check |
/metrics | GET | Prometheus metrics (experimental, when enabled) |
/swagger/index.html | GET | Swagger UI (when enabled) |
GoModel is configured through environment variables and an optional config.yaml. Environment variables override YAML values. See .env.template and config/config.example.yaml for the available options.
Key settings:
| Variable | Default | Description |
|---|---|---|
PORT | 8080 | Server port |
BASE_PATH | / | Mount the gateway under a path prefix such as /g |
GOMODEL_MASTER_KEY | (none) | API key for authentication |
USER_PATH_HEADER | X-GoModel-User-Path | Header used to read/write request user_path values |
ENABLE_PASSTHROUGH_ROUTES | true | Enable provider-native passthrough routes under /p/{provider}/... |
ALLOW_PASSTHROUGH_V1_ALIAS | true | Allow /p/{provider}/v1/... aliases while keeping /p/{provider}/... canonical |
ENABLED_PASSTHROUGH_PROVIDERS | openai,anthropic,openrouter,zai,vllm,deepseek | Comma-separated list of enabled passthrough providers |
GEMINI_API_MODE | native | Gemini AI Studio upstream mode: native or openai_compatible |
VERTEX_API_MODE | native | Vertex AI Gemini upstream mode: native or openai_compatible |
USE_GOOGLE_GEMINI_NATIVE_API | true | Legacy global Gemini mode toggle used when per-provider *_API_MODE is unset |
STORAGE_TYPE | sqlite | Storage backend (sqlite, postgresql, mongodb) |
METRICS_ENABLED | false | Enable Prometheus metrics (experimental) |
LOGGING_ENABLED | false | Enable audit logging |
DASHBOARD_LIVE_LOGS_ENABLED | true | Stream realtime dashboard log previews with bounded replay |
DASHBOARD_LIVE_LOGS_BUFFER_SIZE | 10000 | Max in-memory live events retained; increase above ~1000 msgs/sec or bursty traffic |
DASHBOARD_LIVE_LOGS_REPLAY_LIMIT | 1000 | Max events replayed after reconnect; increase for longer reconnect windows |
DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS | 15 | SSE heartbeat interval; lower when proxies need faster liveness checks |
GUARDRAILS_ENABLED | false | Enable the configured guardrails pipeline |
Quick Start - Authentication: By default GOMODEL_MASTER_KEY is unset. Without this key, API endpoints are unprotected and anyone can call them. This is insecure for production. Strongly recommend setting a strong secret before exposing the service. Add GOMODEL_MASTER_KEY to your .env or environment for production deployments.
GoModel has a two-layer response cache that reduces LLM API costs and latency for repeated or semantically similar requests.
Hashes the full request body (path + Workflow + body) and returns a stored response on byte-identical requests. Sub-millisecond lookup. Activate by environment variables: RESPONSE_CACHE_SIMPLE_ENABLED and REDIS_URL.
Responses served from this layer carry X-Cache: HIT (exact).
Embeds the last user message via your configured provider’s OpenAI-compatible /v1/embeddings API (cache.response.semantic.embedder.provider must name a key in the top-level providers map) and performs a KNN vector search. Semantically equivalent queries - e.g. "What's the capital of France?" vs "Which city is France's capital?" - can return the same cached response without an upstream LLM call.
Expected hit rates: ~60–70% in high-repetition workloads vs. ~18% for exact-match alone.
Responses served from this layer carry X-Cache: HIT (semantic).
Supported vector backends: qdrant, pgvector, pinecone, weaviate (set cache.response.semantic.vector_store.type and the matching nested block).
Both cache layers run after guardrail/workflow patching so they always see the final prompt. Use Cache-Control: no-cache or Cache-Control: no-store to bypass caching per-request.
See DEVELOPMENT.md for testing, linting, and pre-commit setup.
user_path and/or API key/responses lifecycle/messages ingress and /messages/count_tokens/conversations lifecycleJoin our Discord to connect with other GoModel users.
Native macOS app to monitor Claude AI usage limits and watch your coding sessions live
干净、强大、属于你的 AI Agent 平台 --AI agents, without the clutter.
An AI-powered custom node for ComfyUI designed to enhance workflow automation and provide intelligent assistance
npx CLI installing 100+ agents, commands, hooks, and integrations in one command