A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
Run AI agents fully autonomously on a filesystem directory — MCP servers and skills enabled, with zero human approvals o
Run AI agents fully autonomously on a filesystem directory — MCP servers and skills enabled, with zero human approvals or permission prompts — safely isolated in a hardened Docker container (or locally).
It also decouples agent execution from your application. Instead of embedding
agent SDKs, sandboxes, and long-running jobs inside your app, you run the runspace
server once and treat every agent run as a single HTTP request — POST a job
(editable dir, context, prompt, which agent, and the skills + MCP servers it should
have) and poll for the result. Agents are pluggable behind one uniform API via a
small FilesystemAgent protocol — Claude Code ships built-in, and others (Codex,
etc.) can be added — so your application talks to the same endpoint no matter which
agent runs underneath.
Given an editable directory (the agent's workspace), a read-only context directory
(traces, domain knowledge, etc.), a prompt, and optionally the skills and
MCP servers to enable, runspace_agent runs an AI agent that modifies the
editable directory and then exits.
Because each session runs inside a locked-down container by default, the agent
operates unattended: it can use its full toolset — MCP servers, skills, shell,
file edits, web access — without stopping to ask for permission, since the container
boundary (not a human in the loop) is what keeps it safe. The same agent can also run
locally with --no-docker, where filesystem hooks confine it to the session
workspace.
Install by name and pick the extras you need. Container mode is optional — it
needs the [container] extra (plus Docker); without it, agents run in local
mode only.
python -m pip install "runspace-agent[all]" # everything (recommended): server + UI, Claude agent, local + container modes
python -m pip install "runspace-agent[server]" # HTTP server + Web UI (local mode; add [container] for container mode)
python -m pip install "runspace-agent[claude]" # the built-in Claude Code agent
python -m pip install "runspace-agent[container]" # container execution mode (optional; local mode needs no Docker)
python -m pip install runspace-agent # core library only — run your own agent locally; no server, no container
Mix extras as needed, e.g. python -m pip install "runspace-agent[server,claude,container]".
(Quote the brackets — some shells treat [] as a glob.) After installing with
[server] (or [all]), start it with runspace-srv — see Quick Start.
Why
python -m pipand not justpip? It installs into the active interpreter/virtualenv. A barepipmay point at a different Python — the tell is "Defaulting to user installation…", which lands the package (and therunspace-srvlauncher) outside your venv and off your PATH.runspace-srvis the same command for every install method; if your shell still can't find it, runpython -m runspace_agentinstead.
Only needed if you're working on runspace-agent itself (from a clone of this
repo). Normal users should install from PyPI as shown above. The -e flag does an
editable install so your code changes take effect without reinstalling.
# Create venv with Python 3.11
uv venv --python 3.11
Activate the virtual environment:
source .venv/bin/activate
.venv\Scripts\activate
Select the installation option based on your needs:
# Core library only
uv pip install -e .
# With Claude Code agent support
uv pip install -e ".[claude]"
# With Docker container support
uv pip install -e ".[container]"
# With FastAPI server + UI
uv pip install -e ".[server]"
# Everything
uv pip install -e ".[all]"
# Dev dependencies
uv pip install -e ".[dev]"
The fastest way to get started is with the built-in server:
python -m pip install "runspace-agent[all]"
runspace-srv
This starts the API server + React UI on port 6767. Sessions run inside a Docker
container by default, so starting the server runs a Docker pre-flight (verify
the daemon, build runspace-agent:latest if missing). To run sessions locally
on the host instead — no Docker required — start with --no-docker:
runspace-srv --no-docker
An explicit mode in a POST /run request always overrides this default; the CLI
flag only sets the default for requests that don't specify one.
Open in your browser:
# Custom port (environment-only; default 6767)
RUNSPACE_PORT=9000 runspace-srv
# Enable auto-reload on code changes (off by default)
runspace-srv --watch
# Custom session TTL (default: 8 hours)
runspace-srv --session-ttl 24
See Configuration for the environment variables the server reads.
Configure the agent using ClaudeCodeOptions from the Claude Code SDK:
import asyncio
from pathlib import Path
from claude_code_sdk import ClaudeCodeOptions
from runspace_agent import RunspaceSession, run_agent
from runspace_agent.agents.claude_code import ClaudeCodeAgent
options = ClaudeCodeOptions(
env={
"ANTHROPIC_BASE_URL": "https://your-api-proxy.example.com",
"ANTHROPIC_AUTH_TOKEN": "sk-...",
"ANTHROPIC_MODEL": "claude-opus-4-8",
},
max_turns=50,
# Any ClaudeCodeOptions field is supported — model, mcp_servers,
# allowed_tools, disallowed_tools, append_system_prompt, etc.
)
agent = ClaudeCodeAgent(options=options)
session = RunspaceSession(
editable_dir=Path("./my_project"),
context_dir=Path("./context"),
prompt="Improve the code based on the traces in the context directory.",
agent=agent,
# Pull skills into the agent at run setup via `npx skills add`.
# Combine with preinstalled_skills / skills_dir — see Skills below.
remote_skills=["vercel-labs/agent-skills"],
)
result = asyncio.run(run_agent(session))
print(f"Success: {result.success}, Session: {result.session_id}")
The server is configured through environment variables — nothing is read from a
.env file automatically. Set them in your shell, process manager, or container
runtime before starting runspace-srv.
| Variable | Default | Description |
|---|---|---|
RUNSPACE_PORT | 6767 | Port the server listens on. |
RUNSPACE_DATA_DIR | {system-temp}/runspace | The runspace home directory; it contains the sessions/ folder where every session's managed data lives. Set it to keep session data in a stable, inspectable location. See Session Storage. |
Agent credentials (e.g. ANTHROPIC_BASE_URL, ANTHROPIC_AUTH_TOKEN) are not
server config — they are passed per request via agent_settings.env on
POST /run (or ClaudeCodeOptions.env for the library). See
Agent Credentials.
If you prefer to keep these in a file, you can export it yourself before starting the server:
set -a; source my-env-file; set +a
runspace-srv
Any agent that implements the FilesystemAgent protocol can be used:
from runspace_agent.agents.base import FilesystemAgent, Workspace, AgentResult
class MyCustomAgent:
skills_folder_name = ".my_agent/skills"
default_skills_dir = Path("./my_bundled_skills") # or None
npx_agent_name = "my-agent" # "-a" value for `npx skills`, or None
async def run(self, workspace: Workspace) -> AgentResult:
# Read from workspace.context_dir
# Modify files in workspace.editable_dir
# Follow workspace.prompt
return AgentResult(success=True)
Uses the Claude Agent SDK to run Claude Code headlessly. Configure it by passing
a ClaudeCodeOptions object — every SDK field is supported and automatically
forwarded. The agent enforces the following fields for security (your values
are overridden):
| Field | Enforced value | Reason |
|---|---|---|
permission_mode | "bypassPermissions" | Headless container, no human to approve |
cwd | workspace directory | Sandbox boundary |
system_prompt | Headless prompt | Prevents interactive prompts |
hooks | Sandbox hooks | Filesystem isolation enforcement |
If you don't set these, sensible defaults are applied:
| Field | Default |
|---|---|
allowed_tools | Read, Write, Edit, Bash, Glob, Grep, Skill, WebSearch, WebFetch, Agent, LSP |
max_turns | 300 |
Everything else is fully configurable: model, env, mcp_servers,
append_system_prompt, allowed_tools, disallowed_tools, add_dirs,
extra_args, user, etc.
Skills are agent-specific tool extensions. Each FilesystemAgent declares a
skills_folder_name (e.g. .claude/skills for Claude Code) and the library
copies skills into the workspace.
There are three ways to give an agent skills, and you can use any or all:
preinstalled_skills)Each agent ships with a set of bundled skills. The ClaudeCodeAgent ships with:
Preinstalled skills are opt-in — none are included unless you select them by name. Only the names you list are loaded:
# No preinstalled skills (default)
session = RunspaceSession(...)
# Include only the ones you select
session = RunspaceSession(..., preinstalled_skills=["mcp-builder"])
skills_dir)Point skills_dir at a directory that contains one subdirectory per skill
(each with its own SKILL.md):
my-skills/
├── my-skill/
│ └── SKILL.md
└── another-skill/
└── SKILL.md
session = RunspaceSession(..., skills_dir=Path("my-skills"))
Combine both — select preinstalled skills and supply your own directory. Custom skills override preinstalled ones with the same name:
session = RunspaceSession(
...,
preinstalled_skills=["mcp-builder"],
skills_dir=Path("my-skills"),
)
Use GET /skills to list the available preinstalled skills via the API.
remote_skills)Pull skills straight from a repo at run setup using the
skills CLI (npx skills add). remote_skills is a list
of sources — anything npx skills add accepts: an owner/repo slug, a
GitHub URL, or a repo subpath:
session = RunspaceSession(
...,
remote_skills=[
"vercel-labs/agent-skills",
"https://github.com/owner/repo",
"owner/repo/path/to/skill",
],
)
Or via the /run API:
{ "remote_skills": ["vercel-labs/agent-skills"] }
Each source is installed into the agent's skills folder scoped to the agent's
npx_agent_name (claude-code for Claude Code), i.e. roughly:
npx skills add <source> -a claude-code -s '*' -y --copy
Notes:
npx in the run environment. The container image already
ships Node, so container mode works out of the box; local mode needs
npx on the host (it's installed inside the container, never run on the host
for container runs).preinstalled_skills and skills_dir.In local mode, PreToolUse hooks restrict the agent to the session directory — it cannot read or write files outside the workspace.
In container mode, Docker provides true isolation:
--cap-drop ALL — no Linux capabilities--security-opt no-new-privileges — no privilege escalationRunning the server on your own machine? Prefer container mode. It gives the agent a hard isolation boundary: it runs inside the container and the only part of your computer it can touch is the single session workspace directory, which is bind-mounted into the container. Nothing else on your PC is visible or changeable. Container mode also does not write back to your original
editable_dir— it works on a copy in the workspace, so your source files stay untouched.local mode, by contrast, runs the agent as a process directly on your host. The PreToolUse hooks keep it inside the session directory, but it shares your machine and user, and on success it syncs results back to your original
editable_dir— i.e. it does change files on your PC. Use local mode for trusted, fast iteration; use container mode when you want your machine protected from whatever the agent does.
| Mode | When to use |
|---|---|
mode="local" | Development, debugging, fast iteration |
mode="container" | Production — full isolation, a fresh container per run |
In container mode each run gets a brand-new container that is auto-removed
(--rm) when it exits. The container is disposable but your data is not: the agent
writes to a host directory bind-mounted at /workspace, so the editable output,
diffs, and conversation remain on the host after the container is removed (see
Sandbox). Container runs never sync back to your original
editable_dir — changes live in the session workspace and are fetched via the API.
Each POST /run selects its own execution mode with the mode field
("local" or "container"), independently of how the server was started. The CLI
flag only sets the default for requests that omit mode:
runspace-srv (default — container): runs a Docker pre-flight at startup —
it verifies the Docker daemon is running and builds the runspace-agent:latest
image if it's missing. If Docker isn't available, the server fails fast at
startup with a clear error, so container runs are guaranteed to work.runspace-srv --no-docker: skips that pre-flight and defaults requests to
local.Because the pre-flight only happens at startup, a server started with --no-docker
has not verified Docker or built the image. A request that then explicitly asks
for mode: "container" will fail at run time (the session is marked failed,
with a Docker error). If you want to serve container runs, start the server the
normal way (with Docker) rather than --no-docker.
Credentials are agent-specific — there is no global API-key setting in the
runspace layer. Each FilesystemAgent decides what it needs to reach its model and
exposes it through its own configuration, which you supply per session via
agent_settings (HTTP API) or the agent's options object (Python API). Whatever you
provide is what the agent process receives in its environment.
Claude Code agent (built-in). Authenticate by setting credentials in the agent's
env: either an Anthropic API key, or a base URL + auth token for a proxy/gateway.
agent_settings.env on POST /run:{
"editable_dir": "...", "context_dir": "...", "prompt": "...",
"agent_settings": {
"env": {
"ANTHROPIC_API_KEY": "sk-ant-..."
// — or, for a proxy/gateway —
// "ANTHROPIC_BASE_URL": "https://your-gateway.example.com",
// "ANTHROPIC_AUTH_TOKEN": "...",
// "ANTHROPIC_MODEL": "claude-opus-4-8"
}
}
}
ClaudeCodeOptions(env={...}) (see Python API above).build_claude_env() helper in runspace_agent.agents.claude_code (reads
ANTHROPIC_API_KEY / ANTHROPIC_AUTH_TOKEN / ANTHROPIC_BASE_URL / ANTHROPIC_MODEL).Other agents. A different agent type authenticates however it requires — its own
API-key variable, a token file, a config field, etc. Expose those through the same
agent_settings (the agent reads them when it builds its options); nothing in the
runspace layer is Anthropic-specific.
Container vs local: a local-mode run may inherit auth from the host environment, but a container is a clean room — it only receives what you put in
agent_settings.env. If a container session fails immediately with a non-zero exit, missing credentials are the most common cause.
| Method | Endpoint | Description |
|---|---|---|
POST | /run | Start a new agent session |
GET | /sessions | List all sessions with status |
GET | /sessions/{id} | Session details (tokens, duration, status) |
DELETE | /sessions/{id} | Delete session and cleanup workspace |
GET | /sessions/{id}/files | Browse workspace file tree (JSON) |
GET | /sessions/{id}/files/{path} | View/download a specific file |
GET | /sessions/{id}/editable.zip | Download editable dir as zip |
GET | /sessions/{id}/diff | Unified diffs for all changed files |
GET | /sessions/{id}/diff/{path} | Diff for a single file |
GET | /sessions/{id}/conversation | Agent conversation trajectory (JSON) |
GET | /sessions/{id}/summary | Agent-generated session summary |
GET | /skills | List preinstalled/bundled skills |
GET | /ui | Web UI (React SPA) |
GET | /docs | Swagger interactive API docs |
GET | /redoc | ReDoc API docs |
Sessions auto-cleanup after 8 hours of inactivity (configurable with --session-ttl).
Sessions are single-use by design. There is no recall or resume within the same session — each run gets a fresh session with its own context and editable directory.
To "recall" the agent (e.g., run another improvement iteration), create a new session with the updated editable directory and fresh context. This avoids the complexity of maintaining conversation history, summarization across runs, and stale state.
Sessions are automatically cleaned up after the configured TTL (default: 8
hours of inactivity, set via --session-ttl). If you want to free disk space
on the host (or container volume) immediately rather than waiting for the
auto-cleanup, delete the session explicitly after downloading the results:
# 1. Download the improved editable directory
curl -O http://localhost:6767/sessions/{session_id}/editable.zip
# 2. (Optional) Delete the session immediately to free space
# Otherwise it will be auto-deleted after the session TTL expires
curl -X DELETE http://localhost:6767/sessions/{session_id}
# 3. Next iteration: create a new session with the updated files
curl -X POST http://localhost:6767/run \
-H "Content-Type: application/json" \
-d '{"editable_dir": "./updated_skill", "context_dir": "./new_context", "prompt": "..."}'
Everything lives under a runspace home directory, with one subdirectory per
session under sessions/ (each holding its editable copy, pre-run snapshot,
context, conversation, and metadata). The sessions/ folder is what the UI scans
and what the file/diff/download endpoints read.
<home>/
└── sessions/
└── <session_id>/
├── agent_workspace/ (editable/ + context/ + skills)
├── editable_original/ (pre-run snapshot, for diffs)
└── ...
By default the home is {system-temp}/runspace (so the shared temp dir stays
namespaced). Set the optional RUNSPACE_DATA_DIR environment variable to make
that directory the home directly — it will then contain sessions/:
runspace-srv # -> {tmp}/runspace/sessions/<id>/
RUNSPACE_DATA_DIR=~/.runspace runspace-srv # -> ~/.runspace/sessions/<id>/
This is handy when you want all managed session data in one known place (e.g. to inspect, back up, or persist it across reboots) rather than scattered in temp.
See Configuration for all server environment variables.
If you prefer to manage Docker separately:
uv pip install -e ".[server]"
uv run uvicorn runspace_agent.server.app:app --host 0.0.0.0 --port 6767
# Development (auto-reload)
uv run uvicorn runspace_agent.server.app:app --host 0.0.0.0 --port 6767 --reload
The web UI is a React + TypeScript + Tailwind CSS app built with Vite.
cd frontend
npm install
There are two ways to work with the UI:
Vite dev server (npm run dev) | Production build (npm run build) | |
|---|---|---|
| URL | http://localhost:5173/ui | http://localhost:6767/ui |
| Hot reload | Yes — changes appear instantly | No — must rebuild manually |
| API | Proxied to localhost:6767 | Served by FastAPI directly |
| Use when | Developing the frontend | Testing the final build / production |
runspace-srvcd frontend && npm run devnpm run build to update the production UI at localhost:6767/uiNote: The UI at
localhost:6767/uiis a static build baked into the Python package. It does not auto-update when you edit frontend source files — you must runnpm run buildto rebuild it.
The runspace-agent:latest image (required for container mode) is managed for you
by the Docker pre-flight that runs at startup by default — you never have to build
it by hand:
runspace-srv # builds the image only if missing, then serves
runspace-srv --rebuild # force a rebuild even if it exists, then serves
runspace-srv --no-docker # skip Docker entirely; run sessions locally
runspace-srv — checks whether runspace-agent:latest exists and builds it
only if it's missing. If the image is already present it's reused as-is (no
rebuild), so startup is fast on subsequent runs.runspace-srv --rebuild — forces a rebuild of the image even when it already
exists. Use this after upgrading runspace-agent or changing the Dockerfile.runspace-srv --no-docker — skips Docker entirely: no daemon check, no image
build, and sessions default to local execution on the host. A request that still
asks for mode: "container" will fail — see
Choosing the mode per request.The build context is assembled from the installed package, so this works the same whether runspace-agent was installed editable, from git, or from a wheel — no repo checkout required.
Containers auto-remove after each run. All output (conversation, diffs, files) is persisted on the host via the volume mount — the container is just a throwaway execution environment.
To clean up any leftover stopped containers:
docker container prune
uv pip install -e ".[dev]"
uv run pytest tests/
Ready-to-run examples live in examples/. Each example supports three execution modes:
| Mode | Command suffix | Description |
|---|---|---|
server | Requires runspace-srv running | Sends a request to the HTTP server (recommended) |
library-container | Requires Docker | Calls the Python library directly, runs in Docker |
library-local | No Docker needed | Runs locally, modifies editable/ in place |
All modes require ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN environment variables.
Fixes bugs in a skill based on execution traces and domain knowledge:
# Start the server first (in a separate terminal)
runspace-srv
# Then run the example
uv run python examples/skill_improvement/run.py server
Optimizes a skillberry-store skill (airline customer service for tau-bench) using traces and evaluation criteria:
# Start the server first (in a separate terminal)
runspace-srv
# Then run the example
uv run python examples/skillberry_store_skill/run.py server
Replace server with library-container or library-local for alternative modes.
For a map of the modules and how a run flows through them, see docs/architecture.md.
Contributions are welcome. See CONTRIBUTING.md for the local setup, how to run the linter and tests, and the conventions we follow.
Licensed under the Apache License 2.0.
Run Claude Code as an MCP server so any agent can delegate coding tasks to it
Browser automation using accessibility snapshots instead of screenshots
MCP server integration for DaVinci Resolve Studio
A Jetbrains IDE IntelliJ plugin aimed to provide coding agents the ability to leverage intelliJ's indexing of the codeba