A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
Build Claude Code–style deep agents in Python: tool-calling, sandboxed execution, multi-agent teams, skills, checkpoints
The deep agent that forks itself.
Split one task into N parallel branches, let an AI judge merge the winner —
in your terminal, or in one function call. 100% type-safe. Any model. Self-hosted.
Docs · PyPI · Forking · Why · CLI · Framework · Examples
Most agents give you one shot at a task. They pick an approach, commit to it, and if it's wrong you start over.
Pydantic Deep Agents can fork mid-run. One agent.run() splits into several branches that each try a different approach in parallel — isolated filesystems, separate budgets, independent reasoning. An AI judge (or you) picks the winner, and its history becomes the run's continuation. It's git branch for an agent's thinking.
That's one feature. There are forty more — planning, multi-agent swarms, persistent memory, sandboxed execution, skills, MCP, checkpoints, cost tracking — all batteries-included, all behind a single function call, all 100% type-safe.
# Terminal AI assistant — no Python setup required
curl -fsSL https://raw.githubusercontent.com/vstorm-co/pydantic-deep/main/install.sh | bash
pydantic-deep
# Or build your own agent
pip install pydantic-deep
Claude Code can't do this. Aider can't. LangGraph and CrewAI can't. It's the reason to use pydantic-deep.
When an agent hits a fork in the road — "should I refactor this with a decorator or a context manager?" — most tools force one bet. Pydantic Deep Agents lets the run branch:
┌── branch A: "use a decorator" ── tests: 8/8 ✓ conf 0.71
agent.run("refactor auth") ──┬─┼── branch B: "use a context manager" ── tests: 6/8 ✗ conf 0.42
(shared history) │ └── branch C: "extract a base class" ── tests: 8/8 ✓ conf 0.55
│
└──► ⚖️ AI judge weighs quality + tests + consistency
→ adopts branch A, continues the run
Each branch is fully isolated: a copy-on-write filesystem overlay (reads fall through to the parent, writes stay local), its own steering message, and its own budget_usd cap. The coordinator resolves the fork with one of four acceptance modes — manual, auto, auto_with_fallback (default), or vote — and the winning branch's history is adopted as the parent run's continuation.
Framework — opt in with one flag:
agent = create_deep_agent(
model="anthropic:claude-sonnet-4-6",
forking=True, # gives the agent: fork_run, inspect_branches,
) # merge_or_select, diff_branches, fork_cost, terminate_branch
Or run a real test command against every branch and let exit codes decide the winner:
from pydantic_deep import LiveForkCapability
agent = create_deep_agent(
forking=LiveForkCapability(test_command="pytest -q", test_timeout_s=120),
)
# confidence = quality_spread·0.4 + test_pass_ratio·0.4 + internal_consistency·0.2
CLI — fork an in-flight conversation, watch branches stream live, merge the best:
/fork # split the current run into N parallel branches
>>A try a decorator # steer branch A
>>B use a contextmgr # steer branch B
/merge # resolve — manual picker, AI judge, or vote
Live per-branch panels stream each approach side by side; a judge screen scores them; you accept, review the diff, or decline. Configure branch count, budgets, per-branch models, and merge strategy with /fork-config.
📖 Full reference: docs/capabilities/live-fork.md
The only tool that is a terminal assistant and a Python framework and can fork its own runs — without giving up type safety or your choice of model.
| Pydantic Deep | Claude Code | Aider | LangGraph | CrewAI | |
|---|---|---|---|---|---|
| Terminal TUI assistant | ✅ | ✅ | ✅ | — | — |
| Python framework / library | ✅ | — | ~ | ✅ | ✅ |
| Live run forking + AI judge | ✅ | — | — | — | — |
| Multi-agent swarm + message bus | ✅ | ~ | — | ✅ | ✅ |
| Any model / any provider | ✅ | Anthropic | ✅ | ✅ | ✅ |
| Sandboxed Docker execution | ✅ | — | ~ | DIY | DIY |
| Persistent memory + skills | ✅ | ✅ | — | DIY | ~ |
| Type-safe structured output | ✅ | — | — | ~ | ~ |
| MCP servers | ✅ | ✅ | — | ~ | ~ |
| Self-hosted, open source | ✅ MIT | — | ✅ | ✅ | ✅ |
✅ first-class · ~ partial / via extensions · — not available · DIY you wire it yourself. Comparison reflects each project as of 2026-06; corrections welcome via PR.
agent.run() into N parallel branches with copy-on-write isolation, per-branch budgets, a test-runner hook, and four merge modes (manual / auto / auto_with_fallback / vote). Opt in with forking=True./mcp command. Plus a full CLI presentation pass: clipboard image paste, real +/- diffs, tool icons, turn summaries.fallback_model= wraps your primary in a FallbackModel chain; fires on API errors but never on auth errors. Plus a batteries-included security hook preset (default_security_hook()) and three new output styles (markdown, json-only, bullet).include_liteparse=True) — PDFs, DOCX, XLSX, PPTX, and images with optional OCR, all local.pydantic-deep run), Docker sandbox with named workspaces, browser automation via Playwright.Full history: CHANGELOG.md
Pydantic Deep Agents is an agent harness — the complete infrastructure that wraps an LLM and makes it a functional autonomous agent. The model provides intelligence; the harness provides planning, tools, memory, sandboxed execution, unlimited context, and — uniquely — the ability to fork.
| ⑂ Live run forking | Split a run into N isolated branches, each trying a different approach. AI judge or test results pick the winner. No other agent framework has this. |
| 🔧 Tool-calling | File read/write/edit, shell execution, glob, grep, web search, web fetch, browser automation — wired up and ready. |
| 🤝 Multi-agent / swarm | Spawn subagents for parallel workstreams. Shared TODO lists with claiming. Peer-to-peer message bus. Full team coordination. |
| 🧠 Persistent memory | MEMORY.md persists across sessions. Auto-injected into the system prompt. Each agent has isolated memory by default. |
| ♾️ Unlimited context | Auto-summarization when approaching the token budget. LLM-based or zero-cost sliding window. Never hits a context wall. |
| 🐳 Sandboxed execution | Docker sandbox with named workspaces. Installed packages persist between sessions. Project dir mounted at /workspace. |
| 🗂️ Plan Mode | Dedicated planner subagent asks clarifying questions and structures the work before execution begins. Headless-compatible. |
| 🔖 Checkpoints | Save conversation state at any point. Rewind to any checkpoint. Fork sessions to explore alternative approaches. |
| 📚 Skills system | Domain-specific knowledge loaded on demand from SKILL.md files. Built-in: code-review, refactor, test-writer, git-workflow, and more. |
| 📄 Document parsing | Parse PDFs, DOCX, XLSX, PPTX, and images with optional OCR via LiteParse. Runs locally — no cloud services required. |
| 🔌 MCP | Connect any Model Context Protocol server — GitHub, Figma (OAuth), Context7, DeepWiki, or custom. Import straight from Claude Code. |
| ⚡ Lifecycle hooks + security preset | Claude Code-style PRE/POST_TOOL_USE hooks. Shell or Python handlers. default_security_hook() blocks destructive commands out of the box. |
| 📐 Structured output | Type-safe Pydantic model responses via output_type. No JSON parsing. No dict["key"]. Full IDE autocomplete. |
| 🔁 Fallback models | Primary model fails? fallback_model= hops to the next in the chain — on API errors, never on auth errors. |
| 🔄 Stuck loop detection | Detects repeated identical tool calls, A-B-A-B alternating patterns, and no-op calls. Warns the model or stops the run. |
| 💰 Cost tracking | Real-time token and USD cost tracking per run and cumulative. Hard budget limits with BudgetExceededError. |
| ✨ Self-improving | /improve analyzes past sessions and proposes updates to MEMORY.md, SOUL.md, and AGENTS.md. |
| 🏷️ 100% type-safe | Pyright strict + MyPy strict. 100% test coverage. Every public API is fully typed — safe to use in production. |
Built natively on pydantic-ai — uses the Capabilities API directly, inherits all pydantic-ai streaming, multi-model support, and Pydantic validation automatically.
A Claude Code-style terminal AI assistant that works with any model and any provider — and forks.
curl -fsSL https://raw.githubusercontent.com/vstorm-co/pydantic-deep/main/install.sh | bash
No Python setup required — the script installs uv and the CLI automatically. Then:
export ANTHROPIC_API_KEY=sk-ant-...
pydantic-deep
Windows / manual:
pip install "pydantic-deep[cli]"· Update:pydantic-deep update
Works with any model that supports tool-calling:
| Provider | Example models |
|---|---|
| Anthropic | anthropic:claude-opus-4-6, claude-sonnet-4-6 |
| OpenAI | openai:gpt-5.4, gpt-4.1 |
| OpenRouter | openrouter:anthropic/claude-opus-4-6 (200+ models) |
| Google Gemini | google-gla:gemini-2.5-pro |
| Ollama (local) | ollama:qwen3, ollama:llama3.3 |
| Any OpenAI-compatible | Custom base URL via env |
Switch model anytime: pydantic-deep config set model openai:gpt-5.4 or /model in the TUI.
| Feature | |
|---|---|
| ⑂ | Live run forking — split a run into branches, stream them side by side, merge the winner |
| 💬 | Streaming chat with tool call visualization, icons, and real +/- diffs |
| 📁 | File read / write / edit, shell execution, glob, grep |
| 🤝 | Task planning, plan mode, and subagent delegation |
| 🧠 | Persistent memory and self-improvement across sessions |
| ♾️ | Context compression for unlimited conversations |
| 🔖 | Checkpoints — save, rewind, and fork any session |
| 🔌 | MCP servers via /mcp — GitHub, Figma (OAuth), and more; import from Claude Code |
| 🌐 | Web search & fetch built-in · 🖥️ browser automation via Playwright (--browser) |
| 🐳 | Docker sandbox — sandboxed execution with named workspaces |
| 💭 | Extended thinking — minimal / low / medium / high / xhigh |
| 📋 | Clipboard image paste (Ctrl+V / /paste) — multimodal prompts |
| 💰 | Real-time cost and token tracking per session |
| 🛡️ | Tool approval dialogs — approve, auto-approve, or deny per tool call |
| @ | @filename file references · !command shell passthrough |
| ✨ | /fork, /merge, /improve, /skills, /mcp, /model, /theme, /compact, and more |
# Interactive TUI (default)
pydantic-deep
pydantic-deep tui --model openrouter:anthropic/claude-opus-4-6
# Headless deep agent — benchmarks, CI/CD, scripted automation
pydantic-deep run "Fix the failing test in test_auth.py"
pydantic-deep run --task-file task.md --json
# Docker sandbox — sandboxed execution, project dir mounted at /workspace
pydantic-deep tui --sandbox docker
pydantic-deep tui --workspace ml-env # named workspace, packages persist
# Browser automation (requires pydantic-deep[browser])
pydantic-deep tui --browser
# Config & skills
pydantic-deep config set model anthropic:claude-sonnet-4-6
pydantic-deep skills list
pydantic-deep update # update to latest version
See CLI docs for the full reference.
pip install pydantic-deep
One function call gives you a production deep agent with planning, tool-calling, multi-agent delegation, persistent memory, unlimited context, forking, and cost tracking. Everything is a toggle:
from pydantic_ai_backends import StateBackend
from pydantic_deep import create_deep_agent, create_default_deps
agent = create_deep_agent(
model="anthropic:claude-sonnet-4-6",
forking=True, # ⑂ split a run into parallel branches + AI judge
include_todo=True, # Task planning with subtasks and dependencies
include_subagents=True, # Multi-agent swarm — delegate to subagents
include_skills=True, # Domain-specific skills from SKILL.md files
include_memory=True, # Persistent memory across sessions
include_plan=True, # Structured planning before execution
include_teams=True, # Agent teams with shared TODO lists + message bus
include_liteparse=True, # Document parsing — PDF, DOCX, XLSX + OCR
web_search=True, # Tool-calling: web search
thinking="high", # Extended thinking / reasoning effort
context_manager=True, # Unlimited context via auto-summarization
cost_tracking=True, # Token/USD budget enforcement
fallback_model="openai:gpt-5.4", # auto-retry if the primary model fails
include_checkpoints=True, # Save, rewind, and fork conversations
)
deps = create_default_deps(StateBackend())
result = await agent.run("Build a REST API for user auth", deps=deps)
Type-safe responses with Pydantic models — no JSON parsing, no dict["key"]:
from pydantic import BaseModel
class CodeReview(BaseModel):
summary: str
issues: list[str]
score: int
agent = create_deep_agent(output_type=CodeReview)
result = await agent.run("Review the auth module", deps=deps)
print(result.output.score) # fully typed
Spawn isolated subagents for parallel workstreams. Each subagent is a full deep agent with its own tool-calling, memory, and context:
agent = create_deep_agent(
subagents=[
{
"name": "researcher",
"description": "Researches topics using web search",
"instructions": "Search the web, synthesize findings, cite sources.",
},
{
"name": "code-reviewer",
"description": "Reviews code for quality, security, and performance",
"instructions": "Check for security issues, N+1 queries, missing tests...",
},
],
)
# Main agent delegates: task(description="Review auth.py", subagent_type="code-reviewer")
from pydantic_deep import create_deep_agent, default_security_hook, Hook, HookEvent
agent = create_deep_agent(
hooks=[
*default_security_hook(), # blocks destructive shell, path traversal, secret leaks
Hook(
event=HookEvent.PRE_TOOL_USE,
command="echo 'Tool: $TOOL_NAME args: $TOOL_INPUT' >> /tmp/audit.log",
),
],
)
Connect GitHub, Figma (OAuth), Context7, DeepWiki, or any custom server — auth handled for you:
from pydantic_deep import create_deep_agent, build_mcp_server, MCPServerConfig
deepwiki = build_mcp_server(
MCPServerConfig(name="deepwiki", transport="http", url="https://mcp.deepwiki.com/mcp")
)
agent = create_deep_agent(mcp_servers=[deepwiki]) # curated defaults via builtin_mcp_servers()
Pydantic Deep Agents auto-discovers and injects project-specific context into every conversation:
| File | Purpose | Who Sees It |
|---|---|---|
AGENTS.md | Project conventions, architecture, instructions | Main agent + all subagents |
CLAUDE.md | Claude Code project instructions | Main agent + all subagents |
SOUL.md | Agent personality, style, communication preferences | Main agent only |
.cursorrules | Cursor editor conventions | Main agent only |
MEMORY.md | Persistent memory — read/write/update tools | Per-agent (isolated) |
Compatible with Claude Code, Cursor, GitHub Copilot, and other agent frameworks. AGENTS.md follows the agents.md spec.
See the full API reference for all options.
A full-featured research deep agent with web UI — built entirely on Pydantic Deep Agents.
Web search (Tavily, Brave, Jina), sandboxed code execution, Excalidraw diagrams, plan mode, report export.
cd apps/deepresearch && uv sync && cp .env.example .env
uv run deepresearch # → http://localhost:8080
See apps/deepresearch/README.md for full setup.
Pydantic Deep Agents uses pydantic-ai's native Capabilities API for all cross-cutting concerns — forking, hooks, memory, skills, context files, teams, and plan mode are all first-class pydantic-ai capabilities.
Pydantic Deep Agents
+---------------------------------------------------------------------+
| |
| +----------+ +----------+ +----------+ +----------+ +---------+ |
| | Planning | |Filesystem| | Subagents| | Skills | | Teams | |
| +----+-----+ +----+-----+ +----+-----+ +----+-----+ +----+----+ |
| | | | | | |
| +------------+-----+------+------------+------------+ |
| | |
| v |
| Forking --> +------------------+ <-- Capabilities |
| Summarization --> | Deep Agent | <-- Hooks |
| Checkpointing --> | (pydantic-ai) | <-- Memory |
| Cost Tracking --> | | <-- MCP |
| +--------+---------+ |
| | |
| +-----------------+-----------------+ |
| v v v |
| +------------+ +------------+ +------------+ |
| | State | | Local | | Docker | |
| | Backend | | Backend | | Sandbox | |
| +------------+ +------------+ +------------+ |
| |
+---------------------------------------------------------------------+
Every component is a standalone package — use only what you need:
| Package | What It Does |
|---|---|
| pydantic-ai-backend | File storage, Docker sandbox, console toolset |
| pydantic-ai-todo | Task planning with subtasks and dependencies |
| subagents-pydantic-ai | Sync/async delegation, background tasks, cancellation |
| summarization-pydantic-ai | LLM summaries or zero-cost sliding window |
| pydantic-ai-shields | Cost tracking, input/output/tool blocking |
agent.run() into N parallel branches sharing history up to the fork pointBranchOverlay filesystem isolation — reads fall through to parent, writes stay localbudget_usd caps, aggregate budget enforcementmanual, auto, auto_with_fallback (default), voteJudgeAgent with structured JudgeVerdict; compute_confidence blends quality, test pass ratio, and consistencyfork_run, inspect_branches, merge_or_select, terminate_branch, diff_branches, fork_cost/fork, /merge, /fork-config, live per-branch streaming panels, judge screen, merge acceptance gatels, read_file, write_file, edit_file, glob, grep, execute — full filesystem accessnavigate, click, type_text, screenshot, execute_js, and more/improve analyzes past sessions, proposes updates to context filesafter_tool_execute before they enter historydefault_security_hook() blocks destructive commands, path traversal, secret leaksfallback_model= chains; fires on API errors, never on auth errorsoutput_typepydantic-deep run) for CI/CD, benchmarks, scripted automation/fork, /merge, /mcp, /improve, /compact, /diff, /model, /skills, /theme, and more@filename file references, !command shell passthrough, clipboard image pastegit clone https://github.com/vstorm-co/pydantic-deep.git
cd pydantic-deep
make install
make test # 100% coverage required
make all # lint + typecheck + test
See CONTRIBUTING.md. Good first issues are labeled here.
pydantic-deep is part of a broader open-source ecosystem for production AI agents:
| Project | Description | Stars |
|---|---|---|
| full-stack-ai-agent-template | Zero to production AI app in 30 minutes. FastAPI + Next.js 15, 6 AI frameworks (incl. pydantic-deep), RAG pipeline, 75+ config options. | |
| pydantic-ai-shields | Drop-in guardrails for Pydantic AI agents. 5 infra + 5 content shields. | |
| pydantic-ai-subagents | Declarative multi-agent orchestration with token tracking. | |
| pydantic-ai-summarization | Smart context compression for long-running agents. | |
| pydantic-ai-backend | Sandboxed execution for AI agents. Docker + Daytona. | |
| content-skills | Claude Code content studio — blog, social, slides, video, infographics — all brand-aware. | |
| production-stack-skills | Claude Code skills for production-grade FastAPI, PostgreSQL, Docker, and observability. |
Want the full stack? Use full-stack-ai-agent-template — it ships pydantic-deep integrated with FastAPI, Next.js, auth, WebSocket streaming, and RAG out of the box.
Browse all projects at oss.vstorm.co
If pydantic-deep saved you from wiring an agent harness by hand — give it a ⭐. It's the single biggest thing that helps the project grow.
MIT — see LICENSE
Native macOS app to monitor Claude AI usage limits and watch your coding sessions live
npx CLI installing 100+ agents, commands, hooks, and integrations in one command
干净、强大、属于你的 AI Agent 平台 --AI agents, without the clutter.
Pocket Flow: Codebase to Tutorial