A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
One-stop handbook for building, deploying, and understanding LLM agents with 60+ skeletons, tutorials, ecosystem guides,
A practical operating manual for building, evaluating, securing, and shipping modern LLM agent systems.
Modern agents are not "a prompt + a tool." They are systems — with identity, memory, skills, tools, MCP integrations, guardrails, observability, evals, and a provider strategy. This handbook teaches the whole stack and ships templates, blueprints, runnable adapters, and curated examples you can adopt today.
A curated, opinionated, production-oriented handbook in seven parts:
DESIGN.md machine-readable spec| You are… | Start at |
|---|---|
| New to agents | docs/beginners_guide.md → agent_os/README.md |
| Building a production agent | blueprints/ → checklists/production_readiness_checklist.md |
| Picking / wiring providers | providers/README.md → providers/provider_matrix.md |
| Comparing frameworks | docs/framework_comparison.md |
| Adding memory / RAG | memory/ → tutorials/rag_tutorials |
| Adding MCP | mcp/ → mcp/mcp_security.md |
| Designing Skills | skills/ → skills/skill_design_guide.md |
| Working with coding agents | coding_agents/ → coding_agents/prompts/ |
| Writing better prompts | prompt_engineering/ |
| Designing & rolling out | design_docs/ |
| Hardening safety/evals | safety/ → evals/ |
| Coding agent reading this repo | llms.txt → llm_wiki/index.md |
| Layer | Purpose | Where in this repo |
|---|---|---|
| Model / Provider | LLM choice + abstraction + routing | providers/ |
| Orchestration | Agent loops, planning, handoffs | docs/framework_comparison.md, blueprints/ |
| Tool | Function calling and external actions | agent_os/mcp_layer.md |
| MCP | Standardized external context and tools | mcp/ |
| Memory | Durable user/project/semantic memory | memory/ |
| Skills | Reusable, progressive-loading workflows | skills/ |
| Identity | Personality, mission, refusal style | agent_os/agent_identity.md, templates/ |
| Prompt | System prompt design, instruction hierarchy, defenses | prompt_engineering/ |
| Safety | Guardrails, approvals, policy | safety/ |
| Observability | Tracing, spans, cost, latency, evals | observability/, evals/ |
| Deployment | Shipping agents to production | design_docs/rollout_plan.md |
| Coding-agent harness | Claude Code, Cursor, Codex, Aider, Cline | coding_agents/ |
📖 Deep dive: agent_os/README.md
The handbook ships an LLMProvider abstraction with 24+ providers across six families. Most providers go through a single OpenAI-compatible code path; specialty / local providers are first-class.
| Provider type | Examples | Best for |
|---|---|---|
| Frontier APIs | OpenAI, Anthropic, Google Gemini | Reasoning, tool use, production agents |
| Fast inference | Groq, Cerebras, SambaNova | Low-latency workloads |
| Marketplaces | OpenRouter, Together, Fireworks, DeepInfra | Model choice and routing |
| Enterprise clouds | Azure OpenAI, AWS Bedrock, Vertex AI | Compliance, governance |
| Specialty | xAI, Perplexity, Mistral, Cohere, DeepSeek, Hugging Face, Replicate, NVIDIA NIM, MiniMax | Domain-specific |
| Local runtimes | Ollama, LM Studio, vLLM, llama.cpp | Privacy, cost control, offline dev |
Quick start:
from utilities import get_provider
from utilities.provider_router import ProviderRouter
# Use any single provider
out = get_provider("groq").chat(
[{"role": "user", "content": "Summarize MCP."}],
model="llama-3.1-8b-instant",
)
# Or route by task class with fallback
router = ProviderRouter()
out = router.chat(messages, task_class="cheap") # Groq → DeepSeek → Together → OpenRouter
📖 providers/README.md • providers/provider_matrix.md • providers/router_patterns.md • providers/local_models.md
.
├── README.md • llms.txt • llms-full.txt
├── agent_os/ ← the Agent OS concept, layers, workspace examples
├── providers/ ← 24+ provider docs + adapters + router patterns
├── templates/ ← AGENTS.md / SOUL.md / MEMORY.md / SKILL.md / DESIGN_DOC / ADR / …
├── skills/ ← design guide + taxonomy + maturity model + curated catalog + 4 examples
├── memory/ ← memory taxonomy, distillation, security, examples
├── mcp/ ← MCP basics, architecture, security, server catalog, examples
├── prompt_engineering/ ← agent prompt patterns, instruction hierarchy, defenses
├── coding_agents/ ← Claude Code, Cursor, Codex, workflows, prompts, review
├── design_docs/ ← agent + technical design docs, ADR guide, design.md spec
├── safety/ ← guardrails, approvals, prompt injection, secure checklist
├── observability/ ← tracing, spans, cost/latency, dashboards
├── evals/ ← eval design, regression / tool / memory / MCP / safety / prompt
├── blueprints/ ← production architectures by use case
├── examples/ ← end-to-end runnable agent workspaces
├── checklists/ ← agent design, prod readiness, MCP security, …
├── llm_wiki/ ← LLM-friendly index, glossary, matrices, wiki pattern
├── docs/ ← framework comparison, best practices, beginners' guide
├── tutorials/ ← RAG, memory, fine-tuning, chat-with-X
├── utilities/ ← LLMProvider + router + provider_config
├── agents/ ← 100+ curated agent skeletons (preserved)
├── complete_apps/, web_apps/, notebooks/, datasets/, design/, resources/, scripts/, tests/, ecosystem/
└── .github/ ← issue / PR templates
A curated, in-repo catalog plus a clear taxonomy and maturity model:
Curated skills shipped: research-summarizer, repo-auditor, mcp-security-reviewer, agent-memory-curator, api-design-reviewer, pr-summarizer, adr-writer, incident-postmortem, sprint-planner, dataset-profiler.
A dedicated section, agent-focused:
Templates: SYSTEM_PROMPT, AGENT_PROMPT. Checklist: agent_prompt_checklist.
The handbook is itself a great surface for coding agents. Drop your favorite tool (Claude Code, Cursor, Codex, Aider, Cline) into the repo:
The guidance is tool-neutral: same AGENTS.md, same workflows, regardless of harness.
Agent + technical design docs, ADRs, reviews, rollouts, and the DESIGN.md machine-readable spec for design tokens:
Templates: DESIGN_DOC, ADR.
| Framework | Best for | Lang | MCP | Tracing |
|---|---|---|---|---|
| OpenAI Agents SDK | Production agents | Py / JS | ✅ | ✅ built-in |
| LangGraph | Stateful, branching graphs | Py / JS | ✅ | ✅ LangSmith |
| CrewAI | Role-based teams | Py | ✅ | ⚠️ via partners |
| AutoGen (AG2) | Event-driven multi-agent + HITL | Py | ⚠️ partial | ✅ |
| LlamaIndex Workflows | Data-heavy / RAG-first | Py / TS | ✅ | ✅ |
| Pydantic AI | Type-safe, FastAPI-native | Py | ✅ | ✅ Logfire |
| Smolagents | Code-execution mini-agents | Py | ⚠️ | basic |
| Semantic Kernel | .NET / enterprise / Azure | C# / Py / Java | ✅ | ✅ |
| DSPy | Programmatic prompt optimization | Py | — | ✅ |
| Strands Agents | Provider-agnostic, OpenTelemetry | Py | ✅ | ✅ OTEL |
| Vercel AI SDK | App-layer agents in Next.js | TS / JS | ✅ | ✅ |
| Google ADK | Gemini / Vertex hierarchical tools | Py | ✅ | ✅ |
📖 Full comparison + decision tree: docs/framework_comparison.md. Capability tags hedged: verify against current upstream docs.
SKILL.md + scripts + references). Use when a task is repeatable, multi-step, and benefits from progressive disclosure. → skills/MEMORY.md, vector stores, decision logs). → memory/A useful rule of thumb:
| If the thing is… | Use |
|---|---|
| A repeatable workflow with steps and references | Skill |
| An external system with tools to call | MCP server |
| State that should outlive the current run | Memory |
| A single function the model needs once | Plain tool |
📖 Decision matrix: skills/skill_vs_tool_vs_mcp.md
Production agents need risk-tiered tool controls and human approval gates for high-impact actions.
| Risk level | Examples | Approval |
|---|---|---|
| Low | read-only search, summarization | none |
| Medium | drafting files, creating tickets | sometimes |
| High | sending email, modifying repos, running shell | required |
| Critical | deleting data, spending money, changing permissions | always + audit |
📖 safety/README.md • safety/prompt_injection.md • safety/secure_agent_checklist.md
You cannot ship what you cannot measure. The handbook ships:
| File | Purpose |
|---|---|
| AGENTS.md | Repo-specific agent instructions |
| SOUL.md | Identity, voice, values, refusal style |
| MEMORY.md | Durable project + user memory index |
| USER.md | User profile and preferences |
| TOOLS.md | Allowed/restricted/approval-gated tools |
| SKILL.md | Skill spec with progressive loading |
| MCP_SERVER.md | Documenting an MCP integration |
| SYSTEM_PROMPT.md | Long-lived system prompt |
| AGENT_PROMPT.md | Per-task / per-session prompt |
| DESIGN_DOC.md | Agent / technical design doc |
| ADR.md | Architecture Decision Record |
| EVAL_PLAN.md | What you'll evaluate and how |
| GUARDRAILS.md | Policy, refusals, escalation |
| HUMAN_APPROVAL_POLICY.md | Who approves what |
| CODING_AGENT_TASK.md | Task contract for coding agents |
| REPO_MODERNIZATION_PROMPT.md | Multi-phase modernization |
| AGENT_RELEASE_CHECKLIST.md | Ship/no-ship gate |
This release merged seven external projects into the handbook. Each was adapted (not bulk-copied) into the structure above:
| Source theme | Lives in |
|---|---|
| Skills catalog + taxonomy patterns | skills/ — taxonomy, maturity, packaging, validation, awesome catalog |
| Personal-wiki / self-maintaining KB | llm_wiki/wiki_pattern.md, docs/llm_readable_docs.md |
| Agent prompt research patterns | prompt_engineering/ |
| Production coding-agent prompts + workflows | coding_agents/ — prompts, workflows, review |
| Machine-readable design specs | design_docs/design_md_spec.md, templates/DESIGN_DOC.md.template |
| ADRs + design reviews | design_docs/adr_guide.md, design_docs/design_review.md |
📖 Full migration plan: MIGRATION_AND_PROVIDER_EXPANSION_PLAN.md
The utilities/llm_provider.py module exposes a single LLMProvider interface (and a backwards-compatible complete() function). Switch via LLM_PROVIDER without touching agent code; route automatically with ProviderRouter.
24+ providers across frontier / fast / marketplace / enterprise / specialty / local. See:
Contributions are very welcome — new examples, framework updates, fixes, and translations all help. Start with:
MIT — see LICENSE.
Curated & maintained by Sayed Allam (oxbshw). If this handbook helped you ship, please ⭐ the repo and open a PR with what you learned along the way.
MCP server integration for DaVinci Resolve Studio
mcp-language-server gives MCP enabled clients access semantic tools like get definition, references, rename, and diagnos
Run Claude Code as an MCP server so any agent can delegate coding tasks to it
Browser automation using accessibility snapshots instead of screenshots