A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
A minimal AI agent built from scratch — no agent framework, just Python, the OpenAI SDK, and a `while` loop. Accompanies
A minimal AI agent built without any agent framework — just Python, the OpenAI SDK, and a while loop.
This project accompanies the Medium article series by Sergey Neskoromny:
Follow me on LinkedIn and Medium for more on AI tools, mobile development, and whatever I'm currently building!
The agent takes a task, calls tools when needed, observes the results, and loops until it has a final answer. Three modes:
| Mode | What happens |
|---|---|
local | All LLM calls go to a local Ollama model — fully offline |
remote | All LLM calls go to a cloud provider (OpenAI, Anthropic, or Gemini) |
mixed | Local Ollama orchestrates the loop; it delegates complex subtasks to the remote model via ask_remote() |
git clone https://github.com/sergenes/mini-agent
cd mini-agent
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env
# edit .env and add your API keys
For local/mixed mode, install Ollama and pull a model that supports function calling:
ollama pull llama3.1 # good default
ollama pull qwen2.5 # strong at tool use
ollama pull mistral-nemo # lighter alternative
ollama serve
python agent.py --mode local "What is 15% of 847?"
python agent.py --mode local --local-model qwen2.5 "What's the weather in Tokyo and today's date?"
# OpenAI
python agent.py --mode remote --provider openai "Explain how ReAct agents work"
python agent.py --mode remote --provider openai --model gpt-4o "Write a detailed explanation of MCP"
# Anthropic
python agent.py --mode remote --provider anthropic "Write a haiku about Python"
python agent.py --mode remote --provider anthropic --model claude-sonnet-4-6 "..."
# Google Gemini
python agent.py --mode remote --provider gemini "Summarize the ReAct paper"
python agent.py --mode remote --provider gemini --model gemini-2.0-flash-lite "..."
python agent.py --mode mixed "What's today's date, and explain quantum entanglement in simple terms"
python agent.py --mode mixed --local-model qwen2.5 --provider anthropic "..."
In mixed mode the local model decides what to handle itself and when to call ask_remote().
Simple tool calls (calculate, get_date) stay local. Knowledge-heavy tasks go to the cloud.
python agent.py --mode local --interactive
python agent.py --mode mixed --interactive
Try providers in order; skip models that don't support structured tool calling:
python agent.py --mode local --fallback llama3.1 mistral-nemo "What is 15% of 847?"
If qwen2.5 (the default) fails or outputs text-based tool invocations, the agent retries with llama3.1, then mistral-nemo. Each attempt uses the full reliability stack from reliability.py.
This demonstrates the reliability layer surviving a real network interruption:
python agent.py --mode remote --provider openai \
"Get today's date and the weather in Tokyo."
Thinking…, turn off Wi-Fi or disconnect ethernet.WARNING LLM call attempt 1 failed (Connection error) — retrying in 2.1s
WARNING LLM call attempt 2 failed (Connection error) — retrying in 4.3s
The retry is in _RetryingProvider inside reliability.py — it wraps the provider's complete() call so the agent loop never sees the transient failure. Up to 5 attempts, base delay 2 s, doubles each retry.
python agent.py --mode remote --quiet "What is 144 * 37?"
python agent.py --mode remote --provider openai "Get today's date and the weather in Tokyo. Calculate how many days are left until New Year's Day 2027. Write a short daily briefing with all three facts to briefing.txt, then read it back and count how many words it has."
This triggers six tool calls in sequence: get_current_date → get_weather → calculate → write_file → read_file → count_words.
| Provider | Default model |
|---|---|
| openai | gpt-4o-mini |
| anthropic | claude-haiku-4-5-20251001 |
| gemini | gemini-2.0-flash |
| ollama | qwen2.5 |
mini_agent/
├── agent.py # CLI entry point — argument parsing, REPL, mode dispatch
├── core.py # The agent loop: run_agent() and run_agent_mixed()
├── reliability.py # Reliability layer: retry, circuit breaker, validation, tracing, provider fallback
├── providers.py # LLM provider abstraction (OpenAI, Anthropic, Gemini, Ollama)
├── tools.py # Tool implementations and schemas
├── ui.py # Spinner — thread-safe braille activity indicator
├── mcp_server.py # Demo MCP server (to_uppercase, count_words)
├── mcp_client.py # MCP client helper — spawns the server, calls tools via JSON-RPC
├── requirements.txt
└── .env.example
tools.pyTOOL_FUNCTIONSTOOL_SCHEMASThe web_search and get_weather tools are stubs. Replace them with real API calls (Brave Search, Tavily, OpenWeatherMap, etc.) to make the agent genuinely useful.
┌─────────────────────────────────────────┐
│ messages = [system, user_task] │
│ │
│ while True: │
│ response = llm.complete(messages) │
│ │
│ if no tool_calls: │
│ return response.content ← done │
│ │
│ for each tool_call: │
│ result = call_tool(name, args) │
│ messages.append(tool_result) │
└─────────────────────────────────────────┘
In mixed mode, ask_remote() is an extra tool the local model can call. Calling it triggers a fresh run_agent() with the remote provider.
mcp_server.py is a standalone Model Context Protocol server that exposes two tools — to_uppercase and count_words. The agent calls them transparently via mcp_client.py; from the agent's perspective they are no different from any other tool.
Verify the server starts:
python mcp_server.py
It will block waiting for JSON-RPC messages on stdin — that's expected. Press Ctrl+C to exit. Normally the agent spawns it automatically as a subprocess.
How the communication works:
agent (tools.py)
└── mcp_client.py # asyncio JSON-RPC client
└── subprocess: mcp_server.py # FastMCP server on stdio
Each tool call spawns a fresh subprocess, performs the initialize → call_tool handshake, and exits. To add your own MCP tools, define them in mcp_server.py with @mcp.tool() and register wrapper functions in tools.py following the same pattern as mcp_to_uppercase and mcp_count_words.
calculate() uses Python's eval() with empty builtins — safe enough for a demo, not for production. Replace with a proper math library (sympy, asteval) for real use.llama3.1, llama3.2, qwen2.5, mistral-nemo. Models like phi3 or deepseek-r1 may not support it reliably.google-generativeai SDK needed.anthropic SDK is only needed if you use --provider anthropic.Added reliability.py on top of the unchanged core loop. Every item is independently useful; none require changes to core.py.
LLM-level resilience
_RetryingProvider — wraps any provider's complete() with exponential backoff + jitter (up to 5 attempts). The agent loop never sees a transient network failure.Tool-level resilience
with_retry() — retries a single tool call on exception (configurable attempts and base delay)CircuitBreaker — stops calling a broken tool after N consecutive failures; auto-resets after a timeoutvalidated_call() — validates tool arguments against the JSON schema with pydantic before execution; returns an error string the model can self-correct ontraced_call() — emits a structured log line (tool name, args, result, duration) for every call regardless of outcomerun_agent_reliable() — run_agent() with all four tool layers stacked; drop-in replacementProvider fallback
run_agent_with_fallback() — tries a list of providers in order, falling back on exception or silent failure (model outputs tool invocations as plain text instead of structured tool_calls)_looks_like_failed_tool_call() — heuristic that turns the silent failure into a detectable error--fallback MODEL [MODEL …] CLI flag — e.g. --fallback llama3.1 mistral-nemoagent.py — run_agent_reliable() is now the default for local and remote modes.
Core agent loop: agent.py, core.py, providers.py, tools.py, ui.py, MCP client/server.
MIT
MCP server integration for DaVinci Resolve Studio
Run Claude Code as an MCP server so any agent can delegate coding tasks to it
Browser automation using accessibility snapshots instead of screenshots
A Jetbrains IDE IntelliJ plugin aimed to provide coding agents the ability to leverage intelliJ's indexing of the codeba