A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
Monocle is a framework for tracing GenAI app code. This repo contains implementation of Monocle for GenAI apps written i
Open-source tracing & testing for GenAI apps and agents.
How It Works · Quick Start · Testing · Frameworks · Integrations · Contributing
Built under the Linux Foundation AI & Data umbrella, designed to plug into your existing OpenTelemetry stack. With a few lines of code (or none at all), you get rich traces, CI-friendly tests, and deep visibility across LLMs, agents, tools, and vector stores.
At its core, Monocle is a GenAI-specific observability layer built on OpenTelemetry.
Because the traces are structured and consistent, they are easy for humans, dashboards, and even SRE/QA agents to consume.
monocle-test-tools, you can assert on traces themselves — agents invoked, tools used, token costs, error states — not just input/output pairs.If you care about debuggability, reliability, or compliance for AI agents, Monocle is meant for you.
This repository contains the Python implementation of Monocle's tracing SDK and metamodel (monocle_apptrace), including:
There is also a separate package, monocle-test-tools, which provides a testing and validation framework for AI agent tracing, built on top of pytest.
pip install monocle_apptrace
from monocle_apptrace import setup_monocle_telemetry
setup_monocle_telemetry(workflow_name="simple_math_app")
This wires up OpenTelemetry, configures the Monocle metamodel, and auto-instruments supported frameworks without requiring you to manually create spans.
By default, Monocle exports traces as JSON files under a local ./monocle directory:
monocle_trace_{workflow_name}_{trace_id}_{timestamp}.json
Each file contains an array of OpenTelemetry spans capturing agent runs, tool calls, and LLM interactions. Load them into any OTLP-compatible backend, or use the Okahu VS Code extension for a rich Gantt-style timeline visualization.
Use a scope to attach attributes like user_id, session_id, or tenant_id to every span created inside a block. Great for filtering traces by tenant or correlating a multi-step flow.
from monocle_apptrace.instrumentation.common.instrumentor import (
monocle_trace_scope,
monocle_trace_scope_method,
)
# Context manager — scope applies to every span inside the with-block
with monocle_trace_scope("user_id", "user-123"):
result = my_agent.run("What's the weather in London?")
# Decorator — scope applies to every call of the function
@monocle_trace_scope_method("tenant_id", "acme-corp")
def handle_request(payload):
...
Async equivalents (amonocle_trace_scope) and full reference: Scope API docs.
monocle-test-toolsmonocle-test-tools is the companion test framework that lets you write pytest-style tests that assert on traces, not just return values.
| Capability | Description |
|---|---|
| Agentic response | Did the agent produce the right kind of answer for a given input? |
| Agent invocation | Did the correct agent or sub-agent run, and delegate the right tasks? |
| Tool behavior | Were the intended tools called, with the expected parameters and outputs? |
| Inference quality & cost | Did responses match your schemas or rubrics, and stay within token/cost budgets? |
| E2E evaluations in CI/CD | Run eval-style tests as part of your pipeline using the same traces that power observability. |
pip install monocle_test_tools
from monocle_test_tools import expected
def test_weather_agent():
result = expected(
input="What is the weather in London?",
expected_output="weather report for London"
)
result.called_agent("weather_agent")
result.called_tool("get_weather", agent_name="weather_agent")
result.under_token_limit(5000)
result.under_duration(10)
Load a saved Monocle trace JSON and assert against it — no live agent run, no API keys.
from monocle_test_tools.span_loader import JSONSpanLoader
def test_from_saved_trace(monocle_trace_asserter):
monocle_trace_asserter.load_spans(
JSONSpanLoader.from_json("monocle/monocle_trace_my_app_abc123.json")
)
monocle_trace_asserter \
.called_agent("summarizer_agent") \
.contains_output("revenue")
monocle_trace_asserter.does_not_call_tool("delete_record")
Pass a session_id so multiple turns roll up into one session, then evaluate at the agentic_sessions fact (role adherence, knowledge retention, conversation completeness across turns).
import pytest
from monocle_test_tools import MonocleValidator, TestCase
agent_test_cases = [
{"test_input": ["Book a flight from SFO to Mumbai on 26 Nov."]},
{"test_input": ["Now book a hotel near the airport for 4 nights."]},
]
@MonocleValidator().monocle_testcase(agent_test_cases)
async def test_multi_turn_session(test_case: TestCase):
# Same session_id ties both turns to the same agentic session
await MonocleValidator().test_agent_async(
root_agent, "strands", test_case, session_id="travel_session_1"
)
@pytest.mark.asyncio
async def test_session_quality(monocle_trace_asserter):
# After the two turns above, evaluate the session as a whole
monocle_trace_asserter.with_evaluation("okahu") \
.check_eval(fact_name="agentic_sessions", eval_name="role_adherence",
expected=["excellent_adherence", "good_adherence"]) \
.check_eval(fact_name="agentic_sessions", eval_name="knowledge_retention",
expected=["excellent_retention", "good_retention"]) \
.check_eval(fact_name="agentic_sessions", eval_name="correctness",
expected="correct")
See the full test assertions reference and test tools README for detailed usage.
Monocle supports both in-app initialization and wrapper-style execution so you can choose how invasive you want tracing to be.
setup_monocle_telemetry() once at startup and let Monocle auto-instrument supported frameworks.monocle_apptrace module) to trace apps without modifying the code, making it suitable for Lambda layers and platform-level integration.This flexibility is especially useful when platform teams want to inject tracing without touching product code, or when you ship multi-tenant AI platforms.
Monocle can trace the major AI coding assistants. Install once, then register hooks for the CLIs you use:
# Install the Monocle package
uv tool install monocle_apptrace
# Register hooks for whichever assistants you use
monocle-apptrace claude-setup # Claude Code
monocle-apptrace codex-setup # OpenAI Codex CLI
monocle-apptrace copilot-setup # GitHub Copilot (CLI + VS Code Chat)
Each *-setup is interactive and asks two things:
Where to install — two options:
~/.claude/, ~/.codex/, ~/.copilot/); applies to every session on the machine..claude/, .codex/, .github/hooks/); only sessions started inside that project are traced. Useful for trying Monocle on one repo without affecting others.You can also skip the prompt with --global or --project:
monocle-apptrace claude-setup --project
How to authenticate — local storage or Okahu cloud:
Start a new session — traces flow automatically to whatever exporter you've configured (file, console, or cloud), giving you full visibility into how these assistants interact with your codebase.
| Category | Supported |
|---|---|
| Language | 🟢 Python · 🟢 Typescript |
| Agentic frameworks | 🟢 Langgraph · 🟢 LlamaIndex · 🟢 Google ADK · 🟢 OpenAI Agent SDK · 🟢 AWS Strands · 🟢 CrewAI · 🟢 Microsoft Agent Framework |
| MCP / A2A | 🟢 FastMCP · 🟢 MCP client · 🟢 A2A client |
| Web / App | 🟢 Flask · 🟢 AIO Http · 🟢 FastAPI · 🟢 Azure Function · 🟢 AWS Lambda · 🟢 Vercel (TS) · 🟢 Microsoft Teams AI SDK · 🟢 Web/REST client · 🔜 Google Function |
| LLM frameworks | 🟢 Langchain · 🟢 Llamaindex · 🟢 Haystack |
| Agent Runtime | 🟢 AWS Bedrock Agentcore |
| LLM inference | 🟢 OpenAI · 🟢 Azure OpenAI · 🟢 Azure AI · 🟢 Nvidia Triton · 🟢 AWS Bedrock · 🟢 AWS Sagemaker · 🟢 Google Vertex · 🟢 Google Gemini · 🟢 Hugging Face · 🟢 Deepseek · 🟢 Anthropic · 🟢 Mistral · 🟢 LiteLLM · 🔜 Azure ML |
| AI coding assistants | 🟢 Claude CLI · 🟢 OpenAI Codex CLI · 🟢 GitHub Copilot (CLI + VS Code Chat) |
| Vector stores | 🟢 FAISS · 🔜 OpenSearch · 🔜 Milvus |
| Exporters | 🟢 stdout · 🟢 file · 🟢 Memory · 🟢 Azure Blob Storage · 🟢 AWS S3 · 🟢 Okahu cloud · 🟢 OTEL collectors · 🟢 Google Cloud Storage |
Monocle is designed to play nicely with the tools you already use.
The Okahu Trace Visualizer extension reads Monocle JSON trace files and displays them in an interactive UI with timelines, JSON viewers, token counts, and error badges.
👉 Download from VS Code Marketplace
A dedicated integration automatically instruments ADK agents, tools, and runners after you call setup_monocle_telemetry, emitting spans for agent runs, tool calls, and LLM interactions. See the Google ADK docs for setup instructions.
Okahu uses Monocle traces as a primary signal source for debugging, evaluation, and SRE-style monitoring of agentic applications.
These integrations make it easy to go from local debug → CI/CD testing → production observability without switching tracing models.
| Resource | Description |
|---|---|
| User Guide | Installation, configuration, and how traces are structured |
| Trace API | monocle_trace / amonocle_trace and low-level start_trace / stop_trace |
| Scope API | monocle_trace_scope helpers to attach scopes across spans |
| Test Assertions | Complete reference for all fluent API assertions in monocle-test-tools |
| Test Tools | Getting started with monocle-test-tools, conftest.py setup and examples |
| Evaluation API | LLM-based evaluation integration for test assertions |
| Contributing | Technical details for contributing to the project |
| Examples | Sample apps demonstrating Monocle with various frameworks |
Monocle's long-term goal is to support tracing and testing for GenAI apps built in any language, with any orchestration or agent framework, on any LLM or vector backend.
monocle-test-tools for complex, policy-driven AI systems.You can track progress and proposals via the LF AI & Data Monocle project page and GitHub discussions.
Monocle is a community-based open source project under the Apache 2.0 license.
Please see CONTRIBUTING, CODE_OF_CONDUCT, and SECURITY for detailed guidelines.
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
An AI-powered custom node for ComfyUI designed to enhance workflow automation and provide intelligent assistance
Deterministic multi-agent pipeline for end-to-end software development, orchestrating CLI-based AI tools (e.g. Gemini, C
Pocket Flow: Codebase to Tutorial