A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
10 packages found
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
Benchmark for evaluating LLM agents on smart-contract vulnerability discovery and exploitation
Jagged Frontier: LLM vulnerability detection benchmark harnesses (API + Claude Code agentic)
Rust implementation of protobuf with editions support, JSON serialization, and zero-copy views
The Power BI Modeling MCP Server, brings Power BI semantic modeling capabilities to your AI agents.
[ICLR'25] OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures?
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Int
Official Buildkite skills for Claude Code, Cursor, and other AI coding agents