A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
141 packages found
历年ICLR论文和开源项目合集,包含ICLR2021、ICLR2022、ICLR2023、ICLR2024、ICLR2025.
Open survey and evidence map for AI agent evolution, self-evolving agents, memory, skills, harnesses, benchmarks, and ag
It is a comprehensive resource hub compiling all LLM papers accepted at the International Conference on Learning Represe
A repo lists papers related to LLM based agent
Official companion repository for our survey "A Survey of the OpenClaw Ecosystem: From Platform Extensibility to Constra
A curated list of tools, papers, and datasets for applying AI to cybersecurity tasks. This list primarily focuses on mod
🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous wor
总结Prompt&LLM论文,开源数据&模型,AIGC应用
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
Save 30% token costs when using Claude Code, Codex, OpenCode for free - with open source, local semantic search. Works f
Lightweight, auditable Python code agent (~1500 LOC) — ReAct + Planner + Reflexion + Hybrid RAG, with SWE-bench Lite e
A curated list of Generative AI tools, works, models, and references
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
非线智能 NoneLinear - ReLE评测:中文AI大模型能力评测(持续更新):目前已囊括374个大模型,覆盖chatgpt、gpt-5.4、谷歌gemini-3.1-pro、Claude-4.6、文心ERNIE-X1.1、ERNIE
~95% on SimpleQA (e.g. Qwen3.6-27B on a 3090). Supports all local and cloud LLMs (llama.cpp, Ollama, Google, ...). 10+
[Up-to-date] A curated list of resources on graph-empowered agents and agent-facilitated graph learning (Graphs Meet Age
[NeurIPS 2024 D&B] GTA: A Benchmark for General Tool Agents & [arXiv 2026] GTA-2
A Systematic Survey of Deep Research
HealthFlow: Automating electronic health record analysis via a strategically self-evolving multi-agent framework
A LangGraph-powered multi-agent deep research system featuring task planning, human-in-the-loop review, multi-source ret
[ICML2025 Oral] LLM-SRBench: A New Benchmark for Scientific Equation Discovery with Large Language Models
[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling
Benchmarking the gap between AI agent hype and architecture. Three agent archetypes, 73-point performance spread, stress
MASSW is a comprehensive text dataset on Multi-Aspect Summarization of Scientific Workflows. MASSW includes more than 15
Awesome papers involving LLMs in Social Science.
All-in-one Web Agent framework for post-training. Start building with a few clicks!
RLAnything (ICML 2026) & AutoTool (ICML 2026), DemyAgent: Open-Source RL for LLMs and Agentic Scenarios
Claude Code usage governor: compact professional output, context slimming, tool-output filtering, telemetry, and drift g
Awesome LLM Papers and repos on very comprehensive topics.
Make any repo AI-first - write sustainable code from the start, or refactor a legacy codebase to prepare it for agent-dr
xLAM: A Family of Large Action Models to Empower AI Agent Systems
[NeurIPS 2024 D&B] VideoGUI: A Benchmark for GUI Automation from Instructional Videos
A Multi Agent Memory MCP That Connect Agents Across Systems and Machines
[ICLR 2025 Oral] This is the official repo for the paper "LLM-SR" on Scientific Equation Discovery and Symbolic Regressi
Yunjue Agent: A Fully Reproducible, Zero-Start In-Situ Self-Evolving Agent System for Open-Ended Tasks
Self-evolving agent: grows skill tree from 3.3K-line seed, achieving full system control with 6x less token consumption
PFI: Prompt Flow Integrity to Prevent Privilege Escalation in LLM Agents
AI writes code. This automates everything else · 24 plugins · 49 agents · 44 skills · for Claude Code, OpenCode, Codex,
Hivemind turns your traces into reusable skills across agents
The benchmark tasks and evaluation harness for "PhysicianBench: Evaluating LLM Agents in Real-World EHR Environments".
Agent memory for LLMs: 30 runnable Jupyter notebooks covering conversation buffers, vector stores, knowledge graphs, epi
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
SkillsVote: Lifecycle Governance of Agent Skills from Collection, Recommendation to Evolution
AlgoTune is a NeurIPS 2025 benchmark made up of 154 math, physics, and computer science problems. The goal is write code
Odyssey: Empowering Minecraft Agents with Open-World Skills
"DeepCode: Open Agentic Coding (Paper2Code & Text2Web & Text2Backend)"
63 bilingual AI marketing skills (31 VN + 31 Global) for Claude Code, OpenCode, Codex, VS Code. Marketing strategy, cont
✨✨Latest Advances on Neuro-Symbolic Learning in the era of Large Language Models
iOS-opinionated Claude Code workflow automation: Swift 6 migration, Apollo->native SDK removal, stacked PRs, Jira integr
MobileUse: an open-source mobile GUI agent for Android phone automation, AndroidWorld/AndroidLab evaluation, hierarchica