A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
44 packages found
Code and Data for "MIRAI: Evaluating LLM Agents for Event Forecasting"
A repo lists papers related to LLM based agent
[ICML2025 Oral] LLM-SRBench: A New Benchmark for Scientific Equation Discovery with Large Language Models
Awesome papers involving LLMs in Social Science.
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural langu
🧙🏻 Code and benchmark for our Findings of ACL 2024 paper - "TimeChara: Evaluating Point-in-Time Character Hallucinatio
[ICLR'25] OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures?
A curated list of Generative AI tools, works, models, and references
总结Prompt&LLM论文,开源数据&模型,AIGC应用
xLAM: A Family of Large Action Models to Empower AI Agent Systems
[NeurIPS 2024 D&B] GTA: A Benchmark for General Tool Agents & [arXiv 2026] GTA-2
[NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Po
[ICLR 2025 Oral] This is the official repo for the paper "LLM-SR" on Scientific Equation Discovery and Symbolic Regressi
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
Asynchronous LLM Agent playing games of Mafia against human players
👾 Open source implementation of the ChatGPT Code Interpreter
Towards Large Multimodal Models as Visual Foundation Agents
Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan
Odyssey: Empowering Minecraft Agents with Open-World Skills
[CVPR2024 Highlight] Editable Scene Simulation for Autonomous Driving via LLM-Agent Collaboration
A Systematic Survey of Deep Research
ML-Dev-Bench is a benchmark for evaluating AI agents against various ML development tasks.
Awesome LLM Papers and repos on very comprehensive topics.
Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".
Search and explore Hugging Face models, datasets, Spaces, and documentation
AI Observability & Evaluation
The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.
🔴 VERY LARGE AI TOOL LIST! 🔴 Curated list of AI Tools - Updated 2026
[CVPR 2025 🔥]A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)
[Up-to-date] A curated list of resources on graph-empowered agents and agent-facilitated graph learning (Graphs Meet Age
LLM Agent that leverages cheminformatics tools to provide informed responses.
Yunjue Agent: A Fully Reproducible, Zero-Start In-Situ Self-Evolving Agent System for Open-Ended Tasks
scAgent: No-code single-cell analysis for every biologist
[ICLR2026] The official repository for the CodeGym project: "Generalizable End-to-End Tool-Use RL with Synthetic CodeGym
[ACL 2024 Findings] MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning https://arxiv.org/
AIlice is a fully autonomous, general-purpose AI agent.
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
MR. Video: MapReduce is the Principle for Long Video Understanding
RepairAgent is an autonomous LLM-based agent for software repair.
A simple yet versatile context engineered for scalable online data collection
🌱 A little course on Reinforcement Learning Environments for evaluating and training Language Models
非线智能 NoneLinear - ReLE评测:中文AI大模型能力评测(持续更新):目前已囊括374个大模型,覆盖chatgpt、gpt-5.4、谷歌gemini-3.1-pro、Claude-4.6、文心ERNIE-X1.1、ERNIE
Your own GPT-powered Personal Assistant to whom you can ORDER or INSTRUCT to do some task or search for something using