A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
AI writes code. This automates everything else · 24 plugins · 49 agents · 44 skills · for Claude Code, OpenCode, Codex,
A modular runtime and orchestration system for AI agents.
24 plugins · 49 agents · 44 skills (across all repos) · 30k lines of lib code · 3,518 tests · 5 platforms
Plugins distributed as standalone repos under agent-sh org - agentsys is the marketplace & installer
Commands · Installation · Website · Discussions
Built for Claude Code · Codex CLI · OpenCode · Cursor · Kiro
New skills, agents, and integrations ship constantly. Follow for real-time updates:
AI models can write code. That's not the hard part anymore. The hard part is everything around it - task selection, branch management, code review, artifact cleanup, CI, PR comments, deployment. AgentSys is the runtime that orchestrates agents to handle all of it - structured pipelines, gated phases, specialized agents, and persistent state that survives session boundaries.
Building custom skills, agents, hooks, or MCP tools? agnix is the CLI + LSP linter that catches config errors before they fail silently - real-time IDE validation, auto suggestions, auto-fix, and 423 rules for Claude Code, Codex, OpenCode, Cursor, Kiro, Copilot, Gemini CLI, Cline, Windsurf, Roo Code, Amp, and more.
An agent orchestration system - 24 plugins, 49 agents (39 file-based + 10 role-based specialists in audit-project), and 44 skills that compose into structured pipelines for software development. Each plugin lives in its own standalone repo under the agent-sh org. agentsys is the marketplace and installer that ties them together.
Each agent has a single responsibility, a specific model assignment, and defined inputs/outputs. Pipelines enforce phase gates so agents can't skip steps. State persists across sessions so work survives interruptions.
The system runs on Claude Code, OpenCode, Codex CLI, Cursor, and Kiro. Install via the marketplace or the npm installer, and the plugins are fetched automatically from their repos.
Code does code work. AI does AI work.
Certainty levels exist because not all findings are equal:
| Level | Meaning | Action |
|---|---|---|
| HIGH | Definitely a problem | Safe to auto-fix |
| MEDIUM | Probably a problem | Needs context |
| LOW | Might be a problem | Needs human judgment |
This came from testing on 1,000+ repositories.
Structured prompts and enriched context do more for output quality than model tier. Benchmarked March 2026 on real tasks (/can-i-help and /onboard against glide-mq), measured with claude -p --output-format json. Models: Claude Opus 4 and Claude Sonnet 4.
Same task, same repo, same prompt ("I want to improve docs"):
| Configuration | Cost | Output tokens | Result quality |
|---|---|---|---|
| Opus, no agentsys | $1.10 | 2,841 | Generic recommendations, no project-specific context |
| Opus + agentsys | $1.95 | 5,879 | Specific recommendations with effort estimates, convention awareness, breaking change detection |
| Sonnet + agentsys | $0.66 | 6,084 | Comparable to Opus + agentsys: specific, actionable, project-aware |
Sonnet + agentsys produced more output with higher specificity than raw Opus - at 40% lower cost.
Once the pipeline provides structured prompts, enriched repo-intel data, and phase-gated workflows, the model does less heavy lifting. The gap between Sonnet and Opus narrows:
| Plugin | Opus | Sonnet | Savings |
|---|---|---|---|
| /onboard | $1.10 | $0.30 | 73% |
| /can-i-help | $1.34 | $0.23 | 83% |
Both models reached the same outcome quality - Sonnet just costs less to get there. The structured pipeline captures most of the gains that would otherwise require a more expensive model.
| Scenario | Model cost | Quality |
|---|---|---|
| Without agentsys | Need Opus for good results | Depends on model capability |
| With agentsys | Sonnet is sufficient | Pipeline handles the structure, model handles judgment |
The investment shifts from model spend to pipeline design. Better prompts, richer context, enforced phases - these compound in ways that model upgrades alone don't.
| Command | What it does |
|---|---|
/next-task | Task workflow: discovery, implementation, PR, merge |
/prepare-delivery | Pre-ship quality gates: deslop, review, validation, docs sync |
/gate-and-ship | Quality gates then ship (/prepare-delivery + /ship) |
/banthis | Durable negative memory: persist banned agent behaviors |
/agnix | Lint agent configurations (423 rules) |
/ship | PR creation, CI monitoring, merge |
/deslop | Clean AI slop patterns |
/perf | Performance investigation with baselines and profiling |
/drift-detect | Compare plan vs implementation |
/audit-project | Multi-agent iterative code review |
/enhance | Plugin, agent, and prompt analyzers |
/repo-intel | Unified static analysis - git history, AST symbols, project metadata |
/sync-docs | Sync documentation with code changes |
/learn | Research topics, create learning guides |
/consult | Cross-tool AI consultation |
/debate | Structured debate between AI tools |
/release | Versioned release with ecosystem detection |
/skillers | Workflow pattern learning and automation |
/skill-curator | Create and improve reliable SKILL.md files |
/system-prompt-curator | Create and improve autonomous agent system prompts |
/onboard | Codebase orientation for newcomers |
/can-i-help | Match contributor skills to project needs |
Each command works standalone. Together, they compose into end-to-end pipelines.
44 skills included across the plugins:
| Category | Skills |
|---|---|
| Workflow | discover-tasks, prepare-delivery, check-test-coverage, orchestrate-review, validate-delivery |
| Message Queues | glide-mq-migrate-bee, glide-mq-migrate-bullmq, glide-mq |
| Enhancement | enhance-agent-prompts, enhance-claude-memory, enhance-cross-file, enhance-docs, enhance-hooks, enhance-orchestrator, enhance-plugins, enhance-prompts, enhance-skills, skill-curator, system-prompt-curator |
| Performance | baseline, benchmark, code-paths, investigation-logger, perf-analyzer, profile, theory-gatherer, theory-tester |
| Cleanup | deslop, sync-docs |
| Code Review | audit-project |
| AI Collaboration | consult, debate, learn, recommend, skillers-compact |
| Onboarding | can-i-help, onboard |
| Release | release |
| Analysis | drift-analysis, repo-intel |
| Memory | banthis |
| Linting | agnix |
External skill plugins (standalone repos, installed separately):
| Category | Skills | Plugin |
|---|---|---|
| Message Queues | glide-mq, glide-mq-migrate-bullmq, glide-mq-migrate-bee | agent-sh/glidemq |
| Languages | mojo | agent-sh/mojo |
| Languages | ada-spark | agent-sh/ada-spark |
Skills are the reusable implementation units. Agents invoke skills; commands orchestrate agents. When you install a plugin, its skills become available to all agents in that session.
| Section | What's there |
|---|---|
| The Approach | Why it's built this way |
| Benchmarks | Sonnet + agentsys vs raw Opus |
| Commands | All 24 commands overview |
| Skills | 44 skills across plugins |
| Skill-Only Plugins | glide-mq and other non-command plugins |
| Command Details | Deep dive into each command |
| How Commands Work Together | Standalone vs integrated |
| Design Philosophy | The thinking behind the architecture |
| Installation | Get started |
| Research & Testing | What went into building this |
| Documentation | Links to detailed docs |
Plugins that provide skills without a / command. Installed alongside agentsys; skills become available to all agents.
Build message queues, background jobs, and workflow orchestration with glide-mq - high-performance Node.js queue on Valkey/Redis.
| Skill | What it does |
|---|---|
glide-mq | Greenfield queue development - queues, workers, ordering, rate limiting, flows, broadcast, step jobs |
glide-mq-migrate-bullmq | Migrate from BullMQ to glide-mq - API mapping, breaking changes, feature comparison |
glide-mq-migrate-bee | Migrate from Bee-Queue to glide-mq - API mapping, pattern conversion |
Key features: per-key ordering, group concurrency, runtime group rate limiting (job.rateLimitGroup()), token bucket, DAG workflows, broadcast pub/sub, step jobs, deduplication, serverless producers.
Skill plugin → | glide-mq docs → | npm →
Purpose: Complete task-to-production automation.
What happens when you run it:
Phase 9 uses the orchestrate-review skill to spawn parallel reviewers (code quality, security, performance, test coverage) plus conditional specialists.
Agents involved:
| Agent | Model | Role |
|---|---|---|
| task-discoverer | sonnet | Finds and ranks tasks from your source |
| worktree-manager | haiku | Creates git worktrees and branches |
| exploration-agent | sonnet | Deep codebase analysis before planning |
| planning-agent | opus | Designs step-by-step implementation plan |
| implementation-agent | opus | Writes the actual code |
| prepare-delivery:test-coverage-checker | sonnet | Validates tests exist and are meaningful |
| prepare-delivery:delivery-validator | sonnet | Final checks before shipping |
| ci-monitor | haiku | Watches CI status |
| ci-fixer | sonnet | Fixes CI failures and review comments |
| simple-fixer | haiku | Executes mechanical edits |
Cross-plugin agent:
| Agent | Plugin | Role |
|---|---|---|
| deslop-agent | deslop | Removes AI artifacts before review |
| sync-docs-agent | sync-docs | Updates documentation |
Usage:
/next-task # Start new workflow
/next-task --resume # Resume interrupted workflow
/next-task --status # Check current state
/next-task --abort # Cancel and cleanup
Purpose: Run all pre-ship quality gates without shipping. Use after completing implementation manually or outside /next-task.
What it runs (in order):
/prepare-delivery # Run all quality gates
/prepare-delivery --skip-review # Skip review loop
/prepare-delivery --skip-docs # Skip docs sync
/prepare-delivery --base=develop # Against a specific base branch
Does NOT create PRs or push - use /ship or /gate-and-ship after.
Purpose: Quality gates then ship in one command. Chains /prepare-delivery then /ship.
/gate-and-ship # Full: quality gates + ship
/gate-and-ship --skip-review # Skip review, still ship
/gate-and-ship --base=develop # Against a specific base branch
Composability:
/gate-and-ship = /prepare-delivery + /ship
Each piece runs independently - use /prepare-delivery alone to review before deciding to ship, or /ship alone if already validated.
Purpose: Durable negative memory for repeated agent mistakes. Turn a user's "stop doing this" correction into a persistent rule in CLAUDE.md or AGENTS.md.
banthis is a tiny standalone CLI plus skill. It maintains a managed banned-behaviors section, supports project or global targets, and includes an init meta-rule so agents learn when to invoke it automatically.
What it does:
| Command | Use |
|---|---|
banthis add "<title>" "<rule>" | Add or update a banned behavior |
banthis list | List current bans |
banthis show | Print the managed section |
banthis remove "<title>" | Remove a ban |
banthis init | Install the meta-rule that teaches agents to call banthis |
Usage:
/banthis "stop ending with vague optional follow-up offers"
banthis add "No vague endings" "Do not end with vague optional follow-up offers."
banthis init --file AGENTS.md
Purpose: Lint agent configurations before they break your workflow. The first dedicated linter for AI agent configs.
agnix is a standalone open-source project that provides the validation engine. This plugin integrates it into your workflow.
The problem it solves:
Agent configurations are code. They affect behavior, security, and reliability. But unlike application code, they have no linting. You find out your SKILL.md is malformed when the agent fails. You discover your hooks have security issues when they're exploited. You realize your CLAUDE.md has conflicting rules when the AI behaves unexpectedly.
agnix catches these issues before they cause problems.
What it validates:
| Category | What It Checks |
|---|---|
| Structure | Required fields, valid YAML/JSON, proper frontmatter |
| Security | Prompt injection vectors, overpermissive tools, exposed secrets |
| Consistency | Conflicting rules, duplicate definitions, broken references |
| Best Practices | Tool restrictions, model selection, trigger phrase quality |
| Cross-Platform | Compatibility across Claude Code, Codex, OpenCode, Cursor, Kiro, Copilot, Gemini CLI, Cline, Windsurf, Roo Code, Amp, and more |
423 validation rules (129 auto-fixable) derived from:
Supported files:
| File Type | Examples |
|---|---|
| Skills | SKILL.md, */SKILL.md |
| Memory | CLAUDE.md, AGENTS.md, .github/CLAUDE.md |
| Hooks | .claude/settings.json, hooks configuration |
| MCP | *.mcp.json, MCP server configs |
| Cursor | .cursor/rules/*.mdc, .cursorrules |
| Copilot | .github/copilot-instructions.md |
| Kiro | .kiro/steering/**/*.md, .kiro/agents/*.json, .kiro/hooks/*.kiro.hook, POWER.md |
| Windsurf | .windsurf/rules/**/*.md, .windsurf/workflows/**/*.md, .windsurfrules |
| Roo Code | .roo/rules/*.md, .roo/rules-{mode}/*.md, .roomodes, .rooignore, .roorules |
| Gemini CLI | GEMINI.md, .gemini/settings.json, gemini-extension.json |
| OpenCode | opencode.json |
| Amp | .agents/checks/**/*.md, .amp/settings.json |
CI/CD Integration:
agnix outputs SARIF format for GitHub Code Scanning. Add it to your workflow:
- name: Lint agent configs
run: agnix --format sarif > results.sarif
- uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: results.sarif
Usage:
/agnix # Validate current project
/agnix --fix # Auto-fix fixable issues
/agnix --strict # Treat warnings as errors
/agnix --target claude-code # Only Claude Code rules
/agnix --format sarif # Output for GitHub Code Scanning
Agent: agnix-agent (sonnet model)
External tool: Requires agnix CLI
npm install -g agnix # Install via npm
# or
cargo install agnix-cli # Install via Cargo
# or
brew install agnix # Install via Homebrew (macOS)
Why use agnix:
Purpose: Takes your current branch from "ready to commit" to "merged PR."
What happens when you run it:
Platform Detection:
| Type | Detected |
|---|---|
| CI | GitHub Actions, GitLab CI, CircleCI, Jenkins, Travis |
| Deploy | Railway, Vercel, Netlify, Fly.io, Render |
| Project | Node.js, Python, Rust, Go, Java |
Review Comment Handling:
Every comment gets addressed. No exceptions. The workflow categorizes comments and handles each:
If something can't be fixed, the workflow replies explaining why and resolves the thread.
Usage:
/ship # Full workflow
/ship --dry-run # Preview without executing
/ship --strategy rebase # Use rebase instead of squash
Purpose: Finds AI slop - debug statements, placeholder text, verbose comments, TODOs - and removes it.
How detection works:
Three phases run in sequence:
Phase 1: Regex Patterns (HIGH certainty)
console.log, print(), dbg!(), println!()// TODO, // FIXME, // HACKPhase 2: Multi-Pass Analyzers (MEDIUM certainty)
Phase 3: CLI Tools (LOW certainty, optional)
Languages supported: JavaScript/TypeScript, Python, Rust, Go, Java
Usage:
/deslop # Report only (safe)
/deslop apply # Fix HIGH certainty issues
/deslop apply src/ 10 # Fix 10 issues in src/
Thoroughness levels:
quick - Phase 1 only (fastest)normal - Phase 1 + Phase 2 (default)deep - All phases if tools availablePurpose: Structured performance investigation with baselines, profiling, and evidence-backed decisions.
10-phase methodology (based on recorded real performance investigation sessions):
Agents and skills:
| Component | Role |
|---|---|
| perf-orchestrator | Coordinates all phases |
| perf-theory-gatherer | Generates hypotheses from git history and code |
| perf-theory-tester | Validates hypotheses with controlled experiments |
| perf-analyzer | Synthesizes findings into recommendations |
| perf-code-paths | Maps entrypoints and likely hot paths |
| perf-investigation-logger | Structured evidence logging |
Usage:
/perf # Start new investigation
/perf --resume # Resume previous investigation
Phase flags (advanced):
/perf --phase baseline --command "npm run bench" --version v1.2.0
/perf --phase breaking-point --param-min 1 --param-max 500
/perf --phase constraints --cpu 1 --memory 1GB
/perf --phase hypotheses --hypotheses-file perf-hypotheses.json
/perf --phase optimization --change "reduce allocations"
/perf --phase decision --verdict stop --rationale "no measurable improvement"
Purpose: Compares your documentation and plans to what's actually in the code.
The problem it solves:
Your roadmap says "user authentication: done." But is it actually implemented? Your GitHub issue says "add dark mode." Is it already in the codebase? Plans drift from reality. This command finds the drift.
How it works:
JavaScript collectors gather data (fast, token-efficient)
Single Opus call performs semantic analysis
auth/, login.js, session.ts)Why this approach:
Multi-agent collection wastes tokens on coordination. JavaScript collectors are fast and deterministic. One well-prompted LLM call does the actual analysis. Result: 77% token reduction vs multi-agent approaches.
Tested on 1,000+ repositories before release.
Usage:
/drift-detect # Full analysis
/drift-detect --depth quick # Quick scan
Purpose: Multi-agent code review that iterates until issues are resolved.
What happens when you run it:
Up to 10 specialized role-based agents run based on your project:
| Agent | When Active | Focus Area |
|---|---|---|
| code-quality-reviewer | Always | Code quality, error handling |
| security-expert | Always | Vulnerabilities, auth, secrets |
| performance-engineer | Always | N+1 queries, memory, blocking ops |
| test-quality-guardian | Always | Coverage, edge cases, mocking |
| architecture-reviewer | If 50+ files | Modularity, patterns, SOLID |
| database-specialist | If DB detected | Queries, indexes, transactions |
| api-designer | If API detected | REST, errors, pagination |
| frontend-specialist | If frontend detected | Components, state, UX |
| backend-specialist | If backend detected | Services, domain logic |
| devops-reviewer | If CI/CD detected | Pipelines, configs, secrets |
Findings are collected and categorized by severity (critical/high/medium/low). All non-false-positive issues get fixed automatically. The loop repeats until no open issues remain.
Usage:
/audit-project # Full review
/audit-project --quick # Single pass
/audit-project --resume # Resume from queue file
/audit-project --domain security # Security focus only
/audit-project --recent # Only recent changes
Purpose: Analyzes your prompts, plugins, agents, docs, hooks, and skills for improvement opportunities.
Eight analyzers run in parallel:
| Analyzer | What it checks |
|---|---|
| plugin-enhancer | Plugin structure, MCP tool definitions, security patterns |
| agent-enhancer | Agent frontmatter, prompt quality |
| claudemd-enhancer | CLAUDE.md/AGENTS.md structure, token efficiency |
| cross-file-enhancer | Cross-file consistency (tools vs frontmatter, duplicate rules, conflicts) |
| docs-enhancer | Documentation readability, RAG optimization |
| prompt-enhancer | Prompt engineering patterns, clarity, examples |
| hooks-enhancer | Hook frontmatter, structure, safety |
| skills-enhancer | SKILL.md structure, trigger phrases |
Each finding includes:
Auto-learning: Detects obvious false positives (pattern docs, workflow gates) and saves them for future runs. Reduces noise over time without manual suppression files.
Usage:
/enhance # Run all analyzers
/enhance --focus=agent # Just agent prompts
/enhance --apply # Apply HIGH certainty fixes
/enhance --show-suppressed # Show what's being filtered
/enhance --no-learn # Analyze but don't save false positives
Purpose: Unified static analysis - git history, AST symbols, and project metadata in one plugin.
What it provides:
Output is cached at {state-dir}/repo-intel.json (external repo-intel plugin) and {state-dir}/repo-map.json (agentsys internal repo-map library). {state-dir} is .claude/, .opencode/, or .codex/ depending on your platform.
Why it matters:
Tools like /drift-detect, /onboard, /can-i-help, and planners consume this data instead of re-scanning the repo every time. 9 plugins use repo-intel data automatically.
Usage:
/repo-intel init # First-time scan
/repo-intel update # Incremental update
/repo-intel query hotspots # Most active files
/repo-intel query ownership src/ # Who owns a path
/repo-intel query bus-factor # Knowledge risk
Backed by agent-analyzer Rust binary.
Purpose: Sync documentation with actual code changes - find outdated refs, update CHANGELOG, flag stale examples.
The problem it solves:
You refactor auth.js into auth/index.js. Your README still says import from './auth'. You rename a function. Three docs still reference the old name. You ship a feature. CHANGELOG doesn't mention it. Documentation drifts from code. This command finds the drift.
What it detects:
| Category | Examples |
|---|---|
| Broken references | Imports to moved/renamed files, deleted exports |
| Version mismatches | Doc says v2.0, package.json says v2.1 |
| Stale code examples | Import paths that no longer exist |
| Missing CHANGELOG | feat: and fix: commits without entries |
Auto-fixable vs flagged:
| Auto-fixable (apply mode) | Flagged for review |
|---|---|
| Version number updates | Removed exports referenced in docs |
| CHANGELOG entries for commits | Code examples needing context |
| Function renames |
Usage:
/sync-docs # Check what docs need updates (safe)
/sync-docs apply # Apply safe fixes
/sync-docs report src/ # Check docs related to src/
/sync-docs --all # Full codebase scan
Purpose: Research any topic online and create a comprehensive learning guide with RAG-optimized indexes.
What it does:
Depth levels:
| Depth | Sources | Use Case |
|---|---|---|
| brief | 10 | Quick overview |
| medium | 20 | Default, balanced |
| deep | 40 | Comprehensive |
Output structure:
agent-knowledge/
CLAUDE.md # Master index (updated each run)
AGENTS.md # Index for OpenCode/Codex
recursion.md # Topic-specific guide
resources/
recursion-sources.json # Source metadata with quality scores
Usage:
/learn recursion # Default (20 sources)
/learn react hooks --depth=deep # Comprehensive (40 sources)
/learn kubernetes --depth=brief # Quick overview (10 sources)
/learn python async --no-enhance # Skip enhancement pass
Agent: learn-agent (sonnet model)
Purpose: Get a second opinion from another AI CLI tool without leaving your current session.
What it does:
--continue)Supported tools:
| Tool | Default Model (high) | Reasoning Control |
|---|---|---|
| Claude | claude-opus-4-6 | max-turns |
| Gemini | gemini-3.1-pro-preview | built-in |
| Codex | gpt-5.3-codex | model_reasoning_effort |
| OpenCode | (user-selected or default) | --variant |
| Copilot | (default) | none |
Usage:
/consult "Is this the right approach?" --tool=gemini --effort=high
/consult "Review for performance issues" --tool=codex
/consult "Suggest alternatives" --tool=claude --effort=max
/consult "Continue from where we left off" --continue
/consult "Explain this error" --context=diff --tool=gemini
Agent: consult-agent (sonnet model for orchestration)
Purpose: Stress-test ideas through structured multi-round debate between two AI CLI tools.
What it does:
Usage:
# Natural language
/debate codex vs gemini about microservices vs monolith
/debate with claude and codex about our auth implementation
/debate thoroughly gemini vs codex about database schema design
/debate codex vs gemini 3 rounds about event sourcing
# Explicit flags
/debate "Should we use event sourcing?" --tools=claude,gemini --rounds=3 --effort=high
/debate "Valkey vs PostgreSQL for caching" --tools=codex,opencode
# With codebase context
/debate "Is our current approach correct?" --tools=gemini,codex --context=diff
Options:
| Flag | Description |
|---|---|
--tools=TOOL1,TOOL2 | Proposer and challenger (comma-separated) |
--rounds=N | Number of debate rounds, 1–5 (default: 2) |
--effort=low|medium|high|max | Reasoning depth per tool call |
--context=diff|file=PATH|none | Codebase context passed to both tools |
Agent: debate-orchestrator (opus model for orchestration)
Versioned release with automatic ecosystem and tooling detection
/release # Patch release (auto-discovers how this repo releases)
/release minor # Minor version bump
/release major --dry-run # Preview what would happen
The release agent discovers how your repo releases before executing:
release: target, npm release script, scripts/release.*Supports 12+ ecosystems: npm, cargo, python, go, maven, gradle, ruby, nuget, dart, hex, packagist, swift.
Agent: release-agent (sonnet model)
Skill: release (generic fallback workflow)
Learn from your workflow patterns and suggest automations
/skillers show # Display current config and knowledge stats
/skillers compact # Analyze recent transcripts, extract patterns
/skillers compact --days=14 # Analyze older transcripts
/skillers recommend # Get automation suggestions from accumulated knowledge
Reads your Claude Code conversation transcripts, identifies recurring patterns (pain points, repeated workflows, wishes), clusters them into weighted themes, and suggests skills, hooks, or agents to automate them.
No per-turn overhead - it reads transcripts that Claude Code already saves.
Agents: skillers-compactor (sonnet), skillers-recommender (opus)
Skills: skillers-compact, recommend
Create and improve reliable
SKILL.mdfiles
/skill-curator "create a skill for reviewing background jobs"
/skill-curator --improve path/to/SKILL.md --category review
The skill curator focuses on trigger quality, clear scope, router patterns, concrete Skip unless: gates, token budgets, and agnix-ready structure across Claude Code, Codex, OpenCode, Cursor, Kiro, and similar tools.
Skill: skill-curator
Create and improve autonomous coding-agent system prompts
/system-prompt-curator "GitHub issue resolver"
/system-prompt-curator --improve path/to/prompt.md --for-orchestrator
The system prompt curator rewrites prompts around task-matched identity, phased workflow, explicit tools, evidence-based completion criteria, and realistic error recovery examples. It separates prompt guidance from harness-level checks that belong in code.
Skill: system-prompt-curator
Purpose: Get oriented in any codebase in under 3 minutes.
What happens when you run it:
74% fewer tokens than manual onboarding. Validated on 100 repos across JS/TS, Rust, Go, Python, C/C++, Java, and Deno.
Depth levels:
| Level | Time | Data |
|---|---|---|
| quick | ~2s | Manifest + README + structure |
| normal | ~5s | + CLAUDE.md/AGENTS.md + CI + repo-intel |
| deep | ~15s | + repo-intel AST symbols |
Supported manifests: package.json, Cargo.toml, go.mod, pyproject.toml, deno.json, CMakeLists.txt, meson.build, setup.py, pom.xml, build.gradle. Detects monorepos (npm/pnpm/lerna/Cargo workspaces, Python libs/, Deno workspaces).
Usage:
/onboard # Current repo
/onboard /path/to/repo # Specific repo
/onboard --depth=deep # Include AST data
Agent: onboard-agent (sonnet model)
Purpose: Match a contributor's skills to specific areas where they can help.
What happens when you run it:
Matching:
| Developer profile | Gets recommended |
|---|---|
| New to stack | Good-first areas with clear patterns |
| Experienced | Hard problems in pain-point areas |
| Test-focused | Test gaps in frequently-changed files |
| Bug-focused | Bugspot files + relevant open issues |
| Docs-focused | Stale documentation with code examples |
Usage:
/can-i-help # Current repo
/can-i-help /path/to/repo # Specific repo
/can-i-help --depth=deep # Include AST data
Agent: can-i-help-agent (sonnet model)
Standalone use:
/deslop apply # Just clean up your code
/sync-docs # Just check if docs need updates
/prepare-delivery # Run all quality gates (no ship)
/ship # Just ship this branch
/gate-and-ship # Quality gates + ship in one command
/audit-project # Just review the codebase
Composable delivery chain:
/prepare-delivery = quality gates only (deslop, review, validation, docs)
/ship = PR + CI + merge only
/gate-and-ship = /prepare-delivery + /ship
/next-task = full workflow (discovery → implementation → /prepare-delivery → /ship)
Full integrated workflow:
When you run /next-task, it orchestrates everything:
/next-task picks task → explores codebase → plans implementation
↓
implementation-agent writes code
↓
deslop-agent + prepare-delivery:test-coverage-checker + /simplify (parallel)
↓
review loop iterates until approved
↓
prepare-delivery:delivery-validator checks requirements
↓
sync-docs-agent syncs documentation
↓
/ship creates PR → monitors CI → merges
The workflow tracks state so you can resume from any point.
Frontier models write good code. That's solved. What's not solved:
1. One agent, one job, done extremely well
Same principle as good code: single responsibility. The exploration-agent explores. The implementation-agent implements. Phase 9 spawns multiple focused reviewers. No agent tries to do everything. Specialized agents, each with narrow scope and clear success criteria.
2. Pipeline with gates, not a monolith
Same principle as DevOps. Each step must pass before the next begins. Can't push before review. Can't merge before CI passes. Hooks enforce this - agents literally cannot skip phases.
3. Tools do tool work, agents do agent work
If static analysis, regex, or a shell command can do it, don't ask an LLM. Pattern detection uses pre-indexed regex. File discovery uses glob. Platform detection uses file existence checks. The LLM only handles what requires judgment.
4. Agents don't need to know how tools work
The slop detector returns findings with certainty levels. The agent doesn't need to understand the three-phase pipeline, the regex patterns, or the analyzer heuristics. Good tool design means the consumer doesn't need implementation details.
5. Build tools where tools don't exist
Many tasks lack existing tools. JavaScript collectors for drift-detect. Multi-pass analyzers for slop detection. The result: agents receive structured data, not raw problems to figure out.
6. Research-backed prompt engineering
Documented techniques that measurably improve results:
7. Validate plan and results, not every step
Approve the plan. See the results. The middle is automated. One plan approval unlocks autonomous execution through implementation, review, cleanup, and shipping.
8. Right model for the task
Match model capability to task complexity:
Quality compounds. Poor exploration → poor plan → poor implementation → review cycles. Early phases deserve the best model.
9. Persistent state survives sessions
Two JSON files track everything: what task, what phase. Sessions can die and resume. Multiple sessions run in parallel on different tasks using separate worktrees.
10. Delegate everything automatable
Agents don't just write code. They:
If it can be specified, it can be delegated.
11. Orchestrator stays high-level
The main workflow orchestrator doesn't read files, search code, or write implementations. It launches specialized agents and receives their outputs. Keeps the orchestrator's context window available for coordination rather than filled with file contents.
12. Composable, not monolithic
Every command works standalone. /deslop cleans code without needing /next-task. /ship merges PRs without needing the full workflow. Pieces compose together, but each piece is useful on its own.
/plugin marketplace add agent-sh/agentsys
/plugin install next-task@agentsys
/plugin install ship@agentsys
npm install -g agentsys && agentsys
Interactive installer for Claude Code, OpenCode, Codex CLI, Cursor, and Kiro.
# Non-interactive install
agentsys --tool claude # Single tool
agentsys --tool cursor # Cursor (project-scoped skills + commands)
agentsys --tool kiro # Kiro (project-scoped steering + skills + agents)
agentsys --tools "claude,opencode" # Multiple tools
agentsys --development # Dev mode (bypasses marketplace)
Required:
For GitHub workflows:
gh) authenticatedFor GitLab workflows:
glab) authenticatedFor /repo-intel:
For /agnix:
npm install -g agnix, cargo install agnix-cli, or brew install agnix)Local diagnostics (optional):
npm run detect # Platform detection (CI, deploy, project type)
npm run verify # Tool availability + versions
The system is built on research, not guesswork.
Knowledge base (agent-docs/): 8,000 lines of curated documentation from Anthropic, OpenAI, Google, and Microsoft covering:
Testing:
Methodology:
/perf investigation phases based on recorded real performance investigation sessions| Topic | Link |
|---|---|
| Installation | docs/INSTALLATION.md |
| Cross-Platform Setup | docs/CROSS_PLATFORM.md |
| Usage Examples | docs/USAGE.md |
| Architecture | docs/ARCHITECTURE.md |
| Workflow | Link |
|---|---|
| /next-task Flow | docs/workflows/NEXT-TASK.md |
| /ship Flow | docs/workflows/SHIP.md |
| Topic | Link |
|---|---|
| Slop Patterns | docs/reference/SLOP-PATTERNS.md |
| Agent Reference | docs/reference/AGENTS.md |
MIT License | Made by Avi Fenesh
干净、强大、属于你的 AI Agent 平台 --AI agents, without the clutter.
Native macOS app to monitor Claude AI usage limits and watch your coding sessions live
An AI-powered custom node for ComfyUI designed to enhance workflow automation and provide intelligent assistance
npx CLI installing 100+ agents, commands, hooks, and integrations in one command