A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
SkillOrchestra: Learning to Route Agents via Skill Transfer
This repository contains the code for SkillOrchestra, a framework for skill-aware orchestration of compound AI systems. Instead of learning a routing policy end-to-end, SkillOrchestra learns fine-grained skills from execution experience and models agent-specific competence and cost under those skills. At deployment, the orchestrator infers the skill demands of the current interaction and selects agents that best satisfy them under an explicit performance-cost trade-off.
# All environments (pipeline + SGLang + retriever)
./scripts/setup/run.sh
# Or individually:
./scripts/setup/env.sh --pipeline # so_env: pipeline, exploration, learning, selection, testing
./scripts/setup/env.sh --sglang # sglang_env: model serving via SGLang
./scripts/setup/retriever.sh # retriever: Qwen3-Embedding, FAISS
.env.example to .env and fill in your values:
cp .env.example .env
OPENAI_API_KEY or OPENAI_GATEWAY_KEY — for LLM calls during learningINDEX_DIR — for FRAMES: path to retrieval index (e.g. wiki.index, wiki.jsonl)HF_HOME — optional, Hugging Face cacheModels must be running before the pipeline can explore, select, or test.
conda activate sglang_env
./scripts/serve/serve_routing.sh
conda activate sglang_env
./scripts/serve/serve_orchestration.sh # retriever use a different retriever environment
Ensure INDEX_DIR is set for the retriever. Edit config/eval_config.json to match your model host/ports if not using localhost.
Activate the pipeline environment and run:
conda activate so_env
# Full pipeline: explore → learn → select → test
python scripts/pipeline.py model-routing \
--dataset nq_validation_qwen \
--output-dir output/nq \
--test-dataset nq_test_qwen \
--phases explore,learn,select,test \
--exploration-samples 40 \
--train-samples 20 \
--val-samples 20
# Use existing exploration data (skip explore)
python scripts/pipeline.py model-routing \
--dataset nq_validation_qwen \
--exploration-data /path/to/inference_results.jsonl \
--output-dir output/nq \
--phases learn,select,test \
--test-dataset nq_test_qwen
# Full pipeline: explore → learn → select → test
python scripts/pipeline.py frames \
--output-dir output/frames \
--eval-script orchestration/eval_frames.py \
--test-samples data/frames_test.jsonl \
--phases explore,learn,select,test \
--exploration-samples 40 \
--train-samples 20 \
--val-samples 20
explore — Run all pool models on the dataset; collect execution traceslearn — Learn a skill handbook from traces (skills, competence, cost) with refinementselect — Generate candidate handbooks and select Pareto-optimal with live validationtest — Evaluate the selected handbook on the test setUse --phases explore, --phases learn,select, etc. to run subsets.
config/pool_config.json — add/remove models, set ports, pricing, and host env vars. Use --pool-models to override at runtime.config/models.py — map agent IDs to your model endpoints.config/eval_config.json — set ip_addr and port for each model when using remote/vLLM servers.--llm-model, --skill-id-model, --max-refinement-rounds, --max-merge-credits, etc. to tune the learning pipeline.--lambda-cost for performance-cost trade-off; --routing-strategy for routing behavior (e.g. weighted_avg, analyze_router_decides).If you find this work helpful, please consider giving a ⭐ and citing our paper 😊
@misc{wang2026skillorchestra,
title={SkillOrchestra: Learning to Route Agents via Skill Transfer},
author={Jiayu Wang and Yifei Ming and Zixuan Ke and Shafiq Joty and Aws Albarghouthi and Frederic Sala},
year={2026},
eprint={2602.19672},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2602.19672},
}
We are here to help! Any questions? Please open an issue and cc Jiayu Wang (milawang@cs.wisc.edu).
Pocket Flow: Codebase to Tutorial
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
npx CLI installing 100+ agents, commands, hooks, and integrations in one command
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.