A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
Open-source AI assistant ecosystem with MCP integrations, multimodal workflows, IoT support, and cross-platform voice in
English | 简体中文
py-xiaozhi is a lightweight, cross-platform multi-modal AI interaction framework built on Python's async architecture. It supports real-time voice streaming, vision-language tasks, and IoT device control. Deployable across Windows, macOS, Linux desktops, and ARM embedded platforms (Raspberry Pi, Horizon Robotics RDK, Jetson Nano), it bridges the gap between Large Language Models and physical hardware — out of the box.
Evolved from the xiaozhi-esp32 firmware project. Officially adopted by D-Robotics (xiaozhi-in-rdk) as an upstream dependency.

Zero to Xiaozhi Client (Video Tutorial)
py-xiaozhi/
├── main.py # Application entry point
├── src/
│ ├── activation/ # Device activation
│ ├── audio_codecs/ # Audio codecs
│ ├── audio_processing/ # Wake word detection
│ ├── bootstrap/ # Application bootstrap & dependency injection
│ ├── constants/ # Constants
│ ├── core/ # Core infrastructure (event bus, state management, task management, etc.)
│ ├── logging/ # Logging subsystem
│ ├── mcp/ # MCP tool system
│ │ ├── mcp_server.py # MCP server
│ │ └── tools/ # Tool modules (music/camera/screenshot/app/weather/volume)
│ ├── plugins/ # Plugin system (audio, UI, MCP, wake word, shortcuts)
│ ├── protocols/ # Communication protocols (WebSocket/MQTT)
│ ├── ui/ # User interface
│ │ ├── gui/ # PySide6 + QML graphical interface
│ │ ├── cli/ # Command line interface
│ │ └── gpio/ # GPIO embedded interface
│ └── utils/ # Utility functions
├── libs/ # Third-party native libraries
│ ├── libopus/ # Opus audio codec library
│ └── webrtc_apm/ # WebRTC audio processing module
├── models/ # Wake word models
├── assets/ # Static resources
├── scripts/ # Auxiliary scripts
├── documents/ # VitePress documentation site
├── pyproject.toml # Project configuration
└── build.json # Build configuration
# Clone project
git clone https://github.com/huangjunsen0406/py-xiaozhi.git
cd py-xiaozhi
# Base install (CLI / GPIO mode)
uv sync # Recommended (uv users)
# or: pip install -e . # pip users
# GUI mode (extra: PySide6 + qasync)
uv sync --extra gui # Recommended (uv users)
# or: pip install -e '.[gui]' # pip users
# Full development environment (GUI + test / packaging tools)
uv sync --extra gui --group dev
# Code formatting
./format_code.sh
# Run program - GUI mode (default; requires gui extra)
python main.py
# Run program - CLI mode (base install is enough)
python main.py --mode cli
# Specify communication protocol
python main.py --protocol websocket # WebSocket (default)
python main.py --protocol mqtt # MQTT protocol
async/await syntax, avoid blocking operationsConfigManager for unified configuration accesssrc/mcp/tools/ directoryProtocol abstract base classsrc/plugins/ +----------------+
| |
v |
+------+ Wake/Button +------------+ | +------------+
| IDLE | -----------> | CONNECTING | --+-> | LISTENING |
+------+ +------------+ +------------+
^ |
| | Voice Recognition Complete
| +------------+ v
+--------- | SPEAKING | <-----------------+
Playback +------------+
Complete
bug, feature, docs, refactor, or maintenanceIn no particular order
Xiaoxia zhh827 SmartArduino-Li Honggang HonestQiao vonweller Sun Weigong isamu2025 Rain120 kejily Radio bilibili Jun Cyber Intelligence
Whether it's API resources, device compatibility testing, or financial support, every contribution makes the project more complete
MCP server integration for DaVinci Resolve Studio
mcp-language-server gives MCP enabled clients access semantic tools like get definition, references, rename, and diagnos
Run Claude Code as an MCP server so any agent can delegate coding tasks to it
Browser automation using accessibility snapshots instead of screenshots