A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
This is the code repo for Paper DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Paper link https://arxiv.org/pdf/2410.14803
Website and Demo: https://ai-agents-2030.github.io/DistRL/
We will release the Model Weights and Dataset later.
Supported Training Module as detailed in the paper:
Agent Support:
Android-in-the-Wild Task Sets:
DDP Multi-GPU Training Support:
accelerate. If you're operating with a single GPU, this feature can be disabled. Running AutoUI with the DistRL algorithm requires only 15GB of GPU memory. This support is provided should you wish to experiment with more extensive setups.Please check the requirements.txt file for all necessary dependencies.
Create Necessary Directories: Set up the required directories as specified in the configuration .yaml files (e.g., Tmp path, agg_traj path, save_path, etc.).
Update Tokens: Replace placeholders with your actual tokens in the configuration files in scripts/config:
huggingface_tokenwandb_tokengemini_tokenReview and Enhance Prompts: Clear and well-structured prompts are essential for improving the evaluator's performance in assessing task completion. By crafting precise and detailed prompts, we can guide the model to produce more accurate and reliable evaluations. We have provided demonstration examples in data/environment/android/prompts.txt for your reference. Please do adjust data/environment/android/evaluate.py based on our hints and comments.
To set up the Android environment for the DistRL to interact with, refer to the environment. Before moving on, you should be able to view this script.
Weights & Biases is a tool for tracking machine learning experiments. To integrate Wandb into our framework:
pip install wandb.wandb login and entering your API key when prompted.wandb_token in the configuration files with your API key.For more detailed instructions, refer to the Wandb Quickstart Guide.
The main entry point of the program is the run.py script. You can specify different experiments by passing the configuration name. Configuration files are located in the scripts/config/ directory.
Setup Steps:
Set Up Conda Environment:
Install Miniconda.
Create a new environment named distrl:
conda create -n distrl python=3.8
conda activate distrl
Clone the Repository:
Clone the repository and check out the master branch:
git clone <repository_url>
cd <repository_directory>
git checkout master
Install the package:
pip install -e .
Set Up the Environment:
Test the Setup:
multimachine/default.yaml and multimachine/worker.yaml.save_path defined in the config file multimachine/worker.yaml, default as /home/<usrname>/logs/worker.run.py script with the worker configuration to test:CUDA_VISIBLE_DEVICES=0 python scripts/run.py --config-path config/multimachine --config-name worker +thread_id=0
Run from host machine:
accelerate launch --config_file config/accelerate_config/default_config.yaml scripts/run.py --config-path config/multimachine --config-name host
All content of this work is under Apache License v2.0, including codebase, data, and model checkpoints.
Consider citing our paper!
@article{wang2024distrl,
title={Distrl: An asynchronous distributed reinforcement learning framework for on-device control agents},
author={Wang, Taiyi and Wu, Zhihao and Liu, Jianheng and Hao, Jianye and Wang, Jun and Shao, Kun},
journal={arXiv preprint arXiv:2410.14803},
year={2024}
}
干净、强大、属于你的 AI Agent 平台 --AI agents, without the clutter.
Pocket Flow: Codebase to Tutorial
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
An AI-powered custom node for ComfyUI designed to enhance workflow automation and provide intelligent assistance