A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
ICML 2026 · Plug-and-play long-term memory for LLM agents
PlugMem is a plug-and-play long-term memory system for LLM agents. Instead of storing and retrieving raw interaction histories, PlugMem organizes experience into compact, reusable knowledge units, allowing agents to recall what matters to agent decision-making with minimal context overhead.
The module is task-agnostic by design and can be integrated into existing agent pipelines with minimal effort, serving as a general memory backbone for diverse environments such as dialogue agents, knowledge-intensive QA, and web automation.
For more details, please see the full paper: https://arxiv.org/abs/2603.03296
[2026-05] 🚀 Plugin release — PlugMem now ships as installable plugins for AI coding agents.
Integrations available for OpenClaw and Claude Code (see plugin branch).
Highlights: inspect your memory graph, test retrieval interactively, and replay past agent sessions.
[2026-05] 🏆 New SOTA on LongMemEval & HotpotQA — With light task adaptation, PlugMem reaches 90.2 Acc on LongMemEval and 79.1 F1 / 91.1% LLM-Judge Acc on HotpotQA (multi-hop), both state-of-the-art results. Because the framework is task-agnostic, it can serve as a drop-in backbone for other work on these benchmarks. → Step-by-step reproduction guide
[2026-04] 🎉 PlugMem accepted to ICML 2026!
# init PlugMem memory graph
mg = MemoryGraph()
# init memory sequence
mem = Memory(...)
mem.append(...)
mem.close()
# insert memory sequence into memory graph
mg.insert(mem)
# retrieve memory and perform reasoning on retrieved nodes
mg.retrieve_and_reason(...)
plugin branch), with a built-in Memory Inspector UI for visualizing the memory graph, browsing individual memories, testing retrieval, and replaying agent trajectories.
Graph view: explore the full memory graph across semantic, procedural, and episodic nodes
Browse view: inspect, filter, and manage individual memory entries
src/ and follow their installation docs to set up the environment.openai==2.6.1.# under src/
cd src
# clone modified AgentOccam
git clone https://github.com/jizej/AgentOccam
# clone
git clone https://github.com/web-arena-x/webarena
# Enable Scriptbrowserenv to run under async loop (if needed)
cp src/webarena_patch/envs.py src/webarena/browser_env/envs.py
# Enable OPENAI_API_KEY + AZURE_ENDPOINT for trajectory evaluation (if needed)
cp src/webarena_patch/openai_utils.py src/webarena/llms/providers/openai_utils.py
export OPENAI_API_KEY=<your_openai_api_key>
export AZURE_ENDPOINT=<your_azure_endpoint>
export DIR_PATH="/<your_path_to_PlugMem>/data"
export QWEN_BASE_URL="http://<your_qwen_host>:8000/v1"
export EMBEDDING_BASE_URL="http://<your_embedding_host>:8001/v1/embeddings"
cd host_local_inference
# Qwen (vLLM) server
bash vllm_deploy.sh
# NV-Embed-v2 server
bash nv_embed_v2_deploy.sh
mkdir -p "$DIR_PATH/episodic_memory" \
"$DIR_PATH/semantic_memory" \
"$DIR_PATH/procedural_memory" \
"$DIR_PATH/tag" \
"$DIR_PATH/subgoal"
cd src/eval/webarena
python eval_agentoccam.py
eval_agentoccam.py:
--config: Path to the YAML config file (required).--replay-trajectory/--no-replay-trajectory: Replay a saved trajectory before evaluation.--trajectory-dir: Directory containing trajectory JSON files for replay.--load_memory_graph/--no-load_memory_graph: Load a persisted memory graph from disk.--refresh-embeddings/--no-refresh-embeddings: Refresh embeddings when loading the memory graph.--read-only-memory/--no-read-only-memory: Use the memory graph without inserting new memories.--disable-memory-graph/--no-disable-memory-graph: Turn off all memory-graph operations.cd src/eval/longmemeval
python eval_longmemeval_all.py
cd src/eval/hotpotqa
# It may take several hours to structure memory for hotpotqa_corpus.json.
python build.py
#Rebuild the memory graph from structuring result and run test
python eval_hotpotqa_all.py
If you use our code or data, or otherwise found our work helpful, please cite our paper:
@misc{yang2026plugmemtaskagnosticpluginmemory,
title={PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents},
author={Ke Yang and Zixi Chen and Xuan He and Jize Jiang and Michel Galley and Chenglong Wang and Jianfeng Gao and Jiawei Han and ChengXiang Zhai},
year={2026},
eprint={2603.03296},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2603.03296},
}
Pocket Flow: Codebase to Tutorial
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
干净、强大、属于你的 AI Agent 平台 --AI agents, without the clutter.
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.