A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
Agentic AI memory with Ebbinghaus forgetting curve decay. +16pp better recall than Mem0 on LoCoMo.
Every session, your AI assistant starts from zero. It asks the same questions, forgets your preferences, re-learns your stack. There is no memory between conversations.
YourMemory fixes that with a one-command install that plugs into Claude, Cursor, Cline, Windsurf, or any MCP client. It gives your AI a persistent memory layer modelled on human cognition:
Zero infrastructure required. SQLite by default, Postgres for teams.
Three external datasets, all scripts open source and reproducible. Full methodology in BENCHMARKS.md.
The hardest standard benchmark for long-term memory systems. Each question is backed by ~53 conversation sessions; the model must retrieve the right one(s) from the haystack.
| Metric | Score |
|---|---|
| Recall@5 (any gold session in top-5) | 89.4% |
| Recall-all@5 (all gold sessions in top-5) | 84.8% |
| nDCG@5 (ranking quality) | 87.4% |
By question type (Recall@5):
| Question Type | Recall@5 | n |
|---|---|---|
| single-session-assistant | 98.2% | 56 |
| knowledge-update | 96.2% | 78 |
| multi-session | 95.5% | 133 |
| single-session-preference | 90.0% | 30 |
| temporal-reasoning | 84.2% | 133 |
| single-session-user | 72.9% | 70 |
Conversations spanning weeks to months. Every system ingests the same session summaries in the same order.
| System | Recall@5 | 95% CI |
|---|---|---|
| YourMemory (BM25 + vector + graph + decay) | 59% | 56–61% |
| Zep Cloud | 28% | 26–30% |
| Supermemory | 31%* | 28–33% |
| Mem0 | 18%* | 16–20% |
2× better recall than Zep Cloud across all 10 samples. * Supermemory and Mem0 exhausted free-tier quotas mid-benchmark; scores computed over full 1,534 pairs using 0 for unfinished samples.
| System | BOTH_FOUND@5 |
|---|---|
| YourMemory (vector + BM25 + entity graph) | 71.5% |
| YourMemory (no entity edges) | 59.5% |
Entity graph edges add +12 pp — they traverse from Fact 1 to Fact 2 even when Fact 2 has low embedding similarity to the query.
Writeup: I built memory decay for AI agents using the Ebbinghaus forgetting curve
Supports Python 3.11–3.14. No Docker, no database setup. All memory stored locally in ~/.yourmemory/.
| Behavior | Detail |
|---|---|
| Activation | Requires a one-time token. Visit yourmemoryai.xyz, enter your email, verify with a 6-digit code, and copy your token. |
| Global rule injection | yourmemory-setup writes memory instructions into ~/.cursor/rules/memory.mdc and other detected AI client config files (Claude, VS Code, etc.) so the assistant can call memory tools automatically. You can remove these files at any time. |
| MCP tool behavior | The recall_memory tool can be called by your AI assistant when persistent context would help. The assistant decides when to call it based on the request. |
| Telemetry | A UUID (no personal data) is sent on first setup only. Opt out: YOURMEMORY_TELEMETRY=off |
Activation steps:
pip install yourmemory
yourmemory-register <your-token>
yourmemory-setup
Requirement — local model: YourMemory extracts memories with a local model via Ollama. Install Ollama and start it —
yourmemory-setupthen pulls the default model (qwen2.5:7b, ~4.7 GB) automatically. To use a lighter model you already have, setYOURMEMORY_OLLAMA_MODEL(e.g.llama3.2:3b) before setup.Backend:
yourmemory-setupasks whether to use DuckDB (zero setup, default) or Postgres (shared/production — you provide aDATABASE_URL; needs the pgvector extension).
Two built-in browser UIs — no extra setup, start automatically with the MCP server.
http://localhost:3033/uiA full read/write view of everything stored in memory.
| What you see | Details |
|---|---|
| Stats bar | Total · Strong ≥50% · Fading 5–50% · Near prune <10% |
| Agent tabs | All / User / per-agent views |
| Memory cards | Content · strength bar · category · recall count · last accessed |
| Filters | Category (fact / strategy / assumption / failure) · Sort by strength, recency, recall |
Pass ?user=<id> to pre-load a specific user: http://localhost:3033/ui?user=sachit
http://localhost:3033/graphAn interactive force-directed map of how memories connect.
http://localhost:3033/graph?memoryId=42&userId=sachit&depth=2
The only memory system that can answer questions without making any LLM API call.
yourmemory ask "what database does this project use"
# → YourMemory uses DuckDB locally and Postgres in production.
yourmemory ask "what port does the dashboard run on"
# → 3033
yourmemory ask "how do I fix a kubernetes deployment"
# → Not enough memory context to answer without Claude.
When memory is strong enough, it answers instantly — zero tokens, zero cloud cost, zero latency. When it isn't, it declines cleanly rather than hallucinating.
| Query | Mem0 / Zep / LangMem | YourMemory |
|---|---|---|
| "What port does the server run on?" | Full LLM API call | Instant, $0 |
| "What database does this project use?" | Full LLM API call | Instant, $0 |
| "How do I fix a k8s deployment?" | Full LLM API call | Declines → Claude |
| Privacy | Query sent to cloud | Never leaves your machine |
MCP tools are called at the AI's discretion. The API proxy removes that uncertainty — it intercepts every LLM call, injects relevant memories automatically, and handles store_memory / update_memory without any model configuration.
Start the YourMemory server (yourmemory), then point your LLM client at localhost:3033:
from openai import OpenAI
client = OpenAI(
api_key="sk-...",
base_url="http://localhost:3033/proxy/openai"
)
# Memory is injected automatically — no other changes needed
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What database do I use?"}]
)
from anthropic import Anthropic
client = Anthropic(
api_key="sk-ant-...",
base_url="http://localhost:3033/proxy/anthropic"
)
response = client.messages.create(
model="claude-opus-4-8",
max_tokens=1024,
messages=[{"role": "user", "content": "What database do I use?"}]
)
Pass X-YourMemory-User to isolate memory per person:
client = OpenAI(
api_key="sk-...",
base_url="http://localhost:3033/proxy/openai",
default_headers={"X-YourMemory-User": "sachit"}
)
On every request the proxy:
store_memory and update_memory as tools — the model calls them when it learns something newStreaming note: recall injection works for all requests. Tool call interception (store/update) works for non-streaming requests only — streaming passes through and tools execute on the next turn.
Three tools, called by your AI automatically.
| Tool | When your AI calls it | What it does |
|---|---|---|
recall_memory(query, current_path?) | Start of every task | Surfaces memories ranked by similarity × decay strength; spatial boost for path-matched memories |
store_memory(content, importance, category?, context_paths?) | After learning something new | Embeds, deduplicates, stores with decay; tags optional file/dir paths |
update_memory(id, new_content, importance) | When a stored fact is outdated | Re-embeds and replaces; logs old content to audit trail |
# Store with spatial context
store_memory(
"Sachit prefers tabs over spaces in Python",
importance=0.9,
category="fact",
context_paths=["/projects/backend"]
)
# Next session — spatial boost fires when working in that directory
recall_memory("Python formatting", current_path="/projects/backend")
# → {"content": "Sachit prefers tabs over spaces in Python", "strength": 0.87}
| Category | Half-life | Best for |
|---|---|---|
strategy | ~38 days | Patterns that worked, architectural decisions |
fact | ~24 days | Preferences, identity, stable knowledge |
assumption | ~19 days | Inferred context, uncertain beliefs |
failure | ~11 days | Errors, wrong approaches, environment-specific issues |
Memory strength decays exponentially. Importance and recall frequency slow that decay:
effective_λ = base_λ × (1 − importance × 0.8)
strength = clamp(importance × e^(−effective_λ × active_days) × (1 + recall_count × 0.2), 0, 1)
hybrid_score = 0.4 × bm25_norm + 0.6 × cosine_similarity
active_days counts only days the user was active — vacations don't cause memory loss. Memories below strength 0.05 are pruned automatically every 24 hours.
Session wrap-up: recalled memory IDs are tracked per session. When a session goes idle (30 min default), those memories get a recall_count boost. Set YOURMEMORY_SESSION_IDLE to change the window.
Recall throttling: identical (user, query) pairs are cached within a configurable window. Set YOURMEMORY_RECALL_COOLDOWN (seconds, default 0 = off).
Retrieval runs in two rounds:
Round 1 — Hybrid search: cosine similarity + BM25 keyword scoring, returns top-k candidates above threshold.
Round 2 — Graph expansion: BFS traversal from Round 1 seeds surfaces memories that share context but not vocabulary — connected via semantic or entity edges.
recall("Python backend")
Round 1 → [1] Python/MongoDB (sim=0.61)
[2] DuckDB/spaCy (sim=0.19)
Round 2 → [5] Docker/Kubernetes (sim=0.29 — below cut-off, surfaced via shared entity "backend")
Chain-aware pruning: a decayed memory is kept alive if any graph neighbour is above the prune threshold. Related memories age together.
Before storing, YourMemory checks whether the new memory is about the same entity as the nearest existing one:
"Sachit uses DuckDB" vs "YourMemory uses DuckDB"
subject: Sachit subject: YourMemory
→ different entities → stored separately ✓
"YourMemory uses DuckDB" vs "YourMemory stores data in DuckDB"
subject: YourMemory subject: YourMemory
→ same entity → merged ✓
Subject comparison embeds the first two tokens of each sentence — no hardcoded word lists, generalises to any language.
Multiple agents can share one YourMemory instance — each with isolated private memories and controlled access to shared context.
from src.services.api_keys import register_agent
result = register_agent(
agent_id="coding-agent",
user_id="sachit",
can_read=["shared", "private"],
can_write=["shared", "private"],
)
# → result["api_key"] — ym_xxxx (shown once only)
# Agent stores a private failure memory
store_memory(
"Staging uses self-signed cert — skip SSL verify",
importance=0.7, category="failure",
api_key="ym_xxxx", visibility="private"
)
# Recalls shared + its own private memories; other agents see shared only
recall_memory("staging SSL", api_key="ym_xxxx")
| Component | Role |
|---|---|
| DuckDB | Default vector DB — zero setup, native cosine similarity |
| NetworkX | Default graph backend — persists at ~/.yourmemory/graph.pkl |
| sentence-transformers | Local embeddings (multi-qa-mpnet-base-dot-v1, 768 dims) |
| spaCy | Local NLP for deduplication and entity extraction |
| APScheduler | Automatic 24h decay and pruning job |
| PostgreSQL + pgvector | Optional — for teams or large datasets |
| Neo4j | Optional graph backend |
Claude / Cline / Cursor / Any MCP client
│
├── recall_memory(query, current_path?, api_key?)
│ └── throttle check → embed → hybrid search (Round 1)
│ → graph BFS expansion (Round 2)
│ → score = sim × strength
│ → spatial boost (+0.08) if current_path matches context_paths
│ → temporal boost (+0.25) if query has time window expression
│ → session tracking → recall_count bump on session end
│
├── store_memory(content, importance, category?, context_paths?, api_key?)
│ └── question? → reject
│ subject-aware dedup → same entity? merge/reinforce : new
│ embed() → INSERT → index_memory() → graph node + edges
│ record_activity(user_id) → active days log
│
└── update_memory(id, new_content, importance)
└── log old content → memory_history (audit trail)
embed(new_content) → UPDATE → refresh graph node
Vector DB (Round 1) Graph DB (Round 2)
DuckDB (default) NetworkX (default)
memories.duckdb graph.pkl
├── embedding FLOAT[768] ├── nodes: memory_id, strength
├── importance FLOAT └── edges: sim × verb_weight ≥ 0.4
├── recall_count INTEGER
├── context_paths JSON Neo4j (opt-in)
├── created_at TIMESTAMP └── bolt://localhost:7687
├── visibility VARCHAR
├── agent_id VARCHAR
user_activity (active days log)
memory_history (supersession audit)
PRs are welcome. See CONTRIBUTORS.md for contributors who have already improved YourMemory.
Copyright 2026 Sachit Misra — Licensed under CC-BY-NC-4.0.
Free for: personal use, education, academic research, open-source projects. Not permitted: commercial use without a separate written agreement.
Commercial licensing: mishrasachit1@gmail.com
A Jetbrains IDE IntelliJ plugin aimed to provide coding agents the ability to leverage intelliJ's indexing of the codeba
MCP server integration for DaVinci Resolve Studio
A trilingual (繁中 / English / 简中) learning roadmap for agentic AI: from LLM basics to multi-agent systems, with 240+ cura