A community-driven registry for the Claude Code ecosystem. Not affiliated with Anthropic.
Are you the author? Sign in to claim
Cognitive memory for AI agents. Pure Rust, <1ms recall, 2.7MB, zero cloud. Patent Pending.
AuraSDK turns fragile prompt-only agents into auditable, memory-aware, production-ready systems
Deterministic · No fine-tuning · No cloud training · <1ms recall · ~3 MB
Your AI model is smart. But it forgets everything after every conversation.
AuraSDK is a local cognitive runtime that runs alongside any frozen model. It gives agents durable memory, explainability, governed correction, bounded recall reranking, and bounded self-adaptation through experience — all locally, without fine-tuning or cloud training.
pip install aura-memory
from aura import Aura, Level
brain = Aura("./agent_memory")
brain.enable_full_cognitive_stack() # activate all four bounded reranking overlays
# store what happens
brain.store("User always deploys to staging first", level=Level.Domain, tags=["workflow"])
brain.store("Staging deploy prevented 3 production incidents", level=Level.Domain, tags=["workflow"])
# recall — local retrieval with optional bounded cognitive reranking
context = brain.recall("deployment decision") # <1ms, no API call
# inspect advisory hints produced from stored evidence
hints = brain.get_surfaced_policy_hints()
# → [{"action": "Prefer", "domain": "workflow", "description": "deploy to staging first"}]
No API keys. No embeddings required. No cloud. The model stays the same — the cognitive layer becomes more structured, more inspectable, and more useful over time.
⭐ If AuraSDK is useful to you, a GitHub star helps us get funding to continue development from Kyiv.
| Aura | Mem0 | Zep | Cognee | Letta/MemGPT | |
|---|---|---|---|---|---|
| Architecture | 5-layer cognitive engine | Vector + LLM | Vector + LLM | Graph + LLM | LLM orchestration |
| Derived cognitive layers without LLM | Yes — Belief→Concept→Causal→Policy | No | No | No | No |
| Advisory policy hints from experience | Yes — bounded and non-executing | No | No | No | No |
| Learns from agent's own responses | Yes — bounded, auditable, no fine-tuning | No | No | No | No |
| Salience weighting | Yes — what matters persists longer | No | No | No | No |
| Contradiction governance | Yes — explicit, operator-visible | No | No | No | No |
| LLM required | No | Yes | Yes | Yes | Yes |
| Recall latency | <1ms | ~200ms+ | ~200ms | LLM-bound | LLM-bound |
| Works offline | Fully | Partial | No | No | With local LLM |
| Cost per operation | $0 | API billing | Credit-based | LLM + DB cost | LLM cost |
| Binary size | ~3 MB | ~50 MB+ | Cloud service | Heavy (Neo4j+) | Python pkg |
| Memory decay & promotion | Built-in | Via LLM | Via LLM | No | Via LLM |
| Trust & provenance | Built-in | No | No | No | No |
| Encryption at rest | ChaCha20 + Argon2 | No | No | No | No |
| Language | Rust | Python | Proprietary | Python | Python |
Fine-tuning costs thousands of dollars and weeks of work. RAG requires embeddings and a vector database. Context windows are expensive per token.
Aura gives you a third path: a local cognitive runtime that accumulates structured experience between conversations — free, local, sub-millisecond.
Week 1: GPT-4o-mini + Aura Week 1: GPT-4 alone
→ average answers → average answers
Week 4: GPT-4o-mini + Aura Week 4: GPT-4 alone
→ recalls your workflow → still forgets everything
→ surfaces patterns you repeat → same cost per token
→ exposes explainability + correction → no improvement
→ boundedly adapts from experience → no durable learning
→ $0 compute cost → still billing per call
The model stays the same. The cognitive layer gets stronger. That's Aura.
Benchmarked on 1,000 records (Windows 10 / Ryzen 7):
| Operation | Latency | vs Mem0 |
|---|---|---|
| Store | 0.09 ms | ~same |
| Recall (structured) | 0.74 ms | ~270× faster |
| Recall (cached) | 0.48 µs | ~400,000× faster |
| Maintenance cycle | 1.1 ms | No equivalent |
Mem0 recall requires an embedding API call (~200ms+) + vector search. Aura recall is pure local computation.
Aura's full cognitive recall pipeline is active and bounded:
Record → Belief (±5%) → Concept (±4%) → Causal (±3%) → Policy (±2%)
Enable everything in one call:
brain.enable_full_cognitive_stack() # activates all four bounded reranking phases
brain.disable_full_cognitive_stack() # back to raw RRF baseline
Or configure individual phases:
brain.set_belief_rerank_mode("limited") # belief-aware ranking
brain.set_concept_surface_mode("limited") # concept annotations + bounded concept reranking
brain.set_causal_rerank_mode("limited") # causal chain boost
brain.set_policy_rerank_mode("limited") # policy hint shaping
Higher layers also expose advisory surfaced output:
get_surfaced_concepts() — stable concept abstractions over repeated beliefsget_surfaced_causal_patterns() — learned cause→effect patternsget_surfaced_policy_hints() — advisory recommendations (Prefer / Avoid / Warn)Aura also ships operator-facing and plasticity-facing surfaces:
explain_recall()explain_record()provenance_chain()explainability_bundle()capture_experience()ingest_experience_batch()mark_record_salience()get_high_salience_records()get_salience_summary()get_reflection_summaries()get_latest_reflection_digest()get_reflection_digest()get_belief_instability_summary()get_contradiction_clusters()get_contradiction_review_queue()Aura organizes memories into 4 levels across 2 tiers. Important memories persist, trivial ones decay naturally:
CORE TIER (slow decay — weeks to months)
Identity [0.99] Who the user is. Preferences. Personality.
Domain [0.95] Learned facts. Domain knowledge.
COGNITIVE TIER (fast decay — hours to days)
Decisions [0.90] Choices made. Action items.
Working [0.80] Current tasks. Recent context.
SEMANTIC TYPES (modulate decay & promotion)
fact Default knowledge record.
decision More persistent than a standard fact. Promotes earlier.
preference Long-lived user or agent preference.
contradiction Preserved longer for conflict analysis.
trend Time-sensitive pattern tracked over repeated activation.
serendipity Cross-domain discovery record.
One call runs the lifecycle — decay, promotion, consolidation, and archival:
report = brain.run_maintenance() # background memory maintenance
Core Cognitive Runtime
fact, decision, trend, preference, contradiction, serendipity) that influence memory behavior and insightingnamespace="sandbox" keeps test data invisible to production recallTrust & Safety
recorded, retrieved, inferred, generatedAdaptive Memory
brain.feedback(id, useful=True) boosts useful memories, weakens noisebrain.supersede(old_id, new_content) with full version chainsbrain.snapshot("v1") / brain.rollback("v1") / brain.diff("v1","v2")export_context() / import_context() with trust metadataEnterprise & Integrations
store_image() / store_audio_transcript() with media provenance/metrics endpoint with 10+ business-level counters and histogramstelemetry feature flag with OTLP export and 17 instrumented spansStorageBackend trait abstraction (FsBackend + MemoryBackend)Advisory Cognitive Overlays
Explainability & Governed Adaptation
explain_recall(), explain_record(), provenance_chain(), explainability_bundle()Cognitive Guidance
from aura import Aura, TrustConfig
brain = Aura("./data")
tc = TrustConfig()
tc.source_trust = {"user": 1.0, "api": 0.8, "web_scrape": 0.5}
brain.set_trust_config(tc)
# User facts always rank higher than scraped data in recall
brain.store("User is vegan", channel="user")
brain.store("User might like steak restaurants", channel="web_scrape")
results = brain.recall_structured("food preferences", top_k=5)
# -> "User is vegan" scores higher, always
from aura import Aura
brain = Aura("./data")
# Plug in any embedding function: OpenAI, Ollama, sentence-transformers, etc.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")
brain.set_embedding_fn(lambda text: model.encode(text).tolist())
# Now "login problems" matches "Authentication failed" via semantic similarity
brain.store("Authentication failed for user admin")
results = brain.recall_structured("login problems", top_k=5)
Without embeddings, Aura continues to use its local recall pipeline - still fast, still effective.
brain = Aura("./secret_data", password="my-secure-password")
brain.store("Top secret information")
assert brain.is_encrypted() # ChaCha20-Poly1305 + Argon2id
brain = Aura("./data")
# Decisions are treated as higher-value memory
brain.store("Use PostgreSQL over MySQL", semantic_type="decision", tags=["db"])
# Preferences persist longer than generic working notes
brain.store("User prefers dark mode", semantic_type="preference", tags=["ui"])
# Contradictions are preserved for conflict analysis
brain.store("User said vegan but ordered steak", semantic_type="contradiction")
# Search by semantic type
decisions = brain.search(semantic_type="decision")
# Cross-domain insights surface higher-level patterns
insights = brain.insights(phase=2)
# Example:
# [{'insight_type': 'preference_pattern', 'description': 'Preference cluster around ui', ...}]
brain = Aura("./data")
brain.store("Real preference: dark mode", namespace="default")
brain.store("Test: user likes light mode", namespace="sandbox")
# Recall only sees "default" namespace — sandbox is invisible
results = brain.recall_structured("user preference", top_k=5)
Use this when you need inspection-only analytics across isolated namespaces without changing recall behavior.
brain = Aura("./data")
digest = brain.cross_namespace_digest(
namespaces=["default", "sandbox"],
top_concepts_limit=3,
)
# Top concepts per namespace
print(digest["namespaces"][0]["top_concepts"])
# Pairwise overlap
print(digest["pairs"][0]["shared_tags"])
print(digest["pairs"][0]["shared_concept_signatures"])
print(digest["pairs"][0]["shared_causal_signatures"])
HTTP server:
GET /cross-namespace-digest?namespaces=default,sandbox&top_concepts_limit=3
MCP tool:
{
"tool": "cross_namespace_digest",
"arguments": {
"namespaces": ["default", "sandbox"],
"top_concepts_limit": 3
}
}
The digest is read-only. It does not bypass namespace isolation in recall and does not feed training or inference by default.
For richer operator-facing workflows, see examples/V3_OPERATOR_WORKFLOWS.md.
Aura can also observe model output and feed bounded experience back into the cognitive substrate, without retraining the model.
from aura import Aura
brain = Aura("./data")
brain.set_plasticity_mode("limited")
capture = brain.capture_experience(
prompt="How should we deploy this release?",
retrieved_context=[],
model_response="Deploy to staging first, then verify health checks before production.",
session_id="deploy-session-1",
source="model_inference",
)
brain.ingest_experience_batch([capture])
brain.run_maintenance() # queued experience enters the normal cognitive pipeline
This stays bounded and operator-visible:
Recent operator HTTP endpoints:
GET /explain-recordGET /explain-recallGET /explainability-bundleGET /correction-logGET /cross-namespace-digestGET /memory-healthGET /belief-instabilityGET /policy-lifecycleGET /correction-review-queueGET /suggested-correctionsGET /namespace-governance-statusThe killer use case: an agent that remembers your preferences after a week offline, with zero API calls.
See examples/personal_assistant.py for the full runnable script.
from aura import Aura, Level
brain = Aura("./assistant_memory")
# Day 1: User tells the agent about themselves
brain.store("User is vegan", level=Level.Identity, tags=["diet"])
brain.store("User loves jazz music", level=Level.Identity, tags=["music"])
brain.store("User works 10am-6pm", level=Level.Identity, tags=["schedule"])
brain.store("Discuss quarterly report tomorrow", level=Level.Working, tags=["task"])
# Simulate a week passing — run maintenance cycles
for _ in range(7):
brain.run_maintenance() # decay + reflect + consolidate + archive
# Day 8: What does the agent remember?
context = brain.recall("user preferences and personality")
# -> Still remembers: vegan, jazz, schedule (Identity, strength ~0.93)
# -> "quarterly report" decayed heavily (Working, strength ~0.21)
Identity persists. Tasks fade. Important patterns get promoted. Like a real brain.
Give any MCP-compatible AI persistent, self-organizing memory:
pip install aura-memory
Claude Desktop — Settings → Developer → Edit Config:
{
"mcpServers": {
"aura": {
"command": "python",
"args": ["-m", "aura", "mcp", "C:\\Users\\YOUR_NAME\\aura_brain"]
}
}
}
Cursor / VS Code — .cursor/mcp.json or .vscode/mcp.json:
{
"servers": {
"aura": {
"command": "python",
"args": ["-m", "aura", "mcp", "./aura_brain"],
"type": "stdio"
}
}
}
macOS / Linux path:
python -m aura mcp ~/aura_brain
Once connected, Claude automatically has 11 tools:
| Tool | Purpose |
|---|---|
recall | Retrieve relevant memories before answering |
recall_structured | Get memories with scores and metadata |
store | Save a fact, note, or context |
store_code | Save a code snippet at Domain level |
store_decision | Save a decision with reasoning |
search | Filter memories by level or tags |
insights | Memory health stats |
consolidate | Merge similar records |
get | Fetch a specific record by ID |
delete | Remove a record by ID |
maintain | Run a full maintenance cycle |
After connecting, tell Claude: "Before answering, always recall relevant context from memory. After our conversation, store key facts."
If cargo test intermittently fails on Windows with LNK1104 for target\debug\deps\aura-...exe, a stale test process is usually holding the file open. Run:
powershell -ExecutionPolicy Bypass -File .\scripts\cleanup_windows_test_lock.ps1
Then rerun the test command.
Aura includes a standalone web dashboard for visual memory management. Download from GitHub Releases.
./aura-dashboard ./my_brain --port 8000
Features: Analytics · Memory Explorer with filtering · Recall Console with live scoring · Batch ingest
| Platform | Binary |
|---|---|
| Windows x64 | aura-dashboard-windows-x64.exe |
| Linux x64 | aura-dashboard-linux-x64 |
| macOS ARM | aura-dashboard-macos-arm64 |
| macOS x64 | aura-dashboard-macos-x64 |
Try now: — zero install, runs in browser
| Integration | Description | Link |
|---|---|---|
| Ollama | Fully local AI assistant, no API key needed | ollama_agent.py |
| LangChain | Drop-in Memory class + prompt injection | langchain_agent.py |
| LlamaIndex | Chat engine with persistent memory recall | llamaindex_agent.py |
| OpenAI Agents | Dynamic instructions with persistent memory | openai_agents.py |
| Claude SDK | System prompt injection + tool use patterns | claude_sdk_agent.py |
| CrewAI | Tool-based recall/store for crew agents | crewai_agent.py |
| AutoGen | Memory protocol implementation | autogen_agent.py |
| FastAPI | Per-user memory middleware with namespace isolation | fastapi_middleware.py |
FFI (C/Go/C#): aura.h · go/main.go · csharp/Program.cs
More examples: basic_usage.py · encryption.py · agent_memory.py · edge_device.py · maintenance_daemon.py · research_bot.py
Aura uses a Rust core with Python bindings and a local-first memory runtime.
Publicly documented concepts are:
Higher cognitive layers may be present in the SDK as bounded reranking overlays and advisory inspection surfaces. They are not default runtime decision-making or behavior control.
The public repository documents the user-facing behavior and integration surface. Detailed internal architecture, tuning, and research notes are intentionally not published.
Contributions welcome! See CONTRIBUTING.md for setup instructions and guidelines, or check the open issues.
⭐ If Aura saves you time, a GitHub star helps others discover it and helps us continue development.
Built in Kyiv, Ukraine 🇺🇦 — including during power outages.
Solo developer project. If you find this useful, your star means more than you think.
Run Claude Code as an MCP server so any agent can delegate coding tasks to it
Browser automation using accessibility snapshots instead of screenshots
English-first Korean equity intelligence MCP — DART filings, foreign-holder 5%-rule flows, activist filings, KRX news. F
Unity MCP acts as a bridge between AI assistants and your Unity Editor. Give your LLM tools to manage assets, control sc