A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
Persistent semantic memory server for MCP - Give your AI long-term memory that survives across conversations. Lightweigh
Persistent semantic memory server for AI assistants via Model Context Protocol (MCP)
Cognio is a Model Context Protocol (MCP) server that provides persistent semantic memory for AI assistants. Unlike ephemeral chat history, Cognio stores context permanently and enables semantic search across conversations.
Built for:
git clone https://github.com/0xReLogic/Cognio.git
cd Cognio
docker-compose up -d
Server runs at http://localhost:8080
The MCP server automatically configures supported AI clients on first start:
Supported Clients:
Quick Setup:
Run the auto-setup script to configure all clients at once:
cd mcp-server
npm run setup
This generates MCP configs for all 9 supported clients automatically.
Manual Configuration:
See mcp-server/README.md for client-specific MCP configuration examples.
On first run, Cognio auto-generates cognio.md in your workspace with usage guide for AI tools.
# Save a memory
curl -X POST http://localhost:8080/memory/save \
-H "Content-Type: application/json" \
-d '{"text": "Docker allows running apps in containers", "project": "LEARNING"}'
# Search memories
curl "http://localhost:8080/memory/search?q=containers"
Or use naturally in your AI client:
"Search my memories for Docker information"
"Remember this: FastAPI is a modern Python web framework"
Access the interactive memory dashboard:
http://localhost:8080/ui
Features:
The dashboard auto-detects the API server, so it works on localhost, Docker containers, and remote deployments.
When using the MCP server, you have access to 11 specialized tools:
| Tool | Description |
|---|---|
save_memory | Save text with optional project/tags (auto-tagging enabled) |
search_memory | Semantic search with project filtering |
list_memories | List memories with pagination and filters |
get_memory_stats | Get storage statistics and insights |
archive_memory | Soft delete a memory (recoverable) |
delete_memory | Permanently delete a memory by ID |
export_memories | Export memories to JSON or Markdown |
summarize_text | Summarize long text (extractive or LLM-based) |
set_active_project | Set active project context (auto-applies to all operations) |
get_active_project | View currently active project |
list_projects | List all available projects from database |
Active Project Workflow:
1. list_projects() → See: Helios-LoadBalancer (45), Cognio-Memory (23), ...
2. set_active_project("Helios-LoadBalancer")
3. save_memory("Cache TTL is 300s") → Auto-saves to Helios-LoadBalancer
4. search_memory("cache settings") → Auto-searches in Helios-LoadBalancer only
5. list_memories() → Lists only Helios-LoadBalancer memories
Project Isolation:
Always specify a project name OR use set_active_project to keep memories organized and prevent mixing contexts between different workspaces.
| Method | Endpoint | Description |
|---|---|---|
| GET | /health | Health check |
| POST | /memory/save | Save new memory |
| GET | /memory/search | Semantic/Hybrid search |
| GET | /memory/list | List memories with filters |
| DELETE | /memory/{id} | Delete memory by ID |
| POST | /memory/bulk-delete | Bulk delete by project |
| GET | /memory/stats | Get statistics |
| GET | /memory/export | Export memories |
| POST | /memory/summarize | Summarize long text |
Interactive docs: http://localhost:8080/docs
Environment variables (see .env.example):
Copy the example and edit your local overrides:
cp .env.example .env
# Database
DB_PATH=./data/memory.db
# Embeddings
EMBED_MODEL=all-MiniLM-L6-v2
EMBED_DEVICE=cpu
EMBEDDING_CACHE_PATH=./data/embedding_cache.pkl
# API
API_HOST=0.0.0.0
API_PORT=8080
# Optional API key for auth
API_KEY=your-secret-key
# Search
DEFAULT_SEARCH_LIMIT=5
SIMILARITY_THRESHOLD=0.4
HYBRID_ENABLED=true
HYBRID_MODE=rerank # candidate | rerank
HYBRID_ALPHA=0.6 # 0..1, higher = more semantic
HYBRID_RERANK_TOPK=100 # rerank candidate pool size
# LEANN vector search (optional)
LEANN_ENABLED=false
LEANN_INDEX_PATH=./data/leann/memories.leann
LEANN_BACKEND=hnsw
LEANN_LAZY_BUILD=true
LEANN_RECOMPUTE_ON_SEARCH=true
LEANN_WARMUP_ON_START=false
# Summarization
SUMMARIZATION_ENABLED=true
SUMMARIZATION_METHOD=abstractive # extractive | abstractive
SUMMARIZATION_EMBED_MODEL=all-MiniLM-L6-v2
# Auto-tagging (Optional)
AUTOTAG_ENABLED=true
LLM_PROVIDER=groq
GROQ_API_KEY=your-groq-key
GROQ_MODEL=openai/gpt-oss-120b
# OPENAI_API_KEY=your-openai-api-key
# OPENAI_MODEL=gpt-4o-mini
# Performance
MAX_TEXT_LENGTH=10000
BATCH_SIZE=32
SUMMARIZE_THRESHOLD=50
# Logging
LOG_LEVEL=info
Auto-Tagging Models:
openai/gpt-oss-120b - High qualitygpt-4o-mini - OpenAI, fast and cheapllama-3.3-70b-versatile - Groq, balancedllama-3.1-8b-instant - Groq, fastestSee .env.example for all available options and recommendations.
cognio/
├── src/ # Core application
│ ├── main.py # FastAPI app
│ ├── config.py # Environment config
│ ├── models.py # Data schemas
│ ├── database.py # SQLite operations
│ ├── embeddings.py # Semantic search
│ ├── memory.py # Memory CRUD
│ ├── autotag.py # Auto-tagging
│ └── utils.py # Helpers
│
├── mcp-server/ # MCP integration
│ ├── index.js # MCP server
│ └── package.json # Dependencies
│
├── scripts/ # Utilities
│ ├── setup-clients.js # Auto-config AI clients
│ ├── backup.sh # Database backup
│ └── migrate.py # Schema migrations
│
├── tests/ # Test suite
├── docs/ # Documentation
└── examples/ # Usage examples
# Install dependencies
poetry install
# Run tests
pytest
# Start development server
uvicorn src.main:app --reload
| Operation | Time | Notes |
|---|---|---|
| Save memory | ~20ms | Including embedding |
| Search (1k memories) | ~15ms | Semantic similarity |
| Search (10k memories) | ~50ms | Still fast |
| Model load | ~3s | One-time on startup |
MIT License - see LICENSE
Built for better AI conversations
MCP server integration for DaVinci Resolve Studio
mcp-language-server gives MCP enabled clients access semantic tools like get definition, references, rename, and diagnos
Run Claude Code as an MCP server so any agent can delegate coding tasks to it
Browser automation using accessibility snapshots instead of screenshots