Cognio

Persistent semantic memory server for AI assistants via Model Context Protocol (MCP)

Cognio is a Model Context Protocol (MCP) server that provides persistent semantic memory for AI assistants. Unlike ephemeral chat history, Cognio stores context permanently and enables semantic search across conversations.

Built for:

Personal knowledge base that grows over time
Multi-project context management
Research notes and learning journal
Conversation history with semantic retrieval

Features

Semantic Search: Find memories by meaning using sentence-transformers
LEANN Vector Search (Optional): Lazy-built index with on-demand recomputation to reduce startup memory
Multilingual Support: Search in 100+ languages seamlessly
Persistent Storage: SQLite-based storage that survives across sessions
Project Organization: Organize memories by project and tags
Auto-Tagging: Automatic tag generation via LLM (GPT-4, Groq, etc)
Text Summarization: Extractive and abstractive summarization for long texts
MCP Integration: One-click setup for VS Code, Claude, Cursor, and more
RESTful API: Standard HTTP API with OpenAPI documentation
Export Capabilities: Export to JSON or Markdown format
Docker Support: Simple deployment with docker-compose

Quick Start

1. Start the Server

hljs language-bash

git clone https://github.com/0xReLogic/Cognio.git
cd Cognio
docker-compose up -d

Server runs at http://localhost:8080

2. Auto-Configure AI Clients

The MCP server automatically configures supported AI clients on first start:

Supported Clients:

Claude Desktop
Claude Code (CLI)
VS Code (GitHub Copilot)
Cursor
Continue.dev
Cline
Windsurf
Kiro
Gemini CLI

Quick Setup:

Run the auto-setup script to configure all clients at once:

hljs language-bash

cd mcp-server
npm run setup

This generates MCP configs for all 9 supported clients automatically.

Manual Configuration:

See mcp-server/README.md for client-specific MCP configuration examples.

On first run, Cognio auto-generates cognio.md in your workspace with usage guide for AI tools.

3. Test It

hljs language-bash

# Save a memory
curl -X POST http://localhost:8080/memory/save \
  -H "Content-Type: application/json" \
  -d '{"text": "Docker allows running apps in containers", "project": "LEARNING"}'

# Search memories
curl "http://localhost:8080/memory/search?q=containers"

Or use naturally in your AI client:

hljs language-arduino

"Search my memories for Docker information"
"Remember this: FastAPI is a modern Python web framework"

4. Web UI Dashboard

Access the interactive memory dashboard:

hljs language-bash

http://localhost:8080/ui

Features:

Browse and search all memories
Add/edit memories with markdown preview
View statistics and insights
Organize by project and tags
Bulk operations (select, delete)
Dark/light theme toggle
Works locally and in Docker

The dashboard auto-detects the API server, so it works on localhost, Docker containers, and remote deployments.

Documentation

API Reference - Complete endpoint documentation
Examples - Usage patterns and integrations
Quickstart - Installation and configuration

MCP Tools

When using the MCP server, you have access to 11 specialized tools:

Tool	Description
`save_memory`	Save text with optional project/tags (auto-tagging enabled)
`search_memory`	Semantic search with project filtering
`list_memories`	List memories with pagination and filters
`get_memory_stats`	Get storage statistics and insights
`archive_memory`	Soft delete a memory (recoverable)
`delete_memory`	Permanently delete a memory by ID
`export_memories`	Export memories to JSON or Markdown
`summarize_text`	Summarize long text (extractive or LLM-based)
`set_active_project`	Set active project context (auto-applies to all operations)
`get_active_project`	View currently active project
`list_projects`	List all available projects from database

Active Project Workflow:

hljs language-scss

1. list_projects() → See: Helios-LoadBalancer (45), Cognio-Memory (23), ...
2. set_active_project("Helios-LoadBalancer")
3. save_memory("Cache TTL is 300s") → Auto-saves to Helios-LoadBalancer
4. search_memory("cache settings") → Auto-searches in Helios-LoadBalancer only
5. list_memories() → Lists only Helios-LoadBalancer memories

Project Isolation:
Always specify a project name OR use set_active_project to keep memories organized and prevent mixing contexts between different workspaces.

API Endpoints

Method	Endpoint	Description
GET	`/health`	Health check
POST	`/memory/save`	Save new memory
GET	`/memory/search`	Semantic/Hybrid search
GET	`/memory/list`	List memories with filters
DELETE	`/memory/{id}`	Delete memory by ID
POST	`/memory/bulk-delete`	Bulk delete by project
GET	`/memory/stats`	Get statistics
GET	`/memory/export`	Export memories
POST	`/memory/summarize`	Summarize long text

Interactive docs: http://localhost:8080/docs

Configuration

Environment variables (see .env.example):

Copy the example and edit your local overrides:

hljs language-bash

cp .env.example .env

hljs language-bash

# Database
DB_PATH=./data/memory.db

# Embeddings
EMBED_MODEL=all-MiniLM-L6-v2
EMBED_DEVICE=cpu
EMBEDDING_CACHE_PATH=./data/embedding_cache.pkl

# API
API_HOST=0.0.0.0
API_PORT=8080
# Optional API key for auth
API_KEY=your-secret-key

# Search
DEFAULT_SEARCH_LIMIT=5
SIMILARITY_THRESHOLD=0.4
HYBRID_ENABLED=true
HYBRID_MODE=rerank        # candidate | rerank
HYBRID_ALPHA=0.6          # 0..1, higher = more semantic
HYBRID_RERANK_TOPK=100    # rerank candidate pool size

# LEANN vector search (optional)
LEANN_ENABLED=false
LEANN_INDEX_PATH=./data/leann/memories.leann
LEANN_BACKEND=hnsw
LEANN_LAZY_BUILD=true
LEANN_RECOMPUTE_ON_SEARCH=true
LEANN_WARMUP_ON_START=false

# Summarization
SUMMARIZATION_ENABLED=true
SUMMARIZATION_METHOD=abstractive   # extractive | abstractive
SUMMARIZATION_EMBED_MODEL=all-MiniLM-L6-v2

# Auto-tagging (Optional)
AUTOTAG_ENABLED=true
LLM_PROVIDER=groq
GROQ_API_KEY=your-groq-key
GROQ_MODEL=openai/gpt-oss-120b
# OPENAI_API_KEY=your-openai-api-key
# OPENAI_MODEL=gpt-4o-mini

# Performance
MAX_TEXT_LENGTH=10000
BATCH_SIZE=32
SUMMARIZE_THRESHOLD=50

# Logging
LOG_LEVEL=info

Auto-Tagging Models:

openai/gpt-oss-120b - High quality
gpt-4o-mini - OpenAI, fast and cheap
llama-3.3-70b-versatile - Groq, balanced
llama-3.1-8b-instant - Groq, fastest

See .env.example for all available options and recommendations.

Project Structure

hljs language-bash

cognio/
├── src/                # Core application
│   ├── main.py         # FastAPI app
│   ├── config.py       # Environment config
│   ├── models.py       # Data schemas
│   ├── database.py     # SQLite operations
│   ├── embeddings.py   # Semantic search
│   ├── memory.py       # Memory CRUD
│   ├── autotag.py      # Auto-tagging
│   └── utils.py        # Helpers
│
├── mcp-server/         # MCP integration
│   ├── index.js        # MCP server
│   └── package.json    # Dependencies
│
├── scripts/            # Utilities
│   ├── setup-clients.js  # Auto-config AI clients
│   ├── backup.sh       # Database backup
│   └── migrate.py      # Schema migrations
│
├── tests/              # Test suite
├── docs/               # Documentation
└── examples/           # Usage examples

Development

hljs language-bash

# Install dependencies
poetry install

# Run tests
pytest

# Start development server
uvicorn src.main:app --reload

Tech Stack

Backend: Python 3.11+, FastAPI, Uvicorn
Database: SQLite with JSON support
Embeddings: sentence-transformers (paraphrase-multilingual-mpnet-base-v2, 768-dim)
MCP Server: Node.js, @modelcontextprotocol/sdk
Auto-Tagging: Api
Testing: pytest, pytest-asyncio, pytest-cov
Deployment: Docker, docker-compose

Performance

Operation	Time	Notes
Save memory	~20ms	Including embedding
Search (1k memories)	~15ms	Semantic similarity
Search (10k memories)	~50ms	Still fast
Model load	~3s	One-time on startup

License

MIT License - see LICENSE

Cognio

Persistent semantic memory server for AI assistants via Model Context Protocol (MCP)

Built for:

Personal knowledge base that grows over time
Multi-project context management
Research notes and learning journal
Conversation history with semantic retrieval

Features

Semantic Search: Find memories by meaning using sentence-transformers
LEANN Vector Search (Optional): Lazy-built index with on-demand recomputation to reduce startup memory
Multilingual Support: Search in 100+ languages seamlessly
Persistent Storage: SQLite-based storage that survives across sessions
Project Organization: Organize memories by project and tags
Auto-Tagging: Automatic tag generation via LLM (GPT-4, Groq, etc)
Text Summarization: Extractive and abstractive summarization for long texts
MCP Integration: One-click setup for VS Code, Claude, Cursor, and more
RESTful API: Standard HTTP API with OpenAPI documentation
Export Capabilities: Export to JSON or Markdown format
Docker Support: Simple deployment with docker-compose

Quick Start

1. Start the Server

hljs language-bash

git clone https://github.com/0xReLogic/Cognio.git
cd Cognio
docker-compose up -d

Server runs at http://localhost:8080

2. Auto-Configure AI Clients

The MCP server automatically configures supported AI clients on first start:

Supported Clients:

Claude Desktop
Claude Code (CLI)
VS Code (GitHub Copilot)
Cursor
Continue.dev
Cline
Windsurf
Kiro
Gemini CLI

Quick Setup:

Run the auto-setup script to configure all clients at once:

hljs language-bash

cd mcp-server
npm run setup

This generates MCP configs for all 9 supported clients automatically.

Manual Configuration:

See mcp-server/README.md for client-specific MCP configuration examples.

On first run, Cognio auto-generates cognio.md in your workspace with usage guide for AI tools.

3. Test It

hljs language-bash

# Save a memory
curl -X POST http://localhost:8080/memory/save \
  -H "Content-Type: application/json" \
  -d '{"text": "Docker allows running apps in containers", "project": "LEARNING"}'

# Search memories
curl "http://localhost:8080/memory/search?q=containers"

Or use naturally in your AI client:

hljs language-arduino

"Search my memories for Docker information"
"Remember this: FastAPI is a modern Python web framework"

4. Web UI Dashboard

Access the interactive memory dashboard:

hljs language-bash

http://localhost:8080/ui

Features:

Browse and search all memories
Add/edit memories with markdown preview
View statistics and insights
Organize by project and tags
Bulk operations (select, delete)
Dark/light theme toggle
Works locally and in Docker

The dashboard auto-detects the API server, so it works on localhost, Docker containers, and remote deployments.

Documentation

API Reference - Complete endpoint documentation
Examples - Usage patterns and integrations
Quickstart - Installation and configuration

MCP Tools

When using the MCP server, you have access to 11 specialized tools:

Tool	Description
`save_memory`	Save text with optional project/tags (auto-tagging enabled)
`search_memory`	Semantic search with project filtering
`list_memories`	List memories with pagination and filters
`get_memory_stats`	Get storage statistics and insights
`archive_memory`	Soft delete a memory (recoverable)
`delete_memory`	Permanently delete a memory by ID
`export_memories`	Export memories to JSON or Markdown
`summarize_text`	Summarize long text (extractive or LLM-based)
`set_active_project`	Set active project context (auto-applies to all operations)
`get_active_project`	View currently active project
`list_projects`	List all available projects from database

Active Project Workflow:

hljs language-scss

1. list_projects() → See: Helios-LoadBalancer (45), Cognio-Memory (23), ...
2. set_active_project("Helios-LoadBalancer")
3. save_memory("Cache TTL is 300s") → Auto-saves to Helios-LoadBalancer
4. search_memory("cache settings") → Auto-searches in Helios-LoadBalancer only
5. list_memories() → Lists only Helios-LoadBalancer memories

Project Isolation:
Always specify a project name OR use set_active_project to keep memories organized and prevent mixing contexts between different workspaces.

API Endpoints

Method	Endpoint	Description
GET	`/health`	Health check
POST	`/memory/save`	Save new memory
GET	`/memory/search`	Semantic/Hybrid search
GET	`/memory/list`	List memories with filters
DELETE	`/memory/{id}`	Delete memory by ID
POST	`/memory/bulk-delete`	Bulk delete by project
GET	`/memory/stats`	Get statistics
GET	`/memory/export`	Export memories
POST	`/memory/summarize`	Summarize long text

Interactive docs: http://localhost:8080/docs

Configuration

Environment variables (see .env.example):

Copy the example and edit your local overrides:

hljs language-bash

cp .env.example .env

hljs language-bash

# Database
DB_PATH=./data/memory.db

# Embeddings
EMBED_MODEL=all-MiniLM-L6-v2
EMBED_DEVICE=cpu
EMBEDDING_CACHE_PATH=./data/embedding_cache.pkl

# API
API_HOST=0.0.0.0
API_PORT=8080
# Optional API key for auth
API_KEY=your-secret-key

# Search
DEFAULT_SEARCH_LIMIT=5
SIMILARITY_THRESHOLD=0.4
HYBRID_ENABLED=true
HYBRID_MODE=rerank        # candidate | rerank
HYBRID_ALPHA=0.6          # 0..1, higher = more semantic
HYBRID_RERANK_TOPK=100    # rerank candidate pool size

# LEANN vector search (optional)
LEANN_ENABLED=false
LEANN_INDEX_PATH=./data/leann/memories.leann
LEANN_BACKEND=hnsw
LEANN_LAZY_BUILD=true
LEANN_RECOMPUTE_ON_SEARCH=true
LEANN_WARMUP_ON_START=false

# Summarization
SUMMARIZATION_ENABLED=true
SUMMARIZATION_METHOD=abstractive   # extractive | abstractive
SUMMARIZATION_EMBED_MODEL=all-MiniLM-L6-v2

# Auto-tagging (Optional)
AUTOTAG_ENABLED=true
LLM_PROVIDER=groq
GROQ_API_KEY=your-groq-key
GROQ_MODEL=openai/gpt-oss-120b
# OPENAI_API_KEY=your-openai-api-key
# OPENAI_MODEL=gpt-4o-mini

# Performance
MAX_TEXT_LENGTH=10000
BATCH_SIZE=32
SUMMARIZE_THRESHOLD=50

# Logging
LOG_LEVEL=info

Auto-Tagging Models:

openai/gpt-oss-120b - High quality
gpt-4o-mini - OpenAI, fast and cheap
llama-3.3-70b-versatile - Groq, balanced
llama-3.1-8b-instant - Groq, fastest

See .env.example for all available options and recommendations.

Project Structure

hljs language-bash

cognio/
├── src/                # Core application
│   ├── main.py         # FastAPI app
│   ├── config.py       # Environment config
│   ├── models.py       # Data schemas
│   ├── database.py     # SQLite operations
│   ├── embeddings.py   # Semantic search
│   ├── memory.py       # Memory CRUD
│   ├── autotag.py      # Auto-tagging
│   └── utils.py        # Helpers
│
├── mcp-server/         # MCP integration
│   ├── index.js        # MCP server
│   └── package.json    # Dependencies
│
├── scripts/            # Utilities
│   ├── setup-clients.js  # Auto-config AI clients
│   ├── backup.sh       # Database backup
│   └── migrate.py      # Schema migrations
│
├── tests/              # Test suite
├── docs/               # Documentation
└── examples/           # Usage examples

Development

hljs language-bash

# Install dependencies
poetry install

# Run tests
pytest

# Start development server
uvicorn src.main:app --reload

Tech Stack

Backend: Python 3.11+, FastAPI, Uvicorn
Database: SQLite with JSON support
Embeddings: sentence-transformers (paraphrase-multilingual-mpnet-base-v2, 768-dim)
MCP Server: Node.js, @modelcontextprotocol/sdk
Auto-Tagging: Api
Testing: pytest, pytest-asyncio, pytest-cov
Deployment: Docker, docker-compose

Performance

Operation	Time	Notes
Save memory	~20ms	Including embedding
Search (1k memories)	~15ms	Semantic similarity
Search (10k memories)	~50ms	Still fast
Model load	~3s	One-time on startup

License

MIT License - see LICENSE

Cognio

Cognio

Features

Quick Start

1. Start the Server

2. Auto-Configure AI Clients

3. Test It

4. Web UI Dashboard

Documentation

MCP Tools

API Endpoints

Configuration

Project Structure

Development

Tech Stack

Performance

License

Links

Similar Packages

Cognio

Cognio

Features

Quick Start

1. Start the Server

2. Auto-Configure AI Clients

3. Test It

4. Web UI Dashboard

Documentation

MCP Tools

API Endpoints

Configuration

Project Structure

Development

Tech Stack

Performance

License

Links

Similar Packages