A community-driven registry for Claude, Cursor, Windsurf, Cline & more. Not affiliated with Anthropic.
Are you the author? Sign in to claim
👨🏻💻 Meet Lumina – my personal AI assistant powered by hybrid RAG with Pinecone vector search and Neo4j graph travers
David Nguyen's Personal AI Assistant - Lumina is a full-stack web application that allows users to ask questions about David Nguyen, as well as any other topics, and receive instant, personalized responses powered by state‑of‑the‑art AI & RAG. Users can log in to save their conversation history or continue as guests. The app uses modern technologies and provides a sleek, responsive user interface with intuitive UX and lots of animations. 🚀
[!IMPORTANT] Currently, the app is deployed live on Vercel at: https://lumina-david.vercel.app/. Feel free to check it out!
For the backend (with Swagger docs), it is deployed live also on Vercel at: https://ai-assistant-chatbot-server.vercel.app/.
Alternatively, the backup app is deployed live on Netlify at: https://lumina-ai-chatbot.netlify.app/.
[!TIP] Go straight to https://lumina-david.vercel.app/chat if you want to chat with the AI right away!
/passkeys (list/add/nickname/revoke). Backed by @simplewebauthn/server v9 with a TTL-indexed challenges collection consumed exactly once. Email + password remains as a fallback.ToastProvider surfaces auth, passkey, and API errors in non-blocking snackbars instead of alert() dialogs.Promise.allSettled, and a file-backed static resume fallback is used when live retrieval backends fail.markdown formatting for rich text.The project follows a modern, full-stack architecture with clear separation of concerns across three main layers:
Frontend Layer: A React application built with TypeScript and Material-UI (MUI) that provides:
Backend Layer: An Express.js server written in TypeScript that handles:
AI/ML Layer: Hybrid RAG (Retrieval-Augmented Generation) implementation that includes:
For detailed architecture documentation, including component diagrams, data flows, and deployment strategies, see ARCHITECTURE.md.
graph TB
subgraph "Client Layer"
Browser[Web Browser]
React[React Application]
end
subgraph "API Gateway"
LB[Load Balancer / CDN]
end
subgraph "Application Layer"
API[Express.js API Server]
Auth[Authentication Service]
Chat[Chat Service]
Conv[Conversation Service]
end
subgraph "AI/ML Layer"
RAG[RAG Pipeline]
Gemini[Google Gemini AI]
Embed[Embedding Service]
end
subgraph "Data Layer"
MongoDB[(MongoDB)]
Pinecone[(Pinecone Vector DB)]
Neo4j[(Neo4j Graph DB)]
end
Browser --> React
React --> LB
LB --> API
API --> Auth
API --> Chat
API --> Conv
Chat --> RAG
RAG --> Embed
RAG --> Gemini
RAG --> Pinecone
RAG --> Neo4j
Auth --> MongoDB
Conv --> MongoDB
Chat --> MongoDB
style React fill:#4285F4
style API fill:#339933
style MongoDB fill:#47A248
style Pinecone fill:#FF6F61
style Neo4j fill:#008CC1
style Gemini fill:#4285F4
Hybrid retrieval from Pinecone and Neo4j in parallel, followed by intelligent merging, augmentation with conversation history, and response generation with Google Gemini AI. One failing retrieval path never blocks the other, and if live retrieval backends fail, Lumina can fall back to static resume context loaded from local manifest/files.
sequenceDiagram
participant User
participant Frontend
participant Backend
participant Pinecone
participant Neo4j
participant Gemini
participant MongoDB
User->>Frontend: Send chat message
Frontend->>Backend: POST /api/chat/auth
Backend->>MongoDB: Fetch conversation history
MongoDB-->>Backend: Previous messages
Note over Backend,Neo4j: Retrieval Phase (Parallel)
par Parallel Retrieval
Backend->>Pinecone: Vector similarity search
Pinecone-->>Backend: Top-K vector matches
and
Backend->>Neo4j: Extract query entities + graph traversal
Neo4j-->>Backend: Top-K graph matches
end
Backend->>Backend: Merge & deduplicate results
Note over Backend,Gemini: Augmentation Phase
Backend->>Backend: Build augmented context
Backend->>Gemini: Send enriched prompt
Note over Gemini: Generation Phase
Gemini->>Gemini: Generate response
Gemini-->>Backend: AI response + citations
Backend->>MongoDB: Save message & sources
MongoDB-->>Backend: Saved
Backend-->>Frontend: Return AI response
Frontend-->>User: Display response
flowchart LR
subgraph "Frontend"
UI[User Interface]
State[State Management]
API_Client[API Client]
end
subgraph "Backend API"
Routes[Route Handlers]
Middleware[Auth Middleware]
Services[Business Logic]
end
subgraph "Data Sources"
MongoDB[(MongoDB)]
Pinecone[(Pinecone)]
Neo4j[(Neo4j)]
Gemini[Gemini API]
end
UI --> State
State --> API_Client
API_Client -.HTTP/REST.-> Routes
Routes --> Middleware
Middleware --> Services
Services --> MongoDB
Services --> Pinecone
Services --> Neo4j
Services --> Gemini
MongoDB -.Data.-> Services
Pinecone -.Vectors.-> Services
Neo4j -.Graph.-> Services
Gemini -.AI Response.-> Services
Services -.JSON.-> Routes
Routes -.Response.-> API_Client
API_Client --> State
State --> UI
style UI fill:#4285F4
style Services fill:#339933
style MongoDB fill:#47A248
style Pinecone fill:#FF6F61
style Neo4j fill:#008CC1
style Gemini fill:#4285F4
[!NOTE] These diagrams provide a high-level overview of the system architecture. For detailed component interactions, database schemas, deployment strategies, and security architecture, please refer to ARCHITECTURE.md.
For comprehensive architecture documentation including:
Please see ARCHITECTURE.md
Clone the repository:
git clone https://github.com/hoangsonww/AI-Assistant-Chatbot.git
cd AI-Assistant-Chatbot/server
Install dependencies:
npm install
Environment Variables:
Create a .env file in the server folder with the following (adjust values as needed):
PORT=5000
MONGODB_URI=mongodb://localhost:27017/ai-assistant
JWT_SECRET=your_jwt_secret_here
GOOGLE_AI_API_KEY=your_google_ai_api_key_here
PINECONE_API_KEY=your_pinecone_api_key_here
PINECONE_INDEX_NAME=lumina-index
# Neo4j AuraDB (optional — enables graph RAG)
NEO4J_URI=neo4j+s://your-instance.databases.neo4j.io
NEO4J_USERNAME=your_username
NEO4J_PASSWORD=your_password
NEO4J_DATABASE=your_database
# Passkeys (WebAuthn)
# RP_ID is the apex domain that the browser binds the passkey to (no scheme,
# no port). Use "localhost" for local development. EXPECTED_ORIGIN is a
# comma-separated list of every front-end origin that may register or sign
# in. Credentials are domain-bound, so changing RP_ID later invalidates all
# previously-registered passkeys.
WEBAUTHN_RP_ID=localhost
WEBAUTHN_RP_NAME=Lumina AI
WEBAUTHN_EXPECTED_ORIGIN=http://localhost:3000
Run the server in development mode:
npm run dev
This uses nodemon with ts-node to watch for file changes.
Navigate to the client folder:
cd ../client
Install dependencies:
npm install
Run the frontend development server:
npm start
The app will run on http://localhost:3000 (or any other port you've specified in the .env file's PORT key).
Install necessary Node.js packages:
npm install
Ingest knowledge into Pinecone with the CLI (run from server/):
npm run knowledge:repl
Or run a single upsert command (use --external-id to update later):
npm run knowledge:upsert -- \
--title "Resume 2025" \
--file ./knowledge/resume.txt \
--type resume \
--tags "resume,profile" \
--external-id "resume-2025"
(Optional) Set up Neo4j graph database for hybrid retrieval:
Create a Neo4j AuraDB instance at https://console.neo4j.io
Add NEO4J_URI, NEO4J_USERNAME, NEO4J_PASSWORD, NEO4J_DATABASE to server/.env
Rebuild the knowledge graph:
npm run knowledge:graph:rebuild
Check graph status:
npm run knowledge:graph:status
Use the REPL to edit or delete sources (edit <id>, delete <id>) as your profile changes.
Ensure you ingest at least one knowledge source before using the chatbot so responses can be grounded and cited.
For detailed instructions on managing knowledge (adding, updating, deleting), see UPDATE_KNOWLEDGE.md.
The knowledge base supports manifest-based batch sync, making it straightforward to add, update, or delete knowledge sources in bulk. The manifest file (server/knowledge/manifest.json) declaratively describes all knowledge files and their metadata, enabling one-command synchronization via npm run knowledge:sync. The same manifest/file set also powers the static resume fallback used during live retrieval backend failures, so fallback knowledge is easy to maintain without code changes. For the full guide covering single-file upserts, batch sync, graph rebuilds, and deletion workflows, see UPDATE_KNOWLEDGE.md.
The application is currently deployed on Vercel with the following setup:
graph TB
subgraph "Client Devices"
Browser[Web Browser]
Mobile[Mobile Browser]
end
subgraph "CDN Layer"
Vercel[Vercel Edge Network]
Netlify[Netlify CDN - Backup]
end
subgraph "Frontend Deployment"
FrontendVercel[React App on Vercel]
FrontendNetlify[React App on Netlify]
StaticAssets[Static Assets]
end
subgraph "Backend Deployment"
BackendVercel[Express API on Vercel]
ServerlessFunctions[Serverless Functions]
end
subgraph "External Services"
MongoDB[(MongoDB Atlas)]
Pinecone[(Pinecone Vector DB)]
GeminiAPI[Google Gemini AI API]
end
subgraph "CI/CD Pipeline"
GitHub[GitHub Repository]
GitHubActions[GitHub Actions]
AutoDeploy[Auto Deploy on Push]
end
subgraph "Monitoring & Analytics"
VercelAnalytics[Vercel Analytics]
Logs[Application Logs]
end
Browser --> Vercel
Mobile --> Vercel
Vercel --> FrontendVercel
Netlify --> FrontendNetlify
FrontendVercel --> StaticAssets
FrontendVercel --> BackendVercel
FrontendNetlify --> BackendVercel
BackendVercel --> ServerlessFunctions
ServerlessFunctions --> MongoDB
ServerlessFunctions --> Pinecone
ServerlessFunctions --> GeminiAPI
GitHub --> GitHubActions
GitHubActions --> AutoDeploy
AutoDeploy --> Vercel
AutoDeploy --> Netlify
BackendVercel --> VercelAnalytics
BackendVercel --> Logs
FrontendVercel --> VercelAnalytics
style Browser fill:#4285F4
style Vercel fill:#000000
style FrontendVercel fill:#61DAFB
style BackendVercel fill:#339933
style MongoDB fill:#47A248
style Pinecone fill:#FF6F61
style GeminiAPI fill:#4285F4
style GitHub fill:#181717
Run the entire application stack locally using Docker:
# Build and start all services
docker-compose up --build
# Or run in detached mode
docker-compose up -d
# Stop all services
docker-compose down
This will start:
http://localhost:3000http://localhost:5000localhost:27017For production-grade AWS deployment with high availability and scalability:
# Navigate to infrastructure directory
cd terraform/
# Initialize Terraform
terraform init
# Review deployment plan
terraform plan
# Deploy infrastructure
terraform apply
# Or use provided scripts
cd ../aws/scripts/
./deploy-production.sh
AWS Infrastructure includes:
See aws/README.md and terraform/README.md for detailed deployment instructions.
Landing Page:
The landing page provides an overview of the app’s features and two main actions: Create Account (for new users) and Continue as Guest.
Authentication:
Users can sign up, log in, and reset their password. Authenticated users can save and manage their conversation history.
Chatting:
The main chat area allows users to interact with the AI assistant. The sidebar displays saved conversations (for logged-in users) and allows renaming and searching.
Theme:
Toggle between dark and light mode via the navbar. The chosen theme is saved in local storage and persists across sessions.
Lumina features real-time streaming responses that make conversations feel more natural and engaging. Instead of waiting for the complete response, you'll see the AI's thoughts appear word-by-word as they're generated.
The streaming implementation uses Server-Sent Events (SSE) and WebSockets (optional) to deliver AI responses in real-time:
sequenceDiagram
participant User
participant Frontend
participant Backend
participant Gemini AI
User->>Frontend: Send message
Frontend->>Frontend: Show "Processing..."
Frontend->>Backend: POST /api/chat/auth/stream
Backend->>Gemini AI: Request streaming response
loop For each chunk
Gemini AI-->>Backend: Stream text chunk
Backend-->>Frontend: SSE: chunk data
Frontend->>Frontend: Append to message bubble
Frontend->>User: Display growing text + cursor
end
Gemini AI-->>Backend: Stream complete
Backend->>Backend: Save to database
Backend-->>Frontend: SSE: done event
Frontend->>Frontend: Finalize message
Authenticated Streaming:
POST /api/chat/auth/stream
Content-Type: application/json
Authorization: Bearer <token>
{
"message": "Your question here",
"conversationId": "optional-conversation-id",
"editIndex": "optional-int — truncates conversation history at this index before sending"
}
Guest Streaming:
POST /api/chat/guest/stream
Content-Type: application/json
{
"message": "Your question here",
"guestId": "optional-guest-id",
"editIndex": "optional-int — truncates conversation history at this index before sending"
}
The SSE stream sends different event types:
conversationId/guestId: Sent at the start with the conversation identifierchunk: Each piece of text as it's generated from the AIdone: Signals that streaming is completeerror: Indicates an error occurred during streamingIf a connection fails during streaming:
The retry logic uses exponential backoff to avoid overwhelming the server while providing a smooth user experience.
challengeId.email to scope the prompt; omit it for discoverable (usernameless) login./login).flowchart TB
Start([User Visits App]) --> CheckAuth{Has Valid<br/>Token?}
CheckAuth -->|Yes| Dashboard[Access Dashboard]
CheckAuth -->|No| Landing[Landing Page]
Landing --> Choice{User Choice}
Choice -->|Sign Up| SignupForm[Signup Form]
Choice -->|Login| LoginForm[Login Form]
Choice -->|Guest| GuestChat[Guest Chat Mode]
SignupForm --> ValidateSignup{Valid<br/>Credentials?}
ValidateSignup -->|No| SignupError[Show Error]
SignupError --> SignupForm
ValidateSignup -->|Yes| CreateUser[Create User in MongoDB]
CreateUser --> GenerateToken[Generate JWT Token]
LoginForm --> ValidateLogin{Valid<br/>Credentials?}
ValidateLogin -->|No| LoginError[Show Error]
LoginError --> LoginForm
ValidateLogin -->|Yes| VerifyPassword[Verify Password with bcrypt]
VerifyPassword -->|Invalid| LoginError
VerifyPassword -->|Valid| GenerateToken
GenerateToken --> StoreToken[Store Token in LocalStorage]
StoreToken --> Dashboard
Dashboard --> Protected[Protected Routes]
Protected --> ConvHistory[Conversation History]
Protected --> SavedChats[Saved Chats]
Protected --> Settings[User Settings]
GuestChat --> TempStorage[Temporary Storage]
TempStorage --> LimitedFeatures[Limited Features]
Dashboard --> Logout{Logout?}
Logout -->|Yes| ClearToken[Clear Token]
ClearToken --> Landing
style Start fill:#4285F4
style Dashboard fill:#34A853
style GuestChat fill:#FBBC04
style GenerateToken fill:#EA4335
style CreateUser fill:#34A853
flowchart LR
subgraph User["👤 User Actions"]
NewChat[Start New Chat]
LoadChat[Load Existing Chat]
SearchChat[Search Conversations]
RenameChat[Rename Conversation]
DeleteChat[Delete Conversation]
end
subgraph Frontend["⚛️ React Frontend"]
ChatUI[Chat Interface]
Sidebar[Conversation Sidebar]
SearchBar[Search Bar]
end
subgraph API["🔌 Express API"]
ConvRoutes[api/conversations Route]
AuthMiddleware{JWT Auth}
end
subgraph Database["🗄️ MongoDB"]
ConvCollection[(Conversations Collection)]
UserCollection[(Users Collection)]
end
subgraph Operations["📊 CRUD Operations"]
Create[Create]
Read[Read]
Update[Update]
Delete[Delete]
end
NewChat --> ChatUI
LoadChat --> Sidebar
SearchChat --> SearchBar
RenameChat --> Sidebar
DeleteChat --> Sidebar
ChatUI --> ConvRoutes
Sidebar --> ConvRoutes
SearchBar --> ConvRoutes
ConvRoutes --> AuthMiddleware
AuthMiddleware -->|Valid Token| Operations
AuthMiddleware -->|Invalid Token| ErrorAuth[401 Unauthorized]
Create --> ConvCollection
Read --> ConvCollection
Update --> ConvCollection
Delete --> ConvCollection
ConvCollection -.User Reference.-> UserCollection
ConvCollection --> ConvRoutes
ConvRoutes --> Frontend
style ChatUI fill:#4285F4
style ConvCollection fill:#47A248
style AuthMiddleware fill:#EA4335
style Operations fill:#34A853
editIndex to truncate conversation history for message-edit branching.editIndex for conversation branching.editIndex for message-edit branching.editIndex for conversation branching.
AI-Assistant-Chatbot/
├── docker-compose.yml
├── openapi.yaml
├── README.md
├── ARCHITECTURE.md
├── UPDATE_KNOWLEDGE.md
├── LICENSE
├── Jenkinsfile
├── package.json
├── tsconfig.json
├── .env
├── shell/ # Shell scripts for app setups
├── terraform/ # Infrastructure as Code (Terraform)
├── aws/ # AWS deployment configurations
├── img/ # Images and screenshots
├── agentic_ai/ # Multi-agent AI pipeline with MCP client integration
├── mcp_server/ # Standalone MCP server (30+ tools, resources, prompts)
├── client/ # Frontend React application
│ ├── package.json
│ ├── tsconfig.json
│ ├── docker-compose.yml
│ ├── Dockerfile
│ └── src/
│ ├── App.tsx
│ ├── index.tsx
│ ├── theme.ts
│ ├── globals.css
│ ├── index.css
│ ├── dev/
│ │ ├── palette.tsx
│ │ ├── previews.tsx
│ │ ├── index.ts
│ │ └── useInitial.ts
│ ├── services/
│ │ └── api.ts # API client with streaming support
│ ├── types/
│ │ ├── conversation.d.ts
│ │ └── user.d.ts
│ ├── components/
│ │ ├── Navbar.tsx
│ │ ├── Sidebar.tsx
│ │ ├── ChatArea.tsx # Main chat interface with streaming
│ │ └── CopyIcon.tsx
│ ├── styles/
│ │ └── (various style files)
│ └── pages/
│ ├── LandingPage.tsx
│ ├── Home.tsx
│ ├── Login.tsx
│ ├── Signup.tsx
│ ├── NotFoundPage.tsx
│ ├── ForgotPassword.tsx
│ └── Terms.tsx
└── server/ # Backend Express application
├── package.json
├── tsconfig.json
├── Dockerfile
├── docker-compose.yml
├── knowledge/
│ ├── manifest.json # Declarative manifest for batch knowledge sync
│ ├── son-nguyen-profile.txt
│ ├── son-nguyen-honors-awards.txt
│ ├── son-nguyen-publications.txt
│ ├── son-nguyen-projects.txt
│ └── son-nguyen-skills.txt
└── src/
├── server.ts
├── models/
│ ├── Conversation.ts
│ ├── GuestConversation.ts
│ ├── KnowledgeSource.ts
│ └── User.ts
├── routes/
│ ├── auth.ts
│ ├── conversations.ts
│ ├── chat.ts # Authenticated chat with streaming
│ └── guest.ts # Guest chat with streaming
├── services/
│ ├── geminiService.ts # AI service with hybrid RAG + streaming
│ ├── geminiEmbeddings.ts # Embedding generation
│ ├── knowledgeBase.ts # Chunking, embeddings, vector+graph retrieval
│ ├── pineconeClient.ts # Pinecone vector DB client
│ ├── neo4jClient.ts # Neo4j graph DB client
│ ├── graphKnowledge.ts # Graph entity extraction & retrieval
│ └── staticResumeFallback.ts # File-backed fallback retrieval context
├── types/
│ └── graph.ts # Graph entity & relationship types
├── scripts/
│ └── knowledgeCli.ts # CLI + REPL ingestion
├── utils/
│ └── (utility functions)
├── middleware/
│ └── auth.ts
└── public/
└── favicon.ico
Lumina includes a standalone MCP server (mcp_server/) that exposes 30+ tools, 7 resources, and 6 prompts through the standardized Model Context Protocol. Any MCP-compatible client — Claude Desktop, ChatGPT, Cursor, VS Code Copilot — can connect and use Lumina's capabilities.
| Category | Tools | Description |
|---|---|---|
| Pipeline | 5 | Run, monitor, cancel agentic AI pipelines |
| Knowledge | 4 | Search and retrieve RAG knowledge base documents |
| Code | 3 | Search code, analyze files, explore project structure |
| File | 5 | Read, write, list, search files |
| Web | 2 | Fetch URLs, extract structured content |
| Data | 3 | Parse CSV/JSON, transform data |
| Git | 4 | Status, log, diff, blame operations |
| System | 6 | Health checks, metrics, environment diagnostics |
# Install MCP server dependencies
pip install -r mcp_server/requirements.txt
# Run with stdio transport (for Claude Desktop, Cursor, VS Code)
python -m mcp_server
# Run with SSE transport (for remote/network access)
python -m mcp_server --transport sse --port 8080
Add to your claude_desktop_config.json:
{
"mcpServers": {
"lumina": {
"command": "python",
"args": ["-m", "mcp_server"],
"cwd": "/path/to/AI-RAG-Assistant-Chatbot"
}
}
}
📖 See
mcp_server/README.mdfor the complete tool reference, configuration guide, and integration examples.
Lumina includes a multi-agent AI pipeline implemented in Python (agentic_ai/). The pipeline uses LangGraph for agent orchestration and connects to the standalone MCP server as an MCP client, giving every agent access to 30+ real tools through the Model Context Protocol.
Key capabilities:
The pipeline is located in the agentic_ai/ directory and is optional for the main assistant.
[!TIP] For more information on the Agentic AI pipeline, please refer to the
agentic_ai/README.mdfile.
To run the application using Docker, simply run docker-compose up in the root directory of the project. This will start both the backend and frontend services as defined in the docker-compose.yml file.
Why Dockerize?
There is an OpenAPI specification file (openapi.yaml) in the root directory that describes the API endpoints, request/response formats, and authentication methods. This can be used to generate client SDKs or documentation.
To view the API documentation, you can use tools like Swagger UI or Postman to import the openapi.yaml file. Or just go to the /docs endpoint of the deployed backend.
This project includes a GitHub Actions workflow for continuous integration and deployment. The workflow is defined in the .github/workflows/workflow.yml file and includes steps to:
This workflow ensures that every commit and pull request is tested and deployed automatically, providing a robust CI/CD pipeline.
Please ensure you have the necessary secrets configured in your GitHub repository for deployment (e.g, Vercel and Netlify tokens, etc.). Also, feel free to customize the workflow under .github/workflows/workflow.yml to suit your needs.
This project includes unit and integration tests with Jest for both the frontend and backend. To run the tests:
Frontend:
Navigate to the client directory and run:
npm test
Backend:
Navigate to the server directory and run:
npm test
git checkout -b feature/your-feature-namegit commit -m 'Add some feature'git push origin feature/your-feature-nameThis project is licensed under the MIT License.
If you have any questions or suggestions, feel free to reach out to me:
Thank you for checking out the AI Assistant Project! If you have any questions or feedback, feel free to reach out. Happy coding! 🚀
MCP server integration for DaVinci Resolve Studio
Run Claude Code as an MCP server so any agent can delegate coding tasks to it
Browser automation using accessibility snapshots instead of screenshots
A Jetbrains IDE IntelliJ plugin aimed to provide coding agents the ability to leverage intelliJ's indexing of the codeba
via CLI