Merlin

Self-hosted AI code review for GitHub, GitLab, Bitbucket, Azure DevOps, and Gitea.

Merlin runs inside your CI pipeline, reviews pull request diffs with the AI provider of your choice, and posts inline comments directly on the PR. No code ever leaves your infrastructure.

hljs language-css

PR opened
    │
    ▼
CI pipeline triggers Merlin
    │
    ├── Fetch PR diff from platform API
    ├── (optional) Search RAG index for relevant codebase context
    ├── Send diff + context to AI provider
    └── Post inline review comments back to the PR
              │
              ▼
        github-actions[bot] commented:
        🔴 [Critical] SQL injection via unsanitized input ...

Features
Prerequisites
Installation
Quick Start — 5 Minutes
Platform Integration
- GitHub Actions
- GitLab CI
- Bitbucket Pipelines
- Azure DevOps
- Gitea Actions
Permissions & Bot Identity
AI Providers
- Anthropic Claude
- OpenAI GPT-4o
- Google Gemini
- AWS Bedrock
- Azure OpenAI
- Claude Code CLI
- Groq
- Together AI
- DeepSeek
- Mistral AI
- OpenRouter
- Ollama (local)
RAG — Context-Aware Reviews
Custom Rules Engine
Adaptive Feedback Learning
Slash Commands
Webhook & Bot Mode
Autonomous Agent
Configuration Reference
Environment Variables
CLI Reference
Troubleshooting
Architecture
Building from Source
Local Development — Git Hooks
Contributing
License

Features

Category	Details
AI providers	Anthropic Claude, OpenAI, Google Gemini, AWS Bedrock, Azure OpenAI, Groq, Together AI, DeepSeek, Mistral AI, OpenRouter, Ollama (local), Claude Code CLI
VCS platforms	GitHub, GitLab, Bitbucket, Azure DevOps, Gitea — auto-detected from CI environment
Slash commands	22 commands triggered from PR comments (`@merlin /review`) or CLI (`merlin run /spec`)
Custom rules engine	`.merlin-rules.yaml` — regex patterns, natural-language directives, and path-scoped rules
Adaptive feedback	Learns from 👍/👎 reactions to suppress noisy comment patterns over time
PR architecture diagrams	`/diagram` generates Mermaid diagrams showing module relationships and data flow
RAG pipeline	Index your codebase; reviews include semantically relevant file context
Bot mode	Persistent webhook server that reacts to PR comment events automatically
Autonomous agent	ReAct-loop agent with Slack, Discord, and CLI channels
Security focus	Files ranked by security sensitivity; dedicated `/security` scan for secrets + OWASP
Reflect & Review	Optional second AI pass to filter false positives and refine severity
Local mode	`merlin review --diff <file>` for offline testing without a VCS platform
Zero lock-in	Swap AI providers, vector stores, or VCS platforms via a single config line

Prerequisites

You need one of the following to provide AI reviews:

Provider	What you need
Anthropic Claude (recommended)	`ANTHROPIC_API_KEY` from console.anthropic.com
OpenAI	`OPENAI_API_KEY` from platform.openai.com
Google Gemini	`GEMINI_API_KEY` from Google AI Studio
AWS Bedrock	AWS credentials with Bedrock access
Azure OpenAI	Azure OpenAI resource + deployment
Ollama	Local Ollama install — no API key
Claude Code CLI	Claude Code subscription — no API key

Your VCS platform token (GITHUB_TOKEN, CI_JOB_TOKEN, etc.) is provided automatically by CI — no manual setup needed.

Installation

Pick the method that fits your workflow. All methods produce the same binary.

Option 1 — One-line installer (recommended)

hljs language-bash

# Linux / macOS
curl -fsSL \
  https://github.com/Arunachalamkalimuthu/merlin-ai-code-review/releases/latest/download/install.sh \
  | sh

hljs language-powershell

# Windows (PowerShell)
irm https://github.com/Arunachalamkalimuthu/merlin-ai-code-review/releases/latest/download/install.ps1 | iex

The installer auto-detects your OS and architecture and places the binary in /usr/local/bin (or %LOCALAPPDATA%\Programs\merlin on Windows).

Option 2 — Docker image

hljs language-bash

docker pull ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest

The image is multi-arch (linux/amd64, linux/arm64) and uses a fully-static musl binary — no libc issues on Alpine-based runners.

Option 3 — Pre-built binary

Download the binary for your platform from the latest release:

Platform	Binary
Linux x86_64 (glibc)	`merlin-linux-amd64`
Linux x86_64 (musl / static)	`merlin-linux-amd64-musl`
Linux arm64 (glibc)	`merlin-linux-arm64`
Linux arm64 (musl / static)	`merlin-linux-arm64-musl`
macOS Intel	`merlin-darwin-amd64`
macOS Apple Silicon	`merlin-darwin-arm64`
Windows x86_64	`merlin-windows-amd64.exe`

Use the -musl binaries on Alpine Linux or any musl-based distro.

Option 4 — Build from source

hljs language-bash

# Requires Rust 1.85+
cargo install --git https://github.com/Arunachalamkalimuthu/merlin-ai-code-review

Quick Start — 5 Minutes

This is the minimum setup to get Merlin reviewing PRs on GitHub with Anthropic Claude.

Step 1 — Add your API key as a repository secret

In your repository: Settings → Secrets and variables → Actions → New repository secret

Name: ANTHROPIC_API_KEY
Value: your key from console.anthropic.com

Step 2 — Create the workflow file

Create .github/workflows/merlin-review.yml in your repository:

hljs language-yaml

name: Merlin AI Code Review

on:
  pull_request:
    types: [opened, synchronize, reopened]

permissions:
  contents: read
  pull-requests: write

jobs:
  merlin-review:
    name: Merlin AI Review
    runs-on: ubuntu-latest
    container:
      image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Run Merlin Review
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          PR_NUMBER: ${{ github.event.pull_request.number }}
          REPO: ${{ github.repository }}
        run: merlin review

Step 3 — Open a pull request

Merlin will automatically review the diff and post inline comments. Comments appear as github-actions[bot] — no bot account needed.

That's it. For other platforms or advanced configuration, read on.

Platform Integration

GitHub Actions

Two equivalent approaches — choose whichever fits your stack.

Option A — Docker container (simplest)

hljs language-yaml

# .github/workflows/merlin-review.yml
name: Merlin AI Code Review

on:
  pull_request:
    types: [opened, synchronize, reopened]

permissions:
  contents: read        # required for actions/checkout
  pull-requests: write  # required to read diff and post comments

jobs:
  merlin-review:
    name: Merlin AI Review
    runs-on: ubuntu-latest
    container:
      image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Run Merlin Review
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          PR_NUMBER: ${{ github.event.pull_request.number }}
          REPO: ${{ github.repository }}
        run: merlin review

Option B — Binary install (with RAG index caching)

hljs language-yaml

# .github/workflows/merlin-review.yml
name: Merlin AI Code Review

on:
  pull_request:
    types: [opened, synchronize, reopened]

permissions:
  contents: read
  pull-requests: write

jobs:
  merlin-review:
    name: Merlin AI Review
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Cache RAG index
        uses: actions/cache@v4
        with:
          path: merlin-rag.jsonl
          key: merlin-rag-${{ hashFiles('src/**', 'lib/**') }}
          restore-keys: merlin-rag-

      - name: Install Merlin
        run: |
          curl -fsSL \
            https://github.com/Arunachalamkalimuthu/merlin-ai-code-review/releases/latest/download/install.sh \
            | sh

      - name: Build RAG index (first run only)
        run: test -f merlin-rag.jsonl || merlin rag index .
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

      - name: Run Merlin Review
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
        run: merlin review

Important: The permissions block is mandatory. Without pull-requests: write, GitHub returns 403 Forbidden when Merlin tries to fetch the PR diff or post comments.

Secrets to configure

Secret	Required	Purpose
`ANTHROPIC_API_KEY`	Yes (if using Anthropic)	AI review provider
`OPENAI_API_KEY`	Only for RAG embeddings	Codebase indexing

GITHUB_TOKEN is provided automatically — do not create it manually.

GitLab CI

hljs language-yaml

# .gitlab-ci.yml
merlin-review:
  image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest
  stage: review
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
  variables:
    GITLAB_TOKEN: $CI_JOB_TOKEN       # automatic — no setup needed
    ANTHROPIC_API_KEY: $ANTHROPIC_API_KEY
  script:
    - merlin review

With RAG index caching:

hljs language-yaml

merlin-review:
  image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest
  stage: review
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
  cache:
    key: merlin-rag-$CI_DEFAULT_BRANCH
    paths:
      - merlin-rag.jsonl
  variables:
    GITLAB_TOKEN: $CI_JOB_TOKEN
    ANTHROPIC_API_KEY: $ANTHROPIC_API_KEY
    OPENAI_API_KEY: $OPENAI_API_KEY
  script:
    - test -f merlin-rag.jsonl || merlin rag index .
    - merlin review

CI/CD variables to configure (Settings → CI/CD → Variables):

Variable	Required	Purpose
`ANTHROPIC_API_KEY`	Yes	AI review provider
`OPENAI_API_KEY`	Only for RAG	Codebase embeddings

CI_JOB_TOKEN is injected automatically by GitLab. Comments appear as the GitLab project bot.

See .gitlab-ci.yml.example for all RAG embedding and vector store combinations:

Setup	Embedder	Store	Extra requirements
A — Recommended	OpenAI	Local JSONL (cached)	`OPENAI_API_KEY`
B — Self-hosted	OpenAI	Qdrant (GitLab service)	`OPENAI_API_KEY`
C — Managed cloud	OpenAI	Pinecone	`OPENAI_API_KEY` + `PINECONE_API_KEY`
D — Fully private	Ollama (GitLab service)	Local JSONL	Privileged runner
E — No RAG	—	—	Nothing extra

Bitbucket Pipelines

hljs language-yaml

# bitbucket-pipelines.yml
pipelines:
  pull-requests:
    '**':
      - step:
          name: Merlin AI Review
          image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest
          script:
            - merlin review
          variables:
            BITBUCKET_TOKEN: $BITBUCKET_STEP_TOKEN   # automatic — no setup needed
            ANTHROPIC_API_KEY: $ANTHROPIC_API_KEY

With RAG index caching:

hljs language-yaml

pipelines:
  pull-requests:
    '**':
      - step:
          name: Merlin AI Review
          image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest
          caches:
            - merlin-rag
          script:
            - test -f merlin-rag.jsonl || merlin rag index .
            - merlin review
          variables:
            BITBUCKET_TOKEN: $BITBUCKET_STEP_TOKEN
            ANTHROPIC_API_KEY: $ANTHROPIC_API_KEY
            OPENAI_API_KEY: $OPENAI_API_KEY

definitions:
  caches:
    merlin-rag:
      key:
        files:
          - src/**
      path: merlin-rag.jsonl

Repository variables to configure (Repository settings → Pipelines → Repository variables):

Variable	Required	Purpose
`ANTHROPIC_API_KEY`	Yes	AI review provider
`OPENAI_API_KEY`	Only for RAG	Codebase embeddings

BITBUCKET_STEP_TOKEN is created automatically per step. Comments appear as the Pipelines build service user — no bot account needed.

Azure DevOps

hljs language-yaml

# azure-pipelines.yml
trigger: none

pr:
  branches:
    include:
      - '*'

pool:
  vmImage: ubuntu-latest

container:
  image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest

steps:
  - checkout: self
    fetchDepth: 0

  - script: merlin review
    displayName: Merlin AI Review
    env:
      AZURE_DEVOPS_TOKEN: $(System.AccessToken)
      ANTHROPIC_API_KEY: $(ANTHROPIC_API_KEY)
      SYSTEM_TEAMFOUNDATIONCOLLECTIONURI: $(System.TeamFoundationCollectionUri)
      SYSTEM_TEAMPROJECT: $(System.TeamProject)
      BUILD_REPOSITORY_NAME: $(Build.Repository.Name)
      BUILD_SOURCEBRANCH: $(Build.SourceBranch)
      SYSTEM_PULLREQUEST_PULLREQUESTID: $(System.PullRequest.PullRequestId)

One-time pipeline setup:

In the Azure DevOps pipeline editor, click ⋮ → Triggers → YAML → Get sources and tick "Allow scripts to access the OAuth token". This exposes $(System.AccessToken) to the script without requiring a PAT.

Pipeline variables to configure (Pipelines → Edit → Variables):

Variable	Required	Purpose
`ANTHROPIC_API_KEY`	Yes	AI review provider
`OPENAI_API_KEY`	Only for RAG	Codebase embeddings

Comments appear as Project Collection Build Service ({org}) — no bot account needed.

Gitea Actions

hljs language-yaml

# .gitea/workflows/merlin-review.yml
name: Merlin AI Code Review

on:
  pull_request:
    types: [opened, synchronize, reopened]

jobs:
  merlin-review:
    name: Merlin AI Review
    runs-on: ubuntu-latest
    container:
      image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Run Merlin Review
        env:
          GITEA_TOKEN: ${{ secrets.GITEA_TOKEN }}   # automatic (Gitea 1.21+)
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          PR_NUMBER: ${{ github.event.pull_request.number }}
          REPO: ${{ github.repository }}
        run: merlin review

Secrets to configure (Repository Settings → Secrets):

Secret	Required	Purpose
`ANTHROPIC_API_KEY`	Yes	AI review provider

secrets.GITEA_TOKEN is created automatically by Gitea Actions (v1.21+). Comments appear as gitea-actions[bot] — no bot account needed.

Permissions & Bot Identity

All platforms: bot identity is automatic

Every platform provides a built-in CI token. Merlin uses it to post comments as a bot — no manual bot account or GitHub App required.

Platform	Token to use	Comments appear as	Extra setup
GitHub Actions	`secrets.GITHUB_TOKEN`	`github-actions[bot]`	Add `permissions` block (see below)
GitLab CI	`CI_JOB_TOKEN`	GitLab project bot	None
Bitbucket Pipelines	`BITBUCKET_STEP_TOKEN`	Pipelines build service	None
Azure DevOps	`System.AccessToken`	Project Collection Build Service	Enable OAuth token in pipeline settings
Gitea Actions	`secrets.GITEA_TOKEN`	`gitea-actions[bot]`	None (Gitea 1.21+)

Comments appearing under your personal account? You are passing a Personal Access Token (PAT) instead of the platform's automatic token. Switch to the token in the table above and the bot identity is restored automatically.

GitHub: required permissions block

GitHub defaults to a read-only token. Add this block at the workflow level or the API returns 403 Forbidden:

hljs language-yaml

permissions:
  contents: read        # needed by actions/checkout
  pull-requests: write  # needed to read the PR diff and post inline comments

GitHub: custom named bot (optional)

To post as "Merlin AI Reviewer[bot]" instead of github-actions[bot]:

Go to GitHub Settings → Developer Settings → GitHub Apps → New GitHub App
Set permissions: Pull requests: Read & write, Contents: Read. Disable webhooks.
Install the app on your repository.
Store the App ID and private key as secrets (MERLIN_APP_ID, MERLIN_APP_PRIVATE_KEY).
Generate a token in the workflow:

hljs language-yaml

permissions:
  contents: read
  pull-requests: write

jobs:
  merlin-review:
    runs-on: ubuntu-latest
    steps:
      - name: Generate bot token
        id: app-token
        uses: actions/create-github-app-token@v1
        with:
          app-id: ${{ secrets.MERLIN_APP_ID }}
          private-key: ${{ secrets.MERLIN_APP_PRIVATE_KEY }}

      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Run Merlin Review
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          GITHUB_TOKEN: ${{ steps.app-token.outputs.token }}
          PR_NUMBER: ${{ github.event.pull_request.number }}
          REPO: ${{ github.repository }}
        run: merlin review

This step is optional. github-actions[bot] works out of the box with zero configuration.

AI Providers

Merlin auto-detects which provider to use based on the environment variables present, or you can pin one in merlin.toml.

Provider	`provider` value	Key env var	Notes
Anthropic Claude	`anthropic`	`ANTHROPIC_API_KEY`	Default
OpenAI	`openai`	`OPENAI_API_KEY`	GPT-4o, GPT-4o-mini
Google Gemini	`gemini`	`GEMINI_API_KEY`	Gemini 1.5 Pro / Flash
AWS Bedrock	`bedrock`	`AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY`	Claude on Bedrock
Azure OpenAI	`azure-openai`	`AZURE_OPENAI_API_KEY`	Custom deployment
Ollama	`ollama`	(none)	Local, fully private
Claude Code CLI	`claude-code`	`CLAUDE_CODE_TOKEN`	No API key needed
Groq	`groq`	`GROQ_API_KEY`	Llama 3, Mixtral — ultra-fast
Together AI	`together-ai`	`TOGETHER_API_KEY`	100+ open-source models
DeepSeek	`deep-seek`	`DEEPSEEK_API_KEY`	DeepSeek Coder / Chat
Mistral AI	`mistral`	`MISTRAL_API_KEY`	Mistral, Codestral
OpenRouter	`open-router`	`OPENROUTER_API_KEY`	Gateway to 200+ models

Anthropic Claude (default)

hljs language-toml

# merlin.toml
[ai]
provider   = "anthropic"
model      = "claude-sonnet-4-6"   # or claude-opus-4-6, claude-haiku-4-5-20251001
max_tokens = 4096

hljs language-bash

export ANTHROPIC_API_KEY=sk-ant-...
merlin review

Get a key at console.anthropic.com.

OpenAI GPT-4o

hljs language-toml

[ai]
provider = "openai"
model    = "gpt-4o"   # or gpt-4o-mini, gpt-4-turbo

hljs language-bash

export OPENAI_API_KEY=sk-...
merlin review

OPENAI_API_KEY also powers RAG embeddings when embedder = "openai" — one key for both.

Google Gemini

hljs language-toml

[ai]
provider = "gemini"
model    = "gemini-1.5-pro"   # or gemini-2.0-flash, gemini-1.5-flash

hljs language-bash

export GEMINI_API_KEY=AIza...
merlin review

Get a key from Google AI Studio.

AWS Bedrock

hljs language-toml

[ai]
provider        = "bedrock"
model           = "anthropic.claude-sonnet-4-6-20250514-v1:0"
bedrock_region  = "us-east-1"

hljs language-bash

export AWS_ACCESS_KEY_ID=AKIA...
export AWS_SECRET_ACCESS_KEY=...
export AWS_SESSION_TOKEN=...   # optional, for temporary credentials
merlin review

The IAM role/user needs the bedrock:InvokeModel permission for the chosen model ARN.

Azure OpenAI

hljs language-toml

[ai]
provider                 = "azure-openai"
model                    = "gpt-4o"
azure_openai_endpoint    = "https://my-resource.openai.azure.com"
azure_openai_deployment  = "my-gpt4o-deployment"

hljs language-bash

export AZURE_OPENAI_API_KEY=...
merlin review

Ollama (local — fully private)

No API key required. All processing stays on your machine.

hljs language-toml

[ai]
provider        = "ollama"
model           = "llama3.1"   # any model pulled with `ollama pull`
ollama_base_url = "http://localhost:11434"

hljs language-bash

ollama serve
ollama pull llama3.1
merlin review

Good local models for code review: codellama, deepseek-coder, qwen2.5-coder.

Claude Code CLI

For teams with a Claude Code subscription — no ANTHROPIC_API_KEY needed.

hljs language-toml

[ai]
provider = "claude-code"
model    = "claude-sonnet-4-6"

hljs language-bash

# Developer machine (interactive)
claude auth login

# CI (headless)
claude auth login --token $CLAUDE_CODE_TOKEN
merlin review

Set CLAUDE_CODE_TOKEN as a CI secret. The token is obtained from your Claude Code account settings.

Groq

Ultra-fast open-source inference. Llama 3.3 70B reviews typically complete in under 3 seconds.

hljs language-toml

[ai]
provider = "groq"
model    = "llama-3.3-70b-versatile"   # or mixtral-8x7b-32768, gemma2-9b-it

hljs language-bash

export GROQ_API_KEY=gsk_...
merlin review

Get a free key at console.groq.com. Recommended models for code review:

Model	Context	Speed
`llama-3.3-70b-versatile`	128k	Fast
`mixtral-8x7b-32768`	32k	Very fast
`llama-3.1-8b-instant`	128k	Fastest

Together AI

Access to 100+ open-source models including Llama, Mistral, Qwen, and DBRX.

hljs language-toml

[ai]
provider = "together-ai"
model    = "meta-llama/Llama-3.3-70B-Instruct-Turbo"

hljs language-bash

export TOGETHER_API_KEY=...
merlin review

Get a key at api.together.ai. Recommended models:

Model	Notes
`meta-llama/Llama-3.3-70B-Instruct-Turbo`	Best quality
`meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo`	Smaller, faster
`mistralai/Mixtral-8x7B-Instruct-v0.1`	Strong coding ability
`Qwen/Qwen2.5-Coder-32B-Instruct`	Specialised for code

DeepSeek

Strong coding-focused models with competitive pricing.

hljs language-toml

[ai]
provider = "deep-seek"
model    = "deepseek-coder"   # or deepseek-chat

hljs language-bash

export DEEPSEEK_API_KEY=...
merlin review

Get a key at platform.deepseek.com.

Mistral AI

Mistral's own models plus Codestral, which is fine-tuned for code tasks.

hljs language-toml

[ai]
provider = "mistral"
model    = "codestral-latest"   # or mistral-large-latest, open-mistral-nemo

hljs language-bash

export MISTRAL_API_KEY=...
merlin review

Get a key at console.mistral.ai. codestral-latest is recommended for code review tasks.

OpenRouter

A unified gateway to 200+ models from OpenAI, Anthropic, Meta, Mistral, Google, and more — including many free tiers.

hljs language-toml

[ai]
provider = "open-router"
model    = "meta-llama/llama-3.3-70b-instruct"   # or any model on openrouter.ai/models

hljs language-bash

export OPENROUTER_API_KEY=sk-or-...
merlin review

Get a key at openrouter.ai. Use any model slug from the model list. OpenRouter is useful for:

Accessing models not available in your region
Comparing multiple providers with a single API key
Free-tier access to powerful open-source models

Ollama (local — fully private)

RAG (Retrieval-Augmented Generation) indexes your codebase into a vector store. When reviewing a PR, Merlin retrieves the most relevant files and injects them into the AI prompt — giving the reviewer full context beyond the diff alone.

When to use RAG

Large codebases where a diff touches shared utilities or interfaces
Reviews that need to understand how changed code is called elsewhere
Finding similar patterns or security issues across the repo

Setup in merlin.toml

hljs language-toml

[rag]
enabled          = true
embedder         = "openai"              # "openai" for CI, "ollama" for local
store            = "local"              # see vector store table below
embed_model      = "text-embedding-3-small"
collection       = "merlin"
top_k            = 5                    # number of relevant chunks to inject
min_score        = 0.70                 # similarity threshold (0.0–1.0)
chunk_lines      = 80                   # lines per indexed chunk
index_extensions = [".rs", ".ts", ".py", ".go", ".java", ".md"]
local_path       = "merlin-rag.jsonl"  # local store file path

Embedding backends

Embedder	Best for	Model	Requires
`openai`	CI/CD — any runner	`text-embedding-3-small`	`OPENAI_API_KEY`
`ollama`	Local dev — free, fully private	`nomic-embed-text`	`ollama serve`

Vector stores

Store	Setup	Best for
`local`	None — single JSONL file, cache in CI	Small/medium repos, CI
`memory`	None — ephemeral	Testing
`qdrant`	`docker run -p 6333:6333 qdrant/qdrant`	Production, self-hosted
`chroma`	`docker run -p 8000:8000 chromadb/chroma`	Open-source alternative
`pinecone`	Managed cloud account	Zero-ops managed

Local development (Ollama + local store)

hljs language-toml

# merlin.toml
[rag]
enabled  = true
embedder = "ollama"
store    = "local"

hljs language-bash

ollama pull nomic-embed-text
merlin rag index .         # index once (a few seconds for most repos)
merlin review              # reviews now include codebase context

CI/CD (OpenAI + cached JSONL)

hljs language-toml

# merlin.toml
[rag]
enabled     = true
embedder    = "openai"
embed_model = "text-embedding-3-small"
store       = "local"

Add to your GitHub Actions workflow:

hljs language-yaml

- name: Cache RAG index
  uses: actions/cache@v4
  with:
    path: merlin-rag.jsonl
    key: merlin-rag-${{ hashFiles('src/**', 'lib/**') }}
    restore-keys: merlin-rag-

- name: Build RAG index (first run only)
  run: test -f merlin-rag.jsonl || merlin rag index .
  env:
    OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

Indexing a typical 10k-file repo costs around $0.10 in OpenAI embedding credits and only re-runs when source files change.

Production (Qdrant persistent store)

hljs language-toml

# merlin.toml
[rag]
enabled    = true
embedder   = "openai"
store      = "qdrant"
qdrant_url = "http://localhost:6333"
# qdrant_api_key = ""   # required for Qdrant Cloud

The index persists in Qdrant between CI runs — no file caching step needed.

Custom Rules Engine

Define team-specific review rules in .merlin-rules.yaml. Rules can be regex patterns matched against diffs, natural-language directives injected into the AI prompt, or both — optionally scoped to specific file paths.

Example `.merlin-rules.yaml`

hljs language-yaml

rules:
  # Regex pattern rule — flags matching code in the diff
  - name: no-unwrap
    pattern: "unwrap\\(\\)"
    severity: high
    message: "Avoid unwrap() in production code — use ? or expect() with context"

  # Natural-language directive — injected into the AI system prompt
  - name: require-error-handling
    directive: "All public API functions must handle errors explicitly with Result"

  # Path-scoped rule — only applies to files matching the glob
  - name: auth-review
    path_match: "src/auth/**"
    directive: "Flag any changes to authentication logic as Critical severity"

  # Combined: regex + path scope
  - name: no-sql-string-concat
    pattern: "format!.*SELECT|format!.*INSERT|format!.*UPDATE"
    path_match: "src/db/**"
    severity: critical
    message: "Never build SQL with string formatting — use parameterised queries"

Configuration

hljs language-toml

# merlin.toml
[review]
rules_file = ".merlin-rules.yaml"   # default path

Rules are loaded automatically on every review run. Regex pattern matches are prepended to the AI context as hints, and directive rules are appended to the system prompt as numbered team rules.

Adaptive Feedback Learning

Merlin learns from your team's reactions to review comments. Over time, comment patterns that are consistently rejected (👎) are auto-suppressed, reducing noise without manual rule authoring.

How it works

React to review comments with 👍 (accept) or 👎 (reject)
Merlin records each reaction keyed by category:title pattern
Once a pattern has 5+ events and >70% rejection rate, it's suppressed
Run /feedback to see current learning status and suppressed patterns

Configuration

hljs language-toml

# merlin.toml
[review]
feedback_learning = true                     # enable adaptive filtering
feedback_path     = ".merlin-feedback.jsonl"  # where to store feedback data

Viewing status

hljs language-bash

merlin run /feedback
# or from a PR comment:
@merlin /feedback

The feedback file is a simple JSONL format — commit it to your repo so the entire team benefits from shared learning, or add it to .gitignore for per-environment learning.

Slash Commands

Trigger commands from a PR comment using @merlin /command, or run them directly in CI with merlin run /command.

Command	What it does	Output
`/review`	Full code review with inline comments	Inline comments + summary
`/spec`	Generate a technical specification	Updates PR description
`/describe`	Auto-generate PR title and description	Updates PR description
`/ask <question>`	Q&A about the PR diff	PR comment
`/improve`	Inline code suggestion blocks	PR suggestion comments
`/generate_labels`	Auto-label based on diff content and size	PR labels
`/update_changelog`	Prepend an entry to CHANGELOG.md	File commit
`/add_doc`	Generate missing docstrings	PR suggestion comments
`/similar_issue`	Find related open issues	PR comment table
`/test`	Generate unit tests for changed code	PR comment
`/explain`	Plain-language walkthrough of the PR	PR comment
`/security`	Dedicated security scan (secrets + OWASP Top 10)	Inline + summary report
`/approve`	AI-assisted review verdict	PR review submission
`/commit_message`	Generate 3 conventional commit message options	PR comment
`/docs [mode]`	Generate docs (`readme`/`api`/`adr`/`module`/`wiki`/`auto`)	PR comment or file commit
`/snyk`	Scan changed dependencies against Snyk database	PR comment
`/coverage`	Analyse test coverage for changed files	PR comment
`/link_jira`	Find and link related Jira issues	PR comment
`/link_linear`	Find and link related Linear issues	PR comment
`/triage`	Find similar open issues on CodeTriage	PR comment
`/diagram`	Generate a Mermaid architecture diagram of the PR changes	PR comment
`/feedback`	Show adaptive feedback learning status and suppressed patterns	PR comment

Examples

hljs language-bash

# Run from CI
merlin run /review
merlin run /security
merlin run /ask "Is this change safe to deploy on Friday?"
merlin run /docs readme

# Trigger from a PR comment (requires webhook/bot mode)
@merlin /review
@merlin /ask "What is the performance impact of this change?"
@merlin /spec

Webhook & Bot Mode

Bot mode runs Merlin as a persistent HTTP server. It listens for PR comment events and automatically dispatches slash commands — no CI trigger needed.

hljs language-bash

merlin webhook --port 8080

Configure your VCS platform to send webhook events to:

Platform	Webhook URL	Event type
GitHub	`http://your-host:8080/webhook/github`	`issue_comment`
GitLab	`http://your-host:8080/webhook/gitlab`	Note Hook

Securing the webhook

hljs language-bash

# GitHub HMAC secret
export MERLIN_GITHUB_SECRET=your-secret
merlin webhook --port 8080

# GitLab token
export MERLIN_GITLAB_SECRET=your-token
merlin webhook --port 8080

Set the same secret in your platform's webhook settings under "Secret token".

Running as a service

hljs language-yaml

# docker-compose.yml
services:
  merlin:
    image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest
    command: webhook --port 8080
    ports:
      - "8080:8080"
    environment:
      GITHUB_TOKEN: your-token
      ANTHROPIC_API_KEY: your-key
      MERLIN_GITHUB_SECRET: your-secret
    restart: unless-stopped

Autonomous Agent

The agent runs a ReAct (Reason + Act) loop — it plans, executes slash commands as tools, and iterates until the task is complete.

CLI mode (interactive REPL)

hljs language-bash

merlin agent
# > Summarise the open PRs and flag any that touch auth code
# > Review PR #42 with a focus on SQL injection risks

Single-shot (CI-friendly)

hljs language-bash

merlin agent --task "Review PR #42 and post a summary comment"

Slack

hljs language-bash

export SLACK_BOT_TOKEN=xoxb-...
merlin agent --channel slack --port 8090

The agent listens on port 8090 for Slack Events API calls. Configure your Slack app's Event Subscriptions URL to http://your-host:8090.

Discord

hljs language-bash

export DISCORD_BOT_TOKEN=...
export DISCORD_CHANNEL_ID=...
merlin agent --channel discord

Agent configuration

hljs language-toml

# merlin.toml
[agent]
max_iterations      = 10          # max ReAct loop steps
max_memory_messages = 50          # context window for conversation history
memory_file         = ".merlin-memory.jsonl"   # persist memory across runs
default_channel     = "cli"
port                = 8090

Configuration Reference

Copy config.example.toml to merlin.toml in your repo root. All fields are optional — Merlin works with zero configuration.

hljs language-toml

# merlin.toml

[ai]
# Provider: "anthropic" | "openai" | "claude-code" | "gemini" | "bedrock" | "azure-openai" | "ollama"
provider    = "anthropic"
model       = "claude-sonnet-4-6"
max_tokens  = 4096
temperature = 0.2   # lower = more consistent; 0.0–1.0

# Provider-specific options (uncomment as needed)
# ollama_base_url          = "http://localhost:11434"
# azure_openai_endpoint    = "https://my-resource.openai.azure.com"
# azure_openai_deployment  = "my-deployment"
# bedrock_region           = "us-east-1"

[review]
# Review categories to focus on
focus        = ["bugs", "security", "style", "performance"]

# Max inline comments per run (prevents PR spam)
max_comments = 30

# Lines per diff chunk sent to the AI
chunk_lines  = 200

# Second AI pass to filter false positives (slower but more accurate)
reflect      = false

# Custom rules file (regex patterns + natural-language directives)
rules_file   = ".merlin-rules.yaml"

# Adaptive feedback learning — suppresses consistently rejected comment patterns
feedback_learning = false
feedback_path     = ".merlin-feedback.jsonl"

[platform]
# Auto-detected from CI env vars. Override only if needed:
# type = "github"   # "github" | "gitlab" | "bitbucket" | "azure-devops" | "gitea"

[rag]
enabled          = false
embedder         = "openai"              # "openai" | "ollama"
store            = "local"              # "local" | "memory" | "qdrant" | "chroma" | "pinecone"
embed_model      = "text-embedding-3-small"
collection       = "merlin"
top_k            = 5
min_score        = 0.70
chunk_lines      = 80
local_path       = "merlin-rag.jsonl"
index_extensions = [".rs", ".ts", ".py", ".go", ".java", ".md"]

# Qdrant
# qdrant_url     = "http://localhost:6333"
# qdrant_api_key = ""

# ChromaDB
# chroma_url = "http://localhost:8000"

# Pinecone
# pinecone_host    = "https://my-index.svc.us-east1.pinecone.io"
# pinecone_api_key = ""   # or set PINECONE_API_KEY env var

[agent]
max_iterations      = 10
max_memory_messages = 50
# memory_file       = ".merlin-memory.jsonl"
default_channel     = "cli"
port                = 8090

Environment Variables

All secrets are read from environment variables — never put them in merlin.toml.

AI providers

Variable	Provider
`ANTHROPIC_API_KEY`	Anthropic Claude
`OPENAI_API_KEY`	OpenAI (review and/or RAG embeddings)
`GEMINI_API_KEY`	Google Gemini
`AZURE_OPENAI_API_KEY`	Azure OpenAI
`AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY`	AWS Bedrock
`AWS_SESSION_TOKEN`	AWS Bedrock (temporary credentials)
`CLAUDE_CODE_TOKEN`	Claude Code CLI headless auth

VCS platforms

Variable	Platform	Notes
`GITHUB_TOKEN`	GitHub	Auto-provided by Actions
`GITLAB_TOKEN`	GitLab	Use `$CI_JOB_TOKEN` in CI
`BITBUCKET_TOKEN`	Bitbucket	Use `$BITBUCKET_STEP_TOKEN` in CI
`AZURE_DEVOPS_TOKEN`	Azure DevOps	Use `$(System.AccessToken)` in CI
`GITEA_TOKEN`	Gitea	Auto-provided by Gitea Actions 1.21+

Integrations

Variable	Purpose
`PINECONE_API_KEY`	Pinecone vector store
`SNYK_TOKEN`	Snyk dependency scanning (`/snyk` command)
`JIRA_TOKEN`	Jira issue linking (`/link_jira` command)
`LINEAR_API_KEY`	Linear issue linking (`/link_linear` command)
`SLACK_BOT_TOKEN`	Slack agent channel
`DISCORD_BOT_TOKEN`	Discord agent channel
`DISCORD_CHANNEL_ID`	Discord channel to post in
`MERLIN_GITHUB_SECRET`	HMAC secret for GitHub webhook verification
`MERLIN_GITLAB_SECRET`	Token for GitLab webhook verification

CLI Reference

hljs language-bash

# ── Review ────────────────────────────────────────────────────────────────────
merlin review                                   # CI review (auto-detects platform)
merlin review --diff path/to/changes.diff       # local review, no platform posting
merlin review --diff changes.diff --output json # machine-readable output

# ── Slash commands ────────────────────────────────────────────────────────────
merlin run /review
merlin run /spec
merlin run /describe
merlin run /ask "Is this change thread-safe?"
merlin run /improve
merlin run /generate_labels
merlin run /update_changelog
merlin run /add_doc
merlin run /similar_issue
merlin run /test
merlin run /explain
merlin run /security
merlin run /approve
merlin run /commit_message
merlin run /snyk
merlin run /coverage
merlin run /link_jira
merlin run /link_linear
merlin run /triage
merlin run /diagram
merlin run /feedback
merlin run /docs              # auto-detect best doc type
merlin run /docs readme       # generate README section
merlin run /docs api          # generate API reference
merlin run /docs adr          # generate Architecture Decision Record
merlin run /docs module       # generate module docstrings
merlin run /docs wiki         # generate wiki page

# ── RAG index ─────────────────────────────────────────────────────────────────
merlin rag index .                              # index current directory
merlin rag index src/                           # index a subdirectory
merlin rag search "auth bypass"                 # semantic search
merlin rag search "SQL injection" -k 10         # return up to 10 results
merlin rag count                                # number of indexed documents
merlin rag clear                                # delete all indexed data

# ── Webhook server ────────────────────────────────────────────────────────────
merlin webhook --port 8080

# ── Autonomous agent ──────────────────────────────────────────────────────────
merlin agent                                    # CLI REPL
merlin agent --channel slack                    # Slack Events API on --port 8090
merlin agent --channel discord                  # Discord bot
merlin agent --task "summarise the open PRs"    # single-shot, CI-friendly

# ── Debug ─────────────────────────────────────────────────────────────────────
merlin parse-diff path/to/changes.diff          # show parsed file structure + priority

Troubleshooting

`merlin: not found` (exit code 127) in CI

The binary cannot be found in PATH. This happens when:

Using the Docker container approach on Alpine: the glibc binary won't run on musl. Use the image from GHCR — it ships the correct musl static binary.
Using the binary install approach: check that the install script ran successfully before the merlin review step.

hljs language-yaml

# Correct container approach — uses musl binary, works on any runner
container:
  image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest

`403 Forbidden` from GitHub API

Missing permissions block in the workflow. Add:

hljs language-yaml

permissions:
  contents: read
  pull-requests: write

Comments appear under my username instead of a bot

You are passing a Personal Access Token (PAT) instead of the platform's automatic CI token. Use ${{ secrets.GITHUB_TOKEN }} (GitHub), $CI_JOB_TOKEN (GitLab), or $BITBUCKET_STEP_TOKEN (Bitbucket). See Permissions & Bot Identity.

`Platform API error: HTTP 401 Unauthorized`

The VCS token is missing or invalid. Ensure the correct env var is set:

GitHub: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GitLab: GITLAB_TOKEN: $CI_JOB_TOKEN
Bitbucket: BITBUCKET_TOKEN: $BITBUCKET_STEP_TOKEN
Azure DevOps: AZURE_DEVOPS_TOKEN: $(System.AccessToken) (and OAuth token must be enabled in pipeline settings)

RAG index is rebuilt on every CI run

The cache key or cache path is misconfigured. Ensure the path matches local_path in merlin.toml (default: merlin-rag.jsonl) and the key pattern covers your source directories.

AI provider returns an error

Check that the correct API key environment variable is set and not expired.
Verify the model name is correct for your provider/region (especially for Bedrock where model IDs include region-specific suffixes).
For Ollama, ensure ollama serve is running and the model has been pulled with ollama pull <model>.

Architecture

hljs language-scss

CLI (clap)
  ├── ReviewEngine
  │     ├── PlatformClient  (GitHub | GitLab | Bitbucket | Azure DevOps | Gitea)
  │     │     ├── get_diff()
  │     │     ├── post_inline_comment()
  │     │     └── post_summary()
  │     ├── DiffParser → Vec<FileDiff>
  │     │     └── prioritize_diffs()   (token-aware, security-ranked)
  │     ├── AiProvider  (Anthropic | OpenAI | Claude Code | Gemini | Bedrock | Azure OpenAI | Ollama)
  │     │     └── review(ReviewContext) → Vec<ReviewComment>
  │     └── RagPipeline  (optional)
  │           ├── Embedder
  │           │     ├── OllamaEmbedder   (local dev — free, private)
  │           │     └── OpenAiEmbedder   (CI/CD — any runner)
  │           └── VectorStore  (local | memory | qdrant | chroma | pinecone)
  │                 └── search() → Vec<RetrievedDoc> → injected into AI prompt
  │
  ├── RulesEngine  (custom review rules)
  │     └── .merlin-rules.yaml → regex patterns + directives → AI prompt injection
  │
  ├── FeedbackStore  (adaptive learning)
  │     └── .merlin-feedback.jsonl → accept/reject signals → auto-suppress noisy patterns
  │
  ├── ToolRouter  (slash commands)
  │     ├── /spec, /review, /describe, /ask, /improve, /diagram
  │     ├── /generate_labels, /update_changelog, /add_doc, /similar_issue
  │     ├── /test, /explain, /security, /approve, /commit_message, /feedback
  │     ├── /docs, /snyk, /coverage, /link_jira, /link_linear, /triage
  │     └── Webhook server (axum) — dispatches commands from PR comments
  │
  └── AgentRuntime  (ReAct loop)
        ├── AgentMemory  (ring-buffer + optional JSONL persistence)
        ├── AgentTools   (all slash commands + post_comment + get_pr_info + rag_search)
        └── AgentChannel
              ├── CliChannel     (stdin/stdout REPL)
              ├── SlackChannel   (axum webhook + chat.postMessage)
              └── DiscordChannel (REST polling + message reply)

Platform auto-detection — Merlin reads CI environment variables to detect the active platform automatically:

CI system	Detection variable
GitHub Actions	`GITHUB_ACTIONS=true`
GitLab CI	`GITLAB_CI=true`
Bitbucket Pipelines	`BITBUCKET_PIPELINE_UUID`
Azure DevOps	`TF_BUILD=True`
Gitea Actions	`GITEA_ACTIONS=true`

Building from Source

hljs language-bash

# Prerequisites: Rust 1.85+ (see rust-toolchain.toml)

# Clone
git clone https://github.com/Arunachalamkalimuthu/merlin-ai-code-review
cd merlin-ai-code-review

# Development build
cargo build

# Release binary
cargo build --release
./target/release/merlin --version

# Run tests
cargo test

# Lint
cargo clippy --all-targets --all-features -- -D warnings

# Format
cargo fmt --all

# Docker (local build — uses glibc binary)
docker build -t merlin .

# Docker (CI image — uses musl static binary, requires pre-built dist/)
docker build -f Dockerfile.ci -t merlin-ci .

Local Development — Git Hooks

The repository ships pre-commit and pre-push hooks that mirror the CI checks exactly. Install them once after cloning:

hljs language-bash

bash scripts/install-hooks.sh

To also install the required CLI tools (typos, cargo-audit, cargo-deny):

hljs language-bash

bash scripts/install-hooks.sh --tools

What each hook runs

pre-commit — fast checks, runs on every git commit (~5–10 s):

Check	Command	What it catches
Format	`cargo fmt --all --check`	Unformatted code
Lint	`cargo clippy --all-targets --all-features -- -D warnings`	Warnings, bad patterns
Spell	`typos`	Typos in source, docs, config

pre-push — security checks, runs on every git push (~15–30 s):

Check	Command	What it catches
CVE scan	`cargo audit --deny warnings`	Known vulnerabilities in dependencies
Policy	`cargo deny check`	Licence violations, banned crates, duplicate deps

Skipping hooks

hljs language-bash

git commit --no-verify   # skip pre-commit checks
git push   --no-verify   # skip pre-push checks

Use --no-verify only for genuine emergencies. CI will catch any skipped issues.

Contributing

See CONTRIBUTING.md for the full guide — bug reports, feature requests, development setup, coding standards, commit conventions, and walkthroughs for adding a new AI provider, VCS platform, slash command, vector store, or agent channel.

License

MIT — see LICENSE.

Merlin

Self-hosted AI code review for GitHub, GitLab, Bitbucket, Azure DevOps, and Gitea.

Merlin runs inside your CI pipeline, reviews pull request diffs with the AI provider of your choice, and posts inline comments directly on the PR. No code ever leaves your infrastructure.

hljs language-css

PR opened
    │
    ▼
CI pipeline triggers Merlin
    │
    ├── Fetch PR diff from platform API
    ├── (optional) Search RAG index for relevant codebase context
    ├── Send diff + context to AI provider
    └── Post inline review comments back to the PR
              │
              ▼
        github-actions[bot] commented:
        🔴 [Critical] SQL injection via unsanitized input ...

Features
Prerequisites
Installation
Quick Start — 5 Minutes
Platform Integration
- GitHub Actions
- GitLab CI
- Bitbucket Pipelines
- Azure DevOps
- Gitea Actions
Permissions & Bot Identity
AI Providers
- Anthropic Claude
- OpenAI GPT-4o
- Google Gemini
- AWS Bedrock
- Azure OpenAI
- Claude Code CLI
- Groq
- Together AI
- DeepSeek
- Mistral AI
- OpenRouter
- Ollama (local)
RAG — Context-Aware Reviews
Custom Rules Engine
Adaptive Feedback Learning
Slash Commands
Webhook & Bot Mode
Autonomous Agent
Configuration Reference
Environment Variables
CLI Reference
Troubleshooting
Architecture
Building from Source
Local Development — Git Hooks
Contributing
License

Features

Category	Details
AI providers	Anthropic Claude, OpenAI, Google Gemini, AWS Bedrock, Azure OpenAI, Groq, Together AI, DeepSeek, Mistral AI, OpenRouter, Ollama (local), Claude Code CLI
VCS platforms	GitHub, GitLab, Bitbucket, Azure DevOps, Gitea — auto-detected from CI environment
Slash commands	22 commands triggered from PR comments (`@merlin /review`) or CLI (`merlin run /spec`)
Custom rules engine	`.merlin-rules.yaml` — regex patterns, natural-language directives, and path-scoped rules
Adaptive feedback	Learns from 👍/👎 reactions to suppress noisy comment patterns over time
PR architecture diagrams	`/diagram` generates Mermaid diagrams showing module relationships and data flow
RAG pipeline	Index your codebase; reviews include semantically relevant file context
Bot mode	Persistent webhook server that reacts to PR comment events automatically
Autonomous agent	ReAct-loop agent with Slack, Discord, and CLI channels
Security focus	Files ranked by security sensitivity; dedicated `/security` scan for secrets + OWASP
Reflect & Review	Optional second AI pass to filter false positives and refine severity
Local mode	`merlin review --diff <file>` for offline testing without a VCS platform
Zero lock-in	Swap AI providers, vector stores, or VCS platforms via a single config line

Prerequisites

You need one of the following to provide AI reviews:

Provider	What you need
Anthropic Claude (recommended)	`ANTHROPIC_API_KEY` from console.anthropic.com
OpenAI	`OPENAI_API_KEY` from platform.openai.com
Google Gemini	`GEMINI_API_KEY` from Google AI Studio
AWS Bedrock	AWS credentials with Bedrock access
Azure OpenAI	Azure OpenAI resource + deployment
Ollama	Local Ollama install — no API key
Claude Code CLI	Claude Code subscription — no API key

Your VCS platform token (GITHUB_TOKEN, CI_JOB_TOKEN, etc.) is provided automatically by CI — no manual setup needed.

Installation

Pick the method that fits your workflow. All methods produce the same binary.

Option 1 — One-line installer (recommended)

hljs language-bash

# Linux / macOS
curl -fsSL \
  https://github.com/Arunachalamkalimuthu/merlin-ai-code-review/releases/latest/download/install.sh \
  | sh

hljs language-powershell

# Windows (PowerShell)
irm https://github.com/Arunachalamkalimuthu/merlin-ai-code-review/releases/latest/download/install.ps1 | iex

The installer auto-detects your OS and architecture and places the binary in /usr/local/bin (or %LOCALAPPDATA%\Programs\merlin on Windows).

Option 2 — Docker image

hljs language-bash

docker pull ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest

The image is multi-arch (linux/amd64, linux/arm64) and uses a fully-static musl binary — no libc issues on Alpine-based runners.

Option 3 — Pre-built binary

Download the binary for your platform from the latest release:

Platform	Binary
Linux x86_64 (glibc)	`merlin-linux-amd64`
Linux x86_64 (musl / static)	`merlin-linux-amd64-musl`
Linux arm64 (glibc)	`merlin-linux-arm64`
Linux arm64 (musl / static)	`merlin-linux-arm64-musl`
macOS Intel	`merlin-darwin-amd64`
macOS Apple Silicon	`merlin-darwin-arm64`
Windows x86_64	`merlin-windows-amd64.exe`

Use the -musl binaries on Alpine Linux or any musl-based distro.

Option 4 — Build from source

hljs language-bash

# Requires Rust 1.85+
cargo install --git https://github.com/Arunachalamkalimuthu/merlin-ai-code-review

Quick Start — 5 Minutes

This is the minimum setup to get Merlin reviewing PRs on GitHub with Anthropic Claude.

Step 1 — Add your API key as a repository secret

In your repository: Settings → Secrets and variables → Actions → New repository secret

Name: ANTHROPIC_API_KEY
Value: your key from console.anthropic.com

Step 2 — Create the workflow file

Create .github/workflows/merlin-review.yml in your repository:

hljs language-yaml

name: Merlin AI Code Review

on:
  pull_request:
    types: [opened, synchronize, reopened]

permissions:
  contents: read
  pull-requests: write

jobs:
  merlin-review:
    name: Merlin AI Review
    runs-on: ubuntu-latest
    container:
      image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Run Merlin Review
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          PR_NUMBER: ${{ github.event.pull_request.number }}
          REPO: ${{ github.repository }}
        run: merlin review

Step 3 — Open a pull request

Merlin will automatically review the diff and post inline comments. Comments appear as github-actions[bot] — no bot account needed.

That's it. For other platforms or advanced configuration, read on.

Platform Integration

GitHub Actions

Two equivalent approaches — choose whichever fits your stack.

Option A — Docker container (simplest)

hljs language-yaml

# .github/workflows/merlin-review.yml
name: Merlin AI Code Review

on:
  pull_request:
    types: [opened, synchronize, reopened]

permissions:
  contents: read        # required for actions/checkout
  pull-requests: write  # required to read diff and post comments

jobs:
  merlin-review:
    name: Merlin AI Review
    runs-on: ubuntu-latest
    container:
      image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Run Merlin Review
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          PR_NUMBER: ${{ github.event.pull_request.number }}
          REPO: ${{ github.repository }}
        run: merlin review

Option B — Binary install (with RAG index caching)

hljs language-yaml

# .github/workflows/merlin-review.yml
name: Merlin AI Code Review

on:
  pull_request:
    types: [opened, synchronize, reopened]

permissions:
  contents: read
  pull-requests: write

jobs:
  merlin-review:
    name: Merlin AI Review
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Cache RAG index
        uses: actions/cache@v4
        with:
          path: merlin-rag.jsonl
          key: merlin-rag-${{ hashFiles('src/**', 'lib/**') }}
          restore-keys: merlin-rag-

      - name: Install Merlin
        run: |
          curl -fsSL \
            https://github.com/Arunachalamkalimuthu/merlin-ai-code-review/releases/latest/download/install.sh \
            | sh

      - name: Build RAG index (first run only)
        run: test -f merlin-rag.jsonl || merlin rag index .
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

      - name: Run Merlin Review
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
        run: merlin review

Important: The permissions block is mandatory. Without pull-requests: write, GitHub returns 403 Forbidden when Merlin tries to fetch the PR diff or post comments.

Secrets to configure

Secret	Required	Purpose
`ANTHROPIC_API_KEY`	Yes (if using Anthropic)	AI review provider
`OPENAI_API_KEY`	Only for RAG embeddings	Codebase indexing

GITHUB_TOKEN is provided automatically — do not create it manually.

GitLab CI

hljs language-yaml

# .gitlab-ci.yml
merlin-review:
  image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest
  stage: review
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
  variables:
    GITLAB_TOKEN: $CI_JOB_TOKEN       # automatic — no setup needed
    ANTHROPIC_API_KEY: $ANTHROPIC_API_KEY
  script:
    - merlin review

With RAG index caching:

hljs language-yaml

merlin-review:
  image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest
  stage: review
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
  cache:
    key: merlin-rag-$CI_DEFAULT_BRANCH
    paths:
      - merlin-rag.jsonl
  variables:
    GITLAB_TOKEN: $CI_JOB_TOKEN
    ANTHROPIC_API_KEY: $ANTHROPIC_API_KEY
    OPENAI_API_KEY: $OPENAI_API_KEY
  script:
    - test -f merlin-rag.jsonl || merlin rag index .
    - merlin review

CI/CD variables to configure (Settings → CI/CD → Variables):

Variable	Required	Purpose
`ANTHROPIC_API_KEY`	Yes	AI review provider
`OPENAI_API_KEY`	Only for RAG	Codebase embeddings

CI_JOB_TOKEN is injected automatically by GitLab. Comments appear as the GitLab project bot.

See .gitlab-ci.yml.example for all RAG embedding and vector store combinations:

Setup	Embedder	Store	Extra requirements
A — Recommended	OpenAI	Local JSONL (cached)	`OPENAI_API_KEY`
B — Self-hosted	OpenAI	Qdrant (GitLab service)	`OPENAI_API_KEY`
C — Managed cloud	OpenAI	Pinecone	`OPENAI_API_KEY` + `PINECONE_API_KEY`
D — Fully private	Ollama (GitLab service)	Local JSONL	Privileged runner
E — No RAG	—	—	Nothing extra

Bitbucket Pipelines

hljs language-yaml

# bitbucket-pipelines.yml
pipelines:
  pull-requests:
    '**':
      - step:
          name: Merlin AI Review
          image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest
          script:
            - merlin review
          variables:
            BITBUCKET_TOKEN: $BITBUCKET_STEP_TOKEN   # automatic — no setup needed
            ANTHROPIC_API_KEY: $ANTHROPIC_API_KEY

With RAG index caching:

hljs language-yaml

pipelines:
  pull-requests:
    '**':
      - step:
          name: Merlin AI Review
          image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest
          caches:
            - merlin-rag
          script:
            - test -f merlin-rag.jsonl || merlin rag index .
            - merlin review
          variables:
            BITBUCKET_TOKEN: $BITBUCKET_STEP_TOKEN
            ANTHROPIC_API_KEY: $ANTHROPIC_API_KEY
            OPENAI_API_KEY: $OPENAI_API_KEY

definitions:
  caches:
    merlin-rag:
      key:
        files:
          - src/**
      path: merlin-rag.jsonl

Repository variables to configure (Repository settings → Pipelines → Repository variables):

Variable	Required	Purpose
`ANTHROPIC_API_KEY`	Yes	AI review provider
`OPENAI_API_KEY`	Only for RAG	Codebase embeddings

BITBUCKET_STEP_TOKEN is created automatically per step. Comments appear as the Pipelines build service user — no bot account needed.

Azure DevOps

hljs language-yaml

# azure-pipelines.yml
trigger: none

pr:
  branches:
    include:
      - '*'

pool:
  vmImage: ubuntu-latest

container:
  image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest

steps:
  - checkout: self
    fetchDepth: 0

  - script: merlin review
    displayName: Merlin AI Review
    env:
      AZURE_DEVOPS_TOKEN: $(System.AccessToken)
      ANTHROPIC_API_KEY: $(ANTHROPIC_API_KEY)
      SYSTEM_TEAMFOUNDATIONCOLLECTIONURI: $(System.TeamFoundationCollectionUri)
      SYSTEM_TEAMPROJECT: $(System.TeamProject)
      BUILD_REPOSITORY_NAME: $(Build.Repository.Name)
      BUILD_SOURCEBRANCH: $(Build.SourceBranch)
      SYSTEM_PULLREQUEST_PULLREQUESTID: $(System.PullRequest.PullRequestId)

One-time pipeline setup:

Pipeline variables to configure (Pipelines → Edit → Variables):

Variable	Required	Purpose
`ANTHROPIC_API_KEY`	Yes	AI review provider
`OPENAI_API_KEY`	Only for RAG	Codebase embeddings

Comments appear as Project Collection Build Service ({org}) — no bot account needed.

Gitea Actions

hljs language-yaml

# .gitea/workflows/merlin-review.yml
name: Merlin AI Code Review

on:
  pull_request:
    types: [opened, synchronize, reopened]

jobs:
  merlin-review:
    name: Merlin AI Review
    runs-on: ubuntu-latest
    container:
      image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Run Merlin Review
        env:
          GITEA_TOKEN: ${{ secrets.GITEA_TOKEN }}   # automatic (Gitea 1.21+)
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          PR_NUMBER: ${{ github.event.pull_request.number }}
          REPO: ${{ github.repository }}
        run: merlin review

Secrets to configure (Repository Settings → Secrets):

Secret	Required	Purpose
`ANTHROPIC_API_KEY`	Yes	AI review provider

secrets.GITEA_TOKEN is created automatically by Gitea Actions (v1.21+). Comments appear as gitea-actions[bot] — no bot account needed.

Permissions & Bot Identity

All platforms: bot identity is automatic

Every platform provides a built-in CI token. Merlin uses it to post comments as a bot — no manual bot account or GitHub App required.

Platform	Token to use	Comments appear as	Extra setup
GitHub Actions	`secrets.GITHUB_TOKEN`	`github-actions[bot]`	Add `permissions` block (see below)
GitLab CI	`CI_JOB_TOKEN`	GitLab project bot	None
Bitbucket Pipelines	`BITBUCKET_STEP_TOKEN`	Pipelines build service	None
Azure DevOps	`System.AccessToken`	Project Collection Build Service	Enable OAuth token in pipeline settings
Gitea Actions	`secrets.GITEA_TOKEN`	`gitea-actions[bot]`	None (Gitea 1.21+)

Comments appearing under your personal account? You are passing a Personal Access Token (PAT) instead of the platform's automatic token. Switch to the token in the table above and the bot identity is restored automatically.

GitHub: required permissions block

GitHub defaults to a read-only token. Add this block at the workflow level or the API returns 403 Forbidden:

hljs language-yaml

permissions:
  contents: read        # needed by actions/checkout
  pull-requests: write  # needed to read the PR diff and post inline comments

GitHub: custom named bot (optional)

To post as "Merlin AI Reviewer[bot]" instead of github-actions[bot]:

Go to GitHub Settings → Developer Settings → GitHub Apps → New GitHub App
Set permissions: Pull requests: Read & write, Contents: Read. Disable webhooks.
Install the app on your repository.
Store the App ID and private key as secrets (MERLIN_APP_ID, MERLIN_APP_PRIVATE_KEY).
Generate a token in the workflow:

hljs language-yaml

permissions:
  contents: read
  pull-requests: write

jobs:
  merlin-review:
    runs-on: ubuntu-latest
    steps:
      - name: Generate bot token
        id: app-token
        uses: actions/create-github-app-token@v1
        with:
          app-id: ${{ secrets.MERLIN_APP_ID }}
          private-key: ${{ secrets.MERLIN_APP_PRIVATE_KEY }}

      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Run Merlin Review
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          GITHUB_TOKEN: ${{ steps.app-token.outputs.token }}
          PR_NUMBER: ${{ github.event.pull_request.number }}
          REPO: ${{ github.repository }}
        run: merlin review

This step is optional. github-actions[bot] works out of the box with zero configuration.

AI Providers

Merlin auto-detects which provider to use based on the environment variables present, or you can pin one in merlin.toml.

Provider	`provider` value	Key env var	Notes
Anthropic Claude	`anthropic`	`ANTHROPIC_API_KEY`	Default
OpenAI	`openai`	`OPENAI_API_KEY`	GPT-4o, GPT-4o-mini
Google Gemini	`gemini`	`GEMINI_API_KEY`	Gemini 1.5 Pro / Flash
AWS Bedrock	`bedrock`	`AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY`	Claude on Bedrock
Azure OpenAI	`azure-openai`	`AZURE_OPENAI_API_KEY`	Custom deployment
Ollama	`ollama`	(none)	Local, fully private
Claude Code CLI	`claude-code`	`CLAUDE_CODE_TOKEN`	No API key needed
Groq	`groq`	`GROQ_API_KEY`	Llama 3, Mixtral — ultra-fast
Together AI	`together-ai`	`TOGETHER_API_KEY`	100+ open-source models
DeepSeek	`deep-seek`	`DEEPSEEK_API_KEY`	DeepSeek Coder / Chat
Mistral AI	`mistral`	`MISTRAL_API_KEY`	Mistral, Codestral
OpenRouter	`open-router`	`OPENROUTER_API_KEY`	Gateway to 200+ models

Anthropic Claude (default)

hljs language-toml

# merlin.toml
[ai]
provider   = "anthropic"
model      = "claude-sonnet-4-6"   # or claude-opus-4-6, claude-haiku-4-5-20251001
max_tokens = 4096

hljs language-bash

export ANTHROPIC_API_KEY=sk-ant-...
merlin review

Get a key at console.anthropic.com.

OpenAI GPT-4o

hljs language-toml

[ai]
provider = "openai"
model    = "gpt-4o"   # or gpt-4o-mini, gpt-4-turbo

hljs language-bash

export OPENAI_API_KEY=sk-...
merlin review

OPENAI_API_KEY also powers RAG embeddings when embedder = "openai" — one key for both.

Google Gemini

hljs language-toml

[ai]
provider = "gemini"
model    = "gemini-1.5-pro"   # or gemini-2.0-flash, gemini-1.5-flash

hljs language-bash

export GEMINI_API_KEY=AIza...
merlin review

Get a key from Google AI Studio.

AWS Bedrock

hljs language-toml

[ai]
provider        = "bedrock"
model           = "anthropic.claude-sonnet-4-6-20250514-v1:0"
bedrock_region  = "us-east-1"

hljs language-bash

export AWS_ACCESS_KEY_ID=AKIA...
export AWS_SECRET_ACCESS_KEY=...
export AWS_SESSION_TOKEN=...   # optional, for temporary credentials
merlin review

The IAM role/user needs the bedrock:InvokeModel permission for the chosen model ARN.

Azure OpenAI

hljs language-toml

[ai]
provider                 = "azure-openai"
model                    = "gpt-4o"
azure_openai_endpoint    = "https://my-resource.openai.azure.com"
azure_openai_deployment  = "my-gpt4o-deployment"

hljs language-bash

export AZURE_OPENAI_API_KEY=...
merlin review

Ollama (local — fully private)

No API key required. All processing stays on your machine.

hljs language-toml

[ai]
provider        = "ollama"
model           = "llama3.1"   # any model pulled with `ollama pull`
ollama_base_url = "http://localhost:11434"

hljs language-bash

ollama serve
ollama pull llama3.1
merlin review

Good local models for code review: codellama, deepseek-coder, qwen2.5-coder.

Claude Code CLI

For teams with a Claude Code subscription — no ANTHROPIC_API_KEY needed.

hljs language-toml

[ai]
provider = "claude-code"
model    = "claude-sonnet-4-6"

hljs language-bash

# Developer machine (interactive)
claude auth login

# CI (headless)
claude auth login --token $CLAUDE_CODE_TOKEN
merlin review

Set CLAUDE_CODE_TOKEN as a CI secret. The token is obtained from your Claude Code account settings.

Groq

Ultra-fast open-source inference. Llama 3.3 70B reviews typically complete in under 3 seconds.

hljs language-toml

[ai]
provider = "groq"
model    = "llama-3.3-70b-versatile"   # or mixtral-8x7b-32768, gemma2-9b-it

hljs language-bash

export GROQ_API_KEY=gsk_...
merlin review

Get a free key at console.groq.com. Recommended models for code review:

Model	Context	Speed
`llama-3.3-70b-versatile`	128k	Fast
`mixtral-8x7b-32768`	32k	Very fast
`llama-3.1-8b-instant`	128k	Fastest

Together AI

Access to 100+ open-source models including Llama, Mistral, Qwen, and DBRX.

hljs language-toml

[ai]
provider = "together-ai"
model    = "meta-llama/Llama-3.3-70B-Instruct-Turbo"

hljs language-bash

export TOGETHER_API_KEY=...
merlin review

Get a key at api.together.ai. Recommended models:

Model	Notes
`meta-llama/Llama-3.3-70B-Instruct-Turbo`	Best quality
`meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo`	Smaller, faster
`mistralai/Mixtral-8x7B-Instruct-v0.1`	Strong coding ability
`Qwen/Qwen2.5-Coder-32B-Instruct`	Specialised for code

DeepSeek

Strong coding-focused models with competitive pricing.

hljs language-toml

[ai]
provider = "deep-seek"
model    = "deepseek-coder"   # or deepseek-chat

hljs language-bash

export DEEPSEEK_API_KEY=...
merlin review

Get a key at platform.deepseek.com.

Mistral AI

Mistral's own models plus Codestral, which is fine-tuned for code tasks.

hljs language-toml

[ai]
provider = "mistral"
model    = "codestral-latest"   # or mistral-large-latest, open-mistral-nemo

hljs language-bash

export MISTRAL_API_KEY=...
merlin review

Get a key at console.mistral.ai. codestral-latest is recommended for code review tasks.

OpenRouter

A unified gateway to 200+ models from OpenAI, Anthropic, Meta, Mistral, Google, and more — including many free tiers.

hljs language-toml

[ai]
provider = "open-router"
model    = "meta-llama/llama-3.3-70b-instruct"   # or any model on openrouter.ai/models

hljs language-bash

export OPENROUTER_API_KEY=sk-or-...
merlin review

Get a key at openrouter.ai. Use any model slug from the model list. OpenRouter is useful for:

Accessing models not available in your region
Comparing multiple providers with a single API key
Free-tier access to powerful open-source models

Ollama (local — fully private)

When to use RAG

Large codebases where a diff touches shared utilities or interfaces
Reviews that need to understand how changed code is called elsewhere
Finding similar patterns or security issues across the repo

Setup in merlin.toml

hljs language-toml

[rag]
enabled          = true
embedder         = "openai"              # "openai" for CI, "ollama" for local
store            = "local"              # see vector store table below
embed_model      = "text-embedding-3-small"
collection       = "merlin"
top_k            = 5                    # number of relevant chunks to inject
min_score        = 0.70                 # similarity threshold (0.0–1.0)
chunk_lines      = 80                   # lines per indexed chunk
index_extensions = [".rs", ".ts", ".py", ".go", ".java", ".md"]
local_path       = "merlin-rag.jsonl"  # local store file path

Embedding backends

Embedder	Best for	Model	Requires
`openai`	CI/CD — any runner	`text-embedding-3-small`	`OPENAI_API_KEY`
`ollama`	Local dev — free, fully private	`nomic-embed-text`	`ollama serve`

Vector stores

Store	Setup	Best for
`local`	None — single JSONL file, cache in CI	Small/medium repos, CI
`memory`	None — ephemeral	Testing
`qdrant`	`docker run -p 6333:6333 qdrant/qdrant`	Production, self-hosted
`chroma`	`docker run -p 8000:8000 chromadb/chroma`	Open-source alternative
`pinecone`	Managed cloud account	Zero-ops managed

Local development (Ollama + local store)

hljs language-toml

# merlin.toml
[rag]
enabled  = true
embedder = "ollama"
store    = "local"

hljs language-bash

ollama pull nomic-embed-text
merlin rag index .         # index once (a few seconds for most repos)
merlin review              # reviews now include codebase context

CI/CD (OpenAI + cached JSONL)

hljs language-toml

# merlin.toml
[rag]
enabled     = true
embedder    = "openai"
embed_model = "text-embedding-3-small"
store       = "local"

Add to your GitHub Actions workflow:

hljs language-yaml

- name: Cache RAG index
  uses: actions/cache@v4
  with:
    path: merlin-rag.jsonl
    key: merlin-rag-${{ hashFiles('src/**', 'lib/**') }}
    restore-keys: merlin-rag-

- name: Build RAG index (first run only)
  run: test -f merlin-rag.jsonl || merlin rag index .
  env:
    OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

Indexing a typical 10k-file repo costs around $0.10 in OpenAI embedding credits and only re-runs when source files change.

Production (Qdrant persistent store)

hljs language-toml

# merlin.toml
[rag]
enabled    = true
embedder   = "openai"
store      = "qdrant"
qdrant_url = "http://localhost:6333"
# qdrant_api_key = ""   # required for Qdrant Cloud

The index persists in Qdrant between CI runs — no file caching step needed.

Custom Rules Engine

Example `.merlin-rules.yaml`

hljs language-yaml

rules:
  # Regex pattern rule — flags matching code in the diff
  - name: no-unwrap
    pattern: "unwrap\\(\\)"
    severity: high
    message: "Avoid unwrap() in production code — use ? or expect() with context"

  # Natural-language directive — injected into the AI system prompt
  - name: require-error-handling
    directive: "All public API functions must handle errors explicitly with Result"

  # Path-scoped rule — only applies to files matching the glob
  - name: auth-review
    path_match: "src/auth/**"
    directive: "Flag any changes to authentication logic as Critical severity"

  # Combined: regex + path scope
  - name: no-sql-string-concat
    pattern: "format!.*SELECT|format!.*INSERT|format!.*UPDATE"
    path_match: "src/db/**"
    severity: critical
    message: "Never build SQL with string formatting — use parameterised queries"

Configuration

hljs language-toml

# merlin.toml
[review]
rules_file = ".merlin-rules.yaml"   # default path

Rules are loaded automatically on every review run. Regex pattern matches are prepended to the AI context as hints, and directive rules are appended to the system prompt as numbered team rules.

Adaptive Feedback Learning

Merlin learns from your team's reactions to review comments. Over time, comment patterns that are consistently rejected (👎) are auto-suppressed, reducing noise without manual rule authoring.

How it works

React to review comments with 👍 (accept) or 👎 (reject)
Merlin records each reaction keyed by category:title pattern
Once a pattern has 5+ events and >70% rejection rate, it's suppressed
Run /feedback to see current learning status and suppressed patterns

Configuration

hljs language-toml

# merlin.toml
[review]
feedback_learning = true                     # enable adaptive filtering
feedback_path     = ".merlin-feedback.jsonl"  # where to store feedback data

Viewing status

hljs language-bash

merlin run /feedback
# or from a PR comment:
@merlin /feedback

The feedback file is a simple JSONL format — commit it to your repo so the entire team benefits from shared learning, or add it to .gitignore for per-environment learning.

Slash Commands

Trigger commands from a PR comment using @merlin /command, or run them directly in CI with merlin run /command.

Command	What it does	Output
`/review`	Full code review with inline comments	Inline comments + summary
`/spec`	Generate a technical specification	Updates PR description
`/describe`	Auto-generate PR title and description	Updates PR description
`/ask <question>`	Q&A about the PR diff	PR comment
`/improve`	Inline code suggestion blocks	PR suggestion comments
`/generate_labels`	Auto-label based on diff content and size	PR labels
`/update_changelog`	Prepend an entry to CHANGELOG.md	File commit
`/add_doc`	Generate missing docstrings	PR suggestion comments
`/similar_issue`	Find related open issues	PR comment table
`/test`	Generate unit tests for changed code	PR comment
`/explain`	Plain-language walkthrough of the PR	PR comment
`/security`	Dedicated security scan (secrets + OWASP Top 10)	Inline + summary report
`/approve`	AI-assisted review verdict	PR review submission
`/commit_message`	Generate 3 conventional commit message options	PR comment
`/docs [mode]`	Generate docs (`readme`/`api`/`adr`/`module`/`wiki`/`auto`)	PR comment or file commit
`/snyk`	Scan changed dependencies against Snyk database	PR comment
`/coverage`	Analyse test coverage for changed files	PR comment
`/link_jira`	Find and link related Jira issues	PR comment
`/link_linear`	Find and link related Linear issues	PR comment
`/triage`	Find similar open issues on CodeTriage	PR comment
`/diagram`	Generate a Mermaid architecture diagram of the PR changes	PR comment
`/feedback`	Show adaptive feedback learning status and suppressed patterns	PR comment

Examples

hljs language-bash

# Run from CI
merlin run /review
merlin run /security
merlin run /ask "Is this change safe to deploy on Friday?"
merlin run /docs readme

# Trigger from a PR comment (requires webhook/bot mode)
@merlin /review
@merlin /ask "What is the performance impact of this change?"
@merlin /spec

Webhook & Bot Mode

Bot mode runs Merlin as a persistent HTTP server. It listens for PR comment events and automatically dispatches slash commands — no CI trigger needed.

hljs language-bash

merlin webhook --port 8080

Configure your VCS platform to send webhook events to:

Platform	Webhook URL	Event type
GitHub	`http://your-host:8080/webhook/github`	`issue_comment`
GitLab	`http://your-host:8080/webhook/gitlab`	Note Hook

Securing the webhook

hljs language-bash

# GitHub HMAC secret
export MERLIN_GITHUB_SECRET=your-secret
merlin webhook --port 8080

# GitLab token
export MERLIN_GITLAB_SECRET=your-token
merlin webhook --port 8080

Set the same secret in your platform's webhook settings under "Secret token".

Running as a service

hljs language-yaml

# docker-compose.yml
services:
  merlin:
    image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest
    command: webhook --port 8080
    ports:
      - "8080:8080"
    environment:
      GITHUB_TOKEN: your-token
      ANTHROPIC_API_KEY: your-key
      MERLIN_GITHUB_SECRET: your-secret
    restart: unless-stopped

Autonomous Agent

The agent runs a ReAct (Reason + Act) loop — it plans, executes slash commands as tools, and iterates until the task is complete.

CLI mode (interactive REPL)

hljs language-bash

merlin agent
# > Summarise the open PRs and flag any that touch auth code
# > Review PR #42 with a focus on SQL injection risks

Single-shot (CI-friendly)

hljs language-bash

merlin agent --task "Review PR #42 and post a summary comment"

Slack

hljs language-bash

export SLACK_BOT_TOKEN=xoxb-...
merlin agent --channel slack --port 8090

The agent listens on port 8090 for Slack Events API calls. Configure your Slack app's Event Subscriptions URL to http://your-host:8090.

Discord

hljs language-bash

export DISCORD_BOT_TOKEN=...
export DISCORD_CHANNEL_ID=...
merlin agent --channel discord

Agent configuration

hljs language-toml

# merlin.toml
[agent]
max_iterations      = 10          # max ReAct loop steps
max_memory_messages = 50          # context window for conversation history
memory_file         = ".merlin-memory.jsonl"   # persist memory across runs
default_channel     = "cli"
port                = 8090

Configuration Reference

Copy config.example.toml to merlin.toml in your repo root. All fields are optional — Merlin works with zero configuration.

hljs language-toml

# merlin.toml

[ai]
# Provider: "anthropic" | "openai" | "claude-code" | "gemini" | "bedrock" | "azure-openai" | "ollama"
provider    = "anthropic"
model       = "claude-sonnet-4-6"
max_tokens  = 4096
temperature = 0.2   # lower = more consistent; 0.0–1.0

# Provider-specific options (uncomment as needed)
# ollama_base_url          = "http://localhost:11434"
# azure_openai_endpoint    = "https://my-resource.openai.azure.com"
# azure_openai_deployment  = "my-deployment"
# bedrock_region           = "us-east-1"

[review]
# Review categories to focus on
focus        = ["bugs", "security", "style", "performance"]

# Max inline comments per run (prevents PR spam)
max_comments = 30

# Lines per diff chunk sent to the AI
chunk_lines  = 200

# Second AI pass to filter false positives (slower but more accurate)
reflect      = false

# Custom rules file (regex patterns + natural-language directives)
rules_file   = ".merlin-rules.yaml"

# Adaptive feedback learning — suppresses consistently rejected comment patterns
feedback_learning = false
feedback_path     = ".merlin-feedback.jsonl"

[platform]
# Auto-detected from CI env vars. Override only if needed:
# type = "github"   # "github" | "gitlab" | "bitbucket" | "azure-devops" | "gitea"

[rag]
enabled          = false
embedder         = "openai"              # "openai" | "ollama"
store            = "local"              # "local" | "memory" | "qdrant" | "chroma" | "pinecone"
embed_model      = "text-embedding-3-small"
collection       = "merlin"
top_k            = 5
min_score        = 0.70
chunk_lines      = 80
local_path       = "merlin-rag.jsonl"
index_extensions = [".rs", ".ts", ".py", ".go", ".java", ".md"]

# Qdrant
# qdrant_url     = "http://localhost:6333"
# qdrant_api_key = ""

# ChromaDB
# chroma_url = "http://localhost:8000"

# Pinecone
# pinecone_host    = "https://my-index.svc.us-east1.pinecone.io"
# pinecone_api_key = ""   # or set PINECONE_API_KEY env var

[agent]
max_iterations      = 10
max_memory_messages = 50
# memory_file       = ".merlin-memory.jsonl"
default_channel     = "cli"
port                = 8090

Environment Variables

All secrets are read from environment variables — never put them in merlin.toml.

AI providers

Variable	Provider
`ANTHROPIC_API_KEY`	Anthropic Claude
`OPENAI_API_KEY`	OpenAI (review and/or RAG embeddings)
`GEMINI_API_KEY`	Google Gemini
`AZURE_OPENAI_API_KEY`	Azure OpenAI
`AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY`	AWS Bedrock
`AWS_SESSION_TOKEN`	AWS Bedrock (temporary credentials)
`CLAUDE_CODE_TOKEN`	Claude Code CLI headless auth

VCS platforms

Variable	Platform	Notes
`GITHUB_TOKEN`	GitHub	Auto-provided by Actions
`GITLAB_TOKEN`	GitLab	Use `$CI_JOB_TOKEN` in CI
`BITBUCKET_TOKEN`	Bitbucket	Use `$BITBUCKET_STEP_TOKEN` in CI
`AZURE_DEVOPS_TOKEN`	Azure DevOps	Use `$(System.AccessToken)` in CI
`GITEA_TOKEN`	Gitea	Auto-provided by Gitea Actions 1.21+

Integrations

Variable	Purpose
`PINECONE_API_KEY`	Pinecone vector store
`SNYK_TOKEN`	Snyk dependency scanning (`/snyk` command)
`JIRA_TOKEN`	Jira issue linking (`/link_jira` command)
`LINEAR_API_KEY`	Linear issue linking (`/link_linear` command)
`SLACK_BOT_TOKEN`	Slack agent channel
`DISCORD_BOT_TOKEN`	Discord agent channel
`DISCORD_CHANNEL_ID`	Discord channel to post in
`MERLIN_GITHUB_SECRET`	HMAC secret for GitHub webhook verification
`MERLIN_GITLAB_SECRET`	Token for GitLab webhook verification

CLI Reference

hljs language-bash

# ── Review ────────────────────────────────────────────────────────────────────
merlin review                                   # CI review (auto-detects platform)
merlin review --diff path/to/changes.diff       # local review, no platform posting
merlin review --diff changes.diff --output json # machine-readable output

# ── Slash commands ────────────────────────────────────────────────────────────
merlin run /review
merlin run /spec
merlin run /describe
merlin run /ask "Is this change thread-safe?"
merlin run /improve
merlin run /generate_labels
merlin run /update_changelog
merlin run /add_doc
merlin run /similar_issue
merlin run /test
merlin run /explain
merlin run /security
merlin run /approve
merlin run /commit_message
merlin run /snyk
merlin run /coverage
merlin run /link_jira
merlin run /link_linear
merlin run /triage
merlin run /diagram
merlin run /feedback
merlin run /docs              # auto-detect best doc type
merlin run /docs readme       # generate README section
merlin run /docs api          # generate API reference
merlin run /docs adr          # generate Architecture Decision Record
merlin run /docs module       # generate module docstrings
merlin run /docs wiki         # generate wiki page

# ── RAG index ─────────────────────────────────────────────────────────────────
merlin rag index .                              # index current directory
merlin rag index src/                           # index a subdirectory
merlin rag search "auth bypass"                 # semantic search
merlin rag search "SQL injection" -k 10         # return up to 10 results
merlin rag count                                # number of indexed documents
merlin rag clear                                # delete all indexed data

# ── Webhook server ────────────────────────────────────────────────────────────
merlin webhook --port 8080

# ── Autonomous agent ──────────────────────────────────────────────────────────
merlin agent                                    # CLI REPL
merlin agent --channel slack                    # Slack Events API on --port 8090
merlin agent --channel discord                  # Discord bot
merlin agent --task "summarise the open PRs"    # single-shot, CI-friendly

# ── Debug ─────────────────────────────────────────────────────────────────────
merlin parse-diff path/to/changes.diff          # show parsed file structure + priority

Troubleshooting

`merlin: not found` (exit code 127) in CI

The binary cannot be found in PATH. This happens when:

Using the Docker container approach on Alpine: the glibc binary won't run on musl. Use the image from GHCR — it ships the correct musl static binary.
Using the binary install approach: check that the install script ran successfully before the merlin review step.

hljs language-yaml

# Correct container approach — uses musl binary, works on any runner
container:
  image: ghcr.io/arunachalamkalimuthu/merlin-ai-code-review:latest

`403 Forbidden` from GitHub API

Missing permissions block in the workflow. Add:

hljs language-yaml

permissions:
  contents: read
  pull-requests: write

Comments appear under my username instead of a bot

`Platform API error: HTTP 401 Unauthorized`

The VCS token is missing or invalid. Ensure the correct env var is set:

GitHub: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GitLab: GITLAB_TOKEN: $CI_JOB_TOKEN
Bitbucket: BITBUCKET_TOKEN: $BITBUCKET_STEP_TOKEN
Azure DevOps: AZURE_DEVOPS_TOKEN: $(System.AccessToken) (and OAuth token must be enabled in pipeline settings)

RAG index is rebuilt on every CI run

The cache key or cache path is misconfigured. Ensure the path matches local_path in merlin.toml (default: merlin-rag.jsonl) and the key pattern covers your source directories.

AI provider returns an error

Check that the correct API key environment variable is set and not expired.
Verify the model name is correct for your provider/region (especially for Bedrock where model IDs include region-specific suffixes).
For Ollama, ensure ollama serve is running and the model has been pulled with ollama pull <model>.

Architecture

hljs language-scss

CLI (clap)
  ├── ReviewEngine
  │     ├── PlatformClient  (GitHub | GitLab | Bitbucket | Azure DevOps | Gitea)
  │     │     ├── get_diff()
  │     │     ├── post_inline_comment()
  │     │     └── post_summary()
  │     ├── DiffParser → Vec<FileDiff>
  │     │     └── prioritize_diffs()   (token-aware, security-ranked)
  │     ├── AiProvider  (Anthropic | OpenAI | Claude Code | Gemini | Bedrock | Azure OpenAI | Ollama)
  │     │     └── review(ReviewContext) → Vec<ReviewComment>
  │     └── RagPipeline  (optional)
  │           ├── Embedder
  │           │     ├── OllamaEmbedder   (local dev — free, private)
  │           │     └── OpenAiEmbedder   (CI/CD — any runner)
  │           └── VectorStore  (local | memory | qdrant | chroma | pinecone)
  │                 └── search() → Vec<RetrievedDoc> → injected into AI prompt
  │
  ├── RulesEngine  (custom review rules)
  │     └── .merlin-rules.yaml → regex patterns + directives → AI prompt injection
  │
  ├── FeedbackStore  (adaptive learning)
  │     └── .merlin-feedback.jsonl → accept/reject signals → auto-suppress noisy patterns
  │
  ├── ToolRouter  (slash commands)
  │     ├── /spec, /review, /describe, /ask, /improve, /diagram
  │     ├── /generate_labels, /update_changelog, /add_doc, /similar_issue
  │     ├── /test, /explain, /security, /approve, /commit_message, /feedback
  │     ├── /docs, /snyk, /coverage, /link_jira, /link_linear, /triage
  │     └── Webhook server (axum) — dispatches commands from PR comments
  │
  └── AgentRuntime  (ReAct loop)
        ├── AgentMemory  (ring-buffer + optional JSONL persistence)
        ├── AgentTools   (all slash commands + post_comment + get_pr_info + rag_search)
        └── AgentChannel
              ├── CliChannel     (stdin/stdout REPL)
              ├── SlackChannel   (axum webhook + chat.postMessage)
              └── DiscordChannel (REST polling + message reply)

Platform auto-detection — Merlin reads CI environment variables to detect the active platform automatically:

CI system	Detection variable
GitHub Actions	`GITHUB_ACTIONS=true`
GitLab CI	`GITLAB_CI=true`
Bitbucket Pipelines	`BITBUCKET_PIPELINE_UUID`
Azure DevOps	`TF_BUILD=True`
Gitea Actions	`GITEA_ACTIONS=true`

Building from Source

hljs language-bash

# Prerequisites: Rust 1.85+ (see rust-toolchain.toml)

# Clone
git clone https://github.com/Arunachalamkalimuthu/merlin-ai-code-review
cd merlin-ai-code-review

# Development build
cargo build

# Release binary
cargo build --release
./target/release/merlin --version

# Run tests
cargo test

# Lint
cargo clippy --all-targets --all-features -- -D warnings

# Format
cargo fmt --all

# Docker (local build — uses glibc binary)
docker build -t merlin .

# Docker (CI image — uses musl static binary, requires pre-built dist/)
docker build -f Dockerfile.ci -t merlin-ci .

Local Development — Git Hooks

The repository ships pre-commit and pre-push hooks that mirror the CI checks exactly. Install them once after cloning:

hljs language-bash

bash scripts/install-hooks.sh

To also install the required CLI tools (typos, cargo-audit, cargo-deny):

hljs language-bash

bash scripts/install-hooks.sh --tools

What each hook runs

pre-commit — fast checks, runs on every git commit (~5–10 s):

Check	Command	What it catches
Format	`cargo fmt --all --check`	Unformatted code
Lint	`cargo clippy --all-targets --all-features -- -D warnings`	Warnings, bad patterns
Spell	`typos`	Typos in source, docs, config

pre-push — security checks, runs on every git push (~15–30 s):

Check	Command	What it catches
CVE scan	`cargo audit --deny warnings`	Known vulnerabilities in dependencies
Policy	`cargo deny check`	Licence violations, banned crates, duplicate deps

Skipping hooks

hljs language-bash

git commit --no-verify   # skip pre-commit checks
git push   --no-verify   # skip pre-push checks

Use --no-verify only for genuine emergencies. CI will catch any skipped issues.

Contributing

License

MIT — see LICENSE.

merlin-ai-code-review

Merlin

Table of Contents

Features

Prerequisites

Installation

Option 1 — One-line installer (recommended)

Option 2 — Docker image

Option 3 — Pre-built binary

Option 4 — Build from source

Quick Start — 5 Minutes

Platform Integration

GitHub Actions

Option A — Docker container (simplest)

Option B — Binary install (with RAG index caching)

Secrets to configure

GitLab CI

Bitbucket Pipelines

Azure DevOps

Gitea Actions

Permissions & Bot Identity

All platforms: bot identity is automatic

GitHub: required permissions block

GitHub: custom named bot (optional)

AI Providers

Anthropic Claude (default)

OpenAI GPT-4o

Google Gemini

AWS Bedrock

Azure OpenAI

Ollama (local — fully private)

Claude Code CLI

Groq

Together AI

DeepSeek

Mistral AI

OpenRouter

Ollama (local — fully private)

When to use RAG

Setup in merlin.toml

Embedding backends

Vector stores

Local development (Ollama + local store)

CI/CD (OpenAI + cached JSONL)

Production (Qdrant persistent store)

Custom Rules Engine

Example .merlin-rules.yaml

Configuration

Adaptive Feedback Learning

How it works

Configuration

Viewing status

Slash Commands

Examples

Webhook & Bot Mode

Securing the webhook

Running as a service

Autonomous Agent

CLI mode (interactive REPL)

Single-shot (CI-friendly)

Slack

Discord

Agent configuration

Configuration Reference

Environment Variables

AI providers

VCS platforms

Integrations

CLI Reference

Troubleshooting

merlin: not found (exit code 127) in CI

403 Forbidden from GitHub API

Comments appear under my username instead of a bot

Platform API error: HTTP 401 Unauthorized

RAG index is rebuilt on every CI run

AI provider returns an error

Architecture

Building from Source

Local Development — Git Hooks

What each hook runs

Example `.merlin-rules.yaml`

`merlin: not found` (exit code 127) in CI

`403 Forbidden` from GitHub API

`Platform API error: HTTP 401 Unauthorized`

Example `.merlin-rules.yaml`