Large File MCP Server

A Model Context Protocol (MCP) server for intelligent handling of large files with smart chunking, navigation, and streaming capabilities.

📚 Full Documentation | API Reference | Examples

Features

Smart Chunking - Automatically determines optimal chunk size based on file type
Intelligent Navigation - Jump to specific lines with surrounding context
Powerful Search - Regex support with context lines before/after matches
File Analysis - Comprehensive metadata and statistical analysis
Memory Efficient - Stream files of any size without loading into memory
Performance Optimized - Built-in LRU caching for frequently accessed chunks
Type Safe - Written in TypeScript with strict typing
Cross-Platform - Works on Windows, macOS, and Linux

Installation

hljs language-bash

npm install -g @willianpinho/large-file-mcp

Or use directly with npx:

hljs language-bash

npx @willianpinho/large-file-mcp

Quick Start

Claude Code CLI

Add the MCP server using the CLI:

hljs language-bash

# Add for current project only (local scope)
claude mcp add --transport stdio --scope local large-file-mcp -- npx -y @willianpinho/large-file-mcp

# Add globally for all projects (user scope)
claude mcp add --transport stdio --scope user large-file-mcp -- npx -y @willianpinho/large-file-mcp

Verify installation:

hljs language-bash

claude mcp list
claude mcp get large-file-mcp

Remove if needed:

hljs language-bash

# Remove from local scope
claude mcp remove large-file-mcp -s local

# Remove from user scope
claude mcp remove large-file-mcp -s user

MCP Scopes:

local - Available only in the current project directory
user - Available globally for all projects
project - Defined in .mcp.json for team sharing

Claude Desktop

Add to your claude_desktop_config.json:

hljs language-json

{
  "mcpServers": {
    "large-file": {
      "command": "npx",
      "args": ["-y", "@willianpinho/large-file-mcp"]
    }
  }
}

Config file locations:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

Restart Claude Desktop after editing.

Other AI Platforms

Gemini:

hljs language-json

{
  "tools": [
    {
      "name": "large-file-mcp",
      "command": "npx @willianpinho/large-file-mcp",
      "protocol": "mcp"
    }
  ]
}

Usage

Once configured, you can use natural language to interact with large files:

hljs language-text

Read the first chunk of /var/log/system.log

hljs language-text

Find all ERROR messages in /var/log/app.log

hljs language-text

Show me line 1234 of /code/app.ts with context

hljs language-text

Get the structure of /data/sales.csv

Available Tools

read_large_file_chunk

Read a specific chunk of a large file with intelligent chunking.

Parameters:

filePath (required): Absolute path to the file
chunkIndex (optional): Zero-based chunk index (default: 0)
linesPerChunk (optional): Lines per chunk (auto-detected if not provided)
includeLineNumbers (optional): Include line numbers (default: false)

Example:

hljs language-json

{
  "filePath": "/var/log/system.log",
  "chunkIndex": 0,
  "includeLineNumbers": true
}

search_in_large_file

Search for patterns in large files with context.

Parameters:

filePath (required): Absolute path to the file
pattern (required): Search pattern
caseSensitive (optional): Case sensitive search (default: false)
regex (optional): Use regex pattern (default: false)
maxResults (optional): Maximum results (default: 100)
contextBefore (optional): Context lines before match (default: 2)
contextAfter (optional): Context lines after match (default: 2)

Example:

hljs language-json

{
  "filePath": "/var/log/error.log",
  "pattern": "ERROR.*database",
  "regex": true,
  "maxResults": 50
}

get_file_structure

Analyze file structure and get comprehensive metadata.

Parameters:

filePath (required): Absolute path to the file

Returns: File metadata, line statistics, recommended chunk size, and sample lines.

navigate_to_line

Jump to a specific line with surrounding context.

Parameters:

filePath (required): Absolute path to the file
lineNumber (required): Line number to navigate to (1-indexed)
contextLines (optional): Context lines before/after (default: 5)

get_file_summary

Get comprehensive statistical summary of a file.

Parameters:

filePath (required): Absolute path to the file

Returns: File metadata, line statistics, character statistics, and word count.

stream_large_file

Stream a file in chunks for processing very large files.

Parameters:

filePath (required): Absolute path to the file
chunkSize (optional): Chunk size in bytes (default: 64KB)
startOffset (optional): Starting byte offset (default: 0)
maxChunks (optional): Maximum chunks to return (default: 10)

Supported File Types

The server intelligently detects and optimizes for:

Text files (.txt) - 500 lines/chunk
Log files (.log) - 500 lines/chunk
Code files (.ts, .js, .py, .java, .cpp, .go, .rs, etc.) - 300 lines/chunk
CSV files (.csv) - 1000 lines/chunk
JSON files (.json) - 100 lines/chunk
XML files (.xml) - 200 lines/chunk
Markdown files (.md) - 500 lines/chunk
Configuration files (.yml, .yaml, .sh, .bash) - 300 lines/chunk

Configuration

Customize behavior using environment variables:

Variable	Description	Default
`CHUNK_SIZE`	Default lines per chunk	500
`OVERLAP_LINES`	Overlap between chunks	10
`MAX_FILE_SIZE`	Maximum file size in bytes	10GB
`CACHE_SIZE`	Cache size in bytes	100MB
`CACHE_TTL`	Cache TTL in milliseconds	5 minutes
`CACHE_ENABLED`	Enable/disable caching	true

Example with custom settings (Claude Desktop):

hljs language-json

{
  "mcpServers": {
    "large-file": {
      "command": "npx",
      "args": ["-y", "@willianpinho/large-file-mcp"],
      "env": {
        "CHUNK_SIZE": "1000",
        "CACHE_ENABLED": "true"
      }
    }
  }
}

Example with custom settings (Claude Code CLI):

hljs language-bash

claude mcp add --transport stdio --scope user large-file-mcp \
  --env CHUNK_SIZE=1000 \
  --env CACHE_ENABLED=true \
  -- npx -y @willianpinho/large-file-mcp

Examples

Analyzing Log Files

hljs language-text

Analyze /var/log/nginx/access.log and find all 404 errors

The AI will use the search tool to find patterns and provide context around each match.

Code Navigation

hljs language-text

Find all function definitions in /project/src/main.py

Uses regex search to locate function definitions with surrounding code context.

CSV Data Exploration

hljs language-text

Show me the structure of /data/sales.csv

Returns metadata, line count, sample rows, and recommended chunk size.

Large File Processing

hljs language-text

Stream the first 100MB of /data/huge_dataset.json

Uses streaming mode to handle very large files efficiently.

Performance

Caching

LRU Cache with configurable size (default 100MB)
TTL-based expiration (default 5 minutes)
80-90% hit rate for repeated access
Significant performance improvement for frequently accessed files

Memory Management

Streaming architecture - files are read line-by-line, never fully loaded
Configurable chunk sizes - adjust based on your use case
Smart buffering - minimal memory footprint for search operations

File Size Handling

File Size	Operation Time	Method
< 1MB	< 100ms	Direct read
1-100MB	< 500ms	Streaming
100MB-1GB	1-3s	Streaming + cache
> 1GB	Progressive	AsyncGenerator

Development

Building from Source

hljs language-bash

git clone https://github.com/willianpinho/large-file-mcp.git
cd large-file-mcp
pnpm install
pnpm build

Development Mode

hljs language-bash

pnpm dev    # Watch mode
pnpm lint   # Run linter
pnpm start  # Run server

Project Structure

hljs language-text

src/
├── index.ts        # Entry point
├── server.ts       # MCP server implementation
├── fileHandler.ts  # Core file handling logic
├── cacheManager.ts # Caching implementation
└── types.ts        # TypeScript type definitions

Troubleshooting

File not accessible

Ensure the file path is absolute and the file has read permissions:

hljs language-bash

chmod +r /path/to/file

Out of memory

Reduce CHUNK_SIZE environment variable
Disable cache with CACHE_ENABLED=false
Use stream_large_file for very large files

Slow search performance

Reduce maxResults parameter
Use startLine and endLine to limit search range
Ensure caching is enabled

Claude Code CLI: MCP server not found

Check if the server is installed:

hljs language-bash

claude mcp list

If not listed, reinstall:

hljs language-bash

claude mcp add --transport stdio --scope user large-file-mcp -- npx -y @willianpinho/large-file-mcp

Check server health:

hljs language-bash

claude mcp get large-file-mcp

Usage Metrics

This MCP server is actively maintained and monitored for usage patterns to improve functionality. Usage metrics help us:

Understand which tools are most valuable
Identify performance bottlenecks
Prioritize feature development
Ensure reliability and stability

Monitoring in Production

The server provides comprehensive logging and telemetry through environment variables:

CACHE_ENABLED: Enable/disable caching (default: true)
CACHE_SIZE: Cache size in bytes (default: 104857600 - 100MB)
CACHE_TTL: Cache TTL in milliseconds (default: 300000 - 5 minutes)
CHUNK_SIZE: Default lines per chunk (default: 500)
MAX_FILE_SIZE: Maximum file size in bytes (default: 10737418240 - 10GB)
OVERLAP_LINES: Overlap between chunks (default: 10)

Usage Examples

Recent usage patterns show the server is particularly effective for:

Log Analysis: Processing multi-GB log files with search and navigation
Data Processing: Reading large CSV/JSON files in manageable chunks
Code Review: Navigating large codebases efficiently
System Monitoring: Analyzing system logs and debug outputs
Document Analysis: Processing large text documents

For detailed analytics and usage trends, visit the Glama.ai dashboard.

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

Development Workflow

Fork the repository
Create a feature branch
Make your changes
Ensure code builds and lints successfully
Submit a pull request

See CONTRIBUTING.md for detailed guidelines.

License

MIT

Support

Issues: GitHub Issues
Documentation: This README and inline code documentation
Examples: Check the examples/ directory

Acknowledgments

Built with the Model Context Protocol SDK.

Made for the AI developer community.

Large File MCP Server

A Model Context Protocol (MCP) server for intelligent handling of large files with smart chunking, navigation, and streaming capabilities.

📚 Full Documentation | API Reference | Examples

Features

Smart Chunking - Automatically determines optimal chunk size based on file type
Intelligent Navigation - Jump to specific lines with surrounding context
Powerful Search - Regex support with context lines before/after matches
File Analysis - Comprehensive metadata and statistical analysis
Memory Efficient - Stream files of any size without loading into memory
Performance Optimized - Built-in LRU caching for frequently accessed chunks
Type Safe - Written in TypeScript with strict typing
Cross-Platform - Works on Windows, macOS, and Linux

Installation

hljs language-bash

npm install -g @willianpinho/large-file-mcp

Or use directly with npx:

hljs language-bash

npx @willianpinho/large-file-mcp

Quick Start

Claude Code CLI

Add the MCP server using the CLI:

hljs language-bash

# Add for current project only (local scope)
claude mcp add --transport stdio --scope local large-file-mcp -- npx -y @willianpinho/large-file-mcp

# Add globally for all projects (user scope)
claude mcp add --transport stdio --scope user large-file-mcp -- npx -y @willianpinho/large-file-mcp

Verify installation:

hljs language-bash

claude mcp list
claude mcp get large-file-mcp

Remove if needed:

hljs language-bash

# Remove from local scope
claude mcp remove large-file-mcp -s local

# Remove from user scope
claude mcp remove large-file-mcp -s user

MCP Scopes:

local - Available only in the current project directory
user - Available globally for all projects
project - Defined in .mcp.json for team sharing

Claude Desktop

Add to your claude_desktop_config.json:

hljs language-json

{
  "mcpServers": {
    "large-file": {
      "command": "npx",
      "args": ["-y", "@willianpinho/large-file-mcp"]
    }
  }
}

Config file locations:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

Restart Claude Desktop after editing.

Other AI Platforms

Gemini:

hljs language-json

{
  "tools": [
    {
      "name": "large-file-mcp",
      "command": "npx @willianpinho/large-file-mcp",
      "protocol": "mcp"
    }
  ]
}

Usage

Once configured, you can use natural language to interact with large files:

hljs language-text

Read the first chunk of /var/log/system.log

hljs language-text

Find all ERROR messages in /var/log/app.log

hljs language-text

Show me line 1234 of /code/app.ts with context

hljs language-text

Get the structure of /data/sales.csv

Available Tools

read_large_file_chunk

Read a specific chunk of a large file with intelligent chunking.

Parameters:

filePath (required): Absolute path to the file
chunkIndex (optional): Zero-based chunk index (default: 0)
linesPerChunk (optional): Lines per chunk (auto-detected if not provided)
includeLineNumbers (optional): Include line numbers (default: false)

Example:

hljs language-json

{
  "filePath": "/var/log/system.log",
  "chunkIndex": 0,
  "includeLineNumbers": true
}

search_in_large_file

Search for patterns in large files with context.

Parameters:

filePath (required): Absolute path to the file
pattern (required): Search pattern
caseSensitive (optional): Case sensitive search (default: false)
regex (optional): Use regex pattern (default: false)
maxResults (optional): Maximum results (default: 100)
contextBefore (optional): Context lines before match (default: 2)
contextAfter (optional): Context lines after match (default: 2)

Example:

hljs language-json

{
  "filePath": "/var/log/error.log",
  "pattern": "ERROR.*database",
  "regex": true,
  "maxResults": 50
}

get_file_structure

Analyze file structure and get comprehensive metadata.

Parameters:

filePath (required): Absolute path to the file

Returns: File metadata, line statistics, recommended chunk size, and sample lines.

navigate_to_line

Jump to a specific line with surrounding context.

Parameters:

filePath (required): Absolute path to the file
lineNumber (required): Line number to navigate to (1-indexed)
contextLines (optional): Context lines before/after (default: 5)

get_file_summary

Get comprehensive statistical summary of a file.

Parameters:

filePath (required): Absolute path to the file

Returns: File metadata, line statistics, character statistics, and word count.

stream_large_file

Stream a file in chunks for processing very large files.

Parameters:

filePath (required): Absolute path to the file
chunkSize (optional): Chunk size in bytes (default: 64KB)
startOffset (optional): Starting byte offset (default: 0)
maxChunks (optional): Maximum chunks to return (default: 10)

Supported File Types

The server intelligently detects and optimizes for:

Text files (.txt) - 500 lines/chunk
Log files (.log) - 500 lines/chunk
Code files (.ts, .js, .py, .java, .cpp, .go, .rs, etc.) - 300 lines/chunk
CSV files (.csv) - 1000 lines/chunk
JSON files (.json) - 100 lines/chunk
XML files (.xml) - 200 lines/chunk
Markdown files (.md) - 500 lines/chunk
Configuration files (.yml, .yaml, .sh, .bash) - 300 lines/chunk

Configuration

Customize behavior using environment variables:

Variable	Description	Default
`CHUNK_SIZE`	Default lines per chunk	500
`OVERLAP_LINES`	Overlap between chunks	10
`MAX_FILE_SIZE`	Maximum file size in bytes	10GB
`CACHE_SIZE`	Cache size in bytes	100MB
`CACHE_TTL`	Cache TTL in milliseconds	5 minutes
`CACHE_ENABLED`	Enable/disable caching	true

Example with custom settings (Claude Desktop):

hljs language-json

{
  "mcpServers": {
    "large-file": {
      "command": "npx",
      "args": ["-y", "@willianpinho/large-file-mcp"],
      "env": {
        "CHUNK_SIZE": "1000",
        "CACHE_ENABLED": "true"
      }
    }
  }
}

Example with custom settings (Claude Code CLI):

hljs language-bash

claude mcp add --transport stdio --scope user large-file-mcp \
  --env CHUNK_SIZE=1000 \
  --env CACHE_ENABLED=true \
  -- npx -y @willianpinho/large-file-mcp

Examples

Analyzing Log Files

hljs language-text

Analyze /var/log/nginx/access.log and find all 404 errors

The AI will use the search tool to find patterns and provide context around each match.

Code Navigation

hljs language-text

Find all function definitions in /project/src/main.py

Uses regex search to locate function definitions with surrounding code context.

CSV Data Exploration

hljs language-text

Show me the structure of /data/sales.csv

Returns metadata, line count, sample rows, and recommended chunk size.

Large File Processing

hljs language-text

Stream the first 100MB of /data/huge_dataset.json

Uses streaming mode to handle very large files efficiently.

Performance

Caching

LRU Cache with configurable size (default 100MB)
TTL-based expiration (default 5 minutes)
80-90% hit rate for repeated access
Significant performance improvement for frequently accessed files

Memory Management

Streaming architecture - files are read line-by-line, never fully loaded
Configurable chunk sizes - adjust based on your use case
Smart buffering - minimal memory footprint for search operations

File Size Handling

File Size	Operation Time	Method
< 1MB	< 100ms	Direct read
1-100MB	< 500ms	Streaming
100MB-1GB	1-3s	Streaming + cache
> 1GB	Progressive	AsyncGenerator

Development

Building from Source

hljs language-bash

git clone https://github.com/willianpinho/large-file-mcp.git
cd large-file-mcp
pnpm install
pnpm build

Development Mode

hljs language-bash

pnpm dev    # Watch mode
pnpm lint   # Run linter
pnpm start  # Run server

Project Structure

hljs language-text

src/
├── index.ts        # Entry point
├── server.ts       # MCP server implementation
├── fileHandler.ts  # Core file handling logic
├── cacheManager.ts # Caching implementation
└── types.ts        # TypeScript type definitions

Troubleshooting

File not accessible

Ensure the file path is absolute and the file has read permissions:

hljs language-bash

chmod +r /path/to/file

Out of memory

Reduce CHUNK_SIZE environment variable
Disable cache with CACHE_ENABLED=false
Use stream_large_file for very large files

Slow search performance

Reduce maxResults parameter
Use startLine and endLine to limit search range
Ensure caching is enabled

Claude Code CLI: MCP server not found

Check if the server is installed:

hljs language-bash

claude mcp list

If not listed, reinstall:

hljs language-bash

claude mcp add --transport stdio --scope user large-file-mcp -- npx -y @willianpinho/large-file-mcp

Check server health:

hljs language-bash

claude mcp get large-file-mcp

Usage Metrics

This MCP server is actively maintained and monitored for usage patterns to improve functionality. Usage metrics help us:

Understand which tools are most valuable
Identify performance bottlenecks
Prioritize feature development
Ensure reliability and stability

Monitoring in Production

The server provides comprehensive logging and telemetry through environment variables:

CACHE_ENABLED: Enable/disable caching (default: true)
CACHE_SIZE: Cache size in bytes (default: 104857600 - 100MB)
CACHE_TTL: Cache TTL in milliseconds (default: 300000 - 5 minutes)
CHUNK_SIZE: Default lines per chunk (default: 500)
MAX_FILE_SIZE: Maximum file size in bytes (default: 10737418240 - 10GB)
OVERLAP_LINES: Overlap between chunks (default: 10)

Usage Examples

Recent usage patterns show the server is particularly effective for:

Log Analysis: Processing multi-GB log files with search and navigation
Data Processing: Reading large CSV/JSON files in manageable chunks
Code Review: Navigating large codebases efficiently
System Monitoring: Analyzing system logs and debug outputs
Document Analysis: Processing large text documents

For detailed analytics and usage trends, visit the Glama.ai dashboard.

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

Development Workflow

Fork the repository
Create a feature branch
Make your changes
Ensure code builds and lints successfully
Submit a pull request

See CONTRIBUTING.md for detailed guidelines.

License

MIT

Support

Issues: GitHub Issues
Documentation: This README and inline code documentation
Examples: Check the examples/ directory

Acknowledgments

Built with the Model Context Protocol SDK.

Made for the AI developer community.

large-file-mcp

Large File MCP Server

Features

Installation

Quick Start

Claude Code CLI

Claude Desktop

Other AI Platforms

Usage

Available Tools

read_large_file_chunk

search_in_large_file

get_file_structure

navigate_to_line

get_file_summary

stream_large_file

Supported File Types

Configuration

Examples

Analyzing Log Files

Code Navigation

CSV Data Exploration

Large File Processing

Performance

Caching

Memory Management

File Size Handling

Development

Building from Source

Development Mode

Project Structure

Troubleshooting

File not accessible

Out of memory

Slow search performance

Claude Code CLI: MCP server not found

Usage Metrics

Monitoring in Production

Usage Examples

Contributing

Development Workflow

License

Support

Acknowledgments

Similar Packages

large-file-mcp

Large File MCP Server

Features

Installation

Quick Start

Claude Code CLI

Claude Desktop

Other AI Platforms

Usage

Available Tools

read_large_file_chunk

search_in_large_file

get_file_structure

navigate_to_line

get_file_summary

stream_large_file

Supported File Types

Configuration

Examples

Analyzing Log Files

Code Navigation

CSV Data Exploration

Large File Processing

Performance

Caching

Memory Management

File Size Handling

Development

Building from Source

Development Mode

Project Structure

Troubleshooting

File not accessible

Out of memory

Slow search performance