Introduction
What is textrawl and why you need it
textrawl is a Personal Knowledge MCP Server that gives Claude access to your documents, emails, notes, and other knowledge. It uses hybrid search combining semantic understanding with keyword matching to find the most relevant content.
What is MCP?
The Model Context Protocol (MCP) is an open standard for connecting AI assistants to external data sources and tools. Adopted by Anthropic for Claude and donated to the Linux Foundation's Agentic AI Foundation, MCP enables:
- Tool Use: Claude can call functions to search, retrieve, and create content
- Context Sharing: Your documents become part of Claude's working knowledge
- Privacy: Data stays on your infrastructure, not uploaded to the cloud
Key Features
Hybrid Search
textrawl combines two search strategies:
- Semantic Search: Understands meaning using vector embeddings (OpenAI or Ollama)
- Full-Text Search: Finds exact keywords using PostgreSQL tsvector
Results are merged using Reciprocal Rank Fusion (RRF), giving you the best of both approaches.
Multi-Format Import
Convert and import various document formats:
- Email: MBOX archives, EML files
- Documents: PDF, DOCX, TXT, Markdown
- Web: HTML pages, saved articles
- Archives: Google Takeout ZIP exports
Automatic Chunking
Large documents are split into searchable chunks:
- 512 tokens (~2048 characters) per chunk
- 50 token overlap for context preservation
- Paragraph-aware splitting
Self-Hosted
Run textrawl on your infrastructure:
- Local: npm run dev for development
- Docker: docker-compose for production
- Cloud: Deploy to Cloud Run, Railway, or any container platform
Use Cases
Personal Knowledge Base
Import your notes, saved articles, and documents. Ask Claude questions like:
"What did I write about quarterly planning last October?"
Email Search
Convert MBOX archives to searchable knowledge:
"Find all emails from the marketing team about the product launch"
Research Assistant
Import research papers and documentation:
"Summarize the key findings from the papers about hybrid search algorithms"
Code Documentation
Import your project's documentation and ask Claude for help:
"How do I configure authentication based on our security docs?"
Next Steps
- Quick Start - Get running in 5 minutes
- Installation - Detailed setup guide
- Configuration - Environment variables