textrawl

textrawl is a Personal Knowledge MCP Server that gives Claude access to your documents, emails, notes, and other knowledge. It uses hybrid search combining semantic understanding with keyword matching to find the most relevant content.

What is MCP?

The Model Context Protocol (MCP) is an open standard for connecting AI assistants to external data sources and tools. Adopted by Anthropic for Claude and donated to the Linux Foundation's Agentic AI Foundation, MCP enables:

Tool Use: Claude can call functions to search, retrieve, and create content
Context Sharing: Your documents become part of Claude's working knowledge
Privacy: Data stays on your infrastructure, not uploaded to the cloud

Why textrawl?

Give Claude access to your personal documents, emails, notes, and knowledge. textrawl combines semantic understanding with keyword precision using Reciprocal Rank Fusion to deliver the most relevant results.

Hybrid Search

Combine semantic similarity with full-text keyword matching. Adjust weights to optimize for your use case.

Persistent Memory

Remember facts about people, projects, and concepts. Build a knowledge graph with relationships between entities. Track conversation context across sessions.

Proactive Insights

Automatically discover cross-source connections, recurring themes, and outliers in your knowledge base after bulk imports.

Multi-Format Support

Import MBOX emails, HTML pages, PDFs, DOCX files, and more. Convert once, search forever.

MCP Native

Native Model Context Protocol integration. Works with Claude Desktop, Cursor IDE, and any MCP client.

Privacy First

Self-hosted on your infrastructure. Your documents never leave your control.

Quick Start

# Clone and setup
git clone https://github.com/jeffgreendesign/textrawl.git
cd textrawl
pnpm setup
 
# Start the server
pnpm dev
 
# Test with MCP Inspector
pnpm inspector

MCP Tools

textrawl exposes 18 tools via MCP:

Document Tools

Tool	Purpose
`search`	Hybrid semantic + full-text search with optional memory/conversation fusion
`get_document`	Retrieve full document content
`list_documents`	Browse documents with pagination
`update_document`	Update document title and tags
`add_note`	Create notes with automatic embedding

Memory Tools

Tool	Purpose
`remember_fact`	Store facts about entities with semantic embeddings
`build_knowledge`	Store multiple facts and relations in a single batch call
`query_memory`	Query the memory graph (search, entity, or list modes)
`relate_entities`	Create relationships between entities
`forget_entity`	Delete entity and associated memories
`extract_memories`	Extract entities and facts from text via LLM

Conversation Tools

Tool	Purpose
`save_conversation_context`	Save conversation summary and turns for recall
`query_conversations`	Query past conversations (search, get, or list modes)
`delete_conversation`	Delete a conversation session

Insight Tools

Tool	Purpose
`get_insights`	View discovered patterns and connections
`discover_connections`	Trigger insight scan across knowledge base
`dismiss_insight`	Dismiss an insight from the queue

Stats

Tool	Purpose
`get_stats`	Statistics across knowledge, memory, conversations, and insights

View all tools →

Architecture

┌─────────────────────────────────────────────────────────┐
│                    Claude Desktop                        │
│                    or Cursor IDE                         │
└────────────────────────┬────────────────────────────────┘
                         │ MCP Protocol
                         ▼
┌─────────────────────────────────────────────────────────┐
│                    textrawl Server                       │
│  ┌───────────┐ ┌───────────┐ ┌────────┐ ┌───────────┐  │
│  │  Search   │ │ Documents │ │ Memory │ │  Insights │  │
│  │ (Hybrid)  │ │  (CRUD)   │ │(Graph) │ │(Patterns) │  │
│  └─────┬─────┘ └─────┬─────┘ └───┬────┘ └─────┬─────┘  │
│        │              │           │             │        │
│        └──────────────┴───────────┴─────────────┘        │
│                          ▼                               │
│  ┌─────────────────────────────────────────────────────┐│
│  │           Supabase PostgreSQL + pgvector            ││
│  │  • documents + chunks (document search)             ││
│  │  • memory_entities + observations (memory)          ││
│  │  • conversation_sessions + turns (conversations)    ││
│  │  • proactive_insights (insight discovery)           ││
│  │  • hybrid_search() / memory_search() RPCs           ││
│  └─────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────┘

Learn more about hybrid search →

Next Steps

Quick Start - Get running in 5 minutes
Installation - Detailed setup guide
CLI Tools - Import your documents

Introduction