Configuration
Environment variables and configuration options
textrawl is configured via environment variables in a .env file.
Environment Variables
Database Connection
| Variable | Required | Description |
|---|---|---|
SUPABASE_URL | Yes | Your Supabase project URL |
SUPABASE_SERVICE_KEY | Yes | Service role key (bypasses RLS) |
Warning: Never commit your service key to version control. Use
.envfiles or secret managers.
Embedding Provider
Choose between OpenAI (cloud), Google AI (cloud), or Ollama (local):
| Variable | Required | Description |
|---|---|---|
EMBEDDING_PROVIDER | No | openai (default), ollama, or google |
OPENAI_API_KEY | If OpenAI | Your OpenAI API key |
GOOGLE_AI_API_KEY | If Google | Your Google AI API key |
GOOGLE_EMBEDDING_MODEL | If Google | Model name (default: text-embedding-004) |
OLLAMA_BASE_URL | If Ollama | Ollama server URL |
OLLAMA_MODEL | If Ollama | Model name (default: nomic-embed-text) |
OpenAI Configuration:
- Model:
text-embedding-3-small - Dimensions: 1536
- Schema:
scripts/setup-db.sql
Google AI Configuration:
- Model:
text-embedding-004 - Dimensions: 768
- Schema:
scripts/setup-db-google.sql
Ollama Configuration:
- Dimensions: 1024 (nomic-embed-text) or 768 (nomic-embed-text-v2-moe)
- Schema:
scripts/setup-db-ollama.sql(1024d) orscripts/setup-db-ollama-v2.sql(768d)
Important: Each provider uses different embedding dimensions. You cannot mix providers without re-embedding all documents. Use the matching SQL schema for your chosen provider.
Server Configuration
| Variable | Required | Default | Description |
|---|---|---|---|
PORT | No | 3000 | Server port |
LOG_LEVEL | No | info | debug, info, warn, error |
ALLOWED_ORIGINS | No | * | CORS allowed origins (comma-separated) |
Authentication
| Variable | Required | Description |
|---|---|---|
API_BEARER_TOKEN | Production | Auth token (min 32 characters) |
When set, all API endpoints require the Authorization: Bearer <token> header.
Unprotected endpoints (for health checks):
/health/health/live/health/ready
Web UI
| Variable | Required | Default | Description |
|---|---|---|---|
UI_PORT | No | 3001 | Web UI port for file conversion |
Feature Flags
| Variable | Required | Default | Description |
|---|---|---|---|
ENABLE_MEMORY | No | true | Enable/disable memory tools |
ENABLE_CONVERSATIONS | No | true | Enable/disable conversation memory tools |
ENABLE_INSIGHTS | No | true | Enable/disable proactive insight tools |
ENABLE_MEMORY_EXTRACTION | No | false | Enable LLM-based memory extraction |
COMPACT_RESPONSES | No | true | Token-efficient response format |
Memory Extraction
Required only when ENABLE_MEMORY_EXTRACTION=true:
| Variable | Required | Default | Description |
|---|---|---|---|
ANTHROPIC_API_KEY | If extraction | - | Anthropic API key for Claude |
EXTRACTION_MODEL | No | claude-haiku-4-5-20250501 | Model for entity extraction |
Insight Configuration
Controls the LLM model used for proactive insight synthesis (discover_connections and the background insight scheduler). Independent of EXTRACTION_MODEL.
| Variable | Required | Default | Description |
|---|---|---|---|
INSIGHT_MODEL | No | claude-sonnet-4-6-20250514 | Model for insight synthesis |
INSIGHT_BATCH_THRESHOLD | No | 50 | Minimum new chunks before triggering an insight scan |
INSIGHT_DEBOUNCE_SECONDS | No | 300 | Cooldown between automatic insight scans (seconds) |
Chunking
| Variable | Required | Default | Description |
|---|---|---|---|
CHUNKING_MODE | No | fixed | fixed or semantic (embedding-based topic splitting) |
SEMANTIC_SIMILARITY_THRESHOLD | No | 0.5 | Threshold for semantic chunking (0-1) |
Compact Responses
When COMPACT_RESPONSES=true (default), memory tools return token-efficient responses that reduce LLM context usage by 40-60%. This uses short keys like n, t, c instead of name, type, content.
Set COMPACT_RESPONSES=false for human-readable verbose responses during development or debugging.
See Response Format for the complete key mapping.
Postgres Analysis
| Variable | Required | Default | Description |
|---|---|---|---|
DATABASE_URL | No | - | Direct Postgres connection for pg_analyze tools |
PG_REPORT_DIR | No | ./reports/pg-analysis | Directory for analysis report history |
When DATABASE_URL is set, three Postgres analysis tools become available: pg_analyze, pg_recommendations, and pg_report_history.
Rate Limiting
| Variable | Required | Default | Description |
|---|---|---|---|
REDIS_URL | No | - | Redis URL for shared rate limiting across instances (e.g. redis://localhost:6379) |
When REDIS_URL is set, rate limit counters are shared across all server instances via Redis. Without it, each instance tracks limits independently in memory (fine for single-instance deployments).
Rate Limits
Built-in rate limiting:
| Endpoint | Limit |
|---|---|
API (/mcp, /api/*) | 100 requests/minute |
Upload (/api/upload) | 10 requests/minute |
Health (/health/*) | 300 requests/minute |
Example Configurations
Development (Local)
Production (Cloud Run)
Self-Hosted (Ollama)
Generating Secure Tokens
The setup script generates a secure token automatically:
Or generate manually:
Next Steps
- Quick Start - Get running in 5 minutes
- Security Hardening - Production security
- Cloud Run Deployment - Deploy to GCP