textrawl
byJeff Green
Architecture

Embedding Providers

OpenAI vs Ollama for vector embeddings

textrawl supports two embedding providers: OpenAI (cloud) and Ollama (local).

Provider Comparison

FeatureOpenAIOllama
LocationCloudLocal
PrivacyData sent to APIData stays local
Dimensions15361024
Modeltext-embedding-3-smallnomic-embed-text
CostPay per tokenFree
Speed~100ms/request~50ms/request
SetupAPI keyDocker container

OpenAI Setup

# .env
EMBEDDING_PROVIDER=openai
OPENAI_API_KEY=sk-proj-...

Use the database schema for 1536 dimensions:

# Run in Supabase SQL Editor
scripts/setup-db.sql

Ollama Setup

# Start Ollama
docker run -d --name ollama -v ollama:/root/.ollama -p 11434:11434 ollama/ollama
 
# Pull the model
docker exec -it ollama ollama pull nomic-embed-text

Configure environment:

# .env
EMBEDDING_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=nomic-embed-text

Use the database schema for 1024 dimensions:

# Run in Supabase SQL Editor
scripts/setup-db-ollama.sql

Switching Providers

Warning: You cannot mix providers. Switching requires re-embedding all documents.

To switch:

  1. Export your documents (or keep source files)
  2. Drop existing tables
  3. Run new schema (different dimensions)
  4. Re-upload all documents

Quality Comparison

Both providers offer high-quality embeddings suitable for semantic search:

BenchmarkOpenAInomic-embed-text
MTEB Average62.359.4
Retrieval54.952.8
Clustering45.044.2

OpenAI has a slight edge, but Ollama offers privacy and no API costs.

Recommendations

Choose OpenAI if:

  • You need maximum quality
  • Data privacy isn't critical
  • You prefer managed infrastructure

Choose Ollama if:

  • Data must stay local
  • You want to avoid API costs
  • You have GPU resources
  • You're running fully self-hosted

On this page