textrawl
byJeff Green

Embedding Providers

OpenAI, Google AI, and Ollama for vector embeddings

textrawl supports three embedding providers: OpenAI (cloud), Google AI (cloud), and Ollama (local).

Provider Comparison

FeatureOpenAIGoogle AIOllama
LocationCloudCloudLocal
PrivacyData sent to APIData sent to APIData stays local
Dimensions15367681024 or 768
Modeltext-embedding-3-smalltext-embedding-004nomic-embed-text / v2-moe
CostPay per tokenFree tier availableFree
Speed~100ms/request~80ms/request~50ms/request
SetupAPI keyAPI keyDocker container

OpenAI Setup

# .env
EMBEDDING_PROVIDER=openai
OPENAI_API_KEY=sk-proj-...

Use the database schema for 1536 dimensions:

# Run in Supabase SQL Editor
scripts/setup-db.sql

Google AI Setup

# .env
EMBEDDING_PROVIDER=google
GOOGLE_AI_API_KEY=your-google-ai-api-key
GOOGLE_EMBEDDING_MODEL=text-embedding-004

Use the database schema for 768 dimensions:

# Run in Supabase SQL Editor
scripts/setup-db-google.sql

Ollama Setup

# Start Ollama
docker run -d --name ollama -v ollama:/root/.ollama -p 11434:11434 ollama/ollama
 
# Pull the model
docker exec -it ollama ollama pull nomic-embed-text

Configure environment:

# .env
EMBEDDING_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=nomic-embed-text

Use the database schema for 1024 dimensions:

# Run in Supabase SQL Editor
scripts/setup-db-ollama.sql

Switching Providers

Warning: You cannot mix providers. Switching requires re-embedding all documents.

To switch:

  1. Export your documents (or keep source files)
  2. Drop existing tables
  3. Run new schema (different dimensions)
  4. Re-upload all documents

Quality Comparison

The table below compares published MTEB benchmarks for OpenAI and Ollama's nomic-embed-text. Google AI's text-embedding-004 performs comparably on standard retrieval benchmarks but is not included due to differing evaluation methodology.

BenchmarkOpenAInomic-embed-text
MTEB Average62.359.4
Retrieval54.952.8
Clustering45.044.2

OpenAI has a slight edge in benchmarks, but Ollama offers privacy and no API costs. Google AI offers a good balance with a free tier and competitive retrieval quality.

Recommendations

Choose OpenAI if:

  • You need maximum quality
  • Data privacy isn't critical
  • You prefer managed infrastructure

Choose Google AI if:

  • You want cloud convenience with a free tier
  • 768-dimension embeddings are sufficient
  • You already use Google Cloud

Choose Ollama if:

  • Data must stay local
  • You want to avoid API costs
  • You have GPU resources
  • You're running fully self-hosted

On this page