textrawl
byJeff Green
Guides

Supabase Requirements

Recommended compute tiers, storage, and database sizing for textrawl

Choose the right Supabase compute tier and understand your database requirements for textrawl.

Vector Dimensions by Provider

The embedding model you choose determines your vector dimensions, which directly impacts storage, index size, and RAM requirements.

ProviderModelDimensionsStorage per Vector
OpenAItext-embedding-3-small1536~6 KB
Ollama v1nomic-embed-text / mxbai-embed-large1024~4 KB
Ollama v2nomic-embed-text-v2-moe768~3 KB

Lower dimensions = smaller indexes = better performance at every tier.

Required Indexes

Textrawl uses up to 6 HNSW vector indexes (depending on which features are enabled), plus GIN indexes for full-text search and B-tree indexes for lookups.

TableIndex TypePurpose
chunksHNSWCore semantic search
memory_entitiesHNSWEntity semantic search
memory_observationsHNSWMemory semantic search
conversation_sessionsHNSWConversation semantic search
conversation_turnsHNSWTurn-level semantic search
proactive_insightsHNSWInsight semantic search
documentsGINFull-text search (tsvector)
memory_observationsGINFull-text search (tsvector)
conversation_sessionsGINFull-text search on summaries
conversation_turnsGINFull-text search on turns

All HNSW indexes should fit in RAM for optimal query performance. If your HNSW indexes exceed shared_buffers, queries will start hitting disk and latency increases significantly.

Storage Estimates

Estimated storage per document (assuming OpenAI 1536-dimension embeddings):

  • 1 document row: ~2 KB
  • ~5 chunks average (512 tokens each): ~62 KB per document (including embeddings)
  • HNSW index overhead: ~7 KB per vector
DocumentsChunks (est.)DB Size (est.)Fits in 8 GB Free Disk?
1,000~5,000~70 MBYes
10,000~50,000~700 MBYes
50,000~250,000~3.5 GBYes
100,000~500,000~7 GBYes
150,000+~750,000+~10 GB+No (overage at $0.125/GB)

Compute Tier Recommendations

Supabase Pro plan ($25/mo) includes $10/mo in compute credits, which covers one Micro instance at no extra cost. Upgrade to a larger tier by paying the difference.

Micro -- Included with Pro ($10/mo, covered by credits)

2-core shared ARM, 1 GB RAM, 60 direct connections

  • Included at no extra cost on Pro plan via compute credits
  • Suitable for getting started and light personal use
  • Handles up to ~30K vectors comfortably
  • HNSW indexes must stay under ~250 MB total

Small -- $15/mo

2-core shared ARM, 2 GB RAM, 90 direct connections

  • Good for personal knowledge bases under 10K documents (~50K chunks)
  • Room for all 6 HNSW indexes at moderate scale

2-core shared ARM, 4 GB RAM, 120 direct connections

  • Best value for production personal use
  • Handles 10K-50K documents with all features enabled (memory, conversations, insights)
  • 4 GB gives breathing room for all HNSW + GIN indexes

Large -- $110/mo

2-core dedicated ARM, 8 GB RAM, 160 direct connections

  • First tier with dedicated (non-burst) CPU -- consistent performance
  • For heavy use with 50K+ documents or multiple concurrent users
  • 8 GB RAM keeps all indexes comfortably in memory

XL and Above

For 100K+ documents or multi-tenant deployments. Consider pgvectorscale (DiskANN indexes) at this scale to move vector indexes to SSD instead of RAM.

Disk Type

General Purpose (gp3) is recommended for textrawl. Your workload is read-heavy (search queries) with occasional batch writes (uploads). The included 3,000 IOPS baseline handles this well.

  • 8 GB included free, then $0.125/GB
  • 3,000 IOPS included, $0.024 per additional IOPS

High Performance (io2) is only needed for sustained high-throughput workloads and is not necessary for typical textrawl usage.

Diagnosing Your Current Setup

Run these queries in the Supabase SQL Editor to understand your current database.

Database Size

SELECT pg_size_pretty(pg_database_size(current_database())) AS total_db_size;

Table Sizes

SELECT
  schemaname || '.' || tablename AS table,
  pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) AS total_size,
  pg_size_pretty(pg_relation_size(schemaname || '.' || tablename)) AS data_size,
  pg_size_pretty(
    pg_total_relation_size(schemaname || '.' || tablename)
    - pg_relation_size(schemaname || '.' || tablename)
  ) AS index_size
FROM pg_tables
WHERE schemaname = 'public'
  AND tablename IN (
    'documents', 'chunks',
    'memory_entities', 'memory_observations', 'memory_relations',
    'conversation_sessions', 'conversation_turns',
    'proactive_insights', 'insight_queue'
  )
ORDER BY pg_total_relation_size(schemaname || '.' || tablename) DESC;

Vector Counts

SELECT 'chunks' AS table_name, count(*) AS total_rows, count(embedding) AS vectors
FROM chunks
UNION ALL
SELECT 'memory_entities', count(*), count(embedding) FROM memory_entities
UNION ALL
SELECT 'memory_observations', count(*), count(embedding) FROM memory_observations
UNION ALL
SELECT 'conversation_sessions', count(*), count(summary_embedding) FROM conversation_sessions
UNION ALL
SELECT 'conversation_turns', count(*), count(embedding) FROM conversation_turns
UNION ALL
SELECT 'proactive_insights', count(*), count(embedding) FROM proactive_insights
ORDER BY vectors DESC;

HNSW Index Sizes

SELECT
  indexname,
  pg_size_pretty(pg_relation_size(indexname::regclass)) AS index_size
FROM pg_indexes
WHERE schemaname = 'public'
  AND indexdef ILIKE '%hnsw%'
ORDER BY pg_relation_size(indexname::regclass) DESC;

If the chunks table is missing an HNSW index, create it:

-- Works for all providers (OpenAI 1536d, Ollama v1 1024d, Ollama v2 768d)
-- PostgreSQL infers the dimension from the column type
CREATE INDEX IF NOT EXISTS chunks_embedding_idx ON chunks
USING hnsw (embedding vector_cosine_ops);

Current Compute Tier

SHOW shared_buffers;
SHOW effective_cache_size;
SHOW max_connections;
shared_buffersmax_connectionsLikely Tier
~128 MB60Nano / Micro
~256 MB60Micro
~512 MB90Small
~1 GB120Medium
~2 GB160Large
~4 GB240XL

If your total HNSW index size approaches shared_buffers, it's time to upgrade.

Scaling Tips

  1. Lower dimensions help the most. Switching from OpenAI 1536d to Ollama v2 768d cuts index size in half. This is the cheapest way to scale.

  2. Index bloat happens. Run REINDEX TABLE <table>; periodically on GIN-indexed tables (documents, memory_observations, conversation_sessions, conversation_turns) if indexes grow disproportionately large relative to data size (10:1+ ratio is a sign of bloat).

  3. Temporary compute boosts. For large batch imports, temporarily upgrade your compute tier (billed hourly), build HNSW indexes faster, then scale back down.

  4. Consider pgvectorscale for 100K+ vectors. DiskANN indexes use SSD instead of RAM, making them much cheaper to scale. pgvectorscale is not yet a native Supabase extension, but can be used with Supabase via the pgai Python CLI. For self-hosted PostgreSQL, install it directly.

  5. Keep the spend cap on initially. Textrawl is single-tenant -- you won't get surprise traffic spikes.