search_knowledge

Hybrid search combining semantic similarity with full-text keyword matching using Reciprocal Rank Fusion (RRF).

Parameters

Parameter	Type	Required	Default	Description
`query`	string	Yes	-	Natural language search query (1-10000 chars)
`limit`	number	No	10	Maximum results to return (1-50)
`fullTextWeight`	number	No	1.0	Weight for keyword matching (0-2)
`semanticWeight`	number	No	1.0	Weight for semantic similarity (0-2)
`tags`	string[]	No	-	Filter to documents with ALL specified tags
`sourceType`	enum	No	-	Filter by source: `note`, `file`, or `url`
`minScore`	number	No	-	Minimum relevance score threshold (0-1)

Example Request

{
  "query": "quarterly planning meeting notes",
  "limit": 5,
  "fullTextWeight": 1.0,
  "semanticWeight": 1.5,
  "tags": ["work", "planning"],
  "sourceType": "note",
  "minScore": 0.5
}

Response

{
  "query": "quarterly planning meeting notes",
  "filters": {
    "tags": ["work", "planning"],
    "sourceType": "note",
    "minScore": 0.5
  },
  "totalResults": 3,
  "results": [
    {
      "documentId": "550e8400-e29b-41d4-a716-446655440000",
      "documentTitle": "Q4 Planning Notes",
      "sourceType": "note",
      "tags": ["work", "planning", "q4"],
      "chunkId": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
      "content": "In the Q4 planning meeting, we discussed the roadmap for...",
      "score": 0.89
    },
    {
      "documentId": "6ba7b810-9dad-11d1-80b4-00c04fd430c8",
      "documentTitle": "Annual Planning Overview",
      "sourceType": "note",
      "tags": ["work", "planning"],
      "chunkId": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
      "content": "Quarterly milestones should align with company objectives...",
      "score": 0.76
    }
  ]
}

How It Works

Reciprocal Rank Fusion (RRF)

textrawl runs two parallel searches:

Full-Text Search: PostgreSQL tsvector matching for exact keywords
Semantic Search: pgvector cosine similarity for meaning

Results are combined using RRF:

RRF_score = (fullTextWeight / (k + fts_rank)) + (semanticWeight / (k + semantic_rank))

Where k = 60 (standard RRF constant).

Weight Tuning

Use Case	fullTextWeight	semanticWeight
Exact phrases	1.5-2.0	0.5-1.0
Conceptual search	0.5-1.0	1.5-2.0
Balanced (default)	1.0	1.0
Keyword-only	2.0	0
Semantic-only	0	2.0

Tip: Start with default weights and adjust based on results. Higher semantic weight helps with paraphrased queries.

Filtering

By Tags

Filter to documents containing ALL specified tags:

{
  "query": "project update",
  "tags": ["work", "project-x"]
}

Only returns documents tagged with both work AND project-x.

By Source Type

Filter by how the document was created:

sourceType	Description
`note`	Created via `add_note` tool
`file`	Uploaded via CLI or Web UI
`url`	Web content (future feature)

By Score Threshold

Filter out low-relevance results:

{
  "query": "specific topic",
  "minScore": 0.7
}

Caution: Scores are relative. A 0.7 threshold may filter out relevant results for broad queries.

Error Responses

Database not configured

{
  "error": "Database not configured",
  "message": "Set SUPABASE_URL and SUPABASE_SERVICE_KEY to enable search."
}

OpenAI not configured

{
  "error": "OpenAI not configured",
  "message": "Set OPENAI_API_KEY to enable semantic search."
}

Search failed

{
  "error": "Search failed",
  "message": "Connection timeout to database"
}

Best Practices

Start broad, then narrow: Begin without filters, add them if too many results
Use semantic weight for questions: Increase semanticWeight for natural language queries
Use full-text weight for keywords: Increase fullTextWeight for specific terms
Combine with get_document: Search returns chunks, use get_document for full context
Check score distribution: If all scores are low, try rephrasing the query

On this page