search

Hybrid search combining semantic similarity with full-text keyword matching using Reciprocal Rank Fusion (RRF). Optionally search across entity memories and past conversations with weighted fusion.

Parameters

Parameter	Type	Required	Default	Description
`query`	string	Yes	-	Natural language search query (1-10000 chars)
`limit`	number	No	5	Maximum results to return (1-50)
`fullTextWeight`	number	No	1.0	Weight for keyword matching (0-2)
`semanticWeight`	number	No	1.0	Weight for semantic similarity (0-2)
`tags`	string[]	No	-	Filter to documents with ALL specified tags
`sourceType`	enum	No	-	Filter by source: `note`, `file`, or `url`
`contentType`	enum	No	-	Filter by content type: `email`, `youtube`, `calendar`, `contact`, `webpage`, `document`
`minScore`	number	No	-	Minimum relevance score threshold (0-1)
`includeMemories`	boolean	No	false	Also search entity memories (requires `ENABLE_MEMORY`)
`includeConversations`	boolean	No	false	Also search past conversations (requires `ENABLE_CONVERSATIONS`)
`memoryWeight`	number	No	1.0	Weight for memory results when `includeMemories=true` (0-2)
`conversationWeight`	number	No	0.5	Weight for conversation results when `includeConversations=true` (0-2)

Example Request

Document-Only Search (Default)

{
  "query": "quarterly planning meeting notes",
  "limit": 5,
  "fullTextWeight": 1.0,
  "semanticWeight": 1.5,
  "tags": ["work", "planning"],
  "sourceType": "note",
  "minScore": 0.5
}

Cross-Source Search

{
  "query": "database architecture decisions",
  "includeMemories": true,
  "includeConversations": true,
  "memoryWeight": 1.5,
  "limit": 10
}

Response

Document-Only Response

{
  "query": "quarterly planning meeting notes",
  "totalResults": 3,
  "results": [
    {
      "documentId": "550e8400-e29b-41d4-a716-446655440000",
      "documentTitle": "Q4 Planning Notes",
      "sourceType": "note",
      "tags": ["work", "planning", "q4"],
      "chunkId": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
      "content": "In the Q4 planning meeting, we discussed the roadmap for...",
      "score": 0.89
    }
  ]
}

Cross-Source Response

When includeMemories or includeConversations is true, results are fused by score:

{
  "query": "database architecture decisions",
  "counts": { "documents": 3, "memories": 2, "conversations": 1 },
  "results": [
    { "type": "document", "score": 0.91, "data": { "id": "...", "title": "DB Design Doc", "content": "..." } },
    { "type": "memory", "score": 0.85, "data": { "entityName": "Database", "content": "Uses PostgreSQL with pgvector" } },
    { "type": "conversation", "score": 0.42, "data": { "sessionKey": "db-planning", "title": "DB Planning Session" } }
  ]
}

Output Schema

This tool MUST return structuredContent alongside the text response. The structuredContent object MUST use canonical verbose keys regardless of the COMPACT_RESPONSES setting.

Field	Type	Description
`query`	string	The search query
`totalResults`	integer	Number of results returned
`results`	array	Array of result objects
`results[].type`	enum	`document`, `memory`, or `conversation`
`results[].score`	number	Relevance score
`results[].documentId`	string	Document UUID (document results)
`results[].documentTitle`	string	Document title (document results)
`results[].sourceType`	string	Source type (document results)
`results[].tags`	string[]	Document tags (document results)
`results[].chunkId`	string	Chunk UUID (document results)
`results[].content`	string	Content snippet (truncated to 500 chars)
`results[].entityName`	string	Entity name (memory results)
`results[].entityType`	string	Entity type (memory results)
`results[].sessionId`	string	Session ID (conversation results)
`results[].sessionKey`	string?	Session key (conversation results)
`results[].title`	string?	Session title (conversation results)
`results[].summary`	string?	Session summary (conversation results)
`counts`	object	Source counts (cross-source only)
`counts.documents`	integer	Number of document results
`counts.memories`	integer	Number of memory results
`counts.conversations`	integer	Number of conversation results

How It Works

Reciprocal Rank Fusion (RRF)

textrawl runs two parallel searches:

Full-Text Search: PostgreSQL tsvector matching for exact keywords
Semantic Search: pgvector cosine similarity for meaning

Results are combined using RRF:

RRF_score = (fullTextWeight / (k + fts_rank)) + (semanticWeight / (k + semantic_rank))

Where k = 60 (standard RRF constant).

Weight Tuning

Use Case	fullTextWeight	semanticWeight
Exact phrases	1.5-2.0	0.5-1.0
Conceptual search	0.5-1.0	1.5-2.0
Balanced (default)	1.0	1.0
Keyword-only	2.0	0
Semantic-only	0	2.0

Tip: Start with default weights and adjust based on results. Higher semantic weight helps with paraphrased queries.

Filtering

By Tags

Filter to documents containing ALL specified tags:

{
  "query": "project update",
  "tags": ["work", "project-x"]
}

Only returns documents tagged with both work AND project-x.

By Source Type

sourceType	Description
`note`	Created via `add_note` tool
`file`	Uploaded via CLI or Web UI
`url`	Web content (future feature)

By Content Type

contentType	Description
`email`	Email messages
`youtube`	YouTube watch history
`calendar`	Calendar events
`contact`	Contacts
`webpage`	Saved web pages
`document`	General documents

By Score Threshold

Filter out low-relevance results:

{
  "query": "specific topic",
  "minScore": 0.7
}

Caution: Scores are relative. A 0.7 threshold may filter out relevant results for broad queries.

Error Responses

Error	Cause	Fix
Database not configured	Missing database connection	Set `DATABASE_URL`
Embedding provider not configured	Missing API key	Set `OPENAI_API_KEY` or configure Ollama
Search failed	Database or embedding error	Check connectivity and logs

Best Practices

Start broad, then narrow: Begin without filters, add them if too many results
Use semantic weight for questions: Increase semanticWeight for natural language queries
Use full-text weight for keywords: Increase fullTextWeight for specific terms
Combine with get_document: Search returns chunks, use get_document for full context
Use cross-source sparingly: Enable includeMemories/includeConversations only when you need context from those sources
Check score distribution: If all scores are low, try rephrasing the query

On this page