textrawl
byJeff Green

CLI Tools Overview

Convert files to searchable knowledge with command-line tools

textrawl includes CLI tools for converting various file formats to Markdown, then uploading them to your knowledge base.

Quick Start

# Convert files
npm run convert -- mbox ~/Mail/archive.mbox
npm run convert -- html ./saved-pages/
 
# Upload to Supabase
npm run upload -- ./converted/emails

Available Commands

mbox

Convert MBOX email archives (Gmail exports, Thunderbird backups).

MBOX conversion →

html

Convert HTML files and web pages to searchable markdown.

HTML conversion →

upload

Upload converted files to Supabase with automatic embedding.

Batch upload →

Unified Converter

The main entry point for all conversions:

npm run convert -- <command> <path> [options]
CommandDescription
mbox <file>Convert MBOX email archive
eml <path>Convert EML file(s) or directory
html <path>Convert HTML file(s) or directory
takeout <path>Convert Google Takeout archive
auto <path>Auto-detect format and convert

Common Options

All conversion commands support these options:

OptionDefaultDescription
-o, --output <dir>./convertedOutput directory
-v, --verbosefalseEnable verbose logging
--dry-runfalsePreview without writing files
-t, --tags <tags...>[]Additional tags to add

Workflow

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Source    │     │  Converted  │     │  Supabase   │
│   Files     │ ──► │  Markdown   │ ──► │  Knowledge  │
│  (MBOX,HTML)│     │   + YAML    │     │    Base     │
└─────────────┘     └─────────────┘     └─────────────┘
      │                    │                    │
      │                    │                    │
 npm run convert     Frontmatter         npm run upload
                     with metadata

Output Format

Converted files are saved as Markdown with YAML frontmatter:

---
title: "Email Subject or Page Title"
source_type: email
source_hash: "abc123..."
tags:
  - imported
  - email
created_at: "2024-01-15T10:30:00Z"
converted_at: "2024-03-20T14:22:00Z"
metadata:
  from: "sender@example.com"
  to: "recipient@example.com"
---
 
# Email Subject
 
Email or document content here...

Frontmatter Fields

FieldDescription
titleDocument title
source_typeemail, web, youtube, calendar, contact
source_hashHash for deduplication
tagsArray of tags
created_atOriginal creation date
converted_atConversion timestamp
metadataFormat-specific metadata

The source_hash prevents duplicate uploads when re-running the upload command.

Web UI

A drag-and-drop web interface for file conversion:

npm run ui
# Opens at http://localhost:3001

Features:

  • Drag-and-drop file upload
  • Real-time conversion progress
  • Auto-upload option after conversion
  • Supports MBOX, EML, ZIP (Takeout), HTML, PDF, DOCX, TXT, MD

Next Steps

On this page