Extract structured data from any document with AI

Send a PDF, image, or text plus a schema. Get clean, typed JSON back with confidence scores. One API call. That simple.

Start Free — 100 extractions/mo Read the Docs
// One API call to extract structured data
const result = await fetch("https://api.ramlabs.dev/v1/extract", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    document: invoicePdfBase64,
    documentType: "pdf",
    schema: "invoice",
  }),
});

// Get typed JSON with confidence scores
const { data, confidence } = await result.json();
// data.vendor_name = "Acme Corp" (confidence: 0.98)
// data.total_amount = 1500.00  (confidence: 0.99)
📄

Any Document Format

PDF, images (JPEG, PNG, WebP), and plain text. Send base64 or raw text, we handle the rest.

📌

Schema-First Extraction

Define exactly what you want. Use pre-built templates (invoice, receipt, resume, contract) or bring your own schema.

📈

Confidence Scores

Every extracted field includes a 0-1 confidence score so you know when to trust the output and when to flag for review.

Edge-Fast Performance

Deployed on Cloudflare Workers for sub-second cold starts globally. Most extractions complete in 1-3 seconds.

💻

Developer-First DX

Clean REST API, clear error messages, Python and Node.js examples. Get integrated in minutes, not days.

🔒

Usage-Based Pricing

Pay only for what you use. Free tier to start, transparent metered billing via Lemon Squeezy. No surprises.

How It Works

From document to structured data in seconds.

1

Define Your Schema

Use a pre-built template (invoice, receipt, resume, contract) or define any custom JSON schema with the fields you need.

2

Send Your Document

POST a PDF, image, or text to the API. Base64-encode files or send raw text. One endpoint handles everything.

3

Get Structured JSON

Receive typed JSON matching your schema with per-field confidence scores. Auto-process high-confidence results, flag low ones for review.

Why ScoutExtract Over Traditional OCR?

Compare the developer experience.

ScoutExtract AWS Textract Google Document AI
Integration time Minutes Hours–Days Hours–Days
Output format Typed JSON matching your schema Raw text blocks + bounding boxes Entities + key-value pairs
New document types Zero-shot (just change schema) Custom adapters needed Requires training data
Confidence scores Per-field, 0–1 Per-block only Per-entity
Free tier 100/month, no card 1,000/month (12 months) Trial credits

Simple, Usage-Based Pricing

Start free. Scale as you grow. No hidden fees.

Free

$0/mo
  • 100 extractions/month
  • All document types
  • Pre-built templates
  • Community support
Get Started

Pro

$199/mo
  • 5,000 extractions/month
  • Priority processing
  • 120 req/min rate limit
  • Priority support
  • $0.015/extra extraction
Start Trial

Scale

$499/mo
  • 25,000 extractions/month
  • SLA guarantee
  • 300 req/min rate limit
  • Dedicated support
  • $0.01/extra extraction
Start Trial

Try It Right Now

Get your API key and extract data in under 60 seconds.

# Step 1: Get your free API key
curl -X POST https://api.ramlabs.dev/v1/auth/register \
  -H "Content-Type: application/json" \
  -d '{"email": "you@company.com"}'

# Step 2: Extract data from any document
curl -X POST https://api.ramlabs.dev/v1/extract \
  -H "Authorization: Bearer rex_your_key" \
  -H "Content-Type: application/json" \
  -d '{"document": "Invoice #1234\nDate: 2026-01-15\nVendor: Acme Corp\nTotal: $1,500.00", "schema": "invoice"}'

# Returns:
# {"data": {"invoice_number": "1234", "vendor_name": "Acme Corp", "total_amount": 1500.00, ...},
#  "confidence": {"invoice_number": 0.98, "vendor_name": 0.99, "total_amount": 0.99}}

Full Quickstart Guide

Start extracting structured data today

100 free extractions per month. No credit card required.

Get Your Free API Key