Pricing

Free for self-host. Honest pricing for hosted.

Apache-2.0 means you can run DocuExtract on your own infrastructure at any volume, forever, for $0. The hosted tier exists for users who don't want to run infrastructure — and the paid plans pay for the GPU compute, not the software.

Plans

Pick the volume you actually need.

Every plan includes the visual field-picker, HITL review queue, verbatim grounding, audit trail, all supported languages, and API access. Higher plans bundle more extraction tiers — see the “Extraction tiers” section on each card for exactly what's included vs. overage.

Free

Try it out, evaluate, ship side projects.

50 documents/month
Up to 3 templates
Visual field-picker + review queue
Community support

Extraction tiers
Standard: ✓ Included
Premium: unavailable
Multi-pass: unavailable

Start free

Self-host

Run it on your own infrastructure. Apache-2.0.

Unlimited documents
Unlimited templates
docker compose up
Community support

Extraction tiers
Standard: ✓ Included
Premium: your hardware
Multi-pass: your hardware

View on GitHub

Solo

$99/mo

Indie consultants, accountants, small ops teams.

500 documents/month
Up to 10 templates
API access
Email support

Extraction tiers
Standard: ✓ Included
Premium: $0.20/doc
Multi-pass: $0.35/doc

Start Solo

Team

Business

$1,999/mo

High-volume SMB, scaleups, regulated workflows.

15,000 documents/month
Unlimited templates
All API tiers + webhooks
SLA (99.5% uptime, 4h support)
Dedicated Slack channel

Extraction tiers
Standard: ✓ Included
Premium: ✓ Included
Multi-pass: $0.25/doc

Start Business

Scale

$4,999/mo

Mid-enterprise. Dedicated capacity, custom limits.

50,000 documents/month
Dedicated worker pool
SLA (99.9% uptime, 1h support)
Quarterly business review

Extraction tiers
Standard: ✓ Included
Premium: ✓ Included
Multi-pass: ✓ Included

Start Scale

How extraction tiers work

Every plan includes Standardtier extractions (Tier 0–2 OCR — Tesseract / PaddleOCR) up to the plan's monthly quota. Standard handles 80% of typical documents — born-digital PDFs, clean English/Spanish scans, structured forms.

Premium tier (adds vision-LLM OCR for handwriting, multilingual, degraded scans) and Multi-pass(2-pass agreement + tiebreaker for high-stakes accuracy) are bundled into higher plans and available as per-doc overage on lower ones at a plan-discounted rate. Each plan card shows exactly what's included and what costs extra.

API pricing (or plan overage)

Pay for the difficulty of the document.

Tier-based per-call pricing so simple documents don't subsidize hard ones. These are the list rates — used as-is by API-only customers, or applied as discounted overage on hosted plans for tiers not included in your subscription. $25/mo minimum applies to API-only access; hosted plans already cover the always-on infra.

Standard

$0.10/doc

Tier 0–2 OCR (embedded text + Tesseract + PaddleOCR). Hard cases route to the review queue rather than escalating.

Best for

Batch invoice/form workflows, born-digital PDFs.

Premium

$0.25/doc

Full cascade including Tier 3 vision-LLM OCR (Qwen 2.5-VL 7B). Handles handwriting, multilingual scans, complex layouts.

Best for

Documents that defeat traditional OCR.

Premium + multi-pass

$0.40/doc

Full cascade + two independent LLM extraction passes with disagreement detection. Third-call tiebreaker, then HITL routing.

Best for

High-stakes extractions: legal, medical, financial.

Volume discounts kick in at 10K docs/mo. Annual commits get 15–20% off. Contact us for enterprise pricing — we'll quote against your real workload, not list price.

Enterprise · Managed · Consulting

When self-host needs hands. Or scale. Or private models.

DocuExtract is open source — you can run it yourself, forever, for free. When that isn't practical (regulated industry, scale, custom models, dedicated SLA), Inspire AI Lab offers managed deployment and custom engineering on top of it.

Enterprise

Custom

100,000+ docs/month, custom SLA, single-vendor procurement.

100K+ documents/month
AWS Marketplace billing
Custom SLA + named technical contact
SSO, audit log export, retention policies

Talk to sales

Managed deployment

$25K–60K/yr

We run DocuExtract inside your VPC. On your hardware, on your models.

Private deployment on your AWS/GCP/Azure
Custom model fine-tuning on your documents
Hardening, monitoring, on-call rotation
Quarterly accuracy review

Talk to consulting

Custom build

Project quote

Inspire AI Lab consulting engagement on top of Extract.

Custom OCR pipelines for unusual document types
Multilingual extraction validation on your real corpus
Integration into existing systems
Training your team to operate it

Talk to Inspire AI Lab

FAQ

The questions everyone asks.

Is self-hosting really free?: Yes. Apache-2.0 license, docker compose up, run as much volume as your hardware can handle. We don't throttle, phone-home, or require a license key. The self-host story is the load-bearing claim of the project.
Why is your hosted tier cheaper than Sensible or Nanonets?: Because we self-host the inference layer on open-weights models (Qwen 2.5-VL 7B by default) on scale-to-zero GPU. We don't pay per-token to OpenAI or Anthropic. The savings flow through to you.
What's the catch with the free tier?: There isn't one beyond the 50-docs/3-templates quota — that's the rate we can sustain at scale for free users. If you need more, self-host (free, unlimited) or upgrade to a paid plan. The free tier is the funnel into our consulting business, not a paywall trap.
Do you train on my documents?: No. Hosted tier: documents are processed and stored only as long as you keep them; deleted on request immediately. Self-host: nothing leaves your environment at all. We don't train our base models on customer data, period.
What if I need Premium tier extractions on my plan?: Premium is included by default on Team and up. On Solo, Premium-tier extractions are billed per-doc at a plan-discounted rate ($0.20/doc) on top of your subscription. Free tier is Standard-only — upgrade to a paid plan to access Premium. Every plan card shows exactly which tiers are included vs. overage so there are no surprises at billing time.
When do I need the multi-pass tier?: When wrong values would cost more than $0.40/doc. Legal contracts, medical records, financial reconciliations — anywhere a fabricated value would be expensive to catch downstream. Multi-pass runs 2–3 independent LLM extractions and routes disagreements to human review. Included by default on Scale and Enterprise; available as overage on Team and Business.
What languages actually work?: English and Spanish are stable. Eight more are beta. Five Indic languages are experimental. We label by maturity honestly — see the language matrix for the full picture, including known caveats per language.
How do I cancel?: One click in the dashboard. Pro-rated refund for the unused portion of the month. Your data is yours — export everything as CSV/JSON before you cancel; we delete it after a 30-day grace period.
Can I get a custom deployment in our VPC?: Yes — that's the Managed Deployment tier. We deploy DocuExtract inside your AWS / GCP / Azure environment, run the OCR cascade and inference on your hardware, and operate it under contract. Bookings via Inspire AI Lab consulting.

Still deciding? Start free.

50 docs/month is enough to evaluate against a real workload. Upgrade, downgrade, or switch to self-host whenever you want.

Start free See features Talk to consulting →