Skip to content
DocuExtract

Pricing

Free for self-host. Honest pricing for hosted.

Apache-2.0 means you can run DocuExtract on your own infrastructure at any volume, forever, for $0. The hosted tier exists for users who don't want to run infrastructure — and the paid plans pay for the GPU compute, not the software.

Plans

Pick the volume you actually need.

Every plan includes the visual field-picker, HITL review queue, verbatim grounding, audit trail, all supported languages, and API access. Higher plans bundle more extraction tiers — see the “Extraction tiers” section on each card for exactly what's included vs. overage.

Free

$0

Try it out, evaluate, ship side projects.

  • 50 documents/month
  • Up to 3 templates
  • Visual field-picker + review queue
  • Community support
Extraction tiers
Standard
✓ Included
Premium
unavailable
Multi-pass
unavailable

Self-host

$0

Run it on your own infrastructure. Apache-2.0.

  • Unlimited documents
  • Unlimited templates
  • docker compose up
  • Community support
Extraction tiers
Standard
✓ Included
Premium
your hardware
Multi-pass
your hardware

Solo

$99/mo

Indie consultants, accountants, small ops teams.

  • 500 documents/month
  • Up to 10 templates
  • API access
  • Email support
Extraction tiers
Standard
✓ Included
Premium
$0.20/doc
Multi-pass
$0.35/doc

Team

Most popular

$499/mo

Mid-market ops teams running real volume.

  • 3,000 documents/month
  • Unlimited templates
  • API access + webhooks
  • Priority email support
Extraction tiers
Standard
✓ Included
Premium
✓ Included
Multi-pass
$0.30/doc

Business

$1,999/mo

High-volume SMB, scaleups, regulated workflows.

  • 15,000 documents/month
  • Unlimited templates
  • All API tiers + webhooks
  • SLA (99.5% uptime, 4h support)
  • Dedicated Slack channel
Extraction tiers
Standard
✓ Included
Premium
✓ Included
Multi-pass
$0.25/doc

Scale

$4,999/mo

Mid-enterprise. Dedicated capacity, custom limits.

  • 50,000 documents/month
  • Dedicated worker pool
  • SLA (99.9% uptime, 1h support)
  • Quarterly business review
Extraction tiers
Standard
✓ Included
Premium
✓ Included
Multi-pass
✓ Included

How extraction tiers work

Every plan includes Standardtier extractions (Tier 0–2 OCR — Tesseract / PaddleOCR) up to the plan's monthly quota. Standard handles 80% of typical documents — born-digital PDFs, clean English/Spanish scans, structured forms.

Premium tier (adds vision-LLM OCR for handwriting, multilingual, degraded scans) and Multi-pass(2-pass agreement + tiebreaker for high-stakes accuracy) are bundled into higher plans and available as per-doc overage on lower ones at a plan-discounted rate. Each plan card shows exactly what's included and what costs extra.

API pricing (or plan overage)

Pay for the difficulty of the document.

Tier-based per-call pricing so simple documents don't subsidize hard ones. These are the list rates — used as-is by API-only customers, or applied as discounted overage on hosted plans for tiers not included in your subscription. $25/mo minimum applies to API-only access; hosted plans already cover the always-on infra.

Standard

$0.10/doc

Tier 0–2 OCR (embedded text + Tesseract + PaddleOCR). Hard cases route to the review queue rather than escalating.

Best for

Batch invoice/form workflows, born-digital PDFs.

Premium

$0.25/doc

Full cascade including Tier 3 vision-LLM OCR (Qwen 2.5-VL 7B). Handles handwriting, multilingual scans, complex layouts.

Best for

Documents that defeat traditional OCR.

Premium + multi-pass

$0.40/doc

Full cascade + two independent LLM extraction passes with disagreement detection. Third-call tiebreaker, then HITL routing.

Best for

High-stakes extractions: legal, medical, financial.

Volume discounts kick in at 10K docs/mo. Annual commits get 15–20% off. Contact us for enterprise pricing — we'll quote against your real workload, not list price.

Enterprise · Managed · Consulting

When self-host needs hands. Or scale. Or private models.

DocuExtract is open source — you can run it yourself, forever, for free. When that isn't practical (regulated industry, scale, custom models, dedicated SLA), Inspire AI Lab offers managed deployment and custom engineering on top of it.

Enterprise

Custom

100,000+ docs/month, custom SLA, single-vendor procurement.

  • 100K+ documents/month
  • AWS Marketplace billing
  • Custom SLA + named technical contact
  • SSO, audit log export, retention policies

Managed deployment

$25K–60K/yr

We run DocuExtract inside your VPC. On your hardware, on your models.

  • Private deployment on your AWS/GCP/Azure
  • Custom model fine-tuning on your documents
  • Hardening, monitoring, on-call rotation
  • Quarterly accuracy review

Custom build

Project quote

Inspire AI Lab consulting engagement on top of Extract.

  • Custom OCR pipelines for unusual document types
  • Multilingual extraction validation on your real corpus
  • Integration into existing systems
  • Training your team to operate it

FAQ

The questions everyone asks.

Is self-hosting really free?
Yes. Apache-2.0 license, docker compose up, run as much volume as your hardware can handle. We don't throttle, phone-home, or require a license key. The self-host story is the load-bearing claim of the project.
Why is your hosted tier cheaper than Sensible or Nanonets?
Because we self-host the inference layer on open-weights models (Qwen 2.5-VL 7B by default) on scale-to-zero GPU. We don't pay per-token to OpenAI or Anthropic. The savings flow through to you.
What's the catch with the free tier?
There isn't one beyond the 50-docs/3-templates quota — that's the rate we can sustain at scale for free users. If you need more, self-host (free, unlimited) or upgrade to a paid plan. The free tier is the funnel into our consulting business, not a paywall trap.
Do you train on my documents?
No. Hosted tier: documents are processed and stored only as long as you keep them; deleted on request immediately. Self-host: nothing leaves your environment at all. We don't train our base models on customer data, period.
What if I need Premium tier extractions on my plan?
Premium is included by default on Team and up. On Solo, Premium-tier extractions are billed per-doc at a plan-discounted rate ($0.20/doc) on top of your subscription. Free tier is Standard-only — upgrade to a paid plan to access Premium. Every plan card shows exactly which tiers are included vs. overage so there are no surprises at billing time.
When do I need the multi-pass tier?
When wrong values would cost more than $0.40/doc. Legal contracts, medical records, financial reconciliations — anywhere a fabricated value would be expensive to catch downstream. Multi-pass runs 2–3 independent LLM extractions and routes disagreements to human review. Included by default on Scale and Enterprise; available as overage on Team and Business.
What languages actually work?
English and Spanish are stable. Eight more are beta. Five Indic languages are experimental. We label by maturity honestly — see the language matrix for the full picture, including known caveats per language.
How do I cancel?
One click in the dashboard. Pro-rated refund for the unused portion of the month. Your data is yours — export everything as CSV/JSON before you cancel; we delete it after a 30-day grace period.
Can I get a custom deployment in our VPC?
Yes — that's the Managed Deployment tier. We deploy DocuExtract inside your AWS / GCP / Azure environment, run the OCR cascade and inference on your hardware, and operate it under contract. Bookings via Inspire AI Lab consulting.

Still deciding? Start free.

50 docs/month is enough to evaluate against a real workload. Upgrade, downgrade, or switch to self-host whenever you want.