Open Source

The Document AI Framework

Parse, extract, and classify documents with any AI provider. Self-host or use our managed cloud.

npm i @doclo/sdk
Or try Doclo Cloud →
Features

Production-ready extraction in a few lines

Schema Extraction

Schema Extraction

Define your output shape with a typed schema. Get structured, validated data back with full TypeScript inference.

View Docs
Doclo Cloud
Doclo SDK
const flow = createFlow()
  .step('categorize', categorize({
    provider,
    categories: ['Passport', 'ID Card',
      'Utility Bill', 'Bank Stmt']
  }))
  .conditional('extract', (ctx) => {
    const schema = schemas[ctx.category];\n    return extract({ provider, schema });
  })
  .build();
OpenAI symbolOpenAIAnthropic symbolAnthropicGoogle symbolGoogleMistral symbolMistralDatalab symbolDatalabReducto symbolReductoOpenRouter symbol+100 other leading VLM models via OpenRouter
Integrations

Any model. Any provider. Zero lock-in.

Switch between AI providers and OCR engines with a single line of config. No rewrites, no migration headaches. Your pipeline stays the same.

Doclo SDK

Full control in your own environment

The same engine that powers Doclo Cloud. Self-host on your infrastructure, version everything in git, integrate into any stack.

View Documentation

TypeScript-First

Full type safety from schema to output. Your IDE knows exactly what you're getting back.

Provider Abstraction

Write once, run on any provider. No rewriting when you switch models.

2/3

Consensus Voting

Run N extractions, take the majority vote per field.

Automatic Resilience

Fallback chains, circuit breakers, exponential backoff.

0ms100ms200ms300ms

OpenTelemetry Native

Every extraction traced. Export to your existing stack.

L:4L:885%

Citations & Traceability

Track exactly where each extracted value came from. Line-level source references with confidence scores.

Building Blocks

Composable nodes for any document workflow

Parse, extract, categorize, split, and more. Each node does one thing well. Chain them together for complex pipelines.

See the full node reference

Parse

Extract text and structure from documents using OCR or Vision Language Models.

Extract

Pull structured data from documents according to your JSON Schema.

Categorize

Classify and route documents by type automatically.

Trigger

Launch sub-flows and reusable components.

Split

Identify and separate multiple documents within a single PDF.

Conditionals

Branch workflows with IF/THEN logic.

FAQ

Common questions

Everything you need to know about building with the Doclo SDK.

Read the docs

Doclo works with a broad spectrum of formats including PDFs, images, and Office documents. The exact formats supported depend on the models and providers you select for your pipeline.

Get Started

Ship document AI this week, not next quarter

Doclo Logo
2026 All Rights Reserved