Parse, extract, and classify documents with any AI provider. Self-host or use our managed cloud.
npm i @doclo/sdkDefine your output shape with a typed schema. Get structured, validated data back with full TypeScript inference.
const flow = createFlow()
.step('categorize', categorize({
provider,
categories: ['Passport', 'ID Card',
'Utility Bill', 'Bank Stmt']
}))
.conditional('extract', (ctx) => {
const schema = schemas[ctx.category];\n return extract({ provider, schema });
})
.build();
Anthropic
Google
Mistral
DatalabSwitch between AI providers and OCR engines with a single line of config. No rewrites, no migration headaches. Your pipeline stays the same.
The same engine that powers Doclo Cloud. Self-host on your infrastructure, version everything in git, integrate into any stack.
View DocumentationFull type safety from schema to output. Your IDE knows exactly what you're getting back.
Write once, run on any provider. No rewriting when you switch models.
Run N extractions, take the majority vote per field.
Fallback chains, circuit breakers, exponential backoff.
Every extraction traced. Export to your existing stack.
Track exactly where each extracted value came from. Line-level source references with confidence scores.
Parse, extract, categorize, split, and more. Each node does one thing well. Chain them together for complex pipelines.
See the full node referenceExtract text and structure from documents using OCR or Vision Language Models.
Pull structured data from documents according to your JSON Schema.
Classify and route documents by type automatically.
Launch sub-flows and reusable components.
Identify and separate multiple documents within a single PDF.
Branch workflows with IF/THEN logic.
Doclo works with a broad spectrum of formats including PDFs, images, and Office documents. The exact formats supported depend on the models and providers you select for your pipeline.
npm i @doclo/sdk