DocuTray

Python SDK

Official Python SDK for the DocuTray API — OCR conversion, document identification, data extraction, and knowledge bases for Python 3.10+.

The official Python library for the DocuTray API, providing access to document processing capabilities including OCR, document identification, data extraction, and knowledge bases.

Installation

pip install docutray

Requires Python 3.10+.

Quick Start

Synchronous Usage

from pathlib import Path
from docutray import Client

client = Client(api_key="your-api-key")

# Convert a document
result = client.convert.run(
    file=Path("invoice.pdf"),
    document_type_code="invoice"
)
print(result.data)

client.close()

Asynchronous Usage

import asyncio
from pathlib import Path
from docutray import AsyncClient

async def main():
    async with AsyncClient(api_key="your-api-key") as client:
        result = await client.convert.run(
            file=Path("invoice.pdf"),
            document_type_code="invoice"
        )
        print(result.data)

asyncio.run(main())

Configuration

# Via constructor
client = Client(api_key="your-api-key")

# Via environment variable (DOCUTRAY_API_KEY)
client = Client()

Resources

Client

The main entry points for the SDK:

API Resources

  • Convert — Document conversion and data extraction
  • Identify — Automatic document type identification
  • DocumentTypes — Document type catalog and schema validation
  • Steps — Workflow step execution
  • KnowledgeBases — Knowledge base management and semantic search

Error Handling

Types

Response and model types:

On this page