AI-Powered Document Conversion API
Convert PDFs to WCAG 2.2 AA accessible HTML at scale. Powered by Poppler extraction and Claude AI with 12-rule automated validation and a production-ready design system.
Enterprise only — API access is available for enterprise customers with custom agreements. Contact our team to discuss volume pricing and SLA requirements.
Full-stack conversion pipeline
From raw PDF bytes to validated, accessible HTML — every step is automated and auditable.
AI-powered extraction
Claude converts PDF content into semantic, accessible HTML with proper heading hierarchy, landmark regions, and ARIA attributes.
WCAG 2.2 AA validation
12 automated accessibility rules score every conversion. Each rule produces pass/fail results with actionable remediation guidance.
Design system CSS
Every output includes a production-ready design system with CSS custom properties, responsive typography, and print-optimized styles.
Multi-language support
Process documents in English, German, French, Spanish, Italian, and Dutch with language-aware extraction and proper lang attributes.
Large document chunking
Documents over 20 pages are automatically split into optimally-sized chunks, converted individually, and stitched into a single output.
OCR for scanned documents
Image-only PDFs are detected automatically. Optical character recognition extracts text before conversion to ensure full accessibility.
Five steps from PDF to accessible HTML
Upload
POST your PDF via multipart form data with optional metadata.
Extract
Poppler extracts text, images, tables, and form fields per page.
Convert
Claude transforms extracted content into semantic HTML with ARIA.
Validate
12 accessibility rules score the output and flag issues.
Deliver
Receive the HTML, metadata, score, and extraction stats in one response.
Human-in-the-loop workflow
AI handles the heavy lifting, but every conversion passes through human review before reaching your clients. No fully autonomous delivery — ever.
AI converts
The conversion pipeline extracts, converts, and validates the document — producing an accessibility score automatically.
Admin reviews
A human operator previews the HTML output, evaluates the accessibility score, and decides to deliver or retry.
Client accepts
The client reviews the delivered document and formally accepts, requests revisions, or provides feedback.
Full audit trail
Every action — trigger, completion, delivery, acceptance — is recorded as a typed domain event with timestamps.
Why it matters: Fully automated accessibility conversion can produce plausible-looking HTML that misses critical semantic issues. Our human review layer catches what automated scoring cannot — ensuring your clients receive genuinely accessible documents, not just high-scoring ones.
Simple to integrate
A single POST request converts your PDF. No SDKs required.
curl -X POST https://api.docaccessible.com/convert \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@document.pdf" \
-F "source_format=pdf" \
-F "request_id=req_abc123" \
-F "job_id=job_xyz789"import requests
response = requests.post(
"https://api.docaccessible.com/convert",
headers={"Authorization": "Bearer YOUR_API_KEY"},
files={"file": open("document.pdf", "rb")},
data={
"source_format": "pdf",
"request_id": "req_abc123",
"job_id": "job_xyz789",
},
)
result = response.json()
print(f"Score: {result['metadata']['accessibility_score']}/100")
print(result["html"])const formData = new FormData();
formData.append("file", fileBuffer, "document.pdf");
formData.append("source_format", "pdf");
formData.append("request_id", "req_abc123");
formData.append("job_id", "job_xyz789");
const response = await fetch(
"https://api.docaccessible.com/convert",
{
method: "POST",
headers: { Authorization: "Bearer YOUR_API_KEY" },
body: formData,
}
);
const { html, metadata } = await response.json();
console.log(`Score: ${metadata.accessibility_score}/100`);{
"html": "<!DOCTYPE html><html lang=\"en\">...</html>",
"metadata": {
"model": "claude-sonnet-4-20250514",
"processing_time_ms": 4230,
"pages_processed": 12,
"tokens_used": { "input": 28400, "output": 15200 },
"extraction": {
"images_found": 3,
"tables_found": 2,
"has_form_fields": false,
"ocr_recommended": false
},
"document": {
"title": "Annual Accessibility Report",
"author": "Compliance Team",
"language": "en"
},
"accessibility_score": 92
}
}Built for production workloads
Enterprise agreements include everything you need to integrate at scale.
Circuit breaker
Automatic failure detection with a 5-minute cooldown. Prevents cascade failures across your document pipeline.
Domain events
Typed audit events for every conversion: triggered, completed, failed, delivered. Full observability out of the box.
Request correlation
End-to-end request ID tracing from API call through extraction, conversion, and delivery.
Smart chunking
Token-balanced splitting for documents up to 200+ pages. Each chunk is converted independently and stitched seamlessly.
AI self-repair
When validation detects issues, the system feeds errors back to the AI for automatic remediation before delivery.
Volume pricing
Custom rate limits and per-page pricing tiers for high-volume enterprise integrations.
Ready to convert at scale?
Talk to our team about enterprise API access, volume pricing, and custom pipeline configurations.