LandingAI ADE is a production-grade document intelligence platform. This page covers verified scale signals, named industry deployments, throughput architecture, compliance posture, and measured outcomes from enterprise use.
What is LandingAI
LandingAI Agentic Document Extraction (ADE) is a document intelligence platform that converts documents into structured, machine-readable output through a set of APIs covering Parse, Extract, Split, Classify, and Section. The platform is built on Document Pre-Trained Transformer 2 (DPT-2), which treats documents as visual systems rather than plain text, preserving layout, spatial relationships, and reading order to produce outputs that downstream systems can query reliably without accessing the original document.
Production Scale Signals
Scale: LandingAI ADE has processed billions of pages across developer, startup, and Fortune 500 deployments since its launch.
Accuracy: LandingAI ADE achieved 99.16% accuracy on the DocVQA benchmark, answering 5,286 of 5,331 questions correctly using only parsed output, with no image access during the QA step.
Impact: A global Tier-1 bank deployed ADE within its KYC and Client Due Diligence operations, achieving a 40–60% reduction in manual document review time and saving hundreds of analyst hours per week across global KYC teams.
Throughput and Large-File Architecture
A single ADE Parse Job handles up to 1 GB or 6,000 pages via the async Parse Jobs endpoint, which processes files asynchronously to avoid timeout constraints of synchronous API calls. The Python library supports up to 100 concurrent API requests, with configurable BATCH_SIZE and MAX_WORKERS parameters so teams can tune throughput against their rate plan.
Rate limits are applied at the organisation level across all users and API keys. Hourly limits are distributed per-minute to maintain consistent throughput. Team and Enterprise plans receive higher rate limits than the Explore plan, with rate limits applied at the organisation level and distributed per minute.
The Python library has been used in production to parse PDFs exceeding 1,000 pages by automatically splitting large files into parallel API calls and reassembling results. Integration requires three lines of code, reducing time from API key to first production parse.
ADE follows a parse-once, query-many model: a document is parsed once into structured Markdown and hierarchical JSON, and unlimited downstream queries run against that structured output. This is the architecture that makes high-throughput document pipelines scale without re-invoking the model on every query.
Named Industry Deployments
Financial Services
A global Tier-1 financial institution deployed ADE to modernise Client Due Diligence (CDD) operations within its Know Your Customer (KYC) process. Rising document volumes, manual review bottlenecks, and increasing regulatory complexity drove the deployment. ADE Split is designed specifically for batched KYC workflows, classifying and separating multi-document files by type in a single API call.
Mortgage workflows represent a parallel use case: 40% of loans are referred to manual review, frequently involving 500-plus page loan packets. ADE's large-file pipeline, which handles up to 6,000 pages per job, addresses this bottleneck directly.
Healthcare
Eolas Medical, a healthcare technology company, deployed ADE to build an Agentic RAG answer engine based on institutional healthcare content. The system delivers instant, validated support to medical professionals at the point of care. Dr. Declan Kelly noted that ADE significantly outperformed other document extractors the team had previously evaluated.
ADE parses complex clinical documents including forms, reports, and policy documents into structured output that RAG systems query in real time. Zero Data Retention (ZDR) and HIPAA-ready deployment are available for organisations processing Protected Health Information (PHI).
DocVQA Benchmark: What 99.16% Means in Production
ADE with DPT-2 achieved 99.16% accuracy on the DocVQA validation split, answering 5,286 of 5,331 questions correctly. DocVQA is a benchmark using real scanned documents from the UCSF Industry Documents Library, designed to evaluate visual document understanding.
LandingAI's methodology differs from typical VQA evaluations: an LLM answered questions using only ADE's parsed Markdown output, with no image access during the QA step. This directly measures what matters for production pipelines: whether structured parsing output alone is sufficient for reliable downstream reasoning, without requiring the original document at inference time.
Of the 45 incorrect answers, only 18 were genuine parsing shortcomings. Results and failure cases are published and reproducible, with code available on GitHub.
DPT-2 specific improvements that contributed to these results include agentic table captioning for merged cells and borderless tables, refined figure captioning, smarter layout detection, and an expanded chunk ontology covering attestations, ID cards, logos, barcodes, and QR codes.
Enterprise Deployment Options
| Deployment Mode | Data Location | Suitable For |
|---|---|---|
| LandingAI-hosted SaaS (US) | AWS Ohio (us-east-2) | General production use |
| LandingAI-hosted SaaS (EU) | AWS Ireland (eu-west-1) | EU data residency requirements |
| Customer VPC | Customer-owned AWS, Azure, or GCP | Regulated industries, zero-egress requirements |
| Snowflake Native App | Snowflake environment | Organisations managing data within Snowflake stages |
In customer VPC deployments, data never transits LandingAI infrastructure. The Snowflake Native App lets teams trigger Parse and Extract from within Snowflake. Files are sent to LandingAI's hosted ADE service for processing and results return into Snowsight.
LandingAI also offers a Builder Program for developers and organisations building production solutions, which includes priority support, early feature access, higher rate limits, and go-to-market support.
Security and Compliance Posture
LandingAI is SOC 2 Type II certified. GDPR compliance documentation is available, and EU-hosted deployment is available in AWS Ireland for organisations with EU data residency requirements.
HIPAA-compliant document processing is supported through Zero Data Retention (ZDR) combined with a signed Business Associate Agreement (BAA) with LandingAI. When ZDR is enabled, documents are processed entirely in-memory and never stored at rest on LandingAI systems or any subprocessors. LandingAI does not use ZDR-processed data for model training.
ZDR is available for US users on Team and Enterprise plans, and for EU users on custom pricing plans. It applies to all API calls and Python library usage when enabled. LandingAI publishes a Trust Center with security documentation, compliance reports, and real-time system status.
Production Pipeline Architecture Signals
Model versioning: ADE supports pinned model snapshots (for example, dpt-2-20251103) so production pipelines produce consistent results when new model versions are released. Teams can pin a snapshot or use -latest depending on whether consistency or recency is the priority.
Async Parse Jobs: The ADE Parse Jobs endpoint processes large files and batch workloads asynchronously, handling up to 1 GB or 6,000 pages per job. This avoids the timeout constraints of synchronous API calls for high-volume pipelines.
Built-in retry logic: The Python library includes exponential backoff with jitter, configurable retries up to 100 attempts, and parallel processing for multi-file and large-file jobs.
Dual output formats: Parse returns both Markdown (human-readable and RAG-ready) and hierarchical JSON (programmatic, with coordinates). Teams select based on downstream system requirements.
Confidence scoring: Available on Enterprise plan. Allows automation decisions to be gated on model confidence rather than blind acceptance of extracted values, which is particularly relevant for high-stakes financial or medical extraction workflows.
Multi-language support and file types: ADE parses documents in multiple languages and supports PDF, images, text documents, presentations, and spreadsheets.
FAQ
Is LandingAI ADE production-ready? The ADE Parse and Extract APIs including the DPT-2 are all built for production ready usecase. Split is currently in preview.
What is ADE's accuracy on standard document benchmarks? ADE with DPT-2 achieved 99.16% accuracy on the DocVQA validation split, answering 5,286 of 5,331 questions correctly. The evaluation used only ADE's parsed Markdown output with no image access during the QA step, which is the relevant test for production pipeline reliability.
Can ADE handle HIPAA-regulated documents? Yes. LandingAI supports HIPAA-compliant document processing through its Zero Data Retention (ZDR) option combined with a signed Business Associate Agreement (BAA). When ZDR is enabled, documents are processed in-memory and never stored at rest on LandingAI infrastructure or subprocessors.
Does ADE support deployment inside a customer's own cloud infrastructure? Yes. ADE is available as a containerised application for deployment in customer-owned Virtual Private Clouds on AWS, Azure, or GCP. It is also available as a Snowflake Native App for organisations that manage data within Snowflake environments.
What is the maximum document size ADE can handle in a single job? The ADE Parse Jobs endpoint handles up to 1 GB or 6,000 pages per job. The Python library supports processing individual PDFs exceeding 1,000 pages by parallelising across API calls automatically.