Benchmarks: Answer 99.16% of DocVQA Without Images in QA: Agentic Document ExtractionRead more

Sensitive Data Handling in Document Extraction

Share On :

How LandingAI ADE handles PII, PHI, and confidential enterprise documents: what gets stored, what is discarded, and which compliance certifications apply.

LandingAI ADE processes PII, PHI, and confidential enterprise documents under SOC 2 Type II, GDPR, and HIPAA controls. With Zero Data Retention (ZDR) enabled, every submitted document is handled entirely in-memory and discarded the moment extraction completes, with no storage at rest on LandingAI systems or any sub-processor and no use in model training. HIPAA-compliant PHI processing is available on Team plans and above when ZDR is active and a Business Associate Agreement (BAA) is signed.

EU customers can run ADE on AWS Ireland (eu-west-1) for GDPR data residency, and enterprises can deploy ADE as a container inside their own VPC with no LandingAI access to documents at all. Teams that require BAA, ZDR, and on-premises deployment simultaneously can satisfy all three under a single configuration on Team plans and above.

Types of Sensitive Data Common in Enterprise Documents

Six categories of sensitive information appear routinely in documents submitted to extraction APIs across finance, healthcare, legal, and insurance industries.

Data CategoryExample Document TypesRegulatory Relevance
Personally Identifiable Information (PII)Loan applications, onboarding forms, KYC packets, HR recordsGDPR, CCPA, state privacy laws
Protected Health Information (PHI)Prior authorization forms, discharge summaries, insurance claims, lab resultsHIPAA
Financial account dataBank statements, wire instructions, invoices, brokerage recordsPCI-DSS (where cards are present), GLBA
Government-issued identifiersPassports, national ID scans, driver's licensesGDPR, regional identity laws
Legal personally named contentContracts, litigation documents, NDAsVaries by jurisdiction
Biometric dataSignatures captured in forms, identity verification documentsGDPR Article 9, state biometric laws

The same security controls apply uniformly across all six categories. The same protections governing a healthcare claim also govern a financial onboarding packet.

Default Data Handling in ADE

Under the standard configuration, ADE processes documents submitted via the API and retains operational data per the terms of the customer agreement. Processing PHI under the standard configuration is outside the scope of LandingAI's HIPAA-compliant service; HIPAA-compliant PHI processing requires ZDR to be enabled and a BAA in place. All data in transit is encrypted using TLS 1.2 or higher; data at rest uses AES-256. Customer data is logically segregated in the multi-tenant architecture: one organization's documents are never accessible to another.

Zero Data Retention (ZDR): What It Means and When to Enable It

LandingAI ADE's ZDR option processes documents in-memory and discards them immediately after extraction completes, covering LandingAI and all sub-processors with no storage at rest and no use in model training.

When ZDR is active:

  • Documents are processed entirely in-memory and are never written to storage on LandingAI systems or by any sub-processor.
  • Data is used exclusively to complete the extraction request that initiated the call, then immediately and irrevocably discarded.
  • LandingAI does not use documents submitted under ZDR to train or improve its models.
  • ZDR scope covers the full platform when enabled, including both the ADE API and the Python client library.

ZDR is available on Team and Enterprise plans for the US region (AWS Ohio, us-east-2), and on custom pricing plans for the EU region (AWS Ireland, eu-west-1). See ADE pricing and plan tiers for plan-level detail.

HIPAA Compliance and PHI Handling

HIPAA-compliant PHI processing in ADE requires two conditions met simultaneously: ZDR enabled on the account, and a signed BAA in place between the customer's organization and LandingAI. PHI submitted without ZDR active is outside the scope of LandingAI's HIPAA-compliant service. The BAA process is initiated through the Organization Settings page after ZDR is enabled. HIPAA-compliant processing is available on Team and Enterprise plans; free tier and standard API access should not be used for PHI. A global tier-1 bank used ADE to process client due diligence documents at scale, including KYC workflows involving sensitive financial and identity data; see the Fortune 100 bank case study for the compliance controls applied in that deployment.

Compliance Certifications

LandingAI ADE holds SOC 2 Type II, GDPR, and HIPAA certifications, verifiable through the Trust Center and documented on the Security and Compliance page.

SOC 2 Type II. An independent audit against the AICPA trust services criteria for security, availability, and confidentiality, covering a defined audit period rather than a point-in-time snapshot.

GDPR. EU-based customers have dedicated regional infrastructure on AWS Ireland, with GDPR compliance documentation available through the Trust Center.

HIPAA. Available with ZDR enabled and a BAA in place, as described above.

EU-US Data Privacy Framework. LandingAI is working toward DPF certification, which governs transatlantic personal data transfers. Verify current status at the Trust Center.

EU Region and Data Residency

LandingAI ADE is available in the EU region at va.eu-west-1.landing.ai for workloads subject to GDPR or data residency requirements restricting document data from leaving the European Union. The EU deployment runs on AWS Ireland (eu-west-1) with all data stored and processed within the EU. The EU region supports GDPR compliance and is eligible for ZDR on custom pricing plans; see the EU documentation for region-specific API endpoints, SDK configuration, and account setup.

Containerized Deployment on Customer VPC

LandingAI ADE is available as a containerized application deployable in the customer's own VPC, with no LandingAI access to documents during processing and full support for air-gapped environments. Document data never leaves the customer's infrastructure in this deployment model. ZDR is supported in the containerized deployment, meaning teams with simultaneous requirements for a signed BAA, zero data retention, and on-premises infrastructure satisfy all three under a single configuration on Team plans and above. Contact LandingAI through the enterprise contact page to initiate this option.

Extraction Auditability: Field-Level Traceability

Regulated workflows in finance and healthcare require that every extracted value traces back to its exact source location in the original document. ADE returns bounding-box coordinates and page references alongside every extracted field in the structured JSON response, linking each output to the specific region of the source document where the value originated. A compliance reviewer querying a flagged field, whether a KYC identifier, a PHI value, or a financial figure, can navigate directly to its source location without re-reading the full document. The same extraction metadata includes confidence scores per field, enabling automated routing: high-confidence results move downstream without review, and low-confidence results surface with their source coordinates for targeted human verification.

Access Controls and Audit Infrastructure

LandingAI ADE includes four access governance controls for sensitive data workflows, configurable through Organizations and Members settings.

RBAC. Granular permissions assigned to users and groups, limiting which team members can submit documents or access extraction results.

Audit Logs. Immutable records of critical user and system activity, actively monitored by LandingAI's security team.

SSO. Integration with Okta and Azure AD allows organizations to enforce corporate authentication policies for ADE access.

Data Segregation. Customer data is logically isolated in the multi-tenant architecture; no cross-tenant data access is possible.

FAQ

Does LandingAI use documents I submit to ADE to train its models? When ZDR is enabled, submitted documents are processed in-memory and discarded immediately after the extraction call completes; LandingAI does not use them for training. Under the standard configuration without ZDR, LandingAI's privacy policy and customer agreement govern data use.

Can I process HIPAA-regulated documents with LandingAI ADE? Yes, on Team and Enterprise plans, provided ZDR is enabled and a signed BAA is in place. Both conditions are required simultaneously. Free-tier and standard API access should not be used for PHI.

Does ADE support BAA, ZDR, and on-premises deployment simultaneously? Yes. ZDR is supported in the containerized VPC deployment, and the BAA covers that configuration on Team plans and above.

What is the difference between ZDR and the EU region for GDPR compliance? The EU region ensures document data is stored and processed within EU borders, satisfying data residency obligations. ZDR ensures no document data is retained after processing completes, regardless of region. Both can be used together on custom pricing plans.

Is there a deployment option that keeps document data entirely within my own infrastructure? Yes. LandingAI ADE is available as a containerized application in a customer-owned VPC, including support for air-gapped environments. ZDR is available in this deployment.

Does ADE support processing documents with mixed sensitive and non-sensitive data? ADE processes documents at the file level and does not selectively redact fields before extraction. If the document contains PII or PHI, apply ZDR and HIPAA BAA controls to the entire pipeline. Workflows requiring pre-processing redaction should handle that step upstream before submitting to the API.