Extracting Handwritten Text with ADE: From the Classroom to the Archive

Copy of LandingAI Youtube & Blog Template (3)

Ava Xia October 9, 2025

Agentic Document Extraction (ADE) from LandingAI brings handwritten information into the digital world by converting essays, prescriptions, and centuries-old manuscripts into structured, searchable, and analyzable data. Designed to interpret documents the way humans do, ADE combines vision-language intelligence with an agentic workflow that reads handwriting, tables, signatures, and stamps together as part of a single context.

This technology solves a long-standing problem: conventional OCR systems cannot handle cursive writing or complex layouts, but ADE accurately reads and contextualizes them across different industries. Whether it is grading student work, digitizing fragile historical letters, or parsing medical notes, ADE provides precise transcriptions that stay true to their visual source.

In this post, we’ll take a journey through a diverse set of real-world documents to show how ADE deciphers handwritten text from classrooms, archives, hospitals, and even Cold War intelligence files.

In the Classroom: Deciphering Student Work
- Example: Economics Essay
- Example: Math Classroom Worksheet
- Examples: Fill-in-the-blank Worksheet and Multiplication Table
In the Clinic: Parsing Critical Medical Notes
- Example: Handwritten Prescription
In the Archives: Rescuing History from Fading Ink
- Example: 1793 Swedish Letter
- Example: 1855 Bill of Sale for an Enslaved Girl
- Example: 1859 Civil War-Era Letter
- Example: 1910 Oregon Angler’s License
- Example: 1930s Spanish Municipal Record
In Government and Intelligence: Analyzing Layered Information
- Example: Certificate of Veterinary Inspection(Santa’s Reindeers)
- Example: CIA Berlin Handbook (1961)

Why Handwriting Persists—And Why Extraction Matters

Far from obsolete, handwritten material is still a cornerstone of information capture in critical sectors:

Education: Teachers grade millions of written assignments and exams by hand.
Healthcare: Vital prescriptions and patient chart notes are frequently handwritten.
Archives: Centuries of irreplaceable cultural and historical documents exist only on paper.
Government & Business: Official forms, receipts, and field notes are often filled out manually.

Each of these documents represents a barrier between raw information and actionable knowledge. ADE takes a data-centric approach, trained on diverse, domain-specific document sets, from school worksheets to clinical notes and historical manuscripts. This focused training allows it to recognize the nuances of each context: distinguishing handwritten answers from printed templates, interpreting mathematical symbols with precision, and following the natural flow of written language in essays. In doing so, ADE acts as the bridge, transforming handwritten content into information that’s searchable, analyzable, and accessible.

A Look at the ADE Workflow: Parse and Extract

ADE operates on a two-step workflow to turn complex documents into structured data: Parse and Extract. We can see this in action using the example of a student’s AP Calculus BC answer sheet.

In the Parse stage, ADE accurately transcribes the entire page, seamlessly handling a complex mix of mathematical integrals, English instructions, and even handwritten Chinese characters (“没学”) alongside a playful emoji the student drew as an answer to one of the problems.

Once parsed, the Extract feature comes into play. Here, we can define a schema to organize the information we need, such as isolating only the answers from the page. We can also include an instruction for ADE to explain the mathematical reasoning behind each solution, turning the extraction process into a way to both capture and understand the student’s work. In this example, we added in the scheme an answerExplained field, an instruction for ADE to generate a natural-language explanation of the solution process, even though no such text exists in the original document. In the “Extracted Results” panel, ADE not only extracts the handwritten integral and result (∫… = 7.333) into answerText, but also automatically fills answerExplained with a clear, human-readable interpretation: “The area of region R is found by integrating the function √x from…”.

The ADE workflow concludes here, delivering structured JSON data ready for any downstream task. To demonstrate what is possible with this output, our Showcase Playground includes a Chat interface. This example shows how a developer could build an application that allows users to ask conversational questions about a document. For instance, a user could query the meaning of the Chinese words and receive a correct definition. The playground also illustrates how to build in important safeguards: when asked to solve the math problem (which requires generating new information), the chat refuses. For developers who want a head start on building their own retrieval and chat applications, we provide helper scripts and workflow examples in our ADE LLM Retrieval repository on GitHub.

Once you understand the workflow, you can experiment with your own handwritten documents using landingAI’s playground shown in these images, and copy the sample code from the playground to integrate ADE directly into your own agentic workflow.

Now it is time to explore more examples from the world of handwriting.

In the Classroom: Deciphering Student Work

The Challenge: Student work is a complex mix of printed instructions, diagrams, and unique, developing handwriting. The task is to extract not just words, but mathematical notations, symbols, and feedback, all from the same page.

Example: Economics Essay

This example of a handwritten A-Level economics essay highlights ADE’s ability to interpret complex documents with mixed content types, capturing nuances that standard OCR would miss. The system demonstrates two advanced capabilities here. First, its layout analysis is precise enough to recognize critical non-textual cues, correctly identifying the tiny, handwritten checkmark in the “Question 9” box at the top of the page, which is essential for knowing which prompt the student chose to answer. Second, ADE doesn’t just ignore the embedded diagram; it analyzes the visual information and generates a detailed description of the student’s hand-drawn economic graph. As seen in the parsed output, it identifies the axes and the specific economic curves: “<A hand-drawn economic chart showing Price (P) on the y-axis and Output (Q) on the x-axis. It includes curves labeled MC, AC, AR, MR…>”. This combination of recognizing small selection marks and describing complex, hand-drawn visuals shows a deep contextual understanding of the entire document, providing a much richer and more accurate digitization than simple text extraction alone.

Example: Math Classroom Worksheet

This example of a mathematics worksheet demonstrates ADE’s proficiency in handling highly technical and structured educational content. The system shows a strong capability in recognizing and accurately transcribing complex mathematical notations, including the limits and differential equations present in the problems. Furthermore, by using the defined schema, ADE can extract crucial metadata from the page to understand its structure. It correctly identifies non-textual symbols like the checkmarks ✅ to determine which answer has been selected for a given question. In parallel, it also parses and extracts the point value, or mark, associated with each problem. This dual ability to understand both the complex mathematical content and the structural metadata makes it a powerful tool for digitizing and analyzing academic materials.

Examples: Fill-in-the-blank Worksheet and Multiplication Table

In the chat interface, we can see how a developer could build an application to perform contextual analysis, such as automatically grading student worksheets by applying mathematical or grammatical rules to the extracted answers.

For the multiplication table, we simply asked the system to find any incorrect calculations. It immediately reviewed the handwritten answers and correctly identified that “3 x 9 = 28,” “8 x 3 = 22,” and “6 x 3 = 16” were all calculated incorrectly.

Similarly, for the English grammar exercise, we tasked it with finding grammatical errors. The chat not only pinpointed the mistake in “Sentence 9, ‘They is dancing,'” but also provided the correct alternative. When we asked for a summary, it accurately reported there was only one incorrect question.

In the Clinic: Parsing Critical Medical Notes

The Challenge: Medical handwriting is famously difficult to read. It’s a dense mix of abbreviations, dosage shorthand, and rushed script where a single misinterpretation can have serious consequences.

Example: Handwritten Prescription

A handwritten prescription from Dr. S. S. Shukla in India shows ADE deciphering the rushed, slanted script. It correctly parses instructions like “D3 must 60 k, 1 tab weekly” and “Lacthep plus 10 ml SOS,” structuring the output for a pharmacy or electronic health record system.

ADE accurately extracts medication names and dosage instructions from a doctor’s prescription.

In the Archives: Rescuing History from Fading Ink

The Challenge: Historical manuscripts present a battle against time. Fading ink, archaic letterforms, bleed-through, and deteriorating paper create significant noise that can render a document unreadable to standard OCR.

Example: 1793 Swedish Letter

This letter from Stockholm, dated 1793, poses a significant challenge with its aged, discolored paper and archaic German Kurrent handwriting. ADE successfully navigated these obstacles, accurately transcribing the difficult cursive script into modern, readable text. The analysis goes much deeper, however, as the system also identifies and classifies the signature and the prominent red wax seal together as a formal ‘attestation’, even generating a detailed description of this area. The utility extends even further in the chat interface, where a researcher can then instantly translate the transcribed German letter into English, breaking down language barriers.

A 230-year-old letter, digitized and made searchable by ADE.

Example: An 1855 Bill of Sale for an Enslaved Girl

This 1855 bill of sale for an enslaved girl named Polly is a sobering historical document that presents significant digitization challenges due to its content and physical state. The paper is heavily damaged, with tears and ink bleed obscuring text along the edges and in the creases. ADE accurately reads the 19th-century cursive and deconstructs the document into its key components: the main legal text, the witness signatures, and the seller’s official mark.

Example: 1859 Civil War-Era Letter

This letter from 1859, a deeply personal and historically significant artifact from a formerly enslaved man to his mother, requires both technical precision and a deep respect for the content. This is where ADE’s capabilities with archival materials are essential. It accurately transcribed the 19th-century cursive script, navigating the challenges of faded ink and creased paper to ensure that poignant lines like “If I succeed in my undertakings I will send you all the good news” are captured perfectly. By digitizing such a fragile and invaluable artifact with this level of accuracy, ADE makes its story searchable and accessible to researchers worldwide, preserving a vital piece of history for future generations.

Text is recovered from a fragile, damaged historical letter.

Example: 1910 Oregon Angler’s License

This 1910 Oregon Angler’s License is an excellent example of a historical government form, combining a printed template with handwritten data, official signatures, and an embossed seal. ADE’s approach to this document is multifaceted. Instead of merely extracting the handwritten fields like age and height as isolated data points, it intelligently synthesizes them with the printed text to generate a complete, readable summary of the license holder’s description. The system then deconstructs the lower portion of the document, identifying the official signatures as “attestations.” Most impressively, it analyzes the embossed county seal, reading the text on it, describing its central emblem of a sun and mountains, and even legibly extracting the signature written directly over the seal. This comprehensive analysis, which combines data synthesis with the structural and visual breakdown of official seals, is invaluable for digitizing municipal and government archives, capturing the full context and authenticity of each record.

Handwriting is cleanly extracted from designated fields on a government-issued license.

Example: 1930s Spanish Municipal Record

This image of a 1982 municipal ordinance from Argentina showcases the system’s ability to deconstruct a complex official document. It accurately parses the typed Spanish text while also identifying and analyzing the various authenticating elements. The output demonstrates how it reads the text from within the circular blue ink stamps, provides a visual description of them, and classifies them as official attestations. Furthermore, it isolates the handwritten signatures, correctly extracts the legible names and titles of the officials, and categorizes them as formal signatures, preserving the legal and administrative structure of the original record.

The playground’s chat interface shows a proof-of-concept application querying a Spanish document in English. It can synthesize information from multiple locations to answer complex questions, like identifying all officials who signed the document. It also performs specific data retrieval, accurately extracting details like the monetary range for a service. Finally, the chat shows the application providing a contextual summary of the document’s origin and purpose. This illustrates how ADE’s data enables the creation of tools that make dense or foreign-language documents immediately accessible and understandable.

In Government and Intelligence: Analyzing Layered Information

The Challenge: Official documents often contain multiple layers of information: a printed base text, typed-in data, handwritten annotations, and official stamps. The goal is to separate and catalog each layer correctly.

Example: Certificate of Veterinary Inspection

This form for Santa’s reindeer features typed headers and handwritten entries.

It showcases ADE’s ability to parse complex, table-heavy forms. The system accurately recognizes the entire “ANIMAL IDENTIFICATION” grid, extracting each handwritten entry, such as the reindeers’ names, and correctly associating it with the proper column header. The analysis extends beyond the table to the entire document, meticulously breaking down the veterinarian’s signature block into a structured “attestation” with their name, address, and license number. This demonstrates a powerful, end-to-end capability to digitize and structure complex administrative forms, converting handwritten tables into organized, machine-readable data.

Example: CIA Berlin Handbook (1961)

This declassified 1961 CIA “Berlin Handbook” cover is an excellent example of a complex, multi-layered document, and it showcases the system’s ability to deconstruct an “information collage” by accurately processing each distinct element.

ADE correctly transcribes the mix of typed, stamped, and handwritten text, capturing everything from the “SECRET” classification to the critical “MASTER COPY” and distribution annotations written in the margin.

Beyond the text, the system provides a deep analysis of the page’s graphics. It identifies the official CIA seal, extracting the text from it while also generating a detailed description of its visual components (the eagle, shield, and compass rose). Furthermore, it recognizes the stylized drawing of the Brandenburg Gate, providing a rich architectural description of the neoclassical monument. This ability to parse and understand mixed-media documents—separating text, official seals, and illustrations—is invaluable for creating comprehensive digital archives from complex government and intelligence records.

Conclusion

From a student’s quick algebra scribble to a centuries-old parchment, ADE is built to read, interpret, and structure the rich world of handwritten content.

By bridging analog expression with digital intelligence, ADE empowers educators to grade faster, historians to preserve our shared heritage, and professionals everywhere to unlock handwritten knowledge once thought unreachable.

Ready to get started?

Explore the GitHub repo
Test ADE live in the Visual Playground

Contact Enterprise Sales to discuss deployment and pricing for large workloads

Table Of Content

Tune for Speed: Achieving Max Document Processing Throughput with ADE Parallelism Settings

OpenAI’s Trillion-Dollar Bet, Generating Viruses, Modeling Planet Earth, Paying for Training Data

Fast, Accurate Utility Bill Parsing with LandingAI