Patent-Pending Base64.ai Segmentation AI Technology for Reading Multiple Documents in a Single Image

Background

Before Base64.ai's patent-pending Segmentation AI technology, Artificial Intelligence could not understand multiple documents in a single image, such as multiple receipts in a single expense report. Segmentation AI is designed to detect and segment the most salient objects within an image. It comprehends the physical boundaries of documents, allowing it to crop each document and process them individually. This ground-breaking new technology eliminates the need for your team to manually split documents before processing, thereby reducing validation time and the chances of classification and data extraction errors. It's particularly useful for tasks such as onboarding and application processing, where a single image often contains multiple documents, such as multiple IDs. With Segmentation AI, there's no requirement for additional setup to segment document information; Base64.ai automatically detects objects before classifying a document.

How does the Artificial Intelligence split documents?

Segmentation AI employs a specialized neural network for salient object detection. This process comprises two fundamental steps: first, it identifies the most important elements on the page, and then it isolates these elements from the surrounding content.

In this scenario, an image featuring both a NY State ID and a Ukraine passport ID was processed and automatically split into two separate document files. Even though both files originate from the same image, they yield distinct results, each corresponding to a different section of the single document.

Once a multi-document image is split, our AI detects the document type and uses the most relevant model for data extraction. Each file is labeled with the model name under 'Name', signifying its classification. After splitting, each document can be reviewed separately. In the new files, data associated with other documents will not appear in the results, OCR, or API response.

Signatures, tables, and face detection are also exclusively extracted for the corresponding document.

While our AI can automatically detect multiple documents, you have the option to fine-tune your processing workflow by accessing the "Segmentation" feature under "AI Features" in flow settings. This allows you to ensure that multi-document images are consistently split.