Extract Data From Files | Base64 Help Center

Use our OCR API to extract the text from the following supported file formats:

The following file formats are supported:

Images: JPEG, PNG, GIF, HEIC, SVG, WEBP, TIFF
Microsoft Office: DOC, DOCX, XLS, XLSX, PPT, PPTX
Open Office: ODS, ODT, ODP
PDF: Both digital and image-only files are supported. PDFs may be single or multi-page and may contain multiple document types (e.g., 3 ID pages plus 1 invoice).
ZIP: May only contain the supported file formats
MSG: Outlook message files and the contents within (e.g., email's PDFs attachments)
Audio: MP3, OGG, FLAC, WAV
Video: MOV, MP4, AVI, WMV, M4V
Text: CSV

You may send the document's mime type and binary in Base64 encoding:

or simply provide the URL of the document:

{"url":"https://base64.ai/static/content/features/data-extraction/models/1.png"}
{"url":"https://base64.ai/static/content/features/data-extraction/models/health/sbc/1.pdf"}

Content-Type: application/json header is always required. Password-protected files are not supported.