Automatisierung Blog Preise Kontakt

Parse Any Invoice to Structured JSON

Upload any invoice — typed, scanned, or photographed — and get back a validated JSON object with every field extracted, verified, and ready to integrate.

Drop your invoice here

or browse files to upload

Accepted formats: PDF, JPEG, PNG, TIFF · Max 20 MB

How AI Invoice Parsing Works

Upload Any Invoice

Drop a native PDF, a scanned document, or a photo of a paper invoice. The AI handles any layout, orientation, language, or quality level.

AI Reads & Validates

The AI extracts every field, then verifies: tax number formats, address plausibility, unit price × quantity = line total, and sum of lines + tax = invoice total.

Receive Structured JSON

Download a normalised invoice JSON with consistent field names — seller, buyer, line items, tax breakdown, payment terms — ready for any app or database.

The Invoice Stack Reality

Most invoice parsing tools assume clean, computer-generated PDFs. Your suppliers don't send those. They send scanned faxes, smartphone photos of paper invoices, thermal-printed receipts with missing fields, and exported PDFs from a dozen different accounting systems — each with a completely different layout. InvoiceXML's AI engine was built for that reality. It reads invoices the way a human accountant would — understanding context, inferring missing values, and flagging anything it cannot verify with confidence.

Scanned & Photographed Invoices

OCR combined with semantic AI understands invoice structure regardless of scan quality, rotation, or partial occlusion — no template configuration required.

Tax Number Verification

Extracted VAT and business registration numbers are verified against country-specific format rules, catching transposition errors and truncated values before they hit your system.

Line-Item Arithmetic Check

Every line total is recomputed from unit price and quantity. Tax amounts are recalculated per category. The invoice grand total is verified. Discrepancies are flagged in the output.

Address Cross-Referencing

Seller and buyer addresses are parsed into structured components (street, city, postal code, country) and checked for internal consistency — catching OCR errors in postal codes or truncated street names.

Schema Validation

The extracted data is validated against the EN 16931 invoice data model to ensure all required fields are present, correctly typed, and within acceptable value ranges before the JSON is returned.

Integration-Ready Output

The JSON uses consistent, predictable field names regardless of the source language or format — drop it directly into your database, ERP import, or data pipeline without transformation.

Frequently Asked Questions

Does this work with scanned or photographed invoices?

Yes — that is the primary use case. The AI reads the document like a human, recognising fields regardless of layout, orientation, scan quality, or language. It handles thermal-printed receipts, faxed documents, and smartphone photos of paper invoices.

What validation is applied to the extracted data?

We apply four layers of validation: schema validation (all required fields are present and correctly typed), VAT/tax number format verification against country-specific registries, address plausibility checks, and full line-item arithmetic verification (unit prices × quantities = line totals, sum of lines + tax = invoice total).

What does the JSON output look like?

The output is a normalised invoice JSON object with consistent field names regardless of the source document's language or format. It includes seller, buyer, invoice header, line items, tax breakdown, payment terms, and a confidence score for each extracted field.

What formats are accepted?

PDF (native and scanned), JPEG, PNG, TIFF, and WEBP. The maximum file size is 20 MB. For Factur-X or ZUGFeRD PDFs, the embedded XML is used as the primary data source, improving accuracy further.

How is this different from the Extract XML endpoint?

The Extract XML endpoint extracts the embedded CII XML from a Factur-X or ZUGFeRD PDF, falling back to AI generation if no XML is found. The Extract JSON endpoint always uses AI parsing and runs the full validation pipeline — tax numbers, addresses, and arithmetic — returning a normalised JSON object rather than an XML document. Use JSON when you need structured data for app integration; use XML when you need a standards-compliant e-invoice document.

Bereit, Ihre Rechnungen zu automatisieren?

Starten Sie Ihre 30-tägige Gratisprobe. Keine Kreditkarte erforderlich.

Jetzt starten