Automation MCP Server Features Blog Pricing Contact

XML Extraction API Reference

Pull the embedded XML attachment out of a hybrid PDF (Factur-X, ZUGFeRD, or any PDF/A-3 carrying a CII or UBL invoice). The endpoint streams the XML straight from the PDF container without any transformation. If the PDF has no embedded XML, a 400 with errorCode 4006 is returned.

POST /v1/extract/xml

Code Example

curl -X POST https://api.invoicexml.com/v1/extract/xml \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "[email protected]"

Try it out online, no coding required

Upload a Factur-X or ZUGFeRD PDF and download the extracted XML instantly, right in your browser.

Try It Online

Request

Parameter Type Description
file * binary The invoice file to process.

Content-Type: multipart/form-data

The source and target formats are part of the endpoint path, and everything else (syntax, declared profile, specification identifier) is read from the document itself, so there is nothing more to configure.

Headers

Header Value
Authorization * Bearer YOUR_API_KEY
Content-Type multipart/form-data

Response

200 Extracted XML

Returns the embedded XML document as a file download.

Content-Type: application/xml

The response filename is derived from the uploaded file: {original-name}.xml. The Content-Disposition header is set to attachment for direct download.

Frequently Asked Questions

What PDFs work with this endpoint?

PDF/A-3 hybrid invoices that carry an embedded CII or UBL XML attachment, e.g. Factur-X (factur-x.xml), ZUGFeRD (zugferd-invoice.xml), or Peppol PDFs. The API recognises the standard attachment names defined by each format.

What if the PDF has no embedded XML?

The API returns a 400 response with errorCode 4006 (NoEmbeddedXml). To extract invoice data from a PDF that has no embedded XML, use POST /v1/parse/json which uses AI to read the visual content.

Is the extracted XML modified or validated?

No. The XML is returned exactly as it sits inside the PDF, byte-for-byte. If you need Schematron / EN 16931 validation, pass the output to POST /v1/validate/{format}.

Can I get JSON from a Factur-X PDF instead of XML?

Yes. POST /v1/extract/json takes the same hybrid PDF, extracts the embedded XML, parses it, and returns an InvoiceDocument JSON.

What is the output filename?

Derived from the uploaded PDF: if you upload invoice-2026.pdf you receive invoice-2026.xml. The Content-Disposition header is set to attachment for direct download.