Digitising document flow

Many organizations have to cope with incoming documents that don’t fit (digitally) with the internal automation processes. This can be PDF or Excel files of orders at a wholesaler, written work slips in installation companies, confirmations of order acknowledgements and purchase orders.

Specifically with periodically repeating documents, processing by hand can be very time consuming. Digitising these document flows can be a considerable cost saving and it prevents errors of manually typing data into the source systems of the company.
This could look like this:

Below we will explain more about the different options and involved costs in different scenario’s. 

PDFs and other common files

Most commonly used is the system generated PDF received by email. This type of PDF is generally very reliable because it has consistent formatting, making it easy to read. And thus very suitable for digital processing.

Another well known version is the PDF that is a result of a scanned document that is sent through. The quality of this document is readable for humans, but can be a lot more challenging for systems. You are dependent on the quality of the original document and the used scanner. Most common are skewed or twisted pages and the readability of the text.

Excel sheets (or prints of them) are often used as output to share externally. Besides the used layout, Excel sheets are relatively easy to process, if they are built up in a usable structure.

The more commonly used version of ‘documents’ is the JSON or XML message because this type is prepared for automated processing. They are often send in addition to the PDF, for example with invoices.

All these different types of PDFs are easy to use for the sender of the file, but at the receiving end it often isn’t because it means manual processing in the receivers systems.

Conversion for automated processing

Which types of tools are there?

Selecting a tool

Selecting the right tool is best done by the volume per document type and the (one-time) time the tool needs to create the template.

Processing computer generated PDF

€ 0,20 per document

Processing scanner generated PDF

€ 0,75 – € 1,00 per document

Based on a document of 5 pages. Fixed and variable costs, calculated on a volume of 1000 documents.
The larger the volume of documents, the lower the price per document will be. In the end, the best comparison would be to look at the costs compared to manual processing of the documents.


When processing a 1.000 documents per month, at 5 minutes per document will take 83 hours. 83 hours at an hourly rate of € 23,- adds up to a saving of € 1.909,- on labor.

When processing 1.000 computer generated PDFs at € 0,20 per document, adds up to a cost of € 200,-

When processing 1.000 scanner generated PDFs at € 1,00 per document, adds up to a cost of € 1000,-

Depending on the types of documents you have to process, your savings per month is somewhere between € 900,- and € 1.700,- !

The most feasible business case arises when a mix of tools is used. The analysis of the incoming document and the frequency are the main characteristics on which you base your choices.

If you manually process 1 file per day for one customer (internal or external), that already adds up to 260 documents per year.

Automated processing

The real time savings can be achieved when incoming documents can be recognized, converted and processed to internal automation without human intervention. And all non-PDF’s like XLS, CSV, JSON and XML can be processed directly without side steps.

The steps in the process are:

  1. reading the location of the files; mailbox, FTP or another source
  2. send the PDF to the external processor
  3. receive the converted result
  4. optional: enrich from other sources
  5. map to meet the destinations required message set up
  6. process from endpoint to internal automation

The process for a computer generated PDF

In this example the incoming input is email, but this can be FTP or another source as well. 

The process for a scanner generated PDF

This process is very similar. The only difference is the type of PDF you are working with and thus the tooling you need to convert it to a processable data format. 

In this example the incoming input is email, but this can be FTP or another source as well. 

The general process in Dovetail

Dovetail can be fed with data from a variety of sources and in different data types. By Transforming, Enriching and Organising the data it can be made ready for mapping to any other application. 

More about Dovetail iPaaS​

An iPaaS is an excellent way to support your application integration. iPaaS is short for integration Platform as a Service. Dovetail is such a Platform. 

The Dovetail application is a no code, easy to use integration tool that can connect almost anything. 

Related resources

Automated processing of all sales order related flows

As a sales organisation, you aim for a streamli...

Challenges with entering data in multiple systems

What challenges or pain points do you face? And...

The impact of human error rates in manual data entry across multiple systems

How does human error in data entry effect your ...

Citizen integrator, who is that?

What is the influence of democratising integrat...