Blog

Monitoring & reporting

The desire to improve a little bit, every day
… was our reason for taking monitoring & reporting to the next level.

We always say that integration is complex, but not difficult. Complex because of the extensive process, much bigger than just the technical aspect of it, and yet, at the same time not difficult if you have the right knowledge. What also makes our field complex is the dependence on external applications with the technology they use, it is never the same and always evolving.

With the growth in the number of integrations, the need for predictability has increased, especially in those parts of our solution that were not available to every consultant. Or more difficult to understand. In daily practice, we distinguish a number of the most common possible causes that lead to disruptions or overload of the infrastructure;

Pending exchanges

In each step in the integration flow, data is exchanged from one component to the next. If that cannot be handled within a certain time frame, a queue of “pending exchanges” will arise. This has a negative effect on the speed with which the entire flow is handled, but also on the infrastructure used. We often see this with (external) endpoints that cannot process the data offered fast enough. The best comparison to our daily lives is traffic jams on the highway, the number of cars using the highway at the same time, determines whether we can continue driving and at what speed, or in the worst case scenario come to a standstill.

Within the flow manager in Dovetail you can get insight into the number of pending exchanges. By using the trace-functionality, information can be found to optimise the flows. It is “best practice” to assume the “happy flow” will not always work and be critical of the performance in the first weeks after going live.

Failed exchanges

In this case it “just” goes wrong – a processing cannot be done and the flow falls back into the error flow. While a “pending exchange” basically causes a delay, a “failed exchange” most likely disrupts the functional process. Finding the cause of a “failed exchange” is therefore critical and we consider finding preventive measures to be an iterative process. But, often when you put a flow live, you don’t know everything.

We usually see failed exchanges around (external) endpoints that provide an unforeseen answer, or no answer at all. It is therefore important to pay extra attention to that endpoint and to handle the provided answers properly. At the same time, you have to be prepared for what you don’t know yet or couldn’t know at that point.

Variables affecting the result

Besides pending and failed exchanges, a well-functioning integration platform is influenced by use. In practice, this use is influenced by the quality of the integrations built as described above and certainly also by the quality of the endpoints. Other factors of influence are the volume and frequency of incoming and outgoing data, the volume of file format conversions and mappings, the number of flows and components used. The available shared or private infrastructure, in particular the number of cores and available memory, is another factor of influence. Scaling up has a direct relationship to costs that must be able to be passed on and it cannot be implemented free of charge.

The new Health Check API

In Dovetail version 4.13.1. the health check API has been further expanded. In addition to flow information (pending, failed, completed), data is now also offered about the use of CPU and memory in the test and production containers. By combining data about flows and infrastructure, it is possible to zoom in and find correlations about the “health” of flows.
Pending exchanges
We show how the infrastructure behaves in relation to 80% and 100% available capacity. To determine whether actions are required, the measurements are collected per minute which enable us to compare the percentage exceedance to the previous 24 hours.
Measurement capacity
Over the course of December, this information can be made available to partners and customers on a weekly basis.

Error route and flow function monitoring

To get a better grip on failed exchanges and function monitoring, our consultants have developed some best practices that our partners can add to their own Dovetail implementation(s). This catches errors and processes them into a database for reporting and analysis.

Error route monitoring (ERM)

The goal of the error route monitoring is to collect, analyse and report failed exchanges with all response information so that the error can be prevented next time. An example is; failed authentication because a user needs to reset something in their own system. By informing the person responsible in such a process about an error that has occurred, this person can often solve the problem himself and thus prevent the integration process from coming to a standstill (for a long time).
Dovetail Flow for management by exception

Flow function monitoring (FFM)

In FFM everything that does not meet the assumptions and expectations is caught. Earlier we mentioned the so-called “happy flow” in which the behaviour of endpoints is taken into account as much as possible. This does not prevent (undocumented) responses from leading to something unexpected. The result of FFM is that previously unknown responses from endpoints become known and can be handled functionally within the definition of the flow.

Error monitoring and handling in Dovetail
ERM and FFM can be set up by the partner or customer using flow examples and descriptions that will soon be available in the Dovetail Academy.

Related resources

Infrastructural spaghetti

Of course, the differences between Thai and Dut...

Frontend challenge

One of the main aims of Dovetail 4.15 was to ma...

How to replace a library from 2010?

In Dovetail 4.15 we transitioned to Jackson to ...

If you don’t test restores, you don’t have backups

It is as simple as the title suggests: “If you ...