Friday 12th May 2017

Data Collection Gaps in performance data

You may be seeing gaps in your performance data between 2:33 to 3:40 Europe/Berlin time this night.

After a time of unresponsiveness between 2:20 and 2:39 one of our Elasticsearch nodes reported a number of unassigned shards in its cluster status that was not automatically healing itself. Usually the healthy nodes synchronize the affected shards to the problematic node and everything automatically recovers after a while. We have seen this behavior before and the course of action was to restart the affected node to trigger the Elasticsearch cluster resynchronize everything with the restarted node.

Due to some circumstances that we are still investigating, monitoring data of a subset of customers was not stored during the period 2:33 and 3:40 Europe/Berlin time, but all trace data is available in this time period. We are terribly sorry for this issue. All the data is still available in our event logs and we will begin to restore them tomorrow morning.