Wednesday 17th January 2018

Data Collection Processing large Queue Backlog

After a Spectre/Meltdown related reboot of one of our workers queue servers, we are slow in processing a large backlog that has piled up since 4:00 Europe/Berlin time this night. We are investigating if we can speed up the process of recovery faster.

4:00 Reboot of a queue server due to Linux kernel update lead to an uneven distribution of unprocessed jobs, which our workers don't handle well. Processing the queue server with large backlog count is very slow.

Edit:11:30 We recognized the backlog and are now investigating the cause and remedies.

Edit 12:15 We have processed all remaining queue jobs and are up to date again.