Tuesday 8th May 2018

Processing of Traces is slow

We currently see a very slow processing of traces and are working to fix this. The backlog of traces keeps increasing and we are only processing them at a much reduced rate.

Edit 09:01 We have potentially identified the cause to be a performance degradation of our network filesystem, and are trying to confirm it with the hoster and fix it.

Edit 09:10 After speaking to the hoster, we have identified a maintenance on NFS to be the likely cause. It should be over soon, which will get trace processing back to normal.

Edit 10:45 No NFS performance improvement in sight. We have used the time to plan a migration using local disks instead, a change that is long overdue anyways and was already scheduled to be made.

Edit 14:15 We have rolled out a fix, moving the storage from NFS to local disks of our worker machines and are already seeing massively improved processing speeds. We expect the backlog of the last 5 hours to be processed in a few minutes.

Edit 14:57 All traces have been back-processed and are now available again in near-realtime immediately after they are sent from the daemons running on your servers.

We want to apologize for the inconvenience.