Tonight all host machines that run our virtual machines, including single point of failures such as primary database, are rebooted. This can lead to short downtimes or delays.
**Edit 03:36:** A first analysis shows that the rebooting causing a long restart of the database primary server did cause a three hours stretch of ingested data to be lost form 00:19 Europe/Berlin to 03:39. We will analyse what problems our failsaves had that caused this long outage, as the system is designed not to fail this way.
**Edit 03:27:** The database primary server is up again and the workers are processing the backlog. This will take a while and given the length of the primary downtime we are not sure yet if the data is complete.
**Edit 01:53:** Almost all machines have been restarted, but the primary database master is still outstanding.
**Edit 00:26:** The database master is currently being rebooted which is causing a downtime of UI and workers. Data is still ingesting and will be processed after the primary is up again.