On June 19 2019 at 14:00 UTC, an update of our firewall configuration as part of our regular maintenance revealed a previously undiscovered bug in the configuration files that was not flagged by our automated testing environment. This triggered a cascading effect in our core network.
While the issue was instantly recognized and rolled back, it was discovered that the backup configuration included the same bug. This meant a manual override of all configurations necessary, which took around 10 minutes. Systems started recovering at 14:15 UTC. Tracking systems were fully back online at 14:20 UTC
All incoming traffic was back to normal at 15:00 UTC.
As a result, we massively expanded our automated testing environment, revised our deploy procedures and are auditing all of our backup configurations.
We sincerely apologize for the disruption.