9:48 GMT0 a partial traffic outage was reported in EU. This manifested itself as operators unable to login as well as end users not able to open the widget.
10:17 GMT0 the issue was caused by an unresponsive instance of our IAM service. The faulty instance was rebooted.
11:55 GMT0 the issue reappeared, probably by Kubernetes rebalancing itself.
12:14 GMT0 the root cause was then attributed to a faulty Kubernetes node which was turned off and replaced. The service is then operational again.
Posted Apr 15, 2020 - 11:37 UTC
Resolved
An faulty Kubernetes node has been causing issues with our European infrastructure. Initially manifested as authentication issue the issue was then revealed as a Kubernetes malfunction.