On Wednesday, January 29, our monitoring systems began reporting excessive CPU usage on some of our Kubernetes (K8s) nodes, which are responsible for hosting the single-sign-on Login application (i.e. https://login.library.nyu.edu). While this issue was resolved and did not in itself cause any application outages, the rescheduling of one of the pods used for application caching was no longer discoverable by the Login application. This manifested to users as apparently not being logged in, even when a user had indeed logged in. Because only one of the cache pods couldn’t be found, the behavior was inconsistent.