On September 17th, 2023, the backend API servers that handle our Managed Database requests were unable to process requests. This prevented customers from adding, removing, and modifying new and existing Managed Database clusters.
This was caused by large log files that were not rotated properly. In turn, this led to the disks reaching capacity, preventing requests from being handled and a degradation of the service.
To resolve the issue, our team manually rotated the log files as needed, allowing the Managed Database service to begin processing requests normally.
Moving forward, our team will be improving automated log rotation and refining our alerting system for these events. This will help prevent similar issues and allow us to catch and resolve them sooner if they occur.