Starting around 01:00 UTC on March 5, 2025, some customers began experiencing issues with multiple LKEs (Linode Kubernetes Engine) failing to autoscale when IP ACL was enabled, resulting in service disruptions. Approximately 20-25 clusters across various regions were affected globally. Akamai’s investigation traced the issue to the latest software release deployment. New nodes were unable to join LKEs with IP ACL enabled, preventing the autoscaler feature from functioning properly. As a temporary mitigation, Akamai restarted the gateway pods, but this provided only short-term relief. Further analysis identified the root cause as a software defect. To address the issue, Akamai rolled back the release. The rollback was completed around 15:00 UTC on all affected clusters, effectively resolving the problem. Akamai will continue to investigate the root cause and take appropriate preventive actions. We apologize for the impact and thank you for your patience and continued support.
This summary provides an overview of our current understanding of the incident given the information available. Our investigation is ongoing, and any information herein is subject to change.