On January 9, 2025, starting at approximately 21:18 UTC, customers using the US-ORD-1 cluster in Chicago experienced intermittent latency with cluster operations. Telemetry from our canary monitoring detected that operations in this data center were notably slower compared to other clusters, impacting the performance of the US-ORD-1 Object Storage cluster and affecting multiple customers. Our investigation identified a configuration issue as the root of the problem, where a machine with an incorrect configuration was inadvertently left in rotation, leading to degraded performance across the cluster. To address the issue, we promptly removed the misconfigured machine from the production environment, and the cluster’s performance returned to normal, with pre-incident levels restored at approximately 02:40 UTC on January 13, 2025.
We are conducting a comprehensive root cause analysis to identify preventive measures and ensure such incidents do not recur, with enhancements to our monitoring and configuration validation processes already underway. We sincerely apologize for the inconvenience this incident caused and appreciate your patience and support.
At Akamai, we are dedicated to improving the reliability and performance of our systems and services.
This summary provides an overview of our current understanding of the incident given the information available. Our investigation is ongoing and any information herein is subject to change.