On March 17, 2024, at approximately 16:38 UTC, we observed numerous internal alerts fired across multiple data centers, indicating that various servers were unreachable for an extended period. At the same time, customers started reporting issues with multiple products.
During the impact window, customers may have experienced networking issues across all data centers, with those in Amsterdam (NL-AMS) being more significantly affected. In Amsterdam, customers faced difficulties such as an inability to deploy new Linodes, boot existing Linodes, and other host-level jobs, along with a near-total data center outage. Additionally, issues accessing services like Object Storage and Linode Kubernetes Engine (LKE) may have occurred.
The investigation revealed that the issue was caused by an ongoing release of Akamai Compute’s internal backend API component. The release caused route servers to build incomplete route tables for BGP due to inconsistencies in the API response during the rollout, resulting in connectivity issues in all data centers.
When the release concluded at 17:10 UTC, data became consistent between endpoints again, and most data centers recovered independently.
To mitigate the remaining impact, we proceeded to restart route servers in all data centers at approximately 18:30 UTC. This process was completed in phases, beginning with the most impacted regions first, and was completed at approximately 21:34 UTC on March 17, 2025. After monitoring our systems for some time, we confirmed that the issue was resolved.
For the near-term, we have placed a hold on backend API releases until we have a better way of ensuring route data for route servers is contiguous.
This summary provides an overview of our current understanding of the incident, given the information available. Our investigation is ongoing, and any information herein is subject to change.