Linode Status

Service Issue - Intermittent connection drops on LKE pod to pod traffic

Incident Report for Linode

Postmortem

As part of the regular troubleshooting for a customer for an issue on the LKE-E side, we became aware of an intermittent issue causing pod to pod connection timeouts on LKE clusters across all data centers. The investigation at the time indicated "noisy network neighbors" on the hosts that were leading to timeouts. Additional investigation indicated that this issue has been existing since approximately January 20th, 2025.

Our LKE engineering team started testing on standard LKE tier server sets and they were able to replicate the issue for 3 hours in the Los Angeles data center.

Akamai ultimately discovered two different issues which led to the behavior observed. We tracked back most of the occurrences for all server sets running Dedicated Linode plans to problems with the underlying host, and in most cases, it was related to memory pressure and the running guests all had their network affected. We correlated the customer’s reports to their decision to change all premium nodepools to dedicated nodepools at the beginning of the year.

The networking problems we noticed in premium were in fact getting drowned out by the noisy dedicated server sets. Once we isolated only premium nodepools, we were able to correlate the customers' reports to a known issue we had in our envoy proxy configuration.

In order to mitigate the issue, we released a patch with a fix.

Akamai will schedule a meeting to outline lessons learned and next steps to ensure similar incidents do not happen in the future.

This summary provides an overview of our current understanding of the incident given the information available. Our investigation is ongoing and any information herein is subject to change.

Posted Jun 05, 2025 - 23:36 UTC

Resolved

We haven't observed any additional issues with the Linode Kubernetes Engine (LKE), and will now consider this incident resolved. If you continue to experience issues, please contact us at 855-454-6633 (+1-609-380-7100 Intl.), or send an email to support@linode.com for assistance.

Posted Apr 15, 2025 - 00:35 UTC

Monitoring

At this time we have been able to correct the issue affecting the Linode Kubernetes Engine (LKE). We will be monitoring this to ensure that the service remains stable. If you are still experiencing issues and unable to open a Support ticket, please call us at 855-454-6633 (+1-609-380-7100 Intl.), or send an email to support@linode.com.

Posted Apr 14, 2025 - 22:24 UTC