Linode Status

Current Status
Service Issue - Block Storage - IN-MAA (Chennai)
Incident Report for Linode
Postmortem

On December 10, 2024, at approximately 07:25 UTC, TrafficPeak experienced a service disruption that affected our customers' ability to upload, retrieve, or view logs. At the same time, we received a report of service degradation and connectivity issues at our Chennai data center (IN-MAA1).

Following an initial investigation, we ruled out a widespread network issue and determined the root cause to be a problem with the Block Storage service. The issue occurred because too many new hosts were added to the cluster simultaneously, causing it to degrade and become unusable. Additionally, it was established that the service disruption impacting TrafficPeak and service degradation at the Chennai data center were both caused by this same issue.

By approximately 10:30 UTC, we mitigated the situation by temporarily disabling Ceph Cluster rebalancing at IN-MAA1 and restarting all Object Storage Daemons (OSDs) in the affected data center, which restored service.

To prevent a recurrence, we have updated our internal documentation to limit host additions to one at a time and will re-enable rebalancing in the impacted cluster under controlled conditions. We are also evaluating further enhancements to improve system resilience and avoid similar incidents in the future. 

This summary provides an overview of our current understanding of the incident given the information available. Our investigation is ongoing and any information herein is subject to change.

Posted Dec 17, 2024 - 20:46 UTC

Resolved
We haven’t observed any additional issues with the Block Storage service in Chennai data center, and will now consider this incident resolved. If you continue to experience problems, please open a Support ticket for assistance.
Posted Dec 10, 2024 - 16:55 UTC
Monitoring
At this time we have been able to correct the issues affecting the Block Storage service in Chennai. Based on current observations, the service has resumed normal operations. We will be monitoring this to ensure that it remains stable while continuing to investigate the root cause. Please be aware that some customers may experience minor performance degradation during this period. If you continue to experience problems, please open a Support ticket for assistance.
Posted Dec 10, 2024 - 11:07 UTC
Identified
Our team has identified the issue affecting the Block Storage service in our Chennai data center. We are working quickly to implement a fix, and we will provide an update as soon as the solution is in place.
Posted Dec 10, 2024 - 10:22 UTC
Update
Our investigation narrowed down the impact to an issue affecting the Block Storage service in our Chennai data center. During this time, users may experience connection timeouts and errors with this service. We will share additional updates as we have more information.
Posted Dec 10, 2024 - 09:47 UTC
Investigating
Our team is investigating an issue affecting connectivity in our Chennai data center. During this time, users may experience intermittent connection timeouts and errors for all services deployed in this data center. We will share additional updates as we have more information.
Posted Dec 10, 2024 - 09:10 UTC
This incident affected: Block Storage (IN-MAA (Chennai) Block Storage).