Linode Status

Current Status
Connectivity Issue - Hosted DNS Service - US-East (Newark)
Incident Report for Linode
Postmortem

Starting at around 17:26 UTC on June 25, 2023, Linode’s operations team was alerted to slow resolution times for the DNS resolvers in the Newark data center. These alerts occurred throughout the day, but appeared to be intermittent and quickly resolved by restarting a service on the resolvers.

At around 14:34 UTC on June 26, 2023, these alerts for slow response times on Newark’s DNS resolvers recurred at an increasing frequency from the previous day. Linode’s operations team began an initial investigation, and by 14:37 UTC, it was clear that a service on the DNS resolvers was regularly crashing. Restarting the service would recover it, but only for a brief period of time, leading to flapping alerts.

These alerts were originally not thought to indicate a customer-impacting issue, but Linode’s Support team started noticing a trend of tickets at around 14:52 UTC on June 26, 2023, with a particularly strong uptick at 15:19 UTC. The combination of these customer reports and recurrent alerts for service crashes prompted an official start of the incident process and deeper investigation at 15:31 UTC.

By 16:18 UTC, a potential fix for the recurrent crashes was identified, and work commenced to implement the fix on the resolvers. This potential fix was fully implemented on all resolvers by 17:18 UTC, after which Linode’s incident response team observed the performance of the resolvers. With no signs of further crashes by 18:31 UTC, the problem was believed to be fixed, and the status page was updated accordingly.

However, at 18:48 UTC, an internal report of failed DNS queries emerged, followed by additional customer reports at 18:55 UTC. This prompted investigation into additional aspects of the DNS resolvers, and it was decided to move the status page back to an investigating state at 19:46 UTC.

At 20:24 UTC, an additional problem involving certain erroneous DNS queries was identified, prompting an exploration of potential fixes involving these queries. By 20:55 UTC, a tentative fix was implemented and saw an immediate improvement in response times from all DNS resolvers. After monitoring this fix, the status page was set to a monitoring status at 23:14 UTC and resolved at 01:10 UTC on June 27, 2023.

To help prevent this issue from occurring again in the future, Linode will be exploring means to improve the resiliency and monitoring of its DNS resolver systems. Additionally, Linode will be pursuing improvements to its documentation and procedures to more quickly detect potential customer impacts and begin the incident process as quickly as possible.

Posted Jul 03, 2023 - 21:20 UTC

Resolved
We are no longer seeing issues affecting our Hosted DNS Service, and are considering this incident resolved
If you continue to experience problems, please open a ticket with our Support Team.
Posted Jun 27, 2023 - 01:10 UTC
Monitoring
At this time, we have been able to correct the issue affecting our Hosted DNS Service. We will be monitoring this to ensure that connectivity remains stable.

If you continue to experience problems, please open a ticket with our Support Team.
Posted Jun 26, 2023 - 23:14 UTC
Identified
We have identified an additional issue with our DNS resolvers and have implemented a fix, although we are continuing to investigate for any further problems while monitoring the efficacy of this fix.

Customers in Newark are encouraged to use our local DNS resolvers moving forward. If you are experiencing issues with these resolvers, we recommend using a public DNS resolver.

Should you have any problems with DNS resolution using our Newark resolvers, please open a ticket with our Support Team.
Posted Jun 26, 2023 - 21:35 UTC
Investigating
Despite promising initial observations, it appears that the mitigation did not fix the issue with DNS resolution from Linodes in Newark. We are actively continuing to explore this issue and pursuing other potential fixes.

This issue is only affecting DNS resolution from our Newark data center -- our public nameservers and DNS Manager are unaffected. Customers in Newark are encouraged to use a public resolver on their Linodes in the meantime.
Posted Jun 26, 2023 - 19:44 UTC
Monitoring
At this time, we have been able to correct the issue affecting our Hosted DNS Service. We will be monitoring this to ensure that connectivity remains stable.

For full clarity, this issue only affected DNS resolution for Linodes in Newark. Linode's public DNS nameservers and DNS Manager were unaffected by this issue.

If you continue to experience problems, please open a ticket with our Support Team.
Posted Jun 26, 2023 - 18:31 UTC
Identified
Our team has identified the issue affecting our Hosted DNS Service. We are working quickly to implement a fix, and we will provide an update as soon as the solution is in place.

In the meantime, we encourage customers in Newark to configure a public resolver for their services.
Posted Jun 26, 2023 - 17:26 UTC
Investigating
Our team is investigating an issue affecting our Hosted DNS Service. During this time, users may experience an elevated rate of errors, including records not updating or resolving as expected. We will share additional updates as we have more information.
Posted Jun 26, 2023 - 16:02 UTC
This incident affected: Hosted DNS Service.