Linode would like to apologize for the downtime in Fremont over the last few days. While we do our best to plan for and prevent downtime, problems can (and will) happen, and just like any hosting provider we are not immune to that fact.
The Fremont facility is consulting with the UPS manufacturer to make sure the system is more robust in order to protect against similar failures in the future. We plan to follow up with them and ensure that the reliability of Linode's infrastructure meets our expectations.
Here is the RFO from the facility for both outages:
On Saturday night at Fremont 1 around 9pm during a thunderstorm, there was a power incident involving the electric power utility lasting approximately 3 seconds that damaged two UPSes causing them to fail. The UPS paralleling system automatically went into bypass in order to restore power. The UPS service technicians inspected the units and determined the specific components that failed and ordered parts.
The failed UPSes were damaged by power fluctuations that occurred during the thunderstorm.
On Tuesday morning at Fremont 1 at 07:27am PST, the electric power utility had a 1 second power incident. Due to running on bypass this had an effect, where it normally would not have.
The UPS service technicians have the replacement parts and are on site performing the necessary repairs to restore the 2 failed UPS units to normal operational status.
Our goal has always been to choose highest quality facilities, equipment, and infrastructure that minimizes these occurrences, and to have procedures in place to recover quickly when unexpected problems occur. As always, our entire team is committed to recovering quickly from problems and we're proud of how our team handled the two outages.
Over a dozen machines were permanently damaged by the power failures and required replacement into standby hardware, which we keep at the ready at all times, in every facility. After the second power failure on Tuesday we had all but seven host servers back on-line in under an hour. Experiences like this only result in making Linode better.
Downtime is unfortunate, but it can and will happen. Planning for it is the only way to avoid being affected when it does occur. We're sorry for any inconvenience this may have caused you.
Comments