Four Hours Without Gmail [NYT Bits Blog]
When millions of users depend on a single web mail service, the consequences can be catastrophic when it goes haywire for a few hours. At around 4:30 AM Eastern time yesterday, Google's Gmail service went offline for around four hours. The outage had some impact on U.S. Gmail users, who were just beginning their day, the outage's impact in Europe and Asia was more pronounced because it occurred in the middle of the work day Tuesday. Google's Google Apps product is a popular choice for many corporate email set-ups, and it was also affected by the Gmail outage, since they run on the same system.
Google explained the outage as a result of routine maintenance at one data center responsible for the Gmail system. When Google shut down the data center for a software update, the load overwhelmed the other data centers, shutting them all down.
This morning, there was a routine maintenance event in one of our European data centers. This typically causes no disruption because accounts are simply served out of another data center. Unexpected side effects of some new code that tries to keep data geographically close to its owner caused another data center in Europe to become overloaded, and that caused cascading problems from one data center to another. It took us about an hour to get it all back under control.
Four hours after the outage began, Google restored their data centers and Gmail was accessible again. I felt their pain when I heard about this outage as we've all experienced these types of misfortunes. Happily, Google was able to restore service in a few hours.
Google has championed the idea of "cloud computing," which basically means storing all your mission-critical data on the Internet, with applications like Gmail and Google Docs promoting the concept. Cloud computing has some definite advantages when it comes to accessibility of information and the ability to readily share that information, but when the cloud goes down...well, you know.




Comments