(WEB HOST INDUSTRY REVIEW) — Yesterdays Gmail outage of approximately two hours was caused by a miscalculation in an otherwise routine act of maintenance, according to an announcement made by Google (www.google.com) Tuesday evening.
Google’s VP of engineering and site reliability Czar, in a post to the Gmail blog, said, “I’d like to apologize to all of you — today’s outage was a Big Deal, and we’re treating it as such. We’ve already thoroughly investigated what happened, and we’re currently compiling a list of things we intend to fix or improve as a result of the investigation.”
Treynor says the company took a “small fraction” of the Gmail servers offline for routine upgrades Tuesday. However, the company “underestimated the load which some recent changes placed on the request routers.” Around 12:30 p.m. Pacific, some of the routers became overloaded and transferred the load onto remaining request routers, which were all overloaded within a few minutes.
According to Treynor, IMAP and POP access and mail processing, which don’t use the same routers, continued to work normally during the outage.
“I’d like to apologize to all of you — today’s outage was a Big Deal, and we’re treating it as such,” he wrote, addressing the fact that the outage saw, on Twitter, blogs and various other online grievance-airing mechanisms, the sort of volume of discussion normally reserved for, say, an astonishing celebrity death.
The depth and breadth of the dismay, however, speaks volumes about the role Gmail plays as an essential part of a lot of people’s businesses and day-to-day (or even moment-to-moment) existence.
Treynor says Google has “turned [its] full attention to helping ensure this kind of event doesn’t happen again.” That effort includes increasing router capacity well beyond peak demand, as well as re-thinking some of the routing infrastructure strategies at play, adding more failure isolation and designing the service to slow gracefully when overloaded, rather than cut off.
“We’ll be hard at work over the next few weeks implementing these and other Gmail reliability improvements,” he says.
No related posts.











