Data Center Outage Hits 400,000 Clients of Email Marketing Tool MailChimp

A video still from MailChimp's website shows its campaign builder panel A video still from MailChimp's website shows its campaign builder panel

(WEB HOST INDUSTRY REVIEW) — Email marketing platform MailChimp experienced an outage at one of its data centers on Monday, affecting about 400,000 customers. According to a blog post by founder and CEO Ben Chestnut on Tuesday, the issues stemmed from multiple hardware failures at its US1 data center.

According to the post, US1 is MailChimp’s oldest data center with the most users. MailChimp noticed three of its “small user” (users with lists less than 25,000 recipients) database groups failing. At around 2 pm Monday, it disabled access to those groups to prevent users from creating new campaigns that would be lost if MailChimp had to restore from backup.

Chestnut says 1/3 of the affected users saw no campaign or data loss during the restore, but for the other users the failing hardware corrupted data so it had to revert to backups from 1 am ET Jan. 2, 2012. He says that no subscribe or unsubscribe data was processed during the outage.

Through the logs, MailChimp says it found 788 users that lost campaign data between 1 am and 3 pm and it sent an email with refunds to reconcile the loss.

The final batch of users were brought online at 8:24 am on January 3, according to its blog post.

MailChimp says it is still investigating exactly what went wrong, but the RAID controllers for the SSDs weren’t working reliably. It had replaced its older hardware with new SSD-equipped servers to prepare for Thanksgiving and Christmas volume. It plans to switch back from the SSDs “as soon as the dust settles.”

MailChimp has a user base of 1.2 million and was able to keep some customers online since it spreads itself across three data centers in the US.

MailChimp does not provide an SLA to its users, but a post on its forum from March 2011 says downtime is rare for MailChimp.

“…While it can be an inconvenience to you, your own recipients will usually not experience anything more than a delay in delivery. Whenever we get a report of an issue with our system we work hard to remedy the problem as quickly as possible but sometimes it may be out of our hands (like a DDoS attack hammering away at the data center) and then all we can do is wait for things to die down,” MailChimp said in the post.

Nicole Henderson

About

Nicole Henderson writes full-time for the Web Host Industry Review where she covers daily news and features online, as well as in print. She has a bachelor of journalism from Ryerson University in Toronto, and has been writing for the WHIR since September 2010. You can find her on Twitter @NicoleHenderson.

No related posts.

Leave a Comment