This article has been updated to include a statement from Juniper Networks.
Web security and performance provider CloudFlare suffered an hour-long outage over the weekend after pushing out a change that caused a system-wide failure of its Juniper edge routers, according to several reports.
The post-mortem on Sunday’s outage is hosted on CloudFlare’s Posterous blog, and is inaccessible at the time of writing, but according to CNET the issue started 1:47 PDT Sunday morning when CloudFlare detected a DDoS attack on one of its customers.
The outage affected Juniper routers running the Flowspec protocol, which allows customers to broadcast router rules to a large number of routers efficiently, CNET says.
CloudFlare detected the attack when it identified attack packets between 99,971 and 99,985 bytes, exceeding CloudFlare’s 4,470-byte maximum packet size. In this instance, Flowspec accepted the rule and relayed it to its edge network.
“What should have happened is that no packet should have matched that rule because no packet was actually that large,” Matthew Prince, CloudFlare CEO writes in the post-mortem. “What happened instead is that the routers encountered the rule and then proceeded to consume all their RAM until they crashed.”
The outage impacted all of its 23 data centers, and the approximately 175,000 customers around the world that rely on its DNS and web proxy service. CloudFlare also has more than 100 web hosting partners that resell its performance and security service to their own customers.
This is one of a handful of major outages CloudFlare has had since its launch, and is a blow to the company who just launched its next generation web optimization protocol Railgun with more than 25 hosting partners last week.
Prince says CloudFlare will make it up to its paying customers, though at this point it isn’t clear what the reimbursement will look like. CloudFlare introduced very competitive SLAs with its high-end plans last year; a 100 percent SLA with its business plan, and a 2500 percent SLA with its enterprise plan.
UPDATE Monday, 3:27 pm ET: “Juniper Networks is aware of and investigating a reported network outage with one of our customers, Cloudflare. While we have not completed our investigation, we believe this incident was triggered by a product issue that Juniper identified last October, when a patch was also made available. Our customer support team is actively supporting Cloudflare in its efforts to resolve the issue and we are not aware of any other customers experiencing similar issues.”
Talk back: Were you impacted by the CloudFlare outage over the weekend? How did your customers that use CloudFlare react? Let us know in a comment.