A distributed denial-of-service attack (DDOS) against Bitbucket.org, a popular Web site hosted on Amazon Web Services (AWS), will lead to significant changes in how AWS handles network monitoring and customer support.
Bitbucket.org founder Jesper Noehr said that the outage last weekend was quickly taken care of once the extent of the problem was clear, but AWS took more than 18 hours to agree with his team's assessment of the attack.
Bitbucket.org hosts open-source coding projects, and Noehr said the attacks stemmed from a dispute between members of a certain project. He declined to identify the project but said it was based around the online game World of Warcraft, whose communities were prone to argument.
"There's a community around this project that has a history of doing this," said Noehr.
Noehr said he was surprised to find that AWS support couldn't pin down the cause of the problem. It turned out that a flood of network traffic overwhelmed the standard 1 Gbps bandwidth offered by AWS, cutting off the Web site from his Elastic Block Storage also hosted on AWS. Noehr said the delay came because the attack overwhelmed the outward-facing public IP addresses of his instance but not the internal ones.
That meant even though AWS support staff couldn't see anything wrong at the virtual machine level, normal traffic couldn't reach Bitbucket.org. The problem was that no one could see the complete picture, said Noehr. Once AWS support looked at the problem as he suggested, they fixed the issue right away.
"We had to do an educated guess on [the attack] over statistics gathered over several hours," he said. Noehr also noted that some of Amazon's internal controls didn't function as expected.
"They have [Quality-of-Service] on their internal network, but it didn't work," said Noehr. QoS is a set of techniques and software that should be able to route network traffic around disruptions in normal traffic. Noehr said he has had extensive post-mortem discussions on the attack with Amazon engineers and management. He doesn't fault the company's attention to the issue or their sincerity; he did, however, express surprise that they weren't able to see the attack more quickly.
"I was a bit surprised this was something we had to point out to them," he said. On the other hand, he added that there was a lot they could have done at Bitbucket.org to reduce the impact of attacks like this, but they weren't taking full advantage of AWS's unique characteristics.
Noehr said that since the attacks, he's been aggressively courted by other hosters looking to capitalize on the publicity and paint Amazon as not ready for prime-time, especially in regards to security. Noehr finds that unfair but said he's undecided if he's going to move his site off AWS. He said it's mostly a matter of price, but he will definitely be implementing redundant cloud storage to make sure he can get to his site's data if such attacks reoccur.
Peter DeSantis, vice president of Amazon Elastic Compute Cloud (EC2), said that they were definitely taking this lesson about the tardy detection of Bitbucket.org's problem to heart. He said, from Amazon's perspective, the black eye from that smarted, and the company would be changing its customer service playbook and network policies to prevent a reoccurrence. He refused to positively characterize the denial-of-service as a malicious attack and did not speak to the reported failure of QoS on EC2.
"It's our job to understand that people have the same kind of visibility and ability to diagnose problems and we're going to take a lesson from this," he said. DeSantis said that the outage was basically a fluke. He also said that if a scalable architecture had been in place for Bitbucket.org, there would have been bandwidth available to thwart even very large 'resource starvation' traffic spikes.
DeSantis said that Bitbucket.org's experiences meant that Amazon had to do a better job of helping customers take proactive measures, such as distributing instances for redundancy and safety. He said that there were distinct advantages in a cloud computing environment that many weren't aware of or haven't learned about, since AWS is entirely self-service and hands-off. "We are underplaying tools that are at people's disposal," he said.
DeSantis was not specific about how he would endeavor to get strategic information about AWS services to customers, but he did say that Bitbucket.org's experience would form the nucleus of a new approach for Amazon.
Carl Brooks is the Technology Writer at SearchCloudComputing.com. Contact him at firstname.lastname@example.org.