Torbz - Fotolia
Instead of spending six months crunching data, researchers at the Institute for Systems Biology (ISB) turned to the AWS cloud and completed the work in one day -- and for only $16.
A "Why didn't we do this before?" moment isn't unique just to the scientists at ISB, however. Enterprises are also discovering the benefits of high-performance computing (HPC) in the cloud.
Approximately 35% of HPC today is done in the cloud, according to Addison Snell, CEO of research firm Intersect360. Typically, enterprises use HPC in the cloud in a bursty way, for a particularly challenging, data-rich computation or when demand exceeds in-house supply. Currently, enterprise use represents 2% to 3% of the HPC in the cloud market, but Snell expects this to be the fastest-growing segment of the market in coming years.
The advantages of HPC in the cloud are clear: It's scalable, on-demand, fast and inexpensive. But the barriers are clear too. Most enterprises that need HPC already have it -- and the necessary expertise to run it -- in-house. Some organizations are concerned about security in the public cloud. And others may worry about the latency effect of moving large amounts of data. Because of those factors, Snell believes enterprises will continue to explore using HPC in the cloud, but slowly and as a way to augment the HPC they already have in-house.
When enterprises do turn to the cloud for HPC, they have three obvious choices: Amazon Web Services (AWS), Google Compute Engine and Microsoft Azure. Each offers a unique set of features.
AWS leads the way
The market leader, AWS, has offered HPC in the cloud since 2006. Snell said the company's long history, relative ease of use, and growing and varied number of computing instance configurations have all contributed to its popularity.
The heart of the AWS HPC offering is an infrastructure-as-a-service (IaaS) platform with concurrent clusters on demand and vast amounts of storage. Stock market trading system developer Tradeworx first turned to AWS when a joint project with the U.S. Securities and Exchange Commission (SEC) turned up nearly a petabyte of data, said Mike Beller, Tradeworx's CTO. The SEC wanted its customers to have access to the data, and of course the tools and analytics to use with it, but didn't want to build out all the storage. Using AWS, Tradeworx and its affiliate, Thesys Technologies, was able to create the trading platform in the cloud in under six months, Beller said. "We could quickly get a large data set under control." Beller believes AWS offers a lot of options and scenarios just for enterprises that want to run HPC in the cloud. "You can specifically choose hardware virtually dedicated to the whole piece, or a cluster option. You have a lot of choices."
And pricing choices are available, too, said Dr. Nathan Price, associate director of ISB. "Being able to burst up to 100 nodes and having something take one day instead of 100 days is important for us," he said of his group's research in identifying the molecular footprint of diseases. Not only is ISB able to "rent" these resources instead of purchasing them, saving a significant amount of money, but the group also takes advantage of Amazon's fairly aggressive spot instance pricing. "Spot instance pricing lets us do things at one-tenth the cost and just when we need to." AWS allows customers to "name their own price" on spare computing power. Bid, and if the bid is above the current spot instance price, the customer can then run the computation.
Google Compute Engine powers partnerships
Like AWS, Google's HPC offering is also an IaaS platform. Customers can choose between Hadoop, the open source version of Google MapReduce or Cloud Dataflow. The company's pay-per-minute strategy and virtual machines (VMs) that start up in seconds make it fast and economical to start processing data in the cloud. Longer-term projects are discounted through the company's "sustained use pricing" policy.
Competitive pricing is helpful, but researchers working on The Cancer Genome Atlas were primarily looking for a partner when they asked Google for help in processing an estimated 2.5 petabytes (PB) of data, said Dr. Sheila Reynolds, senior research scientist on the project. Google had worked with Reynolds' team at ISB previously, so naturally, Reynolds said, they reached out to Google first and did not shop around. Today, she said, "it's not so much that we're using Google for this. Google is really a part of our team." Reynolds feels the Google group has ownership of ISB's research project and that has resulted in better communication, closer collaboration and a feeling of wanting to get it right because the project is so important.
Collecting, storing and analyzing all the data -- 1 million pieces of data for each of two tumor types for 30 different kinds of cancer -- is just the first step. The key is to make the data available to researchers everywhere, even if they're not computer-savvy. Google's experience developing APIs has been critical to making this happen, Reynolds says, because without it, the data would remain accessible only to a small percentage of those who need it.
Microsoft Azure puts Windows in the cloud
Like the other HPC in the cloud products, Azure offers scalable, pay-as-you-go high-performance computing, but with the added option of working with an on-premises Windows environment that can reach into the cloud automatically when needed. Intersect360's Snell said Windows can be a barrier to acquiring HPC customers -- "the established high-performance computing market is dominated by Linux" -- but he said Azure is attractive to companies new to HPC or closely tied to Windows.
Actuarial consulting company Milliman was looking for a platform-as-a-service HPC solution in the cloud, said Paul Maher, CTO at Life Technology Solutions. The nature of the company's life insurance business meant it needed more horsepower at the end of every year for data analysis, and going to the cloud made practical and financial sense. After looking at all the options, Milliman chose Azure because "of a really distinct advantage in performance" that would allow the company to easily create a software-as-a-service (SaaS) product for its customers. Another deciding factor, Maher said, is that "we're selling solutions to the biggest insurance companies and Microsoft has a lot of maturity in partnering." Maher said his company needed to feel like a true partner and not just another commodity developer basically on its own to figure it out.
Milliman's built-for-Azure SaaS product, Integrate, is already gaining traction, with double the number of customers just in the last 12 months, Maher said.
About the author:
Valerie Rice Silverthorne is a writer and editor with nearly 30 years' experience covering business, trade, technology, real estate and lifestyle trends. She was an award-winning business writer for The San Jose Mercury News and a Forbes Magazine top "30 under 30" journalist. She was the editor of ZDNet.com and PC Week/Inside, and a senior executive editor of PC Week and Electronic Business. She works as a freelance writer from her home in Amesbury, Mass. Contact her at [email protected].