AWS ran the numbers and decided to come out with several new accelerated computing instance types that promise...
better performance for compute-intensive applications.
Elastic graphics processing units (GPUs) optimize performance for apps that manage workloads such as data analytics, machine learning and deep learning. Field-programmable gate array (FPGA) instances also provide better performance for well-defined tasks and supported algorithms but are less flexible than Elastic GPU instances.
Both instance types, which feature dedicated math coprocessors, are in their infancies. They also both work in concert with CPUs for accelerated computing within math-intensive apps.
Elastic GPUs, however, have a more mature application development ecosystem, which enables developers to more easily change or adapt GPU-based applications. For that reason, Elastic GPUs are likely a better fit for workloads such as exploratory analytics, as they make it easier to dynamically update and test different algorithms. With GPUs, developers merely run different software to implement these approaches. FPGAs, in contrast, need more tuning to test different algorithms. For these reasons, GPUs are also a better fit for data visualization, 3D modeling and simulation.
FPGAs promise better performance per unit and reduced power consumption than CPUs alone or CPUs combined with GPUs. While it's not clear how power savings translate into cost savings, many AWS FPGA applications run considerably faster. For example, Ryft, a big data analytics platform company, said it runs search processes faster, and its 1 TB log file processes 91 times faster than running on a CPU alone.
Options for Elastic GPU instances
AWS breaks its Elastic GPU offerings into two instance types: G3 and P2. Elastic Compute Cloud (EC2) G3 instances optimize performance for graphics processing workloads, like virtual desktops, gaming, industrial design and 3D applications. Instance models include one to four GPUs and 8 to 32 GB of GPU memory. Each Nvidia Tesla M60 GPU supports 2,048 parallel processing cores. Enterprises can also turn to an Elastic GPU partner, such as Ansys or Siemens PLM, for engineering simulation.
P2 instances boost general-purpose computing projects written in Compute Unified Device Architecture (CUDA) and Open Computing Language (OpenCL). Instance models include one to 16 GPUs and 61 to 732 GB of RAM. P2 instances are ideal for machine learning, high-performance databases, seismic analysis and molecular modeling and genomics. Developers can use tools such as MXNet, Caffe, Caffe2, TensorFlow, Theano, Cognitive Toolkit, Torch and Keras, which are built into the Deep Learning Amazon Machine Image (AMI).
AWS' P2 instance partners include:
- Clarifai for image and video recognition APIs;
- Altair Engineering for the analysis, management and visualization of business and engineering simulations; and
- MathWorks for faster computation on top of its MATLAB software.
Options for FPGA instances
AWS provides two F1 instance types for FPGA accelerated computing. The F1.2xlarge instance supports eight CPUs, one FPGA chip, 122 GB of RAM and 470 GB of solid-state drive storage. The F1.16xlarge instance supports up to 64 CPUs, eight FPGA modules and 976 GB of dynamic RAM. AWS provides an FPGA developer AMI to code apps that run on F1 instances. These instances are built on the Xilinx 16 nm architecture, as AWS does not yet support the faster Xilinx 8 nm architecture.
Elastic FPGA partners include:
- Aldec for semiconductor design simulation;
- Aon Benfield for insurance analytics;
- Atomic Rules for network processing;
- CME Group for derivatives analytics;
- Edico Genome for genomic sequencing;
- Falcon Computing for C++ app optimization;
- Mipsology for deep learning;
- National Instruments for systems engineering;
- NGCodec for video compression;
- Reconfigure.io for Go app development;
- Ryft for data analytics; and
- Teradeep for video analytics.
High-level APIs get the job done
Math coprocessor application development is a relatively new field compared to CPU applications, and not many developers know how to translate raw accelerated computing into speedier apps. Developers must decompose algorithms to execute in parallel across multiple independent processors to get the best performance from GPUs and FPGAs. They must pay close attention to an instance's RAM, storage and CPU bottlenecks to optimize this process.
A variety of programming languages have emerged to help code these applications. AWS supports DirectX, Open Graphics Library (OpenGL), OpenCL and CUDA for its Elastic GPU instances, and OpenCL for FPGA instances.
DirectX and OpenGL provide portability for graphics-intensive computation for GPUs, including video editing, image recognition and engineering. OpenCL, a variant of OpenGL, intends to simplify the development of general-purpose applications that run across GPUs from many vendors. Developers created CUDA to optimize GPU computation on top of the popular Nvidia GPU architecture.
Until recently, developers relied on low-level programming tools highly specific to individual FPGA implementations. FPGA vendors have begun to adopt OpenCL, which will allow for greater flexibility. However, these vendors still have a way to go to reach the level of flexibility that comes with OpenCL apps built for GPUs.
Enterprises might only derive significant benefit from apps developed on CUDA, OpenGL or OpenCL when they plan to create value through better analytics, machine learning or artificial intelligence applications. Most enterprises will be better served using APIs for analytics services deployed on FPGA and Elastic GPU instances rather than writing APIs themselves. AWS works with a wide variety of partners that ease usage and expand capabilities for FPGA and Elastic GPU instances without making IT teams do the grunt work.
Experts predict AWS' future holds more moves, hybrid implementations
More change due as GPUs and algorithms evolve
Do you spend too much on your EC2 instances?
Flexibility, compute efficiency get a bump from EC2 Elastic GPUs