Warakorn - Fotolia
There are two things that IT professionals think of when they hear high-performance computing: availability and scalability (speed). Scalability helps ensure that an architecture can properly handle an increasing number of requests without negatively affecting speed.
If you're familiar with high-performance computing (HPC), you're probably familiar with O(n), also called Big O notation. O(n) was the original method used to determine the speed of an algorithm when processing massive amounts of data. Essentially, O(n) ignores the actual time it takes for an algorithm or function to run. Instead, it focuses on how many iterations are needed in a data set to process the results.
For example, a function like this signifies the developer has the most effective version, O(n):
for each X
But not all versions are as effective:
for each X as a
for each X as b
In the above example, the code is now iterating over each element for each element, so it becomes O(n*n) or O(n2). This is the absolute worst possible way to process data, as the time to complete computation increases exponentially.
Big O and parallel computing
When developers start to work with multiple processes and compute instances, they realize how to break up a complex and expensive algorithm that may not make much sense in a traditional environment. For example, take that original algorithm, which essentially iterates over each element for each element in our data set:
for each X as a
// Split proc(a) into a new process
// which can be run in parallel
for each X as b
Suddenly, there are two algorithms -- each of them with O(n), which is the ideal Big O notation. If each "proc(a)" call is allowed to run asynchronously by splitting off those calls to happen on their own compute instance, that effectively doubles the process in a parallel computing model and cuts down the overall processing from taking exponential amounts of time to require exponential amounts of resources.
Working toward AWS HPC with O(n) and video encoding
After identifying potential areas of processing where parallel computing can help increase overall throughput, let's look at a real-world example.
One common use for AWS is video processing. It's even more relevant today than it was when the cloud provider first launched, as we now have dozens of different types of devices: desktop PCs, TVs, tablets, smartphones and everything in between. These devices complicate the ability to work with different file formats; developers need to handle all sorts of bandwidth possibilities for different types of network connectivity related to AWS HPC. For example, if an end user has a data connection that is 50 Mbps, it can easily stream a video with 4K resolution. If he is on a cellular connection, however, lower resolution is ideal.
To start, assume a developer is uploading a full-quality 4K video, and he will want to encode it into 10 different video formats or qualities. The algorithm to do so would look like this:
for each video
for each format
If the developer has 10 different formats, the Big O notation would be O(10n). That's not bad, but, obviously, it would be better to split that up and run each encoding step in parallel.
for each video
for each format
Not only does this reduce the overall Big O notation and speed of processing, but it also lets the developer easily add in new file formats by adding them to the "encode_all" trigger. If the developer wants to expand this to AWS HPC, he could use an Amazon Simple Notification Service (SNS) topic to broadcast events when a new video needs to be encoded, and then allow encoders to subscribe to that topic. This allows developers to add new encoding types, as needed, without having to modify existing code.
This idea of event-driven computing is very powerful with AWS, and it is similar to the observer pattern in traditional computing. Let's take a look at how this type of pattern works:
AWS HPC and event-driven processing
Once a video is submitted to SNS for processing, it automatically sends a message to each Amazon Simple Queue Service (SQS) queue for each type of desired output. This message would likely just contain a link to the original video in Amazon Simple Storage Service (S3) that the developer wants to convert. He would then have a fleet of Elastic Compute Cloud (EC2) instances for each SQS queue to process those messages and convert them into an action -- in this case, encoding the video into the appropriate format and uploading it to S3.
To properly handle scalability, an Auto Scaling group should govern each fleet of EC2 instances, which automatically increases in size if the number of SQS messages to be processed surpasses a defined threshold. This allows the developer to automatically scale up in the number of servers during high periods of use -- more uploads -- and then automatically scale back down again.
With this architecture, adding a new type of video format requires a few easy steps: Add a new SQS queue, set up a new Auto Scaling group with EC2 instances to process that queue and subscribe the queue to the SNS topic that is called when a new video is uploaded. The new queue will begin receiving messages for requests for processing videos, and any new videos submitted going forward will automatically be encoded in the new format.
Even better, if one format encoding style fails -- say an entire EC2 instance group stops working -- it doesn't affect the other formats. For example, if a 1080-pixel video format isn't encoding properly, the developer can easily fall back to the 780-pixel format until that encoding format is fixed. Additionally, because the developer queues up those requests to SQS, they will be processed automatically once the encoder is fixed, and he will backfill the videos that had been missed while upgrading or fixing that encoding group.
AWS HPC often starts by modeling processes with Big O notation, which identifies where an application can be split into multiple processes to improve overall performance. Anywhere that can be split into multiple processes is usually preferable, but don't forget that it's always slightly slower end to end when a process is split into multiple pieces. However, if you need to scale your processes to handle high throughput, it's a great way to go.
C4 instances provide premium performance
VPC, Python features boost AWS Lambda
Match enterprise needs to these AWS products