Large or complex workloads present challenges for developers, who can all too easily sink too much time into managing...
resources when application performance degrades. AWS Batch is a tool that gives IT developers the ability to easily run large numbers of batch computing jobs on AWS.
AWS Batch runs large jobs on Elastic Compute Cloud (EC2) instances using Docker images, and it serves as an alternative to workloads that aren't particularly suited to AWS Lambda. Developers familiar with Amazon EC2 Container Service (ECS) and Amazon Simple Queue Service should be comfortable with the similarly designed AWS Batch.
ECS doesn't allow automatic provisioning of instances based on required capacity. When existing ECS instances hit maximum capacity, the service doesn't allow additional tasks to execute. AWS Batch solves this problem with a configuration that's similar to Auto Scaling in an ECS-style environment.
AWS Batch is another tool to manage tasks that run on AWS, and IT teams only need to pay for the underlying resources the service uses. Developers can configure AWS Batch to maintain a minimum level of capacity, so they need to be careful and clean up unused compute environments -- or at least remove minimum capacity for environments that are no longer in use.
ECS enables developers to execute long-running or constantly running tasks with some Auto Scaling support. AWS Batch isn't ideal for serving up user requests or running tasks that never complete. It also shouldn't be used for tasks that complete very quickly, as it's better to run those on AWS Lambda.
AWS Batch is split into three different parts: compute environments, queues and job definitions.
IT teams can configure compute environments to maintain a minimum capacity to quickly process AWS Batch jobs, or set them up to only spin up capacity as needed. A compute environment sets up an ECS cluster and an Auto Scaling group, which handle automatic scaling, manage redundancy and enable fault tolerance.
Developers submit AWS Batch jobs to queues, which hold requests for processing jobs and deliver those requests to processes running within the compute environment. Queues, which are connected to one or more compute environments, enable developers to prioritize certain tasks.
Set up multiple compute environments to enable certain tasks to operate more quickly using On-Demand Instances, while allowing burst capacity targeted on Spot Instances.
For example, our company processes press releases and blog posts. Press releases are high priority and time-sensitive, while blog articles are not. So, we set up a high-priority queue for press releases and a standard-priority queue for blog articles. Both queues process in an On-Demand Instance environment, as well as a Spot Instance environment. The On-Demand Instance environment runs enough capacity to handle all the press release articles -- with some additional capacity for surges.
During normal use, the On-Demand Instance environment has some leftover capacity that can process blog articles. When there is a surge in press releases, the extra capacity handles press releases. Blog posts move to the Spot Instance environment, which might not have enough capacity to keep up with everything right away. Eventually, when press releases slow down, blog articles can take up the extra capacity in the On-Demand Instance environment again.
A job definition is a specific type of request that contains information about the Docker container, including configuration details and commands that must run. Job definitions are the AWS Batch equivalent of ECS tasks; each consists of a command and an input for the command to run on a specific Docker image. Developers submit job definitions to queues, which then run on compute environments.
Job definitions can run any standard Docker image from Docker Hub, or they can run custom Docker images from Amazon EC2 Container Registry. When submitting an individual job, developers can also override any of the command or configuration variables.
Submitting batch jobs from SDKs
While submitting jobs through the AWS Management Console is a starting point, most applications need to submit jobs periodically and programmatically. The AWS SDKs for each language provide various methods to submit jobs, and developers can override any setting on the job definition when they submit a new job. Each submission runs one or more task instances, and when jobs submit, the total required capacity increases to handle the additional workload.
Unlike ECS, when no capacity remains for the job to run, it stays in the queue and waits for more capacity. Additionally, the compute environment automatically scales up to the maximum capacity and adheres to other restrictions, such as Spot Instance pricing.
Server types impact EC2 cloud instances
AWS job scheduler match is very important
Host of native developer tools ease AWS workload congestion