Amazon Elastic MapReduce requires end users to enter AWS credentials to access data, and that opens the environment...
up to various potential errors.
IT teams make two common mistakes with AWS credentials in Elastic MapReduce (EMR). In some cases, the Privacy Enhanced Mail (.pem) file that contains the secure shell (SSH) key doesn't include the proper permissions. AWS uses the .pem file format to hold credentials.
AWS credentials generally include a public certificate, but could also include a complete certificate chain. Administrators can use the chmod command to change permissions on the .pem file and allow it to function with the SSH key.
A second mistake with AWS credentials revolves around using the wrong key pair. This can happen if more than one Hadoop cluster exists, which confuses the key pair. Admins can check cluster details and see which key pair was used to create the cluster and then set the correct key pair for the corresponding cluster. Once the key pair is verified and .pem file permissions are set properly, connect to the master node through SSH; you can also change the .pem file.
When using AWS Identity and Access Management (IAM) for security, set AWS Elastic Compute Cloud (EC2) policies to allow EMR to access EC2 instances for the IAM user. If those permissions are not set properly, EMR will return an EC2 authorization error. For example, IAM users that access Amazon EMR must have minimum access to list clusters. This allows the elasticmapreduce: element, which prefixes all EMR actions for IAM user policies, to specify the ListClusters action or another element vital for EMR tasks. Developers can also use the wildcard character to allow all actions under EMR.
The Condition element can also use cluster tags to detail more granular control of EMR resources. As an alternative to permissions, admins can use managed policies for Amazon EMR and enable more consistent access and updates as permissions change over time.
Identify and avoid Amazon EMR problems
Are AWS IAM roles the right choice to restrict access?
Figure out the best way to secure cloud credentials