auremar - Fotolia
Business leaders continue to seek critical insights from the data they process. However, IT teams face a serious challenge when they need to collect and manage that information and increasingly turn to cloud edge computing to help.
Enterprises generate raw data through internet of things (IoT) sensors and many other sources. In a traditional computing architecture, data passes from its source across a WAN to a storage resource that's often housed in the public cloud. Businesses can then scale up big data clusters to apply analytics and query that data.
Unfortunately, traditional client-server architectures that centralize resources in a primary data center don't work well for this kind of data collection model. The principal problem is the network. Big data analysis only works if you have a massive volume of data, which means there could be up to billions of data sources. And each source moves data across a network to storage. There just isn't enough WAN bandwidth to handle data from every source from every organization.
Take it to the edge
To get around this limitation, IT teams can collect, store and process data closer to its point of origin -- a concept known as edge computing.
For example, an industrial plant with thousands of process and environmental sensors could create a small data center on-site to collect, store and perform preprocessing on data. Once those tasks are complete, the IT team can move the data asynchronously to a corporate data center or the cloud.
Each remote location could have the on-site facilities to store and perform computing tasks on collected data. When organizations use multiple edge locations, the computing model can include advanced infrastructure concepts that erase the traditional client-server model, such as mesh computing and peer-to-peer computing.
Cloud edge computing expands user capabilities
Most enterprises are reluctant to make a large investment in on-premises resources for occasional or periodic queries, because those big data resources would be idle much of the time. Fortunately, public cloud providers, such as AWS, offer more practical and cost-effective big data capabilities.
For enterprises in the cloud, edge computing is more of a reality. An AWS user can spin up thousands of Hadoop instances as needed, execute queries and then terminate resources when finished.
Additionally, AWS offers several services that can help enterprises manage big data and IoT. The AWS IoT service can connect billions of IoT endpoints to AWS resources and services, such as Elastic Compute Cloud instances, Amazon Simple Storage Service (S3), AWS Lambda and Amazon Machine Learning.
AWS Greengrass enables devices to connect to the cloud and run AWS Lambda functions -- all while the service collects and analyzes data closer to the source. Greengrass even works when AWS connectivity is disrupted. There are three parts to the AWS Greengrass system: Greengrass Core, the AWS IoT Device software development kit (SDK) and the AWS cloud itself.
Greengrass Core creates a local communications hub that connects the public cloud to data-generating devices, such as IoT sensors. It also supports local execution of AWS Lambda code and manages data security and caching for the user.
Requirements for cloud edge computing
Part of the difficulty with edge computing comes when IT teams try to connect a public cloud to the edge of a network. There are numerous compatibility requirements that a business must address before it deploys edge services.
In some cases, the requirements are relatively straightforward. For example, AWS Snowball Edge comes preconfigured with an Amazon S3-compatible device endpoint, Network File System support and the native ability to run AWS Lambda functions as data replicates to the device. IT teams need to determine how to best connect that device to the current enterprise LAN. Snowball Edge provides a 10 Gigabit Ethernet RJ45 port, a 25 GbE small form-factor pluggable port and a 40 GbE quad small form-factor pluggable (QSFP+) port. Users only need to supply the appropriate cable and switch the port to the LAN.
But it can be more problematic to connect IoT devices to AWS. Enterprises must select IoT devices that are compatible with Amazon cloud services and support the AWS IoT Device SDK -- though it is possible for developers to write their own SDK. Any device that supports Transport Layer Security should be able to support the AWS IoT Device SDK, but verify the compatibility between devices and AWS before deployment.
Give data transfer challenges a cold shoulder
Network data transfers are not always practical; sometimes, too much data streams from too many devices. When this occurs, IT teams can import data to the cloud via AWS Snowball Edge, a physical storage device that ships from the edge location to AWS.
Snowball Edge holds 100 TB of data and can serve as a temporary local storage tier. An IT team orders the Snowball Edge as an AWS job through the AWS Management Console. Upon arrival, the team connects it to the local network, where it can collect data from IoT devices, apps and other sources. When the collection is complete, the storage device ships to the desired AWS region, where it data loads into an S3 instance for analytics projects.
Greengrass users also need to identify and deploy systems capable of hosting Greengrass Core software on premises. Currently, Greengrass Core-compatible systems require x86-64, ARMv7 or AArch64 (ARMv8) CPU architectures with Linux kernel 4.4 or later, such as Ubuntu 14.04 LTS or Jessie Kernel 4.1/4.4. AWS documentation details Greengrass Core dependencies.
Cloud edge computing doesn't just stop with compatible host systems and endpoint devices. Enterprises must also manage physical device lifecycles. IT teams need to identify, troubleshoot and replace failed devices, upgrade aging devices and replace device batteries. This extra attention requires business policies and procedures that help optimize thousands -- even millions -- of individual devices.
AWS Greengrass pushes cloud provider into edge territory
AWS Snowball gives users an option to move data to AWS
What is AWS Greengrass' role in Amazon functionality?