Definition

data lake

This definition is part of our Essential Guide: An admin's guide to AWS data management

A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed. While a hierarchical data warehouse stores data in files or folders, a data lake uses a flat architecture to store data. Each data element in a lake is assigned a unique identifier and tagged with a set of extended metadata tags. When a business question arises, the data lake can be queried for relevant data, and that smaller set of data can then be analyzed to help answer the question.

The term data lake is often associated with Hadoop-oriented object storage. In such a scenario, an organization's data is first loaded into the Hadoop platform, and then business analytics and data mining tools are applied to the data where it resides on Hadoop's cluster nodes of commodity computers

Like big data, the term data lake is sometimes disparaged as being simply a marketing label for a product that supports Hadoop. Increasingly, however, the term is being accepted as a way to describe any large data pool in which the schema and data requirements are not defined until the data is queried.

See also: Hadoop data lake

 

This was last updated in May 2015

Continue Reading About data lake

PRO+

Content

Find more PRO+ content and other member only offers, here.

Join the conversation

1 comment

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

Data Lake is the new oil in big data, several companies are going in that direction and recently was launched the first data lake as a service, the guys from Bigstep made this.
Cancel

-ADS BY GOOGLE

File Extensions and File Formats

Powered by:

SearchCloudApplications

SearchSOA

TheServerSide

SearchSoftwareQuality

SearchCloudComputing

Close