Cloud Data Architectures: Data Lake and Data Mesh
Understand two types of cloud data architectures: data lake and data mesh.
We'll cover the following...
This lesson reviews two popular cloud data architecture frameworks: data lake and data mesh.
Data lake
A data lake is a popular data architecture comparable, to a data warehouse. It’s a storage repository that holds a large amount of data, but unlike a data warehouse where data is structured, data in a data lake is in its raw format. Apart from the format, the following table summarizes other differences:
Data Lake vs. Data Warehouse
Topic  | Data Lake  | Data Warehouse  | 
Data Format  | Store unstructured, semi-structured and structured data in its raw format.  | Store only structured data after the transformation.  | 
Schema  | Schema-on-read: Schema is defined after data is stored.  | Schema-on-write: Schema is predefined prior to when data is stored.  | 
Usecase  | 
  | 
  | 
Data Quality  | Data is in its raw format without cleaning, so data quality is not ensured.  | Data is highly curated, resulting in higher data quality.  | 
Cost  | Both storage and operational costs are lower.  | Storing data in the data warehouse is usually more expensive and time-consuming.  | 
The following graph illustrates the key components of a data lake:
Ingestion layer: The ingestion layer collects raw data and loads them into the data lake. The raw data is not modified in this layer.
Processing layer: Data lake uses object storage to store data. Object storage stores data with metadata tags and a unique identifier, making searching and accessing data easier. Due to the variety and high volume of data, a data lake usually provides ...