Data Lake
centralized repo to store all structured / unstructured data at any scale
Overview
A centralized repo to store all structured / unstructured data at any scale.
Typically has at least 3 layers (zones)
Raw
Staging
Consumption
Use cases
For analytic: using analytic tools such as
Google BigQuery
Apache Spark
Amazon Athena
Trivia
Last updated