Data Lake

centralized repo to store all structured / unstructured data at any scale

What is Data Lake |

Overview

  • A centralized repo to store all structured / unstructured data at any scale.

  • Typically has at least 3 layers (zones)

    • Raw

    • Staging

    • Consumption

Use cases

  • For analytic: using analytic tools such as

    • Google BigQuery

    • Apache Spark

    • Amazon Athena

Trivia

Last updated