Glue Data Catalog

Overview

  • A metadata repository that stores schema information (table structure, partitions, etc.) for datasets in AWS services like Amazon S3, Amazon Redshift, and Amazon RDS.

  • Acts as a centralized metadata store that can be used by Athena, EMR, Redshift Spectrum, and other AWS services.

Features

  • Stores table definitions, schema, and partition info.

  • Integrates with Amazon Athena, Amazon Redshift Spectrum, and Amazon EMR.

  • Supports manual table creation or automatic updates via a Glue Crawler.

  • Analogy: Think of it as a database schema that keeps track of where and how your data is stored.

Last updated