Glue
serverless ETL service
Last updated
serverless ETL service
Last updated
A fully managed ETL (Extract, Transform, Load) service that can extract data from various sources, transform it into the required format, and load it into a target data store.
Prepare & transform data for analytic.
Prevent processing old data -> Glue can resume a job from where it left off.
GUI for create, run and monitor ETL jobs.
Clean & normalize data using pre-built transformation.
Built on Apache Spark Structured Streaming
Compatible with Kinesis Data Streaming, Kafka, MSK (managed Kafka)
Data transformation = AWS Glue.
Parquet format: is a columnar storage file format optimized for use with big data processing frameworks.