Data Engineer
Last updated
Last updated
Variety: can store many types of data structured or unstructured, semi-structured, image, video...
RDS
S3
Glue
Comprehend
Volume
Velocity (vận tốc)
Kinesis
Lambda
Veracity/Validity
Value
OLTP (Online Transaction Processing) | OLAP (Online Analytical Processing) |
---|---|
Latest state of data | Latest state of historical data |
Normalization & 3rd normal | Normalization can cause lowness |
Optimize for point queries | |
Query latency matter | Latency not as important |
Optimize for | |
Common Table Expression (CTEs) can cause latency | Use CTEs instead of sub-queries |
Computing | Distributed computing |
---|---|
EMR | EMR |
Lambda | Batch |
Glue | Steps function |
Redshift |
EMR | Glue |
---|---|
full feature, distributed Hadoop environment | fully managed ETL |
additional framework & hardware |
|