DynamoDB

noSQL key-value database

Overview

fully managed noSQL (no schema) key-value DB
fast performance with seamless scalability (scale up or scale down without downtime)

Search data

There are 2 ways to search data

Query: must specify primary key, optional for sort key.
Scan: scan ENTIRE table, return all attributes. + (optional) filter expression
- Scan result is divided into pages. 1 page is <=1MB in size.

Table classes

Standard

Standard-IA

Lower cost for IA data

Application logs
E-commerce order history
Old social media post
Past gaming achievement

Benefit

Serverless

That means no server to provision, patch, or manage. No software to install, maintain or operate.

Capacity modes

Provisioned: be able to set read/write capacity
On-demand: for less predictable workload, pay for what consume.

Auto-scaling

Auto increase / scale down:

Throughput
Storage

Use cases

Integrate with AWS Lambda and act as a DB server.

Features

Push button scaling /Auto scaling

Import/Export to S3

zero-ETL -> OpenSearch

Integration with OpenSearch Service. Use DynamoDB data as a source in Amazon OpenSearch Ingestion to automatically replicate your table data to OpenSearch Service indexes.

DynamoDB - DAX

Normal DynamoDB is low latency (in milisecond), but DAX can provide microsecond latency.
in-memory cache, specially designed for DynamoDB.
can x10 read performance. Support milion requests / sec.
Use case:
- Improve performance of READ-heavy of bursty workloads. If you want to improve WRITE performance, using SQS in front of DynamoDB.

DynamoDB - Stream

Capture item-level changes (PutItem, UpdateItem, or DeleteItem) in your table (it stores this information in a log for 24 hours), and push the changes to a DynamoDB stream. In plain English, if your data is modified, DynamoDB will notify. You can then process the stream by using Lambda function.

How it work?

Associate the stream's ARN to a Lambda function.
Lambda polls the stream and invoke the function synchronously when it detects new stream records.

An actual modification must be made to an item for it to be considered an event. If you send an UPDATE request that does not change anything, DynamoDB simply ignores it

Use case:

An app in 1 region modify the data in DynamoDB table, another app in another region will read it, to update another table, or create statistic about those data.
An app send notification for all the users as soon as a new item added to the table.

StreamViewType

When an item in the table is modified, StreamViewType determines what information is written to the stream for this table. If you do not want to expose any PII to the stream, you can use KEYS_ONLY

KEYS_ONLY: only the key attributes (PartitionKey + SortKey) of the modified items are captured.
NEW_IMAGE: capture new value of item
OLD_IMAGES: capture old value of item
NEW_AND_OLD_IMAGES: capture both old & new value.

DynamoDB - Global tables

Multi-region, multi-master solution.
Table: collection of data in a particular topic.
- Item (row): collection of attribute.
- Attribute (column):
Need to enable DynamoDB Streaming first.

Point-in-time Recovery

PITR will protect from accidental write or delete operations.

Recover any time up to second
35 days no downtime

ACID transaction

DynamoDB Transactions enables reading and writing of multiple items across multiple tables as an all or nothing operation. It checks for a pre-requisite condition before writing to a table.

DataPlan API

PutItem: put a single item
BatchWriteItem: write up to 25 items
GetItem: get a single item
BatchGetItem: get upto 100 items from 1 or more tables.
UpdateItem: update one ore more attributes in a item
DeleteItem: delete a single item.

LSI vs GSI

LSI

GSI

Scope

Same partition key from the base table

Different partition key from the base table

Querying

Can only query within the same partition

Can query across the entire table

Creation time

Must be created at the same time as the table

Can be created or modified after the table is created

Throughput

Shares throughput with the base table

Seperate throughput with base table

Locking mechanisms

Databases employ locking mechanisms to ensure that data is always updated to the latest version and is concurrent. There are multiple types of locking strategies that benefit different use cases. Some of these are:

Optimistic Locking: each item has an attribute that acts as a version number. If you retrieve an item from a table, the application records the version number of that item. You can update the item, but only if the version number on the server side has not changed.
Pessimistic Locking: an entity is locked in the database for the entire time
Overly Optimistic Locking: is used for systems that have only one user or operation performing changes at a single time.

Read data

GetItem

aws dynamodb get-item \
    --table-name ProductCatalog \
    --key '{"Id":{"N":"1"}}' \
    --projection-expression "Description, RelatedItems[0], ProductReviews.FiveStar"

Query
Scan

Security

ACID transaction: native, server-side support for transactions
Encryption at Rest using KMS

Pricing

DynamoDB charges for reading, writing, and storing data in your DynamoDB tables, along with any optional features you choose to turn on.

Best practices

Avoid using scan operation on large table or index, use Filter --filter-expression and Projection --projection-expression to get specific data instead.

Turn on ConsistentReadif you want strongly consistent read. Because the PutItem or UpdateItem might not reflect to your replicas.
Cache for popular items. Use DAX for caching reads.

Concepts

Table, item, attribute: are core components of DynamoDB.
- Table: collection of item.
- Item: collection of attribute.
Primary key: unique identifier each item in table.
GSI (Global Secondary Index) uses a different partition key as well as a different sort key to speed up queries on non-key attributes. All reads from GSIs and streams are eventually consistent.
- Partition key: mandatary. -> Hash function -> Hash key.
- Sort key (optional): additional for querying data.
LSI (Local Secondary Index)
- The same partition key as the base table.
- Both tables and LSIs provide two read consistency options: eventually consistent(default) and strongly consistent reads.
WCU:
- 1 api write data to your table = 1 write request
- For 1 item upto 1KB in size
  - 1 WCU = 1 standard write
  - 1 WCU = 0.5 transactional write. Or 1 transational write require 2 WCUs.
RCU:
- 1 api call to your data is a read request (strongly consistent, eventually consistent, or transactional).
- For item upto 4KB
  - 1 RCU = 1 strongly consistent read request / sec
  - 1 RCU = 2 eventually consistent read request / sec
  - 1 RCU = 0.5 transactional read request / sec
- 1 RCU = 4KB/sec. 1WCU = 1KB/sec -> in one second, you can read 4KB but write only 1KB.
PartiQL: SQL-compatible query language that makes it easier to interact with data in AWS services like Amazon DynamoDB, S3 Select, and Glacier Select.
Composite key = Partition key + Sort key.
Throttled: occur when the configure RCU or WCU exceeded. ProvisionedThroughputExceededException. Reasons for this exception are:
- request rate > provision throughput
- wrong choice of partition key -> uneven distribution of data
- frequent access of the same key in a partition -> hot key, if your access pattern exceed 3000 RCU, and 1000 WCU, regardless of the capacity (provisioned or on-demand)

The AWS SDKs for DynamoDB automatically retry requests that receive this exception. Unless your retry queue is too large to finish -> your request is eventually successful.

Trivia

DAX is an in-memory acceleration service that accelerates DynamoDB tables. DAX cannot be used with other databases.
DynamoDB can support tables of virtually any size.
DynamoDB can scale to > 10 trillion requests / day with > 20 milion request/sec.
1 single DynamoDB scan can retrieve max 1MB.
The maximum size of an item in Dynamo table is 400 KB.
the LIMIT parameter in query string is not the number of matching items. It is the maximum number of items to evaluate. 😄
Each table can have up to 20 GSI and 5 LSI (default quota).
You can add Replica only when the table is empty. So do it before inserting any data.

PreviousDocument DB NextDynamoDB API

Last updated 3 months ago