DynamoDB
noSQL key-value database
Last updated
noSQL key-value database
Last updated
fully managed noSQL (no schema) key-value DB
fast performance with seamless scalability (scale up or scale down without downtime)
There are 2 ways to search data
Query: must specify primary key, optional for sort key.
Scan: scan ENTIRE table, return all attributes. + (optional) filter expression
Scan result is divided into pages. 1 page is <=1MB in size.
Lower cost for IA data
Application logs
E-commerce order history
Old social media post
Past gaming achievement
That means no server to provision, patch, or manage. No software to install, maintain or operate.
Provisioned: be able to set read/write capacity
On-demand: for less predictable workload, pay for what consume.
Auto increase / scale down:
Throughput
Storage
Integrate with AWS Lambda and act as a DB server.
Integration with OpenSearch Service. Use DynamoDB data as a source in Amazon OpenSearch Ingestion to automatically replicate your table data to OpenSearch Service indexes.
Normal DynamoDB is low latency (in milisecond), but DAX can provide microsecond latency.
in-memory cache, specially designed for DynamoDB.
can x10 read performance. Support milion requests / sec.
Use case:
Improve performance of READ-heavy of bursty workloads. If you want to improve WRITE performance, using SQS in front of DynamoDB.
Capture item-level changes (PutItem
, UpdateItem
, or DeleteItem
) in your table (it stores this information in a log for 24 hours), and push the changes to a DynamoDB stream. In plain English, if your data is modified, DynamoDB will notify. You can then process the stream by using Lambda function.
How it work?
Associate the stream's ARN to a Lambda function.
Lambda polls the stream and invoke the function synchronously when it detects new stream records.
An actual modification must be made to an item for it to be considered an event. If you send an UPDATE request that does not change anything, DynamoDB simply ignores it
Use case:
An app in 1 region modify the data in DynamoDB table, another app in another region will read it, to update another table, or create statistic about those data.
An app send notification for all the users as soon as a new item added to the table.
When an item in the table is modified, StreamViewType
determines what information is written to the stream for this table. If you do not want to expose any PII to the stream, you can use KEYS_ONLY
KEYS_ONLY
: only the key attributes (PartitionKey + SortKey) of the modified items are captured.
NEW_IMAGE
: capture new value of item
OLD_IMAGES
: capture old value of item
NEW_AND_OLD_IMAGES
: capture both old & new value.
Multi-region, multi-master solution.
Table: collection of data in a particular topic.
Item (row): collection of attribute.
Attribute (column):
Need to enable DynamoDB Streaming first.
PITR will protect from accidental write or delete operations.
Recover any time up to second
35 days no downtime
DynamoDB Transactions enables reading and writing of multiple items across multiple tables as an all or nothing operation. It checks for a pre-requisite condition before writing to a table.
PutItem
: put a single item
BatchWriteItem
: write up to 25 items
GetItem
: get a single item
BatchGetItem
: get upto 100 items from 1 or more tables.
UpdateItem
: update one ore more attributes in a item
DeleteItem
: delete a single item.
LSI | GSI | |
---|---|---|
Scope | Same partition key from the base table | Different partition key from the base table |
Querying | Can only query within the same partition | Can query across the entire table |
Creation time | Must be created at the same time as the table | Can be created or modified after the table is created |
Throughput | Shares throughput with the base table | Seperate throughput with base table |
Databases employ locking mechanisms to ensure that data is always updated to the latest version and is concurrent. There are multiple types of locking strategies that benefit different use cases. Some of these are:
Optimistic Locking: each item has an attribute that acts as a version number. If you retrieve an item from a table, the application records the version number of that item. You can update the item, but only if the version number on the server side has not changed.
Pessimistic Locking: an entity is locked in the database for the entire time
Overly Optimistic Locking: is used for systems that have only one user or operation performing changes at a single time.
GetItem
Query
Scan
ACID transaction: native, server-side support for transactions
Encryption at Rest using KMS
DynamoDB charges for reading, writing, and storing data in your DynamoDB tables, along with any optional features you choose to turn on.
Avoid using scan operation on large table or index, use Filter --filter-expression
and Projection --projection-expression
to get specific data instead.
Turn on ConsistentRead
if you want strongly consistent read. Because the PutItem
or UpdateItem
might not reflect to your replicas.
Cache for popular items. Use DAX for caching reads.
Table, item, attribute: are core components of DynamoDB.
Table: collection of item.
Item: collection of attribute.
Primary key: unique identifier each item in table.
GSI (Global Secondary Index) uses a different partition key as well as a different sort key to speed up queries on non-key attributes. All reads from GSIs and streams are eventually consistent.
Partition key: mandatary. -> Hash function -> Hash key.
Sort key (optional): additional for querying data.
LSI (Local Secondary Index)
The same partition key as the base table.
Both tables and LSIs provide two read consistency options: eventually consistent(default) and strongly consistent reads.
WCU:
1 api write data to your table = 1 write request
For 1 item upto 1KB in size
1 WCU = 1 standard write
1 WCU = 0.5 transactional write. Or 1 transational write require 2 WCUs.
RCU:
1 api call to your data is a read request (strongly consistent, eventually consistent, or transactional).
For item upto 4KB
1 RCU = 1 strongly consistent read request / sec
1 RCU = 2 eventually consistent read request / sec
1 RCU = 0.5 transactional read request / sec
1 RCU = 4KB/sec. 1WCU = 1KB/sec -> in one second, you can read 4KB but write only 1KB.
PartiQL: SQL-compatible query language that makes it easier to interact with data in AWS services like Amazon DynamoDB, S3 Select, and Glacier Select.
Composite key = Partition key + Sort key.
Throttled: occur when the configure RCU or WCU exceeded. ProvisionedThroughputExceededException
. Reasons for this exception are:
request rate > provision throughput
wrong choice of partition key -> uneven distribution of data
frequent access of the same key in a partition -> hot key, if your access pattern exceed 3000 RCU, and 1000 WCU, regardless of the capacity (provisioned or on-demand)
The AWS SDKs for DynamoDB automatically retry requests that receive this exception. Unless your retry queue is too large to finish -> your request is eventually successful.
DAX is an in-memory acceleration service that accelerates DynamoDB tables. DAX cannot be used with other databases.
DynamoDB can support tables of virtually any size.
DynamoDB can scale to > 10 trillion requests / day with > 20 milion request/sec.
1 single DynamoDB scan can retrieve max 1MB.
The maximum size of an item in Dynamo table is 400 KB.
the LIMIT
parameter in query string is not the number of matching items. It is the maximum number of items to evaluate. 😄
Each table can have up to 20 GSI and 5 LSI (default quota).
You can add Replica only when the table is empty. So do it before inserting any data.