S3
Overview
Features
Support LIST/GET/PUT/COPY/POST/DELETE
S3 - Static Website Hosting
S3 can host static websites and have them accessible on the Internet.
The website endpoint will be: http://bucket-name.s3-website-aws-region.amazonaws.com
none SSL
included bucket-name & region-name in the endpoint link.
S3 - Bucket Versioning
Helps you recover accidental, such as: overwrite & delete.
Can be enabled at bucket level.
Turn ON versioning is a best practice.
When not enable =
Suspend
Before enabling versioning, all the
Version ID
of objects areNULL
After enabling Bucket Versioning, you might need to update your lifecycle rules to manage previous versions of objects -> Only apply to newly create objects.
After suspend Bucket Versioning
Lifecycle rules set for previous object versions will still apply.
Existing objects in your bucket do not change.
Newly added objects with the same name as an existing object replaces the existing object.
S3 - Analytics
Used to analyze storage access patterns to help you decide when to transition the right data to the right storage class.
S3 - Replication
MUST enable Versioning in source & destination buckets.
Copy is ASYNC
S3 - CORS
Need to enable CORS headers if client does cross-origin request.
S3 - Event Notifications
to receive notifications when an event happen in your S3 bucket
S3 - Pre-signed URLs
Provide temporary access URL to your private bucket.
Url expiration varies by the way you generate the URLs
S3 console: 1 min ~ 12 hours
AWS CLI: 3600sec ~ 168 hours
Use cases: allow only logged-in users to see you premium videos.
S3 - Object Lock
Object versioning must be
enabled
.Block an object version deletion for a specified amount of time.
Retention mode
Compliance (strick mode): can't be overwritten or deleted by any user, even root user
Governance (softer): most user can't overwrite or delete an object.
S3 Glacier - Vault Lock
refer Vault Lock.
S3 Storage Lens
A fully managed S3 storage analytics solution that provides a comprehensive view of
object storage usage
activity trends
recommendations to optimize costs.
Storage Lens allows you to analyze object access patterns across all of your S3 buckets and generate detailed metrics and reports.
S3 - Object Lambda
Allows you to add your own code to S3 GET requests to modify and process data as it's being returned to an application.
Use cases: data needs to be transformed on-the-fly, redact the PII from the data.
Pricing
S3 Classes
Below is the pricing order. The first one is the most expensive one.
Standard
Standard Infrequent Access
lower cost than
Standard
use cases: disaster recovery, backups
Intelligent-Tiering:
automatically move
your data to infrequent access tier S3 Standard-IAOne Zone IA
11 9's in single AZ, but data lost when AZ is detroyed.
use cases: storing backup data of your on-premises, or data that can recreated.
Glacier: low cost storage used for achiving/backup
Glacier Instant Retrieval
Glacier Flexible Retrieval
Glacier Deep Archive: long term storage
S3 Lifecycle policy
Help optimize S3 storage cost
Ex: transit all objects of a bucket from Standard class -> Standard-IA after 6 months uploading.
Storage class analysis
Help to decide when to transit to the right class
Recommendation for
Standard
&Standard-IA
. Not work for One-Zone-IA or Glacier.
S3 Requester Pay
Performance
S3 scales per
prefix
. Request per second as below:3,500 PUT/COPY/POST/DELETE
5,500 GET/HEAD
Latency between 100-200ms
Support FOLDER concept to
group
objects.Use as many prefixes as posible to achieve the required throughput and disired performance
Object may be replicated accross AZs, but within a single region. S3-IA object can be in 1 AZ.
S3 - Transfer acceleration
A bucket-level feature.
Use S3 Transfer Acceleration to enable fast, easy and secure transfer of files over LONG distance. It will transfer files to an
AWS Edge location
of target S3 bucket. -> Speed up 50-500%Use cases:
Need to collect data from various locations.
S3 Select & Glacier select
Using Server-side filtering (simple SQL) for better performance & less transfer, CPU cost at client. You can perform S3 Select to query only the necessary data inside the CSV files based on the bucket's name and the object's key.
For more complex queries, consider using Athena.
Supports CSV, JSON, and Parquet.
Use cases: retrieve only a subset of data, best for simple SQL (no JOIN, no function, no array...)
S3 Batch operation
Perform batch operations on existing S3 objects.
S3 Inventory
S3 Inventory provides a report of your S3 objects and their corresponding metadata on a daily or weekly basis for a specified S3 bucket or a shared prefix.
These reports include the type of server-side encryption each object is using, along with its replication status.
Features
Configurable to include all or specific object versions.
Can report on various metadata fields such as size, last modified date, storage class, and encryption status.
Supports output in CSV, ORC, and Parquet formats, enabling straightforward integration with analytics tools.
Security
User-based: IAM policies
Resource-based:
Bucket policy: ex: allow cross-account access
Object ACL (can be disable)
Bucket ACL (can be disable)
Encryption data at rest:
SSE-S3
,SSE-KMS
,SSE-C (Customer provided key)
Encryption data in transit: TLS
Access to the
most recent
data immediately after a write (create or overwrite)
S3 - MFA delete
To avoid accidental deletion in S3 bucket:
Enable versioning
Enable MFA delete
S3 - Encryption
Method | Key management | Encryption process | Extras |
---|---|---|---|
Client-side | you | you | None |
SSE-C | you | S3 | None |
S3 | S3 | AES-256 | |
SSE-KMS | S3 & KMS | S3 |
|
By providing S3 object key and the encryption key, you can use GetObject API to download encrypted object.
Amazon S3 now applies server-side encryption with Amazon S3 managed keys (SSE-S3) as the base level of encryption for every bucket in Amazon S3. You only need to use the x-amz-server-side-encryption
header if you want to override the default SSE-S3 encryption and use a different encryption option like SSE-KMS or SSE-C
Note
Server-side encryption encrypts only the object data, not the object metadata.
SSE-KMS supports symmetric keys, not asymmetric keys.
SSE-KMS limitation
KMS have limitation of request per second
When download, it call Decrypt KMS API
When upload, it calls
GenerateDataKey
KMS API
KMS quota different between region: 5500, 10000, 30000
increase the quota by making request at Service Quota console
Best practices
Encrypt your data either on the client-side
Turn on
versioning
Consider using multipart uploads (divide in parts & parallel uploads) for
object that are over 100MB.
if > 5GB,
multi-part upload
is a MUST.
Trivia
Bucket name is
GLOBALY
unique. But bucket are defined at REGIONAL level.Maximum of an object is 5TB.
Multi-part upload is recommended for file > 100MB. If a file larger than 5GB,
multi-part upload is a must.
S3 provides no API that can search for objects based on object metadata.
Naming convention
No UPPERCASE, no underscore
3-63 chars long
Not an IP
Must start with lowercase letter or number
Must NOT start with prefix xn--
Must NOT end with suffix -s3alias
Using Athena to query data in S3 by standard SQL.
To achieve more requests per second, increase prefixes in your bucket.
"aws:SecureTransport":"false"
is a condition andDeny
effect to a bucket policy to force the request using SSL or TLS -> force using HTTPS.When an application running on the EC2 instance makes a call to the S3
ListObjects
API, the request will be allowed by IAM if the attached role has thes3:ListBucket
permission for the relevant bucket.
Concepts
Key: FULL path of an object
ex: s3://mybucket/myfolder/subfolder/myfile.txt
Bucket policy: JSON-based access policy that determines who can access your bucket & what operations they can perform.
Hot storage: frequently accessed data
Warm storage: less frequently accessed data
Cold storage: rarely accessed data
Pricing
Last updated