S3

FAQs | SSE-S3 |

Overview

Features

Support LIST/GET/PUT/COPY/POST/DELETE

S3 - Static Website Hosting

S3 can host static websites and have them accessible on the Internet.
The website endpoint will be: http://bucket-name.s3-website-aws-region.amazonaws.com
- none SSL
- included bucket-name & region-name in the endpoint link.

S3 - Bucket Versioning

Helps you recover accidental, such as: overwrite & delete.
Can be enabled at bucket level.
Turn ON versioning is a best practice.
When not enable = Suspend
Before enabling versioning, all the Version ID of objects are NULL

# list all objects
aws s3 ls s3://bucketname

# list all versioning objects
aws s3api list-object-versions --bucket bucketname

After enabling Bucket Versioning, you might need to update your lifecycle rules to manage previous versions of objects -> Only apply to newly create objects.

After suspend Bucket Versioning

Lifecycle rules set for previous object versions will still apply.
Existing objects in your bucket do not change.
Newly added objects with the same name as an existing object replaces the existing object.

S3 - Analytics

Used to analyze storage access patterns to help you decide when to transition the right data to the right storage class.

S3 - Replication

MUST enable Versioning in source & destination buckets.
Copy is ASYNC

S3 - CORS

Need to enable CORS headers if client does cross-origin request.

S3 - Event Notifications

to receive notifications when an event happen in your S3 bucket

S3 - Pre-signed URLs

Provide temporary access URL to your private bucket.

Url expiration varies by the way you generate the URLs

S3 console: 1 min ~ 12 hours
AWS CLI: 3600sec ~ 168 hours

Use cases: allow only logged-in users to see you premium videos.

S3 - Object Lock

Object versioning must be enabled.
Block an object version deletion for a specified amount of time.
Retention mode
- Compliance (strick mode): can't be overwritten or deleted by any user, even root user
- Governance (softer): most user can't overwrite or delete an object.

S3 Glacier - Vault Lock

refer Vault Lock.

S3 Storage Lens

A fully managed S3 storage analytics solution that provides a comprehensive view of

object storage usage
activity trends
recommendations to optimize costs.

Storage Lens allows you to analyze object access patterns across all of your S3 buckets and generate detailed metrics and reports.

S3 - Object Lambda

Allows you to add your own code to S3 GET requests to modify and process data as it's being returned to an application.

Use cases: data needs to be transformed on-the-fly, redact the PII from the data.

Pricing

S3 Classes

Below is the pricing order. The first one is the most expensive one.

Standard
Standard Infrequent Access
- lower cost than Standard
- use cases: disaster recovery, backups
Intelligent-Tiering: automatically move your data to infrequent access tier S3 Standard-IA
One Zone IA
- 11 9's in single AZ, but data lost when AZ is detroyed.
- use cases: storing backup data of your on-premises, or data that can recreated.
Glacier: low cost storage used for achiving/backup
- Glacier Instant Retrieval
- Glacier Flexible Retrieval
Glacier Deep Archive: long term storage

S3 Lifecycle policy

Help optimize S3 storage cost
Ex: transit all objects of a bucket from Standard class -> Standard-IA after 6 months uploading.

Storage class analysis

Help to decide when to transit to the right class
Recommendation for Standard & Standard-IA. Not work for One-Zone-IA or Glacier.

S3 Requester Pay

In requester pay pattern, the owner still in charge for storage cost.

Performance

S3 scales per prefix. Request per second as below:
- 3,500 PUT/COPY/POST/DELETE
- 5,500 GET/HEAD
Latency between 100-200ms
Support FOLDER concept to group objects.
Use as many prefixes as posible to achieve the required throughput and disired performance

Object may be replicated accross AZs, but within a single region. S3-IA object can be in 1 AZ.

S3 - Transfer acceleration

A bucket-level feature.
Use S3 Transfer Acceleration to enable fast, easy and secure transfer of files over LONG distance. It will transfer files to an AWS Edge location of target S3 bucket. -> Speed up 50-500%
Use cases:
- Need to collect data from various locations.

S3 Select & Glacier select

Using Server-side filtering (simple SQL) for better performance & less transfer, CPU cost at client. You can perform S3 Select to query only the necessary data inside the CSV files based on the bucket's name and the object's key.
For more complex queries, consider using Athena.
Supports CSV, JSON, and Parquet.
Use cases: retrieve only a subset of data, best for simple SQL (no JOIN, no function, no array...)

S3 Batch operation

Perform batch operations on existing S3 objects.
Can use S3 Inventory to get the list of object. Then use S3 Select to filter objects.

S3 Inventory

S3 Inventory provides a report of your S3 objects and their corresponding metadata on a daily or weekly basis for a specified S3 bucket or a shared prefix.

These reports include the type of server-side encryption each object is using, along with its replication status.

Features

Configurable to include all or specific object versions.
Can report on various metadata fields such as size, last modified date, storage class, and encryption status.
Supports output in CSV, ORC, and Parquet formats, enabling straightforward integration with analytics tools.

Security

User-based: IAM policies
Resource-based:
- Bucket policy: ex: allow cross-account access
- Object ACL (can be disable)
- Bucket ACL (can be disable)
Encryption data at rest: SSE-S3, SSE-KMS, SSE-C (Customer provided key)
Encryption data in transit: TLS
Access to the most recent data immediately after a write (create or overwrite)

S3 - MFA delete

To avoid accidental deletion in S3 bucket:

Enable versioning
Enable MFA delete

S3 - Encryption

Method

Key management

Encryption process

Extras

Client-side

you

None

SSE-C

you

None

SSE-S3

AES-256

SSE-KMS

S3 & KMS

Rotation control
Role seperation

By providing S3 object key and the encryption key, you can use GetObject API to download encrypted object.

Amazon S3 now applies server-side encryption with Amazon S3 managed keys (SSE-S3) as the base level of encryption for every bucket in Amazon S3. You only need to use the x-amz-server-side-encryption header if you want to override the default SSE-S3 encryption and use a different encryption option like SSE-KMS or SSE-C

Note

Server-side encryption encrypts only the object data, not the object metadata.
SSE-KMS supports symmetric keys, not asymmetric keys.

SSE-KMS limitation

KMS have limitation of request per second
- When download, it call Decrypt KMS API
- When upload, it calls GenerateDataKey KMS API
KMS quota different between region: 5500, 10000, 30000
- increase the quota by making request at Service Quota console

Best practices

Encrypt your data either on the client-side
Turn on versioning
Consider using multipart uploads (divide in parts & parallel uploads) for
- object that are over 100MB.
- if > 5GB, multi-part upload is a MUST.

Trivia

Bucket name is GLOBALY unique. But bucket are defined at REGIONAL level.
Maximum of an object is 5TB.
Multi-part upload is recommended for file > 100MB. If a file larger than 5GB, multi-part upload is a must.
S3 provides no API that can search for objects based on object metadata.
Naming convention
- No UPPERCASE, no underscore
- 3-63 chars long
- Not an IP
- Must start with lowercase letter or number
- Must NOT start with prefix xn--
- Must NOT end with suffix -s3alias
Using Athena to query data in S3 by standard SQL.
To achieve more requests per second, increase prefixes in your bucket.
"aws:SecureTransport":"false" is a condition and Deny effect to a bucket policy to force the request using SSL or TLS -> force using HTTPS.
When an application running on the EC2 instance makes a call to the S3 ListObjects API, the request will be allowed by IAM if the attached role has the s3:ListBucket permission for the relevant bucket.

Concepts

Key: FULL path of an object
- ex: s3://mybucket/myfolder/subfolder/myfile.txt
Bucket policy: JSON-based access policy that determines who can access your bucket & what operations they can perform.

Hot storage: frequently accessed data
Warm storage: less frequently accessed data
Cold storage: rarely accessed data

Pricing

PreviousFSx NextS3 Glacier

Last updated 1 year ago