AWS
DevOps
  • knowledge
    • glossary
    • network knowledge
      • CIDR Block
      • OSI
      • List of Ports
      • Network model
    • AWS best practices
      • Least privilege principle
      • Support Plan
      • Well-architected framework
        • Well-architected framework
        • Cost optimization
        • Operational Excellence
        • Performance efficiency
        • Reliability
        • Security
    • Exams
      • DOP-C02
        • DOP-C02 topics
        • DOP-C02 Labs
      • DVA-C02
      • SOA-C02
  • services
    • access management
      • Directory Service
      • IAM
        • PassRole
      • IAM Identity Center (SSO)
      • Organizations
        • Organizational Unit
        • Control Tower
      • AD Domain Service
    • analytics
      • data analytic
        • Athena
        • QuickSight
        • Redshift
      • data collection
        • Data Lake
        • Lake Formation
      • data processing
        • EMR
        • Kinesis
        • Glue
          • Glue Data Catalog
      • OpenSearch
    • compute
      • Batch
      • EC2
        • Auto Scaling
        • AMI
        • ELB
          • Global accelerator
        • Security Group
        • EBS
        • EC2 Instance Store
        • Spot Fleet
      • Elastic Beanstalk
      • Lambda
        • Layer
        • Lambda API
      • Outposts
      • Wavelength
      • SAM
      • VMWare Cloud
    • container
      • Copilot
      • ECR
      • ECS
        • ECS Anywhere
      • EKS
        • EKS Anywhere
        • EKS Distro
      • Fargate
    • cost management
      • Budgets
      • Cost Explorer
      • Saving Plans
      • Compute Optimizer
    • database
      • Data Engineer
      • Document DB
      • DynamoDB
        • DynamoDB API
        • Scan
      • ElastiCache
      • Keyspaces
      • MemoryDB for Redis
      • Neptune
      • Quantum Ledger Database
      • RDS
        • Aurora
          • Aurora Global Database
          • Aurora Serverless
      • Timestream
    • devTools
      • CICD
        • CodeArtifact
        • CodeCommit
        • CodeBuild
        • CodeDeploy
        • CodePipeline
      • CloudFormation
      • CodeGuru
      • CodeStar
      • CodeWhisperer
      • X-Ray
      • Deployment strategies
    • finance
      • Cost explorer
    • integration
      • AppFlow
      • AppSync
      • EventBridge
      • MQ
      • SNS
      • SQS
      • Step Functions
      • SWF
    • management
      • AppConfig
      • AWS Backup
      • AWS CDK
      • Config
      • Grafana
      • Health Dashboard
      • Proton
      • Service Catalog
      • System Manager
      • SSM
      • Resource Group
      • OpsWorks (discontinued)
    • media
      • Elemental MediaConvert
      • Transcoder
    • messaging
      • SES
    • migration
      • Application Migration Service
      • DataSync
      • DMS
      • Migration Evaluator
      • Migration Hub
      • Server Migration Service
      • Snow Family
      • Transfer Family
    • ML
      • Comprehend
      • Forecast
      • Kendra
      • Lex
      • Rekognition
      • SageMaker
        • SageMaker Data Wrangler
        • SageMaker ML Lineage Tracking
    • monitoring
      • CloudTrail
      • CloudWatch
      • TrustedAdvisor
    • networking
      • CloudFront
      • Customer gateway
      • Edge Location
      • hybrid networking
        • Direct Connect
          • Direct Connect Gateway
        • Site-to-site VPN
      • PrivateLink
      • Region
        • AZ
      • Route 53
      • Transit Gateway
      • VPC
        • VPC Lattice
        • Subnet
          • NACL
        • Internet Gateway
        • Network Firewall
        • VPN
        • NAT Gateway
      • Virtual Private Gateway
    • security
      • Artifact
      • ACM
      • CloudHSM
      • Cognito
      • Detective
      • Firewall Manager
      • GuardDuty
      • Inspector
      • KMS
      • Macie
      • Network Firewall
      • Resource Access Manager
      • Security Hub
      • Secret Manager
      • Secret Hub
      • Shield
      • STS
      • Trusted Advisor
      • WAF
    • storage
      • Backup
      • EBS
      • EFS
      • FSx
      • S3
        • S3 Glacier
        • S3 Snippet
        • S3 Mountpoint
      • Snow family
      • Storage gateway
      • WorkDocs
    • web & mobile
      • Amplify
      • API Gateway
      • Device Farm
      • Pinpoint
Powered by GitBook
On this page
  • Overview
  • Features
  • S3 - Static Website Hosting
  • S3 - Bucket Versioning
  • S3 - Analytics
  • S3 - Replication
  • S3 - CORS
  • S3 - Event Notifications
  • S3 - Pre-signed URLs
  • S3 - Object Lock
  • S3 Glacier - Vault Lock
  • S3 Storage Lens
  • S3 - Object Lambda
  • Pricing
  • S3 Classes
  • S3 Lifecycle policy
  • Storage class analysis
  • S3 Requester Pay
  • Performance
  • S3 - Transfer acceleration
  • S3 Select & Glacier select
  • S3 Batch operation
  • S3 Inventory
  • Security
  • S3 - MFA delete
  • S3 - Encryption
  • SSE-KMS limitation
  • Best practices
  • Trivia
  • Concepts
  • Pricing
  1. services
  2. storage

S3

PreviousFSxNextS3 Glacier

Last updated 1 year ago

| |

Overview

Features

  • Support LIST/GET/PUT/COPY/POST/DELETE

S3 - Static Website Hosting

  • S3 can host static websites and have them accessible on the Internet.

  • The website endpoint will be: http://bucket-name.s3-website-aws-region.amazonaws.com

    • none SSL

    • included bucket-name & region-name in the endpoint link.

S3 - Bucket Versioning

  • Helps you recover accidental, such as: overwrite & delete.

  • Can be enabled at bucket level.

  • Turn ON versioning is a best practice.

  • When not enable = Suspend

  • Before enabling versioning, all the Version ID of objects are NULL

# list all objects
aws s3 ls s3://bucketname

# list all versioning objects
aws s3api list-object-versions --bucket bucketname

After enabling Bucket Versioning, you might need to update your lifecycle rules to manage previous versions of objects -> Only apply to newly create objects.

After suspend Bucket Versioning

  • Lifecycle rules set for previous object versions will still apply.

  • Existing objects in your bucket do not change.

  • Newly added objects with the same name as an existing object replaces the existing object.

S3 - Analytics

Used to analyze storage access patterns to help you decide when to transition the right data to the right storage class.

S3 - Replication

  • Copy is ASYNC

S3 - CORS

  • Need to enable CORS headers if client does cross-origin request.

S3 - Event Notifications

to receive notifications when an event happen in your S3 bucket

S3 - Pre-signed URLs

Provide temporary access URL to your private bucket.

Url expiration varies by the way you generate the URLs

  • S3 console: 1 min ~ 12 hours

  • AWS CLI: 3600sec ~ 168 hours

Use cases: allow only logged-in users to see you premium videos.

S3 - Object Lock

  • Object versioning must be enabled.

  • Block an object version deletion for a specified amount of time.

  • Retention mode

    • Compliance (strick mode): can't be overwritten or deleted by any user, even root user

    • Governance (softer): most user can't overwrite or delete an object.

S3 Glacier - Vault Lock

S3 Storage Lens

A fully managed S3 storage analytics solution that provides a comprehensive view of

  • object storage usage

  • activity trends

  • recommendations to optimize costs.

Storage Lens allows you to analyze object access patterns across all of your S3 buckets and generate detailed metrics and reports.

S3 - Object Lambda

Allows you to add your own code to S3 GET requests to modify and process data as it's being returned to an application.

Use cases: data needs to be transformed on-the-fly, redact the PII from the data.

Pricing

S3 Classes

Below is the pricing order. The first one is the most expensive one.

  • Standard

  • Standard Infrequent Access

    • lower cost than Standard

    • use cases: disaster recovery, backups

  • Intelligent-Tiering: automatically move your data to infrequent access tier S3 Standard-IA

  • One Zone IA

    • 11 9's in single AZ, but data lost when AZ is detroyed.

    • use cases: storing backup data of your on-premises, or data that can recreated.

  • Glacier: low cost storage used for achiving/backup

    • Glacier Instant Retrieval

    • Glacier Flexible Retrieval

  • Glacier Deep Archive: long term storage

S3 Lifecycle policy

  • Help optimize S3 storage cost

  • Ex: transit all objects of a bucket from Standard class -> Standard-IA after 6 months uploading.

Storage class analysis

  • Help to decide when to transit to the right class

  • Recommendation for Standard & Standard-IA. Not work for One-Zone-IA or Glacier.

S3 Requester Pay

Performance

  • S3 scales per prefix. Request per second as below:

    • 3,500 PUT/COPY/POST/DELETE

    • 5,500 GET/HEAD

  • Latency between 100-200ms

  • Support FOLDER concept to group objects.

  • Use as many prefixes as posible to achieve the required throughput and disired performance

Object may be replicated accross AZs, but within a single region. S3-IA object can be in 1 AZ.

S3 - Transfer acceleration

  • A bucket-level feature.

  • Use S3 Transfer Acceleration to enable fast, easy and secure transfer of files over LONG distance. It will transfer files to an AWS Edge location of target S3 bucket. -> Speed up 50-500%

  • Use cases:

    • Need to collect data from various locations.

S3 Select & Glacier select

  • Using Server-side filtering (simple SQL) for better performance & less transfer, CPU cost at client. You can perform S3 Select to query only the necessary data inside the CSV files based on the bucket's name and the object's key.

  • For more complex queries, consider using Athena.

  • Supports CSV, JSON, and Parquet.

  • Use cases: retrieve only a subset of data, best for simple SQL (no JOIN, no function, no array...)

S3 Batch operation

  • Perform batch operations on existing S3 objects.

S3 Inventory

S3 Inventory provides a report of your S3 objects and their corresponding metadata on a daily or weekly basis for a specified S3 bucket or a shared prefix.

These reports include the type of server-side encryption each object is using, along with its replication status.

Features

  • Configurable to include all or specific object versions.

  • Can report on various metadata fields such as size, last modified date, storage class, and encryption status.

  • Supports output in CSV, ORC, and Parquet formats, enabling straightforward integration with analytics tools.

Security

  • User-based: IAM policies

  • Resource-based:

    • Object ACL (can be disable)

    • Bucket ACL (can be disable)

  • Encryption data at rest: SSE-S3, SSE-KMS, SSE-C (Customer provided key)

  • Encryption data in transit: TLS

  • Access to the most recent data immediately after a write (create or overwrite)

S3 - MFA delete

To avoid accidental deletion in S3 bucket:

  • Enable MFA delete

S3 - Encryption

Method
Key management
Encryption process
Extras

Client-side

you

you

None

SSE-C

you

S3

None

S3

S3

AES-256

SSE-KMS

S3 & KMS

S3

  • Rotation control

  • Role seperation

By providing S3 object key and the encryption key, you can use GetObject API to download encrypted object.

Amazon S3 now applies server-side encryption with Amazon S3 managed keys (SSE-S3) as the base level of encryption for every bucket in Amazon S3. You only need to use the x-amz-server-side-encryption header if you want to override the default SSE-S3 encryption and use a different encryption option like SSE-KMS or SSE-C

Note

  • Server-side encryption encrypts only the object data, not the object metadata.

  • SSE-KMS supports symmetric keys, not asymmetric keys.

SSE-KMS limitation

  • KMS have limitation of request per second

    • When download, it call Decrypt KMS API

    • When upload, it calls GenerateDataKey KMS API

  • KMS quota different between region: 5500, 10000, 30000

    • increase the quota by making request at Service Quota console

Best practices

  • Encrypt your data either on the client-side

  • Turn on versioning

  • Consider using multipart uploads (divide in parts & parallel uploads) for

    • object that are over 100MB.

    • if > 5GB, multi-part upload is a MUST.

Trivia

  • Bucket name is GLOBALY unique. But bucket are defined at REGIONAL level.

  • Maximum of an object is 5TB.

  • Multi-part upload is recommended for file > 100MB. If a file larger than 5GB, multi-part upload is a must.

  • S3 provides no API that can search for objects based on object metadata.

  • Naming convention

    • No UPPERCASE, no underscore

    • 3-63 chars long

    • Not an IP

    • Must start with lowercase letter or number

    • Must NOT start with prefix xn--

    • Must NOT end with suffix -s3alias

  • Using Athena to query data in S3 by standard SQL.

  • To achieve more requests per second, increase prefixes in your bucket.

  • "aws:SecureTransport":"false" is a condition and Deny effect to a bucket policy to force the request using SSL or TLS -> force using HTTPS.

  • When an application running on the EC2 instance makes a call to the S3 ListObjects API, the request will be allowed by IAM if the attached role has the s3:ListBucket permission for the relevant bucket.

Concepts

    • ex: s3://mybucket/myfolder/subfolder/myfile.txt

  • Hot storage: frequently accessed data

  • Warm storage: less frequently accessed data

  • Cold storage: rarely accessed data

Pricing

MUST enable in source & destination buckets.

refer .

Cost-effective classes
S3 classes

In requester pay pattern, the owner still in charge for storage cost.

Can use S3 Inventory to get the list of object. Then use to filter objects.

: ex: allow cross-account access

Enable

: FULL path of an object

: JSON-based access policy that determines who can access your bucket & what operations they can perform.

Vault Lock
Key
Bucket policy
Versioning
Bucket policy
versioning
SSE-S3
FAQs
SSE-S3
S3 Select
S3 Object Lambda
a bucket policy example