S3 Storage Class Optimization: Stop Paying Hot Prices for Cold Data
You're Probably Paying 10x Too Much for Storage
Here's a stat that still blows my mind: the average company stores 70-80% of their S3 data in S3 Standard, but only 20-30% of that data gets accessed regularly. You're paying $0.023/GB/month for data that nobody's touched in six months.
Let me show you what that looks like in dollars.
| S3 Data Volume | % in Standard (Typical) | Monthly Overpay | Annual Waste |
|---|---|---|---|
| 1 TB | 80% | $12.80 | $154 |
| 10 TB | 80% | $128 | $1,536 |
| 100 TB | 80% | $1,280 | $15,360 |
| 1 PB | 80% | $12,800 | $153,600 |
At the petabyte scale, you're leaving $150K+ on the table every year. And the fix takes about an hour.
The S3 Storage Class Cheat Sheet
Let's cut through the marketing. Here's what each class actually costs and when to use it.
| Storage Class | $/GB/Month | Min Duration | Retrieval Cost | Best For |
|---|---|---|---|---|
| S3 Standard | $0.023 | None | Free | Active data, accessed weekly+ |
| S3 Intelligent-Tiering | $0.023 + $0.0025/1K objects | None | Free | Unpredictable access patterns |
| S3 Standard-IA | $0.0125 | 30 days | $0.01/GB | Accessed < 1x/month |
| S3 One Zone-IA | $0.01 | 30 days | $0.01/GB | Reproducible infrequent data |
| S3 Glacier Instant | $0.004 | 90 days | $0.03/GB | Quarterly access, millisecond retrieval |
| S3 Glacier Flexible | $0.0036 | 90 days | $0.01/GB (hours) | Annual access, can wait hours |
| S3 Glacier Deep Archive | $0.00099 | 180 days | $0.02/GB (12hrs) | Compliance, rarely if ever accessed |
The spread from Standard to Deep Archive is 23x. That's the kind of number that gets you a raise in FinOps.
Step 1: Analyze Your Access Patterns
Before you move anything, understand what you actually have. S3 Storage Lens gives you the overview, but for granular bucket-level analysis:
# Get storage breakdown by class for a specific bucket
aws s3api list-objects-v2 \
--bucket my-app-data \
--query "Contents[].{Key:Key,Size:Size,LastModified:LastModified,StorageClass:StorageClass}" \
--output json | jq '
group_by(.StorageClass) |
map({
class: .[0].StorageClass,
count: length,
total_gb: (map(.Size) | add / 1073741824 | . * 100 | round / 100)
})'
For access pattern analysis, enable S3 Server Access Logging or use CloudTrail data events:
# Enable S3 access logging
aws s3api put-bucket-logging \
--bucket my-app-data \
--bucket-logging-status '{
"LoggingEnabled": {
"TargetBucket": "my-access-logs",
"TargetPrefix": "s3-access/my-app-data/"
}
}'
Run logging for 30 days, then query it with Athena to find objects that haven't been accessed.
-- Athena query: find objects not accessed in 90+ days
SELECT key, MAX(requestdatetime) as last_access
FROM s3_access_logs
WHERE bucket = 'my-app-data'
AND operation LIKE 'REST.GET%'
GROUP BY key
HAVING MAX(requestdatetime) < date_add('day', -90, current_date)
ORDER BY last_access ASC;
Step 2: Build Lifecycle Policies
This is where the savings happen. A well-designed lifecycle policy automates the entire tiering process.
Terraform Lifecycle Configuration
resource "aws_s3_bucket_lifecycle_configuration" "cost_optimized" {
bucket = aws_s3_bucket.app_data.id
# Rule 1: Application logs — aggressive tiering
rule {
id = "logs-lifecycle"
status = "Enabled"
filter {
prefix = "logs/"
}
transition {
days = 30
storage_class = "STANDARD_IA"
}
transition {
days = 90
storage_class = "GLACIER"
}
expiration {
days = 365
}
}
# Rule 2: User uploads — moderate tiering
rule {
id = "uploads-lifecycle"
status = "Enabled"
filter {
prefix = "uploads/"
}
transition {
days = 60
storage_class = "STANDARD_IA"
}
transition {
days = 180
storage_class = "GLACIER_IR"
}
}
# Rule 3: Backups — deep archive fast
rule {
id = "backups-lifecycle"
status = "Enabled"
filter {
prefix = "backups/"
}
transition {
days = 1
storage_class = "GLACIER"
}
transition {
days = 90
storage_class = "DEEP_ARCHIVE"
}
expiration {
days = 2555 # 7 years for compliance
}
}
# Rule 4: Clean up incomplete multipart uploads
rule {
id = "abort-multipart"
status = "Enabled"
filter {
prefix = ""
}
abort_incomplete_multipart_upload {
days_after_initiation = 7
}
}
}
That last rule — aborting incomplete multipart uploads — is free money. I've seen buckets with hundreds of GBs of orphaned multipart fragments. You're paying Standard rates for literal garbage.
Step 3: Use Intelligent-Tiering for the Unpredictable Stuff
When you genuinely don't know the access pattern, Intelligent-Tiering is the move. It automatically shifts objects between tiers and you never pay retrieval fees.
# Set default storage class for a bucket to Intelligent-Tiering
aws s3api put-bucket-intelligent-tiering-configuration \
--bucket my-app-data \
--id "full-tiering" \
--intelligent-tiering-configuration '{
"Id": "full-tiering",
"Status": "Enabled",
"Tierings": [
{
"AccessTier": "ARCHIVE_ACCESS",
"Days": 90
},
{
"AccessTier": "DEEP_ARCHIVE_ACCESS",
"Days": 180
}
]
}'
The monitoring fee is $0.0025 per 1,000 objects/month. For objects under 128 KB, there's no monitoring fee and they always stay in the frequent access tier. At scale, the monitoring cost is negligible compared to the savings.
| Scenario (10 TB, 50M objects) | Standard (all hot) | Intelligent-Tiering |
|---|---|---|
| Storage cost | $235/mo | $47-$140/mo |
| Monitoring fee | $0 | $125/mo |
| Retrieval cost | $0 | $0 |
| Total | $235/mo | $172-$265/mo |
For data with unpredictable access patterns, Intelligent-Tiering saves 25-40% on average.
Step 4: Watch for the Gotchas
Minimum Storage Duration Charges
Move an object to Glacier and delete it after 30 days? You still pay for 90 days. Factor this into your lifecycle rules.
# BAD: Transition to Glacier at 60 days, expire at 80 days
# You pay for 90 days of Glacier storage even though object is deleted at day 80
# GOOD: Transition to Glacier at 60 days, expire at 150+ days
# Object lives past the 90-day minimum, no wasted spend
Minimum Object Size
Objects smaller than 128 KB in Standard-IA or One Zone-IA get billed as 128 KB. If you have millions of tiny files, Standard or Intelligent-Tiering is cheaper.
| Actual Object Size | Standard Cost (1M objects) | Standard-IA Cost (1M objects) |
|---|---|---|
| 1 KB | $0.023 | $1.60 (billed as 128 KB each!) |
| 10 KB | $0.23 | $1.60 |
| 128 KB | $2.94 | $1.60 |
| 1 MB | $23.00 | $12.50 |
See that? For 1 KB objects, Standard-IA is 70x more expensive than Standard. Size matters.
Retrieval Costs Add Up
Before moving everything to Glacier, model the retrieval costs:
# Estimate monthly retrieval cost
# If you retrieve 100 GB/month from Glacier Flexible:
# Retrieval: 100 GB * $0.01/GB = $1.00
# Data transfer: 100 GB * $0.09/GB (to internet) = $9.00
# Total retrieval cost: $10.00/month
# Compare: keeping 100 GB in Standard = $2.30/month
# If you retrieve the full 100 GB monthly, Standard is cheaper!
The break-even point: if you retrieve more than ~15% of your Glacier data per month, Standard-IA is probably cheaper.
Real-World Savings Breakdown
Here's what a lifecycle optimization project looked like for a 50 TB bucket I worked on last quarter:
| Data Category | Volume | Before (Standard) | After (Optimized) | Monthly Savings |
|---|---|---|---|---|
| Active app data | 10 TB | $235 | $235 (Standard) | $0 |
| Logs (30-90 days) | 15 TB | $353 | $191 (Standard-IA) | $162 |
| Logs (90+ days) | 10 TB | $235 | $36 (Glacier) | $199 |
| Old backups | 12 TB | $282 | $12 (Deep Archive) | $270 |
| Multipart fragments | 3 TB | $71 | $0 (deleted) | $71 |
| Totals | 50 TB | $1,176/mo | $474/mo | $702/mo |
That's $8,424/year saved from a single bucket. Multiply that across your org and the numbers get serious fast.
The Action Plan
- Today: Enable S3 Storage Lens across all accounts. It's free for the dashboard-level metrics.
- This week: Add the abort-incomplete-multipart-upload rule to every bucket. Zero risk, immediate savings.
- Next two weeks: Analyze access patterns and deploy lifecycle policies for your top 5 buckets by size.
- Ongoing: Review S3 costs monthly. Access patterns change, and your lifecycle policies should evolve with them.
Storage costs are the silent killer of cloud budgets. They grow linearly with your data, and most teams just accept the number on the bill. Don't be that team.
Related Articles
Related Articles
The Complete AWS Cost Optimization Playbook: Compute, Storage, Networking, and Reserved Capacity
A data-driven playbook for cutting AWS costs across compute, storage, networking, and reserved capacity with real numbers and actions.
AWS Lambda Cost Optimization: Memory Tuning, Provisioned Concurrency, and ARM
Cut your AWS Lambda costs by 40-70% with memory right-sizing, ARM/Graviton migration, and smart provisioned concurrency strategies.
Reserved Instances vs Savings Plans: Which to Buy When
A data-driven comparison of AWS Reserved Instances vs Savings Plans — with decision frameworks, break-even math, and real purchase recommendations.