AWS Lambda Cost Optimization: Memory Tuning, Provisioned Concurrency, and ARM
Your Lambda Bill Is Probably 2x What It Should Be
I audited a fintech company's Lambda spend last quarter. They were running 14 functions processing payment webhooks, each configured with the default 128 MB memory. Their monthly Lambda bill: $4,200. After optimization — memory tuning, ARM migration, and selective provisioned concurrency — they dropped to $1,650/month. That's a 61% reduction.
Most teams set Lambda memory once during development and never touch it again. That's leaving money on the table. Here's exactly how to fix it.
Understanding Lambda Pricing
Lambda charges on two axes: requests and duration. Duration is where the money hides.
| Pricing Component | x86 Price | ARM (Graviton2) Price | Savings |
|---|---|---|---|
| Requests (per 1M) | $0.20 | $0.20 | 0% |
| Duration (per GB-second) | $0.0000166667 | $0.0000133334 | 20% |
| Provisioned Concurrency (per GB-hour) | $0.0000041667 | $0.0000033334 | 20% |
The key insight: Lambda bills per millisecond of GB-seconds. A 128 MB function running for 1,000 ms costs the same as a 1,024 MB function running for 125 ms. But here's the thing — more memory means more CPU, and more CPU often means proportionally faster execution.
Step 1: Memory Tuning with AWS Lambda Power Tuning
The single highest-ROI optimization. AWS provides an open-source tool that tests your function at every memory configuration and finds the sweet spot.
Deploy the Power Tuning Tool
# Deploy via AWS SAR (Serverless Application Repository)
aws serverlessrepo create-cloud-formation-change-set \
--application-id arn:aws:serverlessrepo:us-east-1:451282441545:applications/aws-lambda-power-tuning \
--stack-name lambda-power-tuning \
--capabilities CAPABILITY_IAM
# Or deploy via SAM
git clone https://github.com/alexcasalboni/aws-lambda-power-tuning.git
cd aws-lambda-power-tuning
sam build && sam deploy --guided
Run a Power Tuning Test
# Start the Step Functions execution
aws stepfunctions start-execution \
--state-machine-arn arn:aws:states:us-east-1:123456789012:stateMachine:powerTuningStateMachine \
--input '{
"lambdaARN": "arn:aws:lambda:us-east-1:123456789012:function:payment-processor",
"powerValues": [128, 256, 512, 1024, 1536, 2048, 3008],
"num": 50,
"payload": {"orderId": "test-123", "amount": 99.99},
"parallelInvocation": true,
"strategy": "cost"
}'
Real Results: Payment Processor Function
Here's what the power tuning output looked like for a payment processing function:
| Memory (MB) | Avg Duration (ms) | Cost per Invocation | Monthly Cost (2M invocations) |
|---|---|---|---|
| 128 | 2,340 | $0.0000487 | $97.40 |
| 256 | 1,180 | $0.0000491 | $98.20 |
| 512 | 620 | $0.0000516 | $103.20 |
| 1024 | 340 | $0.0000566 | $113.20 |
| 1536 | 210 | $0.0000525 | $105.00 |
| 2048 | 155 | $0.0000410 | $82.00 |
| 3008 | 148 | $0.0000575 | $115.00 |
The sweet spot was 2048 MB — not 128 MB. The function ran 15x faster and cost 16% less. This is counterintuitive but common: at higher memory, Lambda allocates proportionally more CPU, and CPU-bound functions complete so much faster that the total GB-seconds actually drops.
Automate Memory Tuning in CI/CD
# .github/workflows/lambda-power-tune.yml
name: Lambda Power Tuning
on:
workflow_dispatch:
inputs:
function_name:
description: 'Lambda function to tune'
required: true
jobs:
power-tune:
runs-on: ubuntu-latest
steps:
- name: Run Power Tuning
run: |
EXECUTION_ARN=$(aws stepfunctions start-execution \
--state-machine-arn ${{ secrets.POWER_TUNING_ARN }} \
--input "{
\"lambdaARN\": \"arn:aws:lambda:us-east-1:${{ secrets.AWS_ACCOUNT_ID }}:function:${{ inputs.function_name }}\",
\"powerValues\": [128, 256, 512, 1024, 1536, 2048, 3008],
\"num\": 100,
\"strategy\": \"cost\"
}" \
--query 'executionArn' --output text)
# Poll for completion
while true; do
STATUS=$(aws stepfunctions describe-execution \
--execution-arn $EXECUTION_ARN \
--query 'status' --output text)
if [ "$STATUS" != "RUNNING" ]; then break; fi
sleep 10
done
# Get optimal config
aws stepfunctions describe-execution \
--execution-arn $EXECUTION_ARN \
--query 'output' --output text
Step 2: Migrate to ARM (Graviton2)
This is the easiest 20% savings you'll ever get. AWS Graviton2 Lambda functions cost 20% less per GB-second and, in most workloads, run 10-15% faster.
Migration Checklist
# Check if your function uses native dependencies
# These need ARM-compatible layers
aws lambda get-function-configuration \
--function-name my-function \
--query '{Runtime: Runtime, Layers: Layers[].Arn, Architecture: Architectures[0]}'
Terraform Configuration
resource "aws_lambda_function" "processor" {
function_name = "payment-processor"
runtime = "nodejs20.x"
handler = "index.handler"
memory_size = 2048
timeout = 30
# Switch to ARM — this is literally one line
architectures = ["arm64"]
filename = data.archive_file.lambda.output_path
source_code_hash = data.archive_file.lambda.output_base64sha256
role = aws_iam_role.lambda_exec.arn
}
ARM Migration Savings (Real Numbers)
| Function | x86 Monthly Cost | ARM Monthly Cost | Savings |
|---|---|---|---|
| payment-processor | $82.00 | $64.20 | $17.80 (21.7%) |
| webhook-handler | $156.00 | $121.40 | $34.60 (22.2%) |
| report-generator | $340.00 | $265.00 | $75.00 (22.1%) |
| auth-validator | $94.00 | $73.30 | $20.70 (22.0%) |
| Total | $672.00 | $523.90 | $148.10/mo |
That's $1,777/year in savings from changing one line per function. No code changes, no refactoring.
Watch Out For
- Native binary dependencies — if your function uses compiled C/C++ extensions (sharp for image processing, bcrypt, etc.), you need ARM-compiled versions
- Lambda Layers — any layer with native code needs an ARM variant
- Container-based Lambdas — rebuild your Docker image with
--platform linux/arm64
# Multi-arch Lambda container
FROM public.ecr.aws/lambda/nodejs:20
COPY package*.json ./
RUN npm ci --production
COPY src/ ./src/
CMD ["src/index.handler"]
Step 3: Provisioned Concurrency — Only Where It Pays Off
Provisioned Concurrency eliminates cold starts by keeping function instances warm. But it's expensive if applied blindly. You're paying for idle compute.
When Provisioned Concurrency Makes Financial Sense
| Scenario | Cold Start Impact | Provisioned Concurrency? | Why |
|---|---|---|---|
| API Gateway backend (p99 SLA) | 1-3s added latency | Yes | SLA violations cost more than PC |
| Async event processor | Invisible to users | No | Nobody notices cold starts |
| Scheduled cron job | Runs once, who cares | No | Waste of money |
| Payment processor | Failed transactions | Yes | Revenue impact justifies cost |
| Low-traffic internal tool | < 10 req/hour | No | Cold starts are rare anyway |
Cost Calculation: Is It Worth It?
# provisioned_concurrency_calculator.py
def calculate_pc_cost(
concurrency_units: int,
memory_mb: int,
hours_per_month: int = 730, # ~30.4 days
architecture: str = "arm64"
):
gb = memory_mb / 1024
rate = 0.0000033334 if architecture == "arm64" else 0.0000041667
hourly_rate = rate * 3600 # convert per-second to per-hour
monthly_cost = concurrency_units * gb * hourly_rate * hours_per_month
return monthly_cost
# Example: 10 provisioned instances at 2048 MB on ARM
pc_cost = calculate_pc_cost(
concurrency_units=10,
memory_mb=2048,
architecture="arm64"
)
print(f"Monthly PC cost: ${pc_cost:.2f}")
# Output: Monthly PC cost: $175.68
# Compare with cold start cost
# If cold starts cause 0.1% transaction failures
# And you process 500K transactions/month at $50 avg
# Lost revenue: 500K * 0.001 * $50 = $25,000/month
# PC cost: $175.68/month
# ROI: 142x
Terraform with Auto Scaling
resource "aws_lambda_provisioned_concurrency_config" "api" {
function_name = aws_lambda_function.processor.function_name
provisioned_concurrent_executions = 10
qualifier = aws_lambda_alias.live.name
}
# Scale PC with demand using Application Auto Scaling
resource "aws_appautoscaling_target" "lambda" {
max_capacity = 50
min_capacity = 5
resource_id = "function:${aws_lambda_function.processor.function_name}:${aws_lambda_alias.live.name}"
scalable_dimension = "lambda:function:ProvisionedConcurrency"
service_namespace = "lambda"
}
resource "aws_appautoscaling_policy" "lambda" {
name = "lambda-pc-scaling"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.lambda.resource_id
scalable_dimension = aws_appautoscaling_target.lambda.scalable_dimension
service_namespace = aws_appautoscaling_target.lambda.service_namespace
target_tracking_scaling_policy_configuration {
target_value = 0.7 # Scale up when 70% of PC is utilized
predefined_metric_specification {
predefined_metric_type = "LambdaProvisionedConcurrencyUtilization"
}
}
}
The Combined Savings Summary
Here's what the fintech company's bill looked like after applying all three optimizations:
| Optimization | Before | After | Monthly Savings |
|---|---|---|---|
| Memory right-sizing (14 functions) | $4,200 | $2,740 | $1,460 |
| ARM migration (all functions) | $2,740 | $2,135 | $605 |
| Remove unnecessary Provisioned Concurrency | $2,800 | $1,315 | $1,485 |
| Add targeted PC (2 critical functions) | — | +$200 | -$200 |
| Total | $7,000 | $3,650 | $3,350/mo |
Annual savings: $40,200. Implementation time: one sprint. That's one of the highest ROI FinOps wins you can get.
Quick-Start Checklist
- Deploy Lambda Power Tuning and test your top 10 highest-cost functions
- Change
architectures = ["arm64"]for all functions without native x86 dependencies - Audit Provisioned Concurrency — remove it from async/batch functions
- Add auto-scaled Provisioned Concurrency only for latency-sensitive, revenue-critical paths
- Set up a monthly review of Lambda cost in Cost Explorer, grouped by function name
Don't try to optimize all functions at once. Sort by cost, start with the top five, and work your way down. The Pareto principle applies hard here — 20% of your functions are generating 80% of the bill.
Related Articles
Related Articles
The Complete AWS Cost Optimization Playbook: Compute, Storage, Networking, and Reserved Capacity
A data-driven playbook for cutting AWS costs across compute, storage, networking, and reserved capacity with real numbers and actions.
Automated Cloud Cost Anomaly Detection and Alerting
Set up automated cloud cost anomaly detection with AWS Cost Anomaly Detection and custom Lambda monitors to catch runaway spend early.
Reserved Instances vs Savings Plans: Which to Buy When
A data-driven comparison of AWS Reserved Instances vs Savings Plans — with decision frameworks, break-even math, and real purchase recommendations.