The Problem: Oversized Instances Everywhere
In almost every AWS review we conduct, EC2 right-sizing is the single biggest cost-saving opportunity. The pattern is always the same: someone picked an instance size during initial setup, the application worked fine, and nobody ever revisited the decision.
The result? Instances running at 5-15% CPU utilization, costing 2-4x more than they need to.
How to Find Oversized Instances
AWS Cost Explorer
Cost Explorer has a built-in right-sizing recommendation feature. Go to Cost Explorer, click on "Right Sizing Recommendations" in the left nav. It analyzes 14 days of CloudWatch metrics and suggests cheaper alternatives.
CloudWatch Metrics
For a more hands-on approach, check these metrics for each instance:
- CPUUtilization: If consistently below 20%, you're likely oversized
- NetworkIn/NetworkOut: Low network usage might mean you don't need a network-optimized instance
- Memory utilization: Requires the CloudWatch agent, but critical for memory-heavy workloads
AWS Compute Optimizer
This free service analyzes your instance usage patterns and recommends optimal instance types. It considers CPU, memory, network, and storage requirements together.
The Right-Sizing Process
Step 1: Identify Candidates
Look for instances with average CPU below 20% and peak CPU below 50% over the last 30 days. These are safe candidates for downsizing.
Step 2: Test in Staging
Never right-size production instances without testing first. Create a staging environment with the smaller instance type and run your workload against it.
Step 3: Use the Latest Generation
When right-sizing, also consider moving to the latest instance generation. A t3.medium is cheaper and faster than a t2.large. You often save money AND get better performance.
Step 4: Consider Savings Plans
Once you know your baseline needs, lock in savings with Reserved Instances or Savings Plans. This can save an additional 30-60% on top of right-sizing savings.
Common Mistakes to Avoid
- Right-sizing based on average usage only: Always check peak usage too. An instance at 10% average but 95% peak needs that capacity.
- Forgetting about memory: CPU isn't everything. Some workloads are memory-bound, and downsizing CPU won't help if you run out of RAM.
- Not monitoring after changes: Set up CloudWatch alarms for CPU and memory after right-sizing so you catch any performance issues early.
Real Example
One of our clients was running 8 x m5.2xlarge instances for a web application. After analysis, we found average CPU at 12% and peak at 35%. We moved them to 8 x m5.large instances — same performance, 50% cost reduction on compute alone. Combined with Reserved Instances, total savings were 65%.
That's the power of right-sizing: it's not glamorous, but it's where the money is.