← Back to Blog
Guide·11 min read·December 10, 2024

How to Reduce Kubernetes Costs: 10 Proven Strategies

Practical techniques that cut K8s bills by 30-50% without sacrificing performance. Battle-tested on EKS, GKE, and AKS.

Kubernetes makes it easy to deploy applications but hard to control costs. Most clusters waste 30-50% of their budget on over-provisioned resources.

Here are 10 proven strategies to reduce your Kubernetes costs, ranked by impact.

1. Right-Size Resource Requests (30-50% savings)

This is the #1 cost reduction lever. Most pods request 3-5x more CPU and memory than they actually use.

How to fix it:

  1. Analyze actual usage over 7-14 days
  2. Set requests to P95 of actual usage
  3. Set limits to 1.5-2x requests
Find over-provisioned pods instantly
curl -sL wozz.io/audit.sh | bash

2. Use Spot/Preemptible Instances (60-80% savings)

Spot instances cost 60-80% less than on-demand. Use them for:

  • Stateless workloads
  • Batch jobs
  • Dev/staging environments
  • Any workload that can handle restarts

Implementation: Use node selectors or taints to route appropriate workloads to spot node pools.

3. Delete Orphaned Resources (Immediate savings)

Orphaned resources are pure waste—they bill you for nothing. Check for:

  • Load balancers without backend pods (~$20/month each)
  • Persistent volumes not bound to any PVC
  • Snapshots from deleted volumes
  • Unused node pools
# Find orphaned LoadBalancers
kubectl get svc -A | grep LoadBalancer

# Find unbound PVs
kubectl get pv | grep Available

4. Implement Cluster Autoscaler (10-30% savings)

The cluster autoscaler adds nodes when needed and removes them when idle. Without it, you're paying for peak capacity 24/7.

Key settings:

  • scale-down-unneeded-time: 5m – How long before scaling down
  • scale-down-utilization-threshold: 0.5 – Scale down if utilization below 50%

5. Use Horizontal Pod Autoscaler (10-20% savings)

HPA scales pods based on CPU, memory, or custom metrics. This ensures you only run what you need.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

6. Schedule Non-Critical Workloads Off-Peak (20-40% savings)

Batch jobs, backups, and reports don't need to run during business hours. Schedule them for nights/weekends when you can use smaller node pools.

7. Use Namespace Resource Quotas

Without quotas, any team can deploy unlimited resources. Set limits per namespace:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-quota
  namespace: team-payments
spec:
  hard:
    requests.cpu: "10"
    requests.memory: 20Gi
    limits.cpu: "20"
    limits.memory: 40Gi

8. Optimize Container Images (5-10% savings)

Smaller images = faster pulls = less egress costs = faster scaling.

  • Use multi-stage builds
  • Start from alpine or distroless base images
  • Remove dev dependencies from production images

9. Use Reserved Instances for Baseline (30-40% savings)

For predictable, always-on workloads, reserved instances (1-year or 3-year) save 30-60% vs on-demand.

Strategy: Use reserved for your baseline, on-demand for variable load, and spot for spikes.

10. Monitor and Alert on Cost Anomalies

Set up alerts for:

  • Daily spend exceeding threshold
  • New namespaces consuming resources
  • Pods scaling beyond expected limits

Get Automated Cost Alerts

Wozz monitors your cluster daily and alerts you when waste increases.

curl -sL wozz.io/audit.sh | bash -s -- --push

Quick Wins vs Long-Term Strategies

StrategyEffortImpactTimeline
Delete orphaned resourcesLowImmediateToday
Right-size top offendersMediumHighThis week
Enable cluster autoscalerMediumMediumThis week
Add spot instancesMediumHighThis month
Reserved instancesLowHighThis quarter

Summary

Reducing Kubernetes costs doesn't require complex tools or major architecture changes. Start with visibility (audit your current waste), implement quick wins (delete orphaned resources, right-size top offenders), then build sustainable practices (autoscaling, spot instances, monitoring).

Most teams can cut 30-50% of their Kubernetes bill within 30 days using these strategies.