Name: Wozz
Rating: 4.9 (127 reviews)
Author: Wozz

Why Kubernetes is So Expensive

The average company wastes 30-50% of their Kubernetes spend. That's $50K-$200K annually for mid-market companies, and millions for enterprises.

The root cause? Kubernetes makes it too easy to over-provision. Developers set resource limits "to be safe," and those limits stick around forever — even when actual usage is 10x lower.

The 7 Biggest Sources of Kubernetes Waste

1. Over-Provisioned Memory Limits (40% of waste)

This is the #1 cost killer. A pod requests 4GB RAM but uses 400MB. Kubernetes reserves that entire 4GB, so you're paying for 3.6GB of nothing.

❌ Bad Example

resources:
  requests:
    memory: "4Gi"
    cpu: "2000m"
  limits:
    memory: "4Gi"
    cpu: "2000m"

# Actual usage: 400Mi memory, 200m CPU
# Waste: $1,200/year per pod

✅ Optimized

resources:
  requests:
    memory: "512Mi"
    cpu: "250m"
  limits:
    memory: "1Gi"    # Headroom for spikes
    cpu: "500m"

# Aligned with actual usage
# Savings: $1,200/year per pod

2. Orphaned Load Balancers (20% of waste)

You deleted a Service but the cloud load balancer stayed alive. At $20-50/month per LB, 10 orphaned LBs = $6,000/year.

How to find them:

# List all LoadBalancer services
kubectl get svc --all-namespaces -o json | \
  jq '.items[] | select(.spec.type=="LoadBalancer") | .metadata.name'

# Cross-reference with your cloud provider console
# Delete orphaned LBs that aren't in kubectl output

3. Idle Development Clusters (15% of waste)

Dev/staging clusters running 24/7 when they're only used 9-5 weekdays.

Solution: Use Karpenter or cluster-autoscaler to scale down to zero nodes after hours. Or use tools like DevSpace/Tilt for local dev.

4. No Autoscaling (10% of waste)

Fixed replica counts sized for peak load, running 24/7.

Implement HPA

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

5. Oversized Nodes (10% of waste)

Running 8xlarge instances when 4xlarge would suffice. Bin-packing efficiency matters.

6. No Spot/Preemptible Instances (5% of waste)

Spot instances are 70% cheaper. Use them for stateless workloads, batch jobs, and dev/test.

Step-by-Step Optimization Plan

Week 1: Discovery

Run Wozz audit (2 min) to find all waste
Identify top 10 over-provisioned pods (these are quick wins)
Find orphaned load balancers and volumes

Week 2: Quick Wins

Delete orphaned load balancers → save $500-2K/month
Right-size top 10 pods → save $2-5K/month
Scale down dev clusters at night → save $1-3K/month

Week 3-4: Systematic Optimization

Implement HPA for top 20 services
Enable cluster autoscaling
Move batch jobs to spot instances
Set up VPA (Vertical Pod Autoscaler) for automatic right-sizing

Advanced Techniques

Use Karpenter for Better Node Provisioning

Karpenter is 10x better than cluster-autoscaler. It provisions the exact right instance types based on pod requirements, reducing waste from poor bin-packing.

Implement Pod Disruption Budgets

PDBs let you aggressively scale down without risking availability. They ensure minimum replicas stay running during node drains.

Use Namespace Resource Quotas

Prevent teams from over-provisioning by setting namespace-level CPU/memory quotas.

Measuring Success

Track these metrics monthly:

Total cluster cost (from cloud bill)
Cost per pod (cluster cost / total pods)
Resource utilization (actual usage / requested resources)
Waste reduction % (vs. baseline month)

Target: 70%+ utilization and 40% cost reduction within 3 months.

Find Your Waste in 2 Minutes

Run Wozz to get your personalized optimization plan

Start Free Audit