How to set up horizontal pod scaling on DigitalOcean
Horizontal Pod Autoscaler (HPA) on DigitalOcean Kubernetes automatically scales your application pods based on CPU, memory, or custom metrics. Enable the metrics server, create an HPA resource with target metrics, and Kubernetes will automatically add or remove pods to meet demand.
Prerequisites
- Active DigitalOcean Kubernetes cluster
- kubectl configured for your DOKS cluster
- Basic understanding of Kubernetes pods and deployments
- Application already deployed to the cluster
Step-by-Step Instructions
Verify metrics server is enabled
kubectl get deployment metrics-server -n kube-systemIf not found, enable it in your DOKS cluster settings in the DigitalOcean control panel under Kubernetes > Your Cluster > Settings > Add-ons and enable Metrics Server.
Configure resource requests for your deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 2
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app
image: my-app:latest
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512MiApply the changes:
kubectl apply -f deployment.yamlCreate the Horizontal Pod Autoscaler
hpa.yaml:apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70Apply the HPA:
kubectl apply -f hpa.yamlVerify HPA status and configuration
kubectl get hpa my-app-hpaView detailed HPA information:
kubectl describe hpa my-app-hpaYou should see current CPU utilization, target utilization, and current replica count. It may take a few minutes for metrics to appear.
Configure memory-based scaling (optional)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80Apply the updated configuration:
kubectl apply -f hpa.yamlTest the autoscaling behavior
kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/shInside the pod, generate requests:
while true; do wget -q -O- http://my-app-service/; doneMonitor the HPA response:
kubectl get hpa my-app-hpa --watchMonitor and adjust scaling parameters
- Lower
averageUtilizationfor more aggressive scaling - Increase
maxReplicasif you hit the limit during peak traffic - Adjust
minReplicasbased on baseline traffic requirements
Update your HPA configuration and reapply:
kubectl apply -f hpa.yamlCommon Issues & Troubleshooting
HPA shows 'unknown' for current CPU utilization
Ensure the metrics server is running (kubectl get pods -n kube-system | grep metrics-server) and your deployment has CPU resource requests defined. Wait 2-3 minutes after deployment for metrics to populate.
Pods are not scaling up despite high CPU usage
Check if you've reached the maxReplicas limit and verify your DigitalOcean node pool has sufficient capacity. Use kubectl describe nodes to check resource availability and consider adding more nodes if needed.
Scaling happens too frequently (flapping)
Add stabilization windows to your HPA configuration:
behavior:
scaleUp:
stabilizationWindowSeconds: 60
scaleDown:
stabilizationWindowSeconds: 300This prevents rapid scaling changes.
HPA cannot scale below minimum replicas during low traffic
This is expected behavior. If you want fewer replicas during off-peak hours, consider using Vertical Pod Autoscaler (VPA) alongside HPA or manually adjust minReplicas in your HPA configuration for different time periods.