Click to rate this post!

[Total: 0 Average: 0]

Kubernetes HorizontalPodAutoscaler automatically scales Kubernetes Pods under ReplicationController, Deployment, or ReplicaSet controllers basing on its CPU, memory, or other metrics.

It was shortly discussed in the Kubernetes: running metrics-server in AWS EKS for a Kubernetes Pod AutoScaler post, now let’s go deeper to check all options available for scaling.

For HPA you can use three API types:

metrics.k8s.io: default metrics, basically provided by the metrics-server
custom.metrics.k8s.io: metrics, provided by adapters from inside of a cluster, for example – Microsoft Azure Adapter, Google Stackdriver, Prometheus Adapter (the Prometheus Adapter will be used in this post later), check the full list here>>>
external.metrics.k8s.io: similar to the Custom Metrics API, but metrics are provided by an external system, such as AWS CloudWatch

Documentation: Support for metrics APIs, and Custom and external metrics for autoscaling workloads.

Besides the HorizontalPodAutoscaler (HPA) you also can use Vertical Pod Autoscaling (VPA) and they can be used together although with some limitations, see Horizontal Pod Autoscaling Limitations.

Contents

Create HorizontalPodAutoscaler

Let’s start with a simple HPA which will scale pods basing on CPU usage:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-example
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: deployment-example
  minReplicas: 1
  maxReplicas: 5
  targetCPUUtilizationPercentage: 10

Here:

apiVersion: autoscaling/v1 – an API groupautoscaling, pay attention to the API version, as in the v1 at the time of writing, scaling was available by the CPU metrics only, thus memory and custom metrics can be used only with the API v2beta2 (still, you can use v1 with annotations), see API Object.
spec.scaleTargetRef: specify for НРА which controller will be scaled (ReplicationController, Deployment, ReplicaSet), in this case, HPA will look for the Deployment object called deployment-example
spec.minReplicas, spec.maxReplicas: minimal and maximum pods to be running by this HPA
targetCPUUtilizationPercentage: CPU usage % from the requests when HPA will add or remove pods

Create it:

[simterm]

$ kubectl apply -f hpa-example.yaml 
horizontalpodautoscaler.autoscaling/hpa-example created

[/simterm]

Check:

[simterm]

$ kubectl get hpa hpa-example
NAME          REFERENCE                       TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
hpa-example   Deployment/deployment-example   <unknown>/10%   1         5         0          89s

[/simterm]

Currently, its TARGETS has the <unknown> value as there are no pods created yet, but metrics are already available:

[simterm]

$ kubectl get --raw "/apis/metrics.k8s.io/" | jq
{
  "kind": "APIGroup",
  "apiVersion": "v1",
  "name": "metrics.k8s.io",
  "versions": [
    {
      "groupVersion": "metrics.k8s.io/v1beta1",
      "version": "v1beta1"
    }
  ],
  "preferredVersion": {
    "groupVersion": "metrics.k8s.io/v1beta1",
    "version": "v1beta1"
  }
}

[/simterm]

Add the called deployment-example Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: deployment-example
spec:
  replicas: 1
  strategy:
    type: RollingUpdate
  selector:
    matchLabels:
      application: deployment-example
  template:
    metadata:
      labels:
        application: deployment-example
    spec: 
      containers:
      - name: deployment-example-pod
        image: nginx
        ports:
          - containerPort: 80
        resources:
          requests:
            cpu: 100m
            memory: 100Mi

Here we defined Deployment which will spin up one pod with NINGX with requests for 100 millicores and 100 mebibyte memory, see Kubernetes best practices: Resource requests and limits.

Create it:

[simterm]

$ kubectl apply -f hpa-deployment-example.yaml 
deployment.apps/deployment-example created

[/simterm]

Check the HPA now:

[simterm]

$ kubectl get hpa hpa-example
NAME          REFERENCE                       TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
hpa-example   Deployment/deployment-example   0%/10%    1         5         1          14m

[/simterm]

Our НРА found the deployment and started checking its pods’ metrics.

Let’s check those metrics – find a pod:

[simterm]

$ kubectl get pod | grep example | cut -d " " -f 1
deployment-example-86c47f5897-2mzjd

[/simterm]

And run the following API request:

[simterm]

$ kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespaces/default/pods/deployment-example-86c47f5897-2mzjd | jq
{
  "kind": "PodMetrics",
  "apiVersion": "metrics.k8s.io/v1beta1",
  "metadata": {
    "name": "deployment-example-86c47f5897-2mzjd",
    "namespace": "default",
    "selfLink": "/apis/metrics.k8s.io/v1beta1/namespaces/default/pods/deployment-example-86c47f5897-2mzjd",
    "creationTimestamp": "2020-08-07T10:41:21Z"
  },
  "timestamp": "2020-08-07T10:40:39Z",
  "window": "30s",
  "containers": [
    {
      "name": "deployment-example-pod",
      "usage": {
        "cpu": "0",
        "memory": "2496Ki"
      }
    }
  ]
}

[/simterm]

CPU usage – zero, memory – 2 megabytes, let’s confirm with top:

[simterm]

$ kubectl top pod deployment-example-86c47f5897-2mzjd
NAME                                  CPU(cores)   MEMORY(bytes)   
deployment-example-86c47f5897-2mzjd   0m           2Mi

[/simterm]

“Alright, these guys!” (с)

Okay – we got our metrics, we’ve created the HPA and deployment – let’s go to see how the scaling will work here.

Load testing HorizontalPodAutoscaler scaling

For load testing, we can use the loadimpact/loadgentest-wrk utility image.

Now, run ports redirect from the local workstation to the pod with NGINX, as we didn’t add any LoadBalancer (see Kubernetes: ClusterIP vs NodePort vs LoadBalancer, Services, and Ingress – an overview with examples):

[simterm]

$ kubectl port-forward deployment-example-86c47f5897-2mzjd 8080:80
Forwarding from 127.0.0.1:8080 -> 80
Forwarding from [::1]:8080 -> 80

[/simterm]

Check resources once again:

[simterm]

$ kubectl get hpa hpa-example
NAME          REFERENCE                       TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
hpa-example   Deployment/deployment-example   0%/10%    1         5         1          33m

[/simterm]

0% CPU is used, one pod is running (REPLICAS 1).

Run the test:

[simterm]

$ docker run --rm --net=host loadimpact/loadgentest-wrk -c 100 -t 100 -d 5m http://127.0.0.1:8080
Running 5m test @ http://127.0.0.1:8080

[/simterm]

Here:

open 100 connections using 100 threads
run the test for 5 minutes

Check the pod:

[simterm]

$ kubectl top pod deployment-example-86c47f5897-2mzjd
NAME                                  CPU(cores)   MEMORY(bytes)   
deployment-example-86c47f5897-2mzjd   49m          2Mi

[/simterm]

CPU usage now 49mi, and in the requests, we’ve set the 10 milicpu limit – check the HPA:

[simterm]

$ kubectl get hpa hpa-example
NAME          REFERENCE                       TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
hpa-example   Deployment/deployment-example   49%/10%   1         5         4          42m

[/simterm]

TARGETS 49% from the 10% limit and our HPA started new pods – REPLICAS 4:

[simterm]

$ kubectl get pod | grep example
deployment-example-86c47f5897-2mzjd    1/1     Running   0          31m
deployment-example-86c47f5897-4ntd4    1/1     Running   0          24s
deployment-example-86c47f5897-p7tc7    1/1     Running   0          8s
deployment-example-86c47f5897-q49gk    1/1     Running   0          24s
deployment-example-86c47f5897-zvdvz    1/1     Running   0          24s

[/simterm]

Multi-metrics scaling

Okay, we were able to scale by the CPU usage, but what if you want to scale by both CPU and memory usage?

Add another Resource with memory limit set to the same 10%:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-example
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: deployment-example
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Resource
    resource:
      name: cpu
      targetAverageUtilization: 10
  - type: Resource
    resource:
      name: memory 
      targetAverageUtilization: 10

autoscaling API group versions

Let’s go back to the API versions.

In the first manifest for our HPA, we’ve used the autoscaling/v1 API which has the only targetCPUUtilizationPercentage parameter.

Check the autoscaling/v2beta1 – now, it has the metrics field added which is the MetricSpec array which can hold for new fields – the external, object, pods, resource.

in its turn the resource has the ResourceMetricSource, which holds two fields – targetAverageUtilization, and targetAverageValue, which are used now in themertics insted of the targetCPUUtilizationPercentage.

Apply the HPA update:

[simterm]

$ kubectl apply -f hpa-example.yaml 
horizontalpodautoscaler.autoscaling/hpa-example configured

[/simterm]

Check it:

[simterm]

$ kubectl get hpa hpa-example
NAME          REFERENCE                       TARGETS          MINPODS   MAXPODS   REPLICAS   AGE
hpa-example   Deployment/deployment-example   3%/10%, 0%/10%   1         5         2          126m

[/simterm]

The TARGETS now displaying two metrics – CPU and memory.

It’s hard to make NGINX consume a lot of memory, so let’s go to see how much it uses now with the following kubectl command:

[simterm]

$ kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespaces/default/pods/deployment-example-c4d6f96db-jv6nm | jq '.containers[].usage.memory'
"2824Ki"

[/simterm]

2 megabytes.

Let’s update our PHA once again and set a new limit, but at this time we’ll use raw values instead of the percent – 1024Ki, 1 megabyte using targetAverageUtilization instead of the previously used targetAverageUtilization:

...
  metrics:
  - type: Resource
    resource:
      name: cpu
      targetAverageUtilization: 10
  - type: Resource
    resource:
      name: memory
      targetAverageValue: 1024Ki

Apply and check:

[simterm]

$ kubectl get hpa hpa-example
NAME          REFERENCE                       TARGETS               MINPODS   MAXPODS   REPLICAS   AGE
hpa-example   Deployment/deployment-example   2551808/1Mi, 0%/10%   1         5         3          2m8s

[/simterm]

REPLICAS == 3 now, pods were scaled, check the value from the TARGETS – convert it to kilobytes:

[simterm]

$ echo 2551808/1024 | bc
2492

[/simterm]

And check real memory usage:

[simterm]

$ kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespaces/default/pods/deployment-example-c4d6f96db-fldl2 | jq '.containers[].usage.memory'
"2496Ki"

[/simterm]

2492 ~= 2496Ki.

Okay, so – we are able now to scale the Deployment by both CPU and memory usage.

Custom Metrics

Memory metrics scaling

Apart from metrics provided by the API server and cAdvisor we can use any other metrics, for example – metrics, collected by Prometheus.

It can be metrics collected by a Cloudwatch exporter, Prometheus’ node_exporter, or metrics from an application.

Documention is here>>>.

As we are using Prometheus (see Kubernetes: monitoring with Prometheus – exporters, a Service Discovery, and its roles and Kubernetes: a cluster’s monitoring with the Prometheus Operator) so let’s add its adapter:

If you’ll try to access external or custom API endpoints now you’ll get the error:

[simterm]

$ kubectl get --raw /apis/custom.metrics.k8s.io/
Error from server (NotFound): the server could not find the requested resource

$ kubectl get --raw /apis/external.metrics.k8s.io/
Error from server (NotFound): the server could not find the requested resource

[/simterm]

Install the adapter from the Helm chart:

[simterm]

$ helm install prometheus-adapter stable/prometheus-adapter
NAME: prometheus-adapter
LAST DEPLOYED: Sat Aug  8 13:27:36 2020
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
prometheus-adapter has been deployed.
In a few minutes you should be able to list metrics using the following command(s):

  kubectl get --raw /apis/custom.metrics.k8s.io/v1beta

[/simterm]

Wait for a minute or two and check API again:

[simterm]

$ kubectl get --raw="/apis/custom.metrics.k8s.io/v1beta1" | jq .
{
  "kind": "APIResourceList",
  "apiVersion": "v1",
  "groupVersion": "custom.metrics.k8s.io/v1beta1",
  "resources": []
}

[/simterm]

Well, but why the "resources":[] is empty?

Check the adapter’s pod logs:

[simterm]

$ kubectl logs -f prometheus-adapter-b8945f4d8-q5t6x                                                                                                                                             
I0808 10:45:47.706771       1 adapter.go:94] successfully using in-cluster auth                                                                                                                                                               
E0808 10:45:47.752737       1 provider.go:209] unable to update list of all metrics: unable to fetch metrics for query "{__name__=~\"^container_.*\",container!=\"POD\",namespace!=\"\",pod!=\"\"}": Get http://prometheus.default.svc:9090/api/v1/series?match%5B%5D=%7B__name__%3D~%22%5Econtainer_.%2A%22%2Ccontainer%21%3D%22POD%22%2Cnamespace%21%3D%22%22%2Cpod%21%3D%22%22%7D&start=1596882347.736: dial tcp: lookup prometheus.default.svc on 172.20.0.10:53: no such host          
I0808 10:45:48.032873       1 serving.go:306] Generated self-signed cert (/tmp/cert/apiserver.crt, /tmp/cert/apiserver.key)   
...

[/simterm]

Here is the error:

dial tcp: lookup prometheus.default.svc on 172.20.0.10:53: no such host

Let try to access our Prometheus Operator from the pod by its Service DNS name:

[simterm]

$ kubectl exec -ti deployment-example-c4d6f96db-fldl2 curl prometheus-prometheus-oper-prometheus.monitoring.svc.cluster.local:9090/metrics | head -5
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 2.0078e-05
go_gc_duration_seconds{quantile="0.25"} 3.8669e-05
go_gc_duration_seconds{quantile="0.5"} 6.956e-05

[/simterm]

Okay, we are able to reach it using the prometheus-prometheus-oper-prometheus.monitoring.svc.cluster.local:9090 URL.

Edit the adapter’s Deployment:

[simterm]

$ kubectl edit deploy prometheus-adapter

[/simterm]

Update its prometheus-url:

...
    spec:
      affinity: {}
      containers:
      - args:
        - /adapter
        - --secure-port=6443
        - --cert-dir=/tmp/cert
        - --logtostderr=true
        - --prometheus-url=http://prometheus-prometheus-oper-prometheus.monitoring.svc.cluster.local:9090
...

Apply changes and check again:

[simterm]

$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .  |grep "pods/" | head -5
      "name": "pods/node_load15",
      "name": "pods/go_memstats_next_gc_bytes",
      "name": "pods/coredns_forward_request_duration_seconds_count",
      "name": "pods/rest_client_requests",
      "name": "pods/node_ipvs_incoming_bytes",

[/simterm]

Nice – we’ve got our metrics and can use them now in the HPA.

Check the API server for the memory_usage_bytes metric:

[simterm]

$ kubectl get --raw="/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/memory_usage_bytes" | jq .                                                                                    
{
  "kind": "MetricValueList",
  "apiVersion": "custom.metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/memory_usage_bytes"
  },
  "items": [
    {
      "describedObject": {
        "kind": "Pod",
        "namespace": "default",
        "name": "deployment-example-c4d6f96db-8tfnw",
        "apiVersion": "/v1"
      },
      "metricName": "memory_usage_bytes",
      "timestamp": "2020-08-08T11:18:53Z",
      "value": "11886592",
      "selector": null
    },
...

[/simterm]

Update the НРА’s manifest:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-example
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: deployment-example
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Pods
    pods:
      metricName: memory_usage_bytes
      targetAverageValue: 1024000

Check the HPA’s values now:

[simterm]

$ kubectl get hpa hpa-example
NAME          REFERENCE                       TARGETS               MINPODS   MAXPODS   REPLICAS   AGE
hpa-example   Deployment/deployment-example   4694016/1Mi, 0%/10%   1         5         5          69m

[/simterm]

Apply the latest changes:

[simterm]

$ kubectl apply -f hpa-example.yaml
horizontalpodautoscaler.autoscaling/hpa-example configured

[/simterm]

Check again:

[simterm]

$ kubectl get hpa hpa-example
NAME          REFERENCE                       TARGETS          MINPODS   MAXPODS   REPLICAS   AGE
hpa-example   Deployment/deployment-example   11853824/1024k   1         5         1          16s

[/simterm]

Still have 1 replica, check events:

[simterm]

...
43s         Normal    ScalingReplicaSet   Deployment                Scaled up replica set deployment-example-c4d6f96db to 1
16s         Normal    ScalingReplicaSet   Deployment                Scaled up replica set deployment-example-c4d6f96db to 4
1s          Normal    ScalingReplicaSet   Deployment                Scaled up replica set deployment-example-c4d6f96db to 5
16s         Normal    SuccessfulRescale   HorizontalPodAutoscaler   New size: 4; reason: pods metric memory_usage_bytes above target
1s          Normal    SuccessfulRescale   HorizontalPodAutoscaler   New size: 5; reason: pods metric memory_usage_bytes above target
...

[/simterm]

And HPA again:

[simterm]

$ kubectl get hpa hpa-example
NAME          REFERENCE                       TARGETS             MINPODS   MAXPODS   REPLICAS   AGE
hpa-example   Deployment/deployment-example   6996787200m/1024k   1         5         5          104s

[/simterm]

Great – “It works!” (c)

And it’s good for metrics which are already present in the cluster like the memory_usage_bytes by default collected by the cAdvisor from all containers in the cluster.

Let’s try to use more custom metrics, for example – let’s scale a Gorush-server by using its own metrics, see Kubernetes: running a push-server with Gorush behind an AWS LoadBalancer.

Application-based metrics scaling

So, we have the Gorush server running in our cluster which is used to send push-notifications to mobile clients.

It has the built-in /metrics endpoint which returns standard time-series metrics that can be used in Prometheus.

To run the testing Gorush server we can use such Service, ConfigMap, andDeployment:

apiVersion: v1
kind: Service
metadata:
  name: gorush
  labels:
    app: gorush
    tier: frontend
spec:
  selector:
    app: gorush
    tier: frontend
  type: ClusterIP
  ports:
  - name: gorush
    protocol: TCP
    port: 80
    targetPort: 8088
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: gorush-config
data:
  stat.engine: memory
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: gorush
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gorush
      tier: frontend
  template:
    metadata:
      labels:
        app: gorush
        tier: frontend
    spec:
      containers:
      - image: appleboy/gorush
        name: gorush
        imagePullPolicy: Always
        ports:
        - containerPort: 8088
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8088
          initialDelaySeconds: 3
          periodSeconds: 3
        env:
        - name: GORUSH_STAT_ENGINE
          valueFrom:
            configMapKeyRef:
              name: gorush-config
              key: stat.engine

Create a dedciated namespace:

[simterm]

$ kubectl create ns eks-dev-1-gorush
namespace/eks-dev-1-gorush created

[/simterm]

Create the application:

[simterm]

$ kubectl -n eks-dev-1-gorush apply -f my-gorush.yaml 
service/gorush created
configmap/gorush-config created
deployment.apps/gorush created

[/simterm]

Check pods:

[simterm]

$ kubectl -n eks-dev-1-gorush get pod
NAME                      READY   STATUS        RESTARTS   AGE
gorush-5c6775748b-6r54h   1/1     Running       0          83s

[/simterm]

Gorush Service:

[simterm]

$ kubectl -n eks-dev-1-gorush get svc
NAME     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
gorush   ClusterIP   172.20.186.251   <none>        80/TCP    103s

[/simterm]

Run the port-forward to its pod:

[simterm]

$ kubectl -n eks-dev-1-gorush port-forward gorush-5c6775748b-6r54h 8088:8088
Forwarding from 127.0.0.1:8088 -> 8088
Forwarding from [::1]:8088 -> 8088

[/simterm]

Check metrcis:

[simterm]

$ curl -s localhost:8088/metrics | grep gorush | head
# HELP gorush_android_fail Number of android fail count
# TYPE gorush_android_fail gauge
gorush_android_fail 0
# HELP gorush_android_success Number of android success count
# TYPE gorush_android_success gauge
gorush_android_success 0
# HELP gorush_ios_error Number of iOS fail count
# TYPE gorush_ios_error gauge
gorush_ios_error 0
# HELP gorush_ios_success Number of iOS success count

[/simterm]

Or another way: by using its Service name, we can reach it directly.

Find the Service:

[simterm]

$ kubectl -n eks-dev-1-gorush get svc
NAME     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
gorush   ClusterIP   172.20.186.251   <none>        80/TCP    26m

[/simterm]

Open proxy to the API-server:

[simterm]

$ kubectl proxy --port=8080
Starting to serve on 127.0.0.1:8080

[/simterm]

And connect to the Service:

[simterm]

$ curl -sL localhost:8080/api/v1/namespaces/eks-dev-1-gorush/services/gorush:gorush/proxy/metrics | head 
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 9.194e-06
go_gc_duration_seconds{quantile="0.25"} 1.2092e-05
go_gc_duration_seconds{quantile="0.5"} 2.1812e-05
go_gc_duration_seconds{quantile="0.75"} 5.1794e-05
go_gc_duration_seconds{quantile="1"} 0.000145631
go_gc_duration_seconds_sum 0.001080551
go_gc_duration_seconds_count 32
# HELP go_goroutines Number of goroutines that currently exist.

[/simterm]

Kubernetes `ServiceMonitor`

The next thing to do is to add a ServiceMonitor to the Kubernetes cluster for our Prometheus Operator which will collect those metrics, check the Adding Kubernetes ServiceMonitor.

Check metrics now with port-forward:

[simterm]

$ kk -n monitoring port-forward prometheus-prometheus-prometheus-oper-prometheus-0 9090:9090
Forwarding from [::1]:9090 -> 9090
Forwarding from 127.0.0.1:9090 -> 9090

[/simterm]

Try to access them:

[simterm]

$ curl "localhost:9090/api/v1/series?match[]=gorush_total_push_count&start=1597141864"
{"status":"success","data":[]}

[/simterm]

The "data":[] is empty now – or Prometheus doesn’t collect those metrics yet.

Define the ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    serviceapp: gorush-servicemonitor
    release: prometheus
  name: gorush-servicemonitor
  namespace: monitoring
spec:     
  endpoints:
  - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
    interval: 15s
    port: gorush
  namespaceSelector:
    matchNames: 
    - eks-dev-1-gorush
  selector: 
    matchLabels:
      app: gorush

Note: The Prometheus resource includes a field called serviceMonitorSelector, which defines a selection of ServiceMonitors to be used. By default and before the version v0.19.0, ServiceMonitors must be installed in the same namespace as the Prometheus instance. With the Prometheus Operator v0.19.0 and above, ServiceMonitors can be selected outside the Prometheus namespace via the serviceMonitorNamespaceSelector field of the Prometheus resource.
See Prometheus Operator

Create this ServiceMonitor:

[simterm]

$ kubectl apply -f ../../gorush-service-monitor.yaml 
servicemonitor.monitoring.coreos.com/gorush-servicemonitor created

[/simterm]

Check it in the Targets:

UP, good.

And in a couple of minutes check for metrics again:

[simterm]

$ curl "localhost:9090/api/v1/series?match[]=gorush_total_push_count&start=1597141864"
{"status":"success","data":[{"__name__":"gorush_total_push_count","endpoint":"gorush","instance":"10.3.35.14:8088","job":"gorush","namespace":"eks-dev-1-gorush","pod":"gorush-5c6775748b-6r54h","service":"gorush"}]}

[/simterm]

Or in this way:

[simterm]

$ curl -s localhost:9090/api/v1/label/__name__/values | jq | grep gorush
    "gorush_android_fail",
    "gorush_android_success",
    "gorush_ios_error",
    "gorush_ios_success",
    "gorush_queue_usage",
    "gorush_total_push_count",

[/simterm]

Nice, we’ve got our metrics, let’s go ahead and use them in the HorizontalPodAutoscaler of this Deployment.

Check the metrics groups available here:

[simterm]

$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .  | grep "gorush"
      "name": "services/gorush_android_success",
      "name": "pods/gorush_android_fail",
      "name": "namespaces/gorush_total_push_count",
      "name": "namespaces/gorush_queue_usage",
      "name": "pods/gorush_ios_success",
      "name": "namespaces/gorush_ios_success",
      "name": "jobs.batch/gorush_ios_error",
      "name": "services/gorush_total_push_count",
      "name": "jobs.batch/gorush_queue_usage",
      "name": "pods/gorush_queue_usage",
      "name": "jobs.batch/gorush_android_fail",
      "name": "services/gorush_queue_usage",
      "name": "services/gorush_ios_success",
      "name": "jobs.batch/gorush_android_success",
      "name": "jobs.batch/gorush_total_push_count",
      "name": "pods/gorush_ios_error",
      "name": "pods/gorush_total_push_count",
      "name": "pods/gorush_android_success",
      "name": "namespaces/gorush_android_success",
      "name": "namespaces/gorush_android_fail",
      "name": "namespaces/gorush_ios_error",
      "name": "jobs.batch/gorush_ios_success",
      "name": "services/gorush_ios_error",
      "name": "services/gorush_android_fail",

[/simterm]

Add a new manifest with the HPA which will use the gorush_queue_usage from the Pods group:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: gorush-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: gorush
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Pods
    pods:
      metricName: gorush_total_push_count
      targetAverageValue: 2

With such settings, the HPA has to scale pods once the gorush_total_push_count‘s value will be over 2.

Create it:

[simterm]

$ kubectl -n eks-dev-1-gorush apply -f my-gorush.yaml
service/gorush unchanged
configmap/gorush-config unchanged
deployment.apps/gorush unchanged
horizontalpodautoscaler.autoscaling/gorush-hpa created

[/simterm]

Check its value now:

[simterm]

$ kubectl get --raw="/apis/custom.metrics.k8s.io/v1beta1/namespaces/eks-dev-1-gorush/pods/*/gorush_total_push_count" | jq '.items[].value'
"0"

[/simterm]

Check the НРА:

[simterm]

$ kubectl -n eks-dev-1-gorush get hpa
NAME         REFERENCE           TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
gorush-hpa   Deployment/gorush   0/1       1         5         1          17s

[/simterm]

TARGETS 0/1, okay.

Send a push:

[simterm]

$ curl -X POST a6095d18859c849889531cf08baa6bcf-531932299.us-east-2.elb.amazonaws.com/api/push -d '{"notifications":[{"tokens":["990004543798742"],"platform":2,"message":"Hello Android"}]}'
{"counts":1,"logs":[],"success":"ok"}

[/simterm]

Check the gorush_total_push_count metric again:

[simterm]

$ kubectl get --raw="/apis/custom.metrics.k8s.io/v1beta1/namespaces/eks-dev-1-gorush/pods/*/gorush_total_push_count" | jq '.items[].value'
"1"

[/simterm]

One push was sent.

Check the HPA one more time:

[simterm]

$ kubectl -n eks-dev-1-gorush get hpa
NAME         REFERENCE           TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
gorush-hpa   Deployment/gorush   1/2       1         5         1          9m42s

[/simterm]

TARGETS 1/2, REPLICAS still has 1, send another push and check the events:

[simterm]

$ kubectl -n eks-dev-1-gorush get events --watch
LAST SEEN   TYPE     REASON                 KIND                      MESSAGE
18s         Normal   Scheduled              Pod                       Successfully assigned eks-dev-1-gorush/gorush-5c6775748b-x8fjs to ip-10-3-49-200.us-east-2.compute.internal
17s         Normal   Pulling                Pod                       Pulling image "appleboy/gorush"
17s         Normal   Pulled                 Pod                       Successfully pulled image "appleboy/gorush"
17s         Normal   Created                Pod                       Created container gorush
17s         Normal   Started                Pod                       Started container gorush
18s         Normal   SuccessfulCreate       ReplicaSet                Created pod: gorush-5c6775748b-x8fjs
18s         Normal   SuccessfulRescale      HorizontalPodAutoscaler   New size: 2; reason: pods metric gorush_total_push_count above target
18s         Normal   ScalingReplicaSet      Deployment                Scaled up replica set gorush-5c6775748b to 2

[/simterm]

And the HPA:

[simterm]

$ kubectl -n eks-dev-1-gorush get hpa
NAME         REFERENCE           TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
gorush-hpa   Deployment/gorush   3/2       1         5         2          10m

[/simterm]

Great! So, scaling by the gorush_total_push_count is working.

But here a trap: the gorush_total_push_count is cumulative metric, e.g. on the Production graph it will look like the next:

And in such a case our HPA will scale pods till the end of time.

Prometheus Adapter `ConfigMap` – `seriesQuery` and `metricsQuery`

To mitigate this let’s add an own metric.

The Prometheus Adapter has its own ConfigMap:

[simterm]

$ kubectl get cm prometheus-adapter
NAME                 DATA   AGE
prometheus-adapter   1      46h

[/simterm]

Wich contains the config.yaml, see its example here>>>.

Create a PromQL query which will return pushes count per second:

rate(gorush_total_push_count{instance="push.server.com:80",job="push-server"}[5m])

Update the ConfigMap and add new query there:

apiVersion: v1
data:
  config.yaml: |
    rules:
    - seriesQuery: '{__name__=~"gorush_total_push_count"}'
      seriesFilters: []
      resources:
        overrides:
          namespace:
            resource: namespace
          pod:
            resource: pod
      name:
        matches: ""
        as: "gorush_push_per_second"
      metricsQuery: rate(<<.Series>>{<<.LabelMatchers>>}[5m])

Save and exit, check it:

[simterm]

$ kubectl get --raw="/apis/custom.metrics.k8s.io/v1beta1/namespaces/eks-dev-1-gorush/pods/*/gorush_push_per_second" | jq
Error from server (NotFound): the server could not find the metric gorush_push_per_second for pods

[/simterm]

Re-create the pod so it will apply the changes (see Kubernetes: ConfigMap and Secrets – data auto-reload in pods):

[simterm]

$ kubectl delete pod prometheus-adapter-7c56787c5c-kllq6
pod "prometheus-adapter-7c56787c5c-kllq6" deleted

[/simterm]

Check it:

[simterm]

$ kubectl get --raw="/apis/custom.metrics.k8s.io/v1beta1/namespaces/eks-dev-1-gorush/pods/*/gorush_push_per_second" | jq
{
  "kind": "MetricValueList",
  "apiVersion": "custom.metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/eks-dev-1-gorush/pods/%2A/gorush_push_per_second"
  },
  "items": [
    {
      "describedObject": {
        "kind": "Pod",
        "namespace": "eks-dev-1-gorush",
        "name": "gorush-5c6775748b-6r54h",
        "apiVersion": "/v1"
      },
      "metricName": "gorush_push_per_second",
      "timestamp": "2020-08-11T12:28:03Z",
      "value": "0",
      "selector": null
    },
...

[/simterm]

Update the НРА to use the gorush_push_per_second:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: gorush-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: gorush
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Pods
    pods:
      metricName: gorush_push_per_second
      targetAverageValue: 1m

Check it:

[simterm]

$ kubectl -n eks-dev-1-gorush get hpa
NAME         REFERENCE           TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
gorush-hpa   Deployment/gorush   0/1m      1         5         1          68m

[/simterm]

Events:

[simterm]

...
0s    Normal   SuccessfulRescale   HorizontalPodAutoscaler   New size: 4; reason: pods metric gorush_push_per_second above target
0s    Normal   ScalingReplicaSet   Deployment   Scaled up replica set gorush-5c6775748b to 4
...

[/simterm]

And tje HPA now:

[simterm]

$ kubectl -n eks-dev-1-gorush get hpa
NAME         REFERENCE           TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
gorush-hpa   Deployment/gorush   11m/1m    1         5         5          70m

[/simterm]

Done.

Kubernetes: HorizontalPodAutoscaler – an overview with examples
0 (0)

Create HorizontalPodAutoscaler

Load testing HorizontalPodAutoscaler scaling

Multi-metrics scaling

autoscaling API group versions

Custom Metrics

Memory metrics scaling

Application-based metrics scaling

Kubernetes `ServiceMonitor`

Prometheus Adapter `ConfigMap` – `seriesQuery` and `metricsQuery`

Useful links

Create HorizontalPodAutoscaler

Load testing HorizontalPodAutoscaler scaling

Multi-metrics scaling

autoscaling API group versions

Custom Metrics

Memory metrics scaling

Application-based metrics scaling

Kubernetes ServiceMonitor

Prometheus Adapter ConfigMap – seriesQuery and metricsQuery

Useful links

Kubernetes `ServiceMonitor`

Prometheus Adapter `ConfigMap` – `seriesQuery` and `metricsQuery`