Grafana Loki: alerts from Ruler and labels from logs

By | 01/07/2023

For general information about Grafana Loki, see the Grafana Loki: architecture and running in Kubernetes with AWS S3 storage and boltdb-shipper post.

Among other components of the Loki, there is a separate service called ruler that is responsible for working with alerts that can be generated directly from logs.

The idea is very simple:

  • create a file with alerts in a Prometheus-like format
  • connect it to the pod ruler(the loki-read in the case of simple-scalable deployment)
  • ruler parses the logs according to the rules specified in the configuration file, and if some expression works, then the Ruler sends an alert via the Alertmanager’s API

Alerts will be described in ConfigMap, which will then be connected to pods with Ruler.

Documentation – Rules and the Ruler.

Test Pod for OOM-Killed

I want to test whether OOM Killed works, so let’s create a Pod with clearly understated limits that will be killed “on the fly”:

---
apiVersion: v1
kind: Pod
metadata:
  name: oom-test
  labels:
    test: "true"
spec:
  containers:
    - name: oom-test
      image: openjdk
      command: [ "/bin/bash", "-c", "--" ]
      args: [ "while true; do sleep 30; done;" ]
      resources:
        limits:
          memory: "1Mi"
  nodeSelector:
    kubernetes.io/hostname: eks-node-dev_data_services-i-081719890438d467f

Specify a node name in the nodeSelector to make it easier to search in Loki.

When this Pod will be started, Kubernetes will kill it due to exceeding the limits, and  journald on the WorkerNode will write an event to the system log, which is collected by the promtail:

[simterm]

$ kk -n monitoring get cm logs-promtail -o yaml
...
    - job_name: journal
       journal:
        labels:
          job: systemd-journal
        max_age: 12h
        path: /var/log/journal
      relabel_configs:
      - source_labels:
        - __journal__systemd_unit
        target_label: unit
      - source_labels:
        - __journal__hostname
        target_label: hostname

[/simterm]

Create the Pod:

[simterm]

$ kk apply -f test-oom.yaml 
pod/oom-test created

[/simterm]

Check its Events:

[simterm]

$ kk describe pod oom-test
...
Events:
  Type     Reason                  Age                 From               Message
  ----     ------                  ----                ----               -------
  Normal   Scheduled               91s                 default-scheduler  Successfully assigned default/oom-test to ip-10-0-0-27.us-west-2.compute.internal
  Normal   SandboxChanged          79s (x12 over 90s)  kubelet            Pod sandbox changed, it will be killed and re-created.
  Warning  FailedCreatePodSandBox  78s (x13 over 90s)  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "oom-test": Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: container init was OOM-killed (memory limit too low?): unknown

[/simterm]

And Loki logs:

Okay, now we have an oom-killed Pod for tests – let’s build a query for a future alert.

Forming a query in Loki

In the logs, we used the  {hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*" request – let’s use it for a test alert.

First, let’s check what Loki itself draws for us – use the  rate() and sum()see Log range aggregations :

sum(rate({hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*" [5m])) by (hostname)

Good!

Now, we can work with this – create a test alert.

Creating an alert for Loki Ruler

Create a file with a ConfigMap:

kind: ConfigMap
apiVersion: v1
metadata:
  name: rules-alerts
  namespace: monitoring
data:
  rules.yaml: |-
    groups:
      - name: systemd-alerts
        rules:
          - alert: TESTLokiRuler Systemd journal
            expr: |
              sum(rate({hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*" [5m])) by (hostname) > 1
            for: 1s
            labels:
                severity: info
            annotations:
                summary: Test Loki OOM Killer Alert

Deploy it:

[simterm]

$ kk apply -f rule-cm.yaml 
configmap/rules-alerts created

[/simterm]

Ruler, and a ConfigMap volume

Next, we need to connect this ConfigMap to ruler‘s directory specified in the Loki’s config for the ruler component:

...
    ruler:
      storage:
        local:
          directory: /var/loki/rules
...

Our Ruler works in  the loki-read pods – open their StatefulSet:

[simterm]

$ kk -n monitoring edit sts loki-read

[/simterm]

Describe a new volume:

...
      volumes:
      - configMap:
          defaultMode: 420
          name: rules-alerts
        name: rules
...

And its mapping to the Pod as /var/loki/rules/fake/rules.yaml, where the fake is a tenant_id, if used:

...
        volumeMounts:
        - mountPath: /etc/loki/config
          name: config
        - mountPath: /tmp
          name: tmp
        - mountPath: /var/loki
          name: data
        - mountPath: /var/loki/rules/fake/rules.yaml
          name: rules
          subPath: rules.yaml
...

In the subPath specify the key from the ConfigMap to mount ConfigMap’s content as a file.

Ruler alerting settings

Find the Alertmanager URL:

[simterm]

$ kk -n monitoring get svc
NAME                                             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
...
prometheus-kube-prometheus-alertmanager          ClusterIP   172.20.240.159   <none>        9093/TCP                     110d
...

[/simterm]

In the Loki’s ConfigMap for the ruler specify its address:

...
    ruler:
      storage:
        local:
          directory: /var/loki/rules
        type: local
      alertmanager_url: http://prometheus-kube-prometheus-alertmanager:9093
...

All options for ruler here>>>  .

Open access to the Alertmanager to check alerts:

[simterm]

$ kk -n monitoring port-forward svc/prometheus-kube-prometheus-alertmanager 9093:9093

[/simterm]

Restart the loki-read Pods, which can be simply done through kubectl delete pod, and check their logs:

[simterm]

$ kk -n monitoring  logs -f loki-read-0
...
level=info ts=2022-12-13T16:37:33.837173256Z caller=metrics.go:133 component=ruler org_id=fake latency=fast query="(sum by(hostname)(rate({hostname=\"eks-node-dev_data_services-i-081719890438d467f\"} |~ \".*OOM-killed.*\"[5m])) > 1)" query_type=metric range_type=instant length=0s step=0s duration=120.505858ms status=200 limit=0 returned_lines=0 throughput=48MB total_bytes=5.8MB total_entries=1 queue_time=0s subqueries=1
...

[/simterm]

Check Alerts in the Alertmanager – http://localhost:9093:

Loki, and additional labels

In the alerts, I would like to display a little more information than just the message “Test Loki OOM Killer Alert”, for example, to display a name of the Pod that was killed.

Adding labels by Promtail

The first option is to create new labels at the stage of log collection, in the Promtail itself through the pipeline_stages, see Grafana: Loki – Prometheus-like counters and aggregation functions in LogQL and graphs of DNS requests to dnsmasq, for example:

- job_name: journal
  pipeline_stages:
  - match:
      selector: '{job="systemd-journal"}'
      stages:
      - regex:
          expression: '.*level=(?P<level>[a-zA-Z]+).*'
      - labels:
          level:
      - regex:
          expression: '.*source="(?P<source>[a-zA-Z]+)".*'
      - labels:
          source:
  journal:
    labels:
      job: systemd-journal
    max_age: 12h
    path: /var/log/journal
  relabel_configs:
  - source_labels:
    - __journal__systemd_unit
    target_label: unit
  - source_labels:
    - __journal__hostname
    target_label: hostname

Here for tests, I’ve created new labels that were connected to the logs –  source and  level.

Another option with Promtail is using  static_labels.

But there is a big problem here: since Loki creates a separate log stream for each set of labels, for which separate indexes and data chunks are created, then as a result we will get, firstly, problems with performance, and secondly, with costs, because for each index and data block read-write requests will be performed in the shared store, in our case it is AWS S3, where you have to pay money for each request.

See a great post on this topic here – Grafana Loki and what can go wrong with label cardinality .

Adding labels from queries in Loki

Instead, we can create new labels directly from the request using Loki’s LogQL.

Let’s take an entry from the log, which tells about the OOM Killer:

E1213 16:52:25.879626 3382 pod_workers.go:951] “Error syncing pod, skipping” err=”failed to \”CreatePodSandbox\” for \”oom-test_default(f02523a9-43a7-4370-85dd-1da7554496e6)\” with CreatePodSandboxError: \”Failed to create sandbox for pod \\\”oom-test_default(f02523a9-43a7-4370-85dd-1da7554496e6)\\\”: rpc error: code = Unknown desc = failed to start sandbox container for pod \\\”oom-test\\\”: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: container init was OOM-killed (memory limit too low?): unknown\”” pod=”default/oom-test” podUID=f02523a9-43a7-4370-85dd-1da7554496e6

Here we have a field  pod with the name of the Pod that was killed – pod="default/oom-test".

We can use a regex in the form  pod=".*/(?P<pod>[a-zA-Z].*)".* to create a Named Capturing Group, check for example on the https://regex101.com website:

Update the query in Loki:

{hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*" | regexp `pod=".*/(?P<pod>[a-zA-Z].*)".*`

And as a result, we get a label named pod with the value “oom-test“:

Check the alert request with the sum() and  rate():

sum(rate({hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*" | regexp `pod=".*/(?P<pod>[a-zA-Z].*)".*` | pod!="" [5m])) by (pod)

And the result is:

Update the alert – add  description with the use of the {{ $labels.pod }} format:

- alert: TESTLokiRuler Systemd journal
  expr: |
    sum(rate({hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*" | regexp `.*pod=".*/(?P<pod>[a-zA-Z].*)".*` | pod!="" [15m])) by (pod) > 1
  for: 1s
  labels:
      severity: info
  annotations:
      summary: Test Loki OOM Killer Alert
      description: "Killed pod: `{{ $labels.pod }}`"

Wait for it to fire:

And in a Slack:

Grafana Loki, and 502/504 errors

Can’t reproduce it now, but sometimes Grafana can’t wait for a response from Loki to execute a request and crashes with 502 or 504 errors.

There is a thread in Girthub, for me increasing HTTP timeouts in Loki ConfigMap helped:

...
server:
  http_server_read_timeout: 600s
  http_server_write_timeout: 600s
...

Done.

Useful links