Grafana Loki: alerts from the Loki Ruler and labels from logs

By | 03/12/2023

For general information on Grafana Loki, see the Grafana Loki: architecture and running in Kubernetes with AWS S3 storage and boltdb-shipper.

Among other services that make up Loki, there is a separate service called ruler that is responsible for working with alerts that can be generated directly from logs.

The idea is very simple:

  • create a file with alerts in a Prometheus-like format
  • connect it to the pod ruler(the loki-read in the case of simple-scalable deployment)
  • ruler parses the logs according to the rules specified in the configuration file, and if some expression triggers, then the Ruler sends the alert to the Alertmanager API

Alerts will be described in a ConfigMap, which will then be connected to pods with Ruler.

Documentation – Rules and the Ruler.

Test Pod for the OOM-Killed message

I want to test whether OOM Killed works, so let’s create a Kubernetes Pod with clearly understated limits that will be killed “on the fly”:

---
apiVersion: v1
kind: Pod
metadata:
  name: oom-test
  labels:
    test: "true"
spec:
  containers:
    - name: oom-test
      image: openjdk
      command: [ "/bin/bash", "-c", "--" ]
      args: [ "while true; do sleep 30; done;" ]
      resources:
        limits:
          memory: "1Mi"
  nodeSelector:
    kubernetes.io/hostname: eks-node-dev_data_services-i-081719890438d467f

Specify a node in the nodeSelector to make it easier to search in Loki.

When this Pod is started, Kubernetes will kill it due to exceeding the limits, and  journald on the WorkerNode will write an event to the system log, which is collected  promtail:

[simterm]

$ kk -n monitoring get cm logs-promtail -o yaml
...
    - job_name: journal
       journal:
        labels:
          job: systemd-journal
        max_age: 12h
        path: /var/log/journal
      relabel_configs:
      - source_labels:
        - __journal__systemd_unit
        target_label: unit
      - source_labels:
        - __journal__hostname
        target_label: hostname

[/simterm]

Let’s start our pod:

[simterm]

$ kk apply -f test-oom.yaml 
pod/oom-test created

[/simterm]

Check it:

[simterm]

$ kk describe pod oom-test
...
Events:
  Type     Reason                  Age                 From               Message
  ----     ------                  ----                ----               -------
  Normal   Scheduled               91s                 default-scheduler  Successfully assigned default/oom-test to ip-10-0-0-27.us-west-2.compute.internal
  Normal   SandboxChanged          79s (x12 over 90s)  kubelet            Pod sandbox changed, it will be killed and re-created.
  Warning  FailedCreatePodSandBox  78s (x13 over 90s)  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "oom-test": Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: container init was OOM-killed (memory limit too low?): unknown

[/simterm]

And check the Loki logs:

Ok, now we have an oom-killed Pod for tests – let’s build a query for a future alert.

Building a request in Loki

In the logs, we looked for the logs by the request  {hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*" – let’s use it for a test alert.

First, let’s check what Loki herself draws for us – we use  rate() andsum()see Log range aggregations:

sum(rate({hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*" [5m])) by (hostname)

Good!

We can already work with this – create a test alert.

Creating an alert for Loki Ruler

Create a file with the ConfigMap:

kind: ConfigMap
apiVersion: v1
metadata:
  name: rules-alerts
  namespace: monitoring
data:
  rules.yaml: |-
    groups:
      - name: systemd-alerts
        rules:
          - alert: TESTLokiRuler Systemd journal
            expr: |
              sum(rate({hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*" [5m])) by (hostname) > 1
            for: 1s
            labels:
                severity: info
            annotations:
                summary: Test Loki OOM Killer Alert

Deploy it:

[simterm]

$ kk apply -f rule-cm.yaml 
configmap/rules-alerts created

[/simterm]

The Ruler and its ConfigMap volume

Next, we need to connect this ConfigMap to the directory ruler specified in Loki’s config for the ruler component:

...
    ruler:
      storage:
        local:
          directory: /var/loki/rules
...

Our Ruler works in the loki-read pods – edit their StatefulSet:

[simterm]

$ kk -n monitoring edit sts loki-read

[/simterm]

Describe a new volume:

...
      volumes:
      - configMap:
          defaultMode: 420
          name: rules-alerts
        name: rules
...

And its mapping to the Pod as  /var/loki/rules/fake/rules.yaml, where fake is the tenant_id, if used:

...
        volumeMounts:
        - mountPath: /etc/loki/config
          name: config
        - mountPath: /tmp
          name: tmp
        - mountPath: /var/loki
          name: data
        - mountPath: /var/loki/rules/fake/rules.yaml
          name: rules
          subPath: rules.yaml
...

In the subPath we set a key from the ConfigMap to connect it exactly as a file.

Configuring Ruler alerting

Find the Alertmanager URL:

[simterm]

$ kk -n monitoring get svc
NAME                                             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
...
prometheus-kube-prometheus-alertmanager          ClusterIP   172.20.240.159   <none>        9093/TCP                     110d
...

[/simterm]

In Loki’s ConfigMap for the  ruler specify this address:

...
    ruler:
      storage:
        local:
          directory: /var/loki/rules
        type: local
      alertmanager_url: http://prometheus-kube-prometheus-alertmanager:9093
...

All options for  rulerhere>>>  .

Open access to the Alertmanager to check alerts:

[simterm]

$ kk -n monitoring port-forward svc/prometheus-kube-prometheus-alertmanager 9093:9093

[/simterm]

Restart the loki-read pods, can simply be done through kubectl delete pod, and check their logs:

[simterm]

$ kk -n monitoring  logs -f loki-read-0
...
level=info ts=2022-12-13T16:37:33.837173256Z caller=metrics.go:133 component=ruler org_id=fake latency=fast query="(sum by(hostname)(rate({hostname=\"eks-node-dev_data_services-i-081719890438d467f\"} |~ \".*OOM-killed.*\"[5m])) > 1)" query_type=metric range_type=instant length=0s step=0s duration=120.505858ms status=200 limit=0 returned_lines=0 throughput=48MB total_bytes=5.8MB total_entries=1 queue_time=0s subqueries=1
...

[/simterm]

Check Alerts in the Alertmanager – http://localhost:9093 :

Loki and additional labels

In the alerts, I would like to display a little more information than just the message “Test Loki OOM Killer Alert”. For example, let’s display the name of a Pod that was killed.

Adding labels to Promtail

The first option is to create new labels at the stage of log collection, in Promtail itself through the pipeline_stages, see the Grafana: Loki – the LogQL’s Prometheus-like counters, aggregation functions, and dnsmasq’s requests graphs, for example:

- job_name: journal
  pipeline_stages:
  - match:
      selector: '{job="systemd-journal"}'
      stages:
      - regex:
          expression: '.*level=(?P<level>[a-zA-Z]+).*'
      - labels:
          level:
      - regex:
          expression: '.*source="(?P<source>[a-zA-Z]+)".*'
      - labels:
          source:
  journal:
    labels:
      job: systemd-journal
    max_age: 12h
    path: /var/log/journal
  relabel_configs:
  - source_labels:
    - __journal__systemd_unit
    target_label: unit
  - source_labels:
    - __journal__hostname
    target_label: hostname

Here, for tests, I created new labels that were attached to logs – the source and  level.

Another option with Promtail is using  static_labels.

But there is a problem here: since Loki creates a separate log stream for each set of labels, for which separate indexes and data chunks are created, then as a result we will get, firstly, problems with performance, and secondly, with cost, because for each index and block of data, read-write requests will be performed in the shared store, in our case it is AWS S3, where you have to pay money for each request.

See a great post on this topic here – Grafana Loki and what can go wrong with label cardinality.

Adding labels from queries in Loki

Instead, we can create new labels directly from the request using Loki itself.

Let’s take an entry from the log, which tells about the operation of the OOM Killer:

E1213 16:52:25.879626 3382 pod_workers.go:951] “Error syncing pod, skipping” err=”failed to \”CreatePodSandbox\” for \”oom-test_default(f02523a9-43a7-4370-85dd-1da7554496e6)\” with CreatePodSandboxError: \”Failed to create sandbox for pod \\\”oom-test_default(f02523a9-43a7-4370-85dd-1da7554496e6)\\\”: rpc error: code = Unknown desc = failed to start sandbox container for pod \\\”oom-test\\\”: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: container init was OOM-killed (memory limit too low?): unknown\”” pod=”default/oom-test” podUID=f02523a9-43a7-4370-85dd-1da7554496e6

Here, we have a field  pod with the name of the pod that was killed – pod="default/oom-test".

We can use regex in the form  pod=".*/(?P<pod>[a-zA-Z].*)".* to create  a Named Capturing Group, check for example at  https://regex101.com:

Update the query in the Loki:

{hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*" | regexp `pod=".*/(?P<pod>[a-zA-Z].*)".*`

And as a result, we get a new label called pod with the value “oom-test“:

Check the alert query with the sum() and  rate():

sum(rate({hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*" | regexp `pod=".*/(?P<pod>[a-zA-Z].*)".*` | pod!="" [5m])) by (pod)

The result:

Update the alert – add description with the  {{ $labels.pod }}:

- alert: TESTLokiRuler Systemd journal
  expr: |
    sum(rate({hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*" | regexp `.*pod=".*/(?P<pod>[a-zA-Z].*)".*` | pod!="" [15m])) by (pod) > 1
  for: 1s
  labels:
      severity: info
  annotations:
      summary: Test Loki OOM Killer Alert
      description: "Killed pod: `{{ $labels.pod }}`"

Wait for it to fire:

And in a Slack:

Grafana Loki and 502/504 errors

Can’t reproduce it now, but sometimes Grafana can’t wait for a response from Loki to execute a request and crashes with 502 or 504 errors.

There is  a thread in Girthub, I was helped by increasing HTTP timeouts in Loki ConfigMap:

...
server:
  http_server_read_timeout: 600s
  http_server_write_timeout: 600s
...

In general, that’s all for now.

Useful links