For general information on Grafana Loki, see the Grafana Loki: architecture and running in Kubernetes with AWS S3 storage and boltdb-shipper.
Among other services that make up Loki, there is a separate service called ruler
that is responsible for working with alerts that can be generated directly from logs.
The idea is very simple:
- create a file with alerts in a Prometheus-like format
- connect it to the pod
ruler
(theloki-read
in the case of simple-scalable deployment) ruler
parses the logs according to the rules specified in the configuration file, and if some expression triggers, then the Ruler sends the alert to the Alertmanager API
Alerts will be described in a ConfigMap, which will then be connected to pods with Ruler.
Documentation – Rules and the Ruler.
Contents
Test Pod for the OOM-Killed message
I want to test whether OOM Killed works, so let’s create a Kubernetes Pod with clearly understated limits that will be killed “on the fly”:
--- apiVersion: v1 kind: Pod metadata: name: oom-test labels: test: "true" spec: containers: - name: oom-test image: openjdk command: [ "/bin/bash", "-c", "--" ] args: [ "while true; do sleep 30; done;" ] resources: limits: memory: "1Mi" nodeSelector: kubernetes.io/hostname: eks-node-dev_data_services-i-081719890438d467f
Specify a node in the nodeSelector
to make it easier to search in Loki.
When this Pod is started, Kubernetes will kill it due to exceeding the limits, and journald
on the WorkerNode will write an event to the system log, which is collected promtail
:
[simterm]
$ kk -n monitoring get cm logs-promtail -o yaml ... - job_name: journal journal: labels: job: systemd-journal max_age: 12h path: /var/log/journal relabel_configs: - source_labels: - __journal__systemd_unit target_label: unit - source_labels: - __journal__hostname target_label: hostname
[/simterm]
Let’s start our pod:
[simterm]
$ kk apply -f test-oom.yaml pod/oom-test created
[/simterm]
Check it:
[simterm]
$ kk describe pod oom-test ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 91s default-scheduler Successfully assigned default/oom-test to ip-10-0-0-27.us-west-2.compute.internal Normal SandboxChanged 79s (x12 over 90s) kubelet Pod sandbox changed, it will be killed and re-created. Warning FailedCreatePodSandBox 78s (x13 over 90s) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "oom-test": Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: container init was OOM-killed (memory limit too low?): unknown
[/simterm]
And check the Loki logs:
Ok, now we have an oom-killed Pod for tests – let’s build a query for a future alert.
Building a request in Loki
In the logs, we looked for the logs by the request {hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*"
– let’s use it for a test alert.
First, let’s check what Loki herself draws for us – we use rate()
andsum()
, see Log range aggregations:
sum(rate({hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*" [5m])) by (hostname)
Good!
We can already work with this – create a test alert.
Creating an alert for Loki Ruler
Create a file with the ConfigMap:
kind: ConfigMap apiVersion: v1 metadata: name: rules-alerts namespace: monitoring data: rules.yaml: |- groups: - name: systemd-alerts rules: - alert: TESTLokiRuler Systemd journal expr: | sum(rate({hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*" [5m])) by (hostname) > 1 for: 1s labels: severity: info annotations: summary: Test Loki OOM Killer Alert
Deploy it:
[simterm]
$ kk apply -f rule-cm.yaml configmap/rules-alerts created
[/simterm]
The Ruler and its ConfigMap volume
Next, we need to connect this ConfigMap to the directory ruler
specified in Loki’s config for the ruler
component:
... ruler: storage: local: directory: /var/loki/rules ...
Our Ruler works in the loki-read
pods – edit their StatefulSet:
[simterm]
$ kk -n monitoring edit sts loki-read
[/simterm]
Describe a new volume
:
... volumes: - configMap: defaultMode: 420 name: rules-alerts name: rules ...
And its mapping to the Pod as /var/loki/rules/fake/rules.yaml
, where fake is the tenant_id
, if used:
... volumeMounts: - mountPath: /etc/loki/config name: config - mountPath: /tmp name: tmp - mountPath: /var/loki name: data - mountPath: /var/loki/rules/fake/rules.yaml name: rules subPath: rules.yaml ...
In the subPath
we set a key
from the ConfigMap to connect it exactly as a file.
Configuring Ruler alerting
Find the Alertmanager URL:
[simterm]
$ kk -n monitoring get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE ... prometheus-kube-prometheus-alertmanager ClusterIP 172.20.240.159 <none> 9093/TCP 110d ...
[/simterm]
In Loki’s ConfigMap for the ruler
specify this address:
... ruler: storage: local: directory: /var/loki/rules type: local alertmanager_url: http://prometheus-kube-prometheus-alertmanager:9093 ...
All options for ruler
– here>>> .
Open access to the Alertmanager to check alerts:
[simterm]
$ kk -n monitoring port-forward svc/prometheus-kube-prometheus-alertmanager 9093:9093
[/simterm]
Restart the loki-read
pods, can simply be done through kubectl delete pod
, and check their logs:
[simterm]
$ kk -n monitoring logs -f loki-read-0 ... level=info ts=2022-12-13T16:37:33.837173256Z caller=metrics.go:133 component=ruler org_id=fake latency=fast query="(sum by(hostname)(rate({hostname=\"eks-node-dev_data_services-i-081719890438d467f\"} |~ \".*OOM-killed.*\"[5m])) > 1)" query_type=metric range_type=instant length=0s step=0s duration=120.505858ms status=200 limit=0 returned_lines=0 throughput=48MB total_bytes=5.8MB total_entries=1 queue_time=0s subqueries=1 ...
[/simterm]
Check Alerts in the Alertmanager – http://localhost:9093 :
Loki and additional labels
In the alerts, I would like to display a little more information than just the message “Test Loki OOM Killer Alert”. For example, let’s display the name of a Pod that was killed.
Adding labels to Promtail
The first option is to create new labels at the stage of log collection, in Promtail itself through the pipeline_stages
, see the Grafana: Loki – the LogQL’s Prometheus-like counters, aggregation functions, and dnsmasq’s requests graphs, for example:
- job_name: journal pipeline_stages: - match: selector: '{job="systemd-journal"}' stages: - regex: expression: '.*level=(?P<level>[a-zA-Z]+).*' - labels: level: - regex: expression: '.*source="(?P<source>[a-zA-Z]+)".*' - labels: source: journal: labels: job: systemd-journal max_age: 12h path: /var/log/journal relabel_configs: - source_labels: - __journal__systemd_unit target_label: unit - source_labels: - __journal__hostname target_label: hostname
Here, for tests, I created new labels that were attached to logs – the source
and level
.
Another option with Promtail is using static_labels
.
But there is a problem here: since Loki creates a separate log stream for each set of labels, for which separate indexes and data chunks are created, then as a result we will get, firstly, problems with performance, and secondly, with cost, because for each index and block of data, read-write requests will be performed in the shared store, in our case it is AWS S3, where you have to pay money for each request.
See a great post on this topic here – Grafana Loki and what can go wrong with label cardinality.
Adding labels from queries in Loki
Instead, we can create new labels directly from the request using Loki itself.
Let’s take an entry from the log, which tells about the operation of the OOM Killer:
E1213 16:52:25.879626 3382 pod_workers.go:951] “Error syncing pod, skipping” err=”failed to \”CreatePodSandbox\” for \”oom-test_default(f02523a9-43a7-4370-85dd-1da7554496e6)\” with CreatePodSandboxError: \”Failed to create sandbox for pod \\\”oom-test_default(f02523a9-43a7-4370-85dd-1da7554496e6)\\\”: rpc error: code = Unknown desc = failed to start sandbox container for pod \\\”oom-test\\\”: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: container init was OOM-killed (memory limit too low?): unknown\”” pod=”default/oom-test” podUID=f02523a9-43a7-4370-85dd-1da7554496e6
Here, we have a field pod
with the name of the pod that was killed – pod="default/oom-test"
.
We can use regex in the form pod=".*/(?P<pod>[a-zA-Z].*)".*
to create a Named Capturing Group, check for example at https://regex101.com:
Update the query in the Loki:
{hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*" | regexp `pod=".*/(?P<pod>[a-zA-Z].*)".*`
And as a result, we get a new label called pod
with the value “oom-test“:
Check the alert query with the sum()
and rate()
:
sum(rate({hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*" | regexp `pod=".*/(?P<pod>[a-zA-Z].*)".*` | pod!="" [5m])) by (pod)
The result:
Update the alert – add description
with the {{ $labels.pod }}
:
- alert: TESTLokiRuler Systemd journal expr: | sum(rate({hostname="eks-node-dev_data_services-i-081719890438d467f"} |~ ".*OOM-killed.*" | regexp `.*pod=".*/(?P<pod>[a-zA-Z].*)".*` | pod!="" [15m])) by (pod) > 1 for: 1s labels: severity: info annotations: summary: Test Loki OOM Killer Alert description: "Killed pod: `{{ $labels.pod }}`"
Wait for it to fire:
And in a Slack:
Grafana Loki and 502/504 errors
Can’t reproduce it now, but sometimes Grafana can’t wait for a response from Loki to execute a request and crashes with 502 or 504 errors.
There is a thread in Girthub, I was helped by increasing HTTP timeouts in Loki ConfigMap:
... server: http_server_read_timeout: 600s http_server_write_timeout: 600s ...
In general, that’s all for now.
Useful links
- How to Setup Alerting With Loki
- Loki 2.0 released: Transform logs as you’re querying them, and set up alerts within Loki
- Labels from Logs
- The concise guide to labels in Loki
- How labels in Loki can make log queries faster and easier
- Grafana Loki and what can go wrong with label cardinality
- How to use LogQL range aggregations in Loki