Tag Archives: monitoring

Logz.io: collection logs from Kubernetes – fluentd vs filebeat

1 February 2021
 

 We are using Logz.io to collect our Kubernetes cluster logs (also, there is a local Loki instance). Logs are collected and processed by a Fluentd pod on every WorkerNode which are deployed from a DaemonSet in its default configuration, see the documentation here – logzio-k8s. The problem we faced is that those pods are consuming… Read More »

Prometheus: Alertmanager Web UI alerts Silence

26 January 2021
 

 Active alerts sending frequency via Alertmanager is configured via the repeat_interval in the /etc/alertmanager/config.yml file. We have this interval set to 15 minutes, and as result, we have notifications about alerts in our Slack each fifteen minutes. Still, some alerts are such a “known issues”, when we already started the investigation or fixing it, but… Read More »

Kubernetes: a cluster’s monitoring with the Prometheus Operator

13 August 2020
 

 Continuing with the Kubernetes: monitoring with Prometheus – exporters, a Service Discovery, and its roles, where we configured Prometheus manually to see how it’s working – now, let’s try to use Prometheus Operator installed via Helm chart. So, the task is spin up a Prometheus server and all necessary exporter in an AWS Elastic Kubernetes… Read More »

Kubernetes: HorizontalPodAutoscaler – an overview with examples

12 August 2020
 

 Kubernetes HorizontalPodAutoscaler automatically scales Kubernetes Pods under ReplicationController, Deployment, or ReplicaSet controllers basing on its CPU, memory, or other metrics. It was shortly discussed in the Kubernetes: running metrics-server in AWS EKS for a Kubernetes Pod AutoScaler post, now let’s go deeper to check all options available for scaling. For HPA you can use three… Read More »

Kubernetes: monitoring with Prometheus – exporters, a Service Discovery, and its roles

26 April 2020
 

 The next task with our Kubernetes cluster is to set up its monitoring with Prometheus. This task is complicated by the fact, that there is the whole bunch of resources needs to be monitored: from the infrastructure side – ЕС2 WokerNodes instances, their CPU, memory, network, disks, etc key services of Kubernetes itself – its… Read More »

Kubernetes: running metrics-server in AWS EKS for a Kubernetes Pod AutoScaler

15 February 2020
 

 Assuming, we already have an AWS EKS cluster with worker nodes. In this post – we will connect to a newly created cluster, will create a test deployment with an HPA – Kubernetes Horizontal Pod AutoScaler and will try to get information about resources usage using kubectl top. Kubernetes cluster Create a test cluster using… Read More »

Grafana: Loki – the LogQL’s Prometheus-like counters, aggregation functions and dnsmasq’s requests graphs

17 November 2019
 

 The last time I configured Loki for logs collecting and monitoring was in February 2019 – almost a year ago, see the Grafana Labs: Loki – logs collecting and monitoring system post, when Loki was in its Beta state. Now we faced with outgoing traffic issues in our Production environments and can’t find who guilty for… Read More »

Debian: logrotate won’t rotate logs with an “unknown group ‘syslog'” error

9 October 2019
 

 We have an AWS EC2 with Debian and logrotate. One day its root partition was exhausted and when I started investigating it – found, that we have a bunch of files like /var/log/syslog.N.gz. At the same time by default logrotate creates a config file to rotate syslog log files: [simterm] root@monitoring-dev:~# cat /etc/logrotate.d/syslog # Ansible… Read More »

OpsGenie: Uptrends integration

24 September 2019
 

 Uptrends – just a simple pinging monitoring service already used for the RTFM blog (see the Prometheus: RTFM blog monitoring set up with Ansible – Grafana, Loki, and promtail post for more details). Now I’d like to add it as an additional alerting service for the project’s API-endpoints and configure its alerts to be sent… Read More »