Tag Archives: monitoring

Grafana: Loki – the LogQL’s Prometheus-like counters, aggregation functions and dnsmasq’s requests graphs
0 (0)

17 November 2019

The last time I configured Loki for logs collecting and monitoring was in February 2019 – almost a year ago, see the Grafana Labs: Loki – logs collecting and monitoring system post, when Loki was in its Beta state. Now we faced with outgoing traffic issues in our Production environments and can’t find who guilty for… Read More »

Loading

Debian: logrotate won’t rotate logs with an “unknown group ‘syslog'” error
0 (0)

9 October 2019

We have an AWS EC2 with Debian and logrotate. One day its root partition was exhausted and when I started investigating it – found, that we have a bunch of files like /var/log/syslog.N.gz. At the same time by default logrotate creates a config file to rotate syslog log files: [simterm] root@monitoring-dev:~# cat /etc/logrotate.d/syslog # Ansible… Read More »

Loading

OpsGenie: Uptrends integration
0 (0)

24 September 2019

Uptrends – just a simple pinging monitoring service already used for the RTFM blog (see the Prometheus: RTFM blog monitoring set up with Ansible – Grafana, Loki, and promtail post for more details). Now I’d like to add it as an additional alerting service for the project’s API-endpoints and configure its alerts to be sent… Read More »

Loading

Prometheus: Alertmanager’s alerts receivers and routing based on severity level and tags
0 (0)

26 March 2019

We have three working environments – Dev, Stage, Production. Also, there are a bunch of alerts with different severities – info, warning и critical. For example: … – name: SSLexpiry.rules rules: – alert: SSLCertExpiring30days expr: probe_ssl_earliest_cert_expiry{job=”blackbox”} – time() < 86400 * 30 for: 10m labels: severity: info annotations: summary: “SSL certificate warning” description: “SSL certificate… Read More »

Loading

Prometheus: Alertmanager – send alerts to a “/dev/null”
0 (0)

26 March 2019

In addition to the Prometheus: Alertmanager’s alerts receivers and routing based on severity level and tags post. Have an Alertmanager config with routes. The task is – send all alerts from a Dev-environment into a “/dev/null”. To do this – create an empty receiver: … receivers: – name: ‘blackhole’ – name: ‘default’ slack_configs: – send_resolved:… Read More »

Loading

Monit: email alerting on an SSH logins
0 (0)

18 March 2019

The task is to send an email alert when SSH-login was made from a not whitelisted IPs. Will use Monit here. Install it: [simterm] root@jenkins-dev:/home/admin# apt update && apt -y install monit [/simterm] Configure email settings: set localhost (we have a local eximhere), email’s format and email’s receiver. Edit the /etc/monit/monitrc file: … set mailserver localhost… Read More »

Loading

Prometheus: RTFM blog monitoring set up with Ansible – Grafana, Loki, and promtail
0 (0)

10 March 2019

After implementing the Loki system on my job’s project – I decided to add it for myself, so see my RTFM blog server’s logs. Also – want to add the node_exporter and alertmanager, to be notified about high disk usage. In this post, I’ll describe the Prometheus, node_exporter, Grafana, Loki, and promtail set up process… Read More »

Loading

Prometheus: blackbox-exporter probe_http_status_code == 0 and its debug
0 (0)

6 March 2019

Today I decided to upgrade Grafana to already released version 6.0 and all other Docker images as well. Upgrade was successful – Loki eventually started displaying previously missed log-file names and other tags, just – immediately I got a bunch of CRITICAL alerts in our Slack from the blackbox-exporter which is used to check every… Read More »

Loading