Category Archives: Prometheus

Prometheus, a Cloud Native Computing Foundation project, is a systems and service monitoring system. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts if some condition is observed to be true.

Prometheus: Alertmanager’s alerts receivers and routing based on severity level and tags

26 March 2019
 

 We have three working environments – Dev, Stage, Production. Also, there are a bunch of alerts with different severities – info, warning и critical. For example: … – name: SSLexpiry.rules rules: – alert: SSLCertExpiring30days expr: probe_ssl_earliest_cert_expiry{job=”blackbox”} – time() < 86400 * 30 for: 10m labels: severity: info annotations: summary: “SSL certificate warning” description: “SSL certificate… Read More »

Prometheus: Alertmanager – send alerts to a “/dev/null”

26 March 2019
 

 In addition to the Prometheus: Alertmanager’s alerts receivers and routing based on severity level and tags post. Have an Alertmanager config with routes. The task is – send all alerts from a Dev-environment into a “/dev/null”. To do this – create an empty receiver: … receivers: – name: ‘blackhole’ – name: ‘default’ slack_configs: – send_resolved:… Read More »

Prometheus: RTFM blog monitoring set up with Ansible – Grafana, Loki, and promtail

10 March 2019
 

 After implementing the Loki system on my job’s project – I decided to add it for myself, so see my RTFM blog server’s logs. Also – want to add the node_exporter and alertmanager, to be notified about high disk usage. In this post, I’ll describe the Prometheus, node_exporter, Grafana, Loki, and promtail set up process… Read More »

Prometheus: blackbox-exporter probe_http_status_code == 0 and its debug

6 March 2019
 

 Today I decided to upgrade Grafana to already released version 6.0 and all other Docker images as well. Upgrade was successful – Loki eventually started displaying previously missed log-file names and other tags, just – immediately I got a bunch of CRITICAL alerts in our Slack from the blackbox-exporter which is used to check every… Read More »

Grafana Labs: Loki – using AWS S3 as a data storage and AWS DynamoDB for indexes

13 February 2019
 

 Let’s proceed with the Loki system. First post of this series – Grafana Labs: Loki – logs collecting and monitoring system and the second one – Grafana Labs: Loki – distributed system, labels and filters. There is the Grafana’s Slack community with the dedicated #loki channel where you can ask for some assist (and it’s really helpful).… Read More »