Category Archives: Monitoring

Hardware, services and network monitoring systems

AWS: CloudWatch – Multi source query: collecting metrics from an external Prometheus

13 December 2023
 

 Another interesting announcement from the last re:Invent is that CloudWatch has added the ability to collect metrics from external resources (see a very interesting report AWS re:Invent 2023 – Cloud operations for today, tomorrow, and beyond (COP227)). That is, we can now create graphs and/or alerts not only from the default metrics of CloudWatch itself,… Read More »

Grafana Loki: collecting AWS LoadBalancer logs from S3 with Promtail Lambda

25 November 2023
 

 Currently, we are able to collect our API Gateway logs from the CloudWatch Logs to Grafana Loki, see. Loki: collecting logs from CloudWatch Logs using Lambda Promtail. But in the process of migrating to Kubernetes, we have Application Load Balancers that can only write logs to S3, and we need to learn how to collect… Read More »

VictoriaMetrics: pushing metrics without Prometheus Pushgateway

18 November 2023
 

 In the Prometheus: running Pushgateway on Kubernetes with Helm and Terraform post I wrote about how to add Pushgateway to Prometheus, which allows using the Push model instead of Pull, that is, an Exporter can send metrics directly to the database instead of waiting for Prometheus or VMAgent to come to it. With VictoriaMetrics, it’s… Read More »

VictoriaMetrics: VMAuth – Proxy, Authentication, and Authorization

27 August 2023
 

  We continue to develop our monitoring stack. See the first part – VictoriaMetrics: creating a Kubernetes monitoring stack with its own Helm chart. What do we want to do next: give access to developers so that they can set Silence for alerts themselves in Alertmanager to avoid spamming Slack, see Prometheus: Alertmanager Web UI alerts… Read More »

Grafana: values ​​from records in Loki logs, and dual-Y-axes panels in Grafana

19 August 2023
 

  We have a function in AWS Lambda, that is writing logs to CloudWatch Logs, from where with the lambda-promtail we are getting them to a Grafana Loki instance to use them in Grafana graphs. What the task is: in the logs, we have records about “Init duration” and “Max Memory Used” by Lambdas. There are… Read More »

Grafana Loki: performance optimization with Recording Rules, caching, and parallel queries

19 August 2023
 

  So, we have Loki installed from the chart in simple-scale mode, see Grafana Loki: architecture and running in Kubernetes with AWS S3 storage and boltdb-shipper. Loki is runnings on an AWS Elastic Kubernetes Service cluster, installed with Loki Helm chart, AWS S3 is used as a long-term store, and BoltDB Shipper is used to… Read More »

AWS: Grafana Loki, InterZone traffic in AWS, and Kubernetes nodeAffinity

19 August 2023
 

  Traffic in AWS is generally quite an interesting and sometimes complicated thing, I once wrote about it in the AWS: Cost optimization – services expenses overview and traffic costs in AWS. Now, it’s time to return to this topic again. So, what’s the problem: in AWS Cost Explorer, I’ve noticed that we have an… Read More »

VictoriaMetrics: deploying a Kubernetes monitoring stack

23 July 2023
 

  Now we have VictoriaMetrics + Grafana on a regular EC2 instance, launched with Docker Compose, see the VictoriaMetrics: an overview and its use instead of Prometheus. It was kind of a Proof of Concept, and it’s time to launch it “in an adult way” – in Kubernetes and all the configurations stored in a… Read More »

VictoriaMetrics: an overview and its use instead of Prometheus

11 June 2023
 

  I’ve heard a lot about VictoriaMetrics for a long time, and finally, it’s time to try it out. So, in a nutshell – VictoriaMetrics is “Prometheus on steroids” and is fully compatible with it – can use its configuration files, exporters, PromQL, etc. So for me who has always used Prometheus, the first question… Read More »