Category Archives: Monitoring

Hardware, services and network monitoring systems

AWS: VPC Flow Logs, NAT Gateways, and Kubernetes Pods – a detailed overview

5 May 2024
 

 We have a relatively large spending on AWS NAT Gateway Processed Bytes, and it became interesting to know what exactly is processed through it. It would seem that everything is simple – just turn on VPC Flow Logs and see what’s what. But when it comes to AWS Elastic Kubernetes Service and NAT Gateways, things… Read More »

Kubernetes: tracing requests with AWS X-Ray, and Grafana data source

2 March 2024
 

 Tracing allows you to track requests between components, that is, for example, when using AWS and Kubernetes we can trace the entire path of a request from AWS Load Balancer to Kubernetes Pod and to DynamoDB or RDS. This helps us both to track performance issues – where and which requests are taking a long… Read More »

Terraform: creating a module for collecting AWS ALB logs in Grafana Loki

24 February 2024
 

 An example of creating a Terraform module to automate log collection from AWS Load Balancers in Grafana Loki. See how the scheme works in the Grafana Loki: collecting AWS LoadBalancer logs from S3 with Promtail Lambda blog. In short, ALB writes logs to an S3 bucket, from where they are picked up by a Lambda… Read More »

Grafana Loki: LogQL and Recording Rules for metrics from AWS Load Balancer logs

24 February 2024
 

 I didn’t plan this post at all as I thought I would do it quickly, but it didn’t work out quickly, and I need to dig a little deeper into this topic. So, what we are talking about: we have AWS Load Balancers, logs from which are collected to Grafana Loki, see. Grafana Loki: collecting… Read More »

Karpenter: its monitoring, and Grafana dashboard for Kubernetes WorkerNodes

18 February 2024
 

 We have an AWS Elastic Kubernetes Service cluster with Karpenter which is responsible for EC2 auto-scaling, see AWS: Getting started with Karpenter for autoscaling in EKS, and its installation with Helm. In general, there are no problems with it so far, but in any case we need to monitor it. For its monitoring, Karpenter provides… Read More »

AWS: CloudWatch – Multi source query: collecting metrics from an external Prometheus

13 December 2023
 

 Another interesting announcement from the last re:Invent is that CloudWatch has added the ability to collect metrics from external resources (see a very interesting report AWS re:Invent 2023 – Cloud operations for today, tomorrow, and beyond (COP227)). That is, we can now create graphs and/or alerts not only from the default metrics of CloudWatch itself,… Read More »

Grafana Loki: collecting AWS LoadBalancer logs from S3 with Promtail Lambda

25 November 2023
 

 Currently, we are able to collect our API Gateway logs from the CloudWatch Logs to Grafana Loki, see. Loki: collecting logs from CloudWatch Logs using Lambda Promtail. But in the process of migrating to Kubernetes, we have Application Load Balancers that can only write logs to S3, and we need to learn how to collect… Read More »

VictoriaMetrics: pushing metrics without Prometheus Pushgateway

18 November 2023
 

 In the Prometheus: running Pushgateway on Kubernetes with Helm and Terraform post I wrote about how to add Pushgateway to Prometheus, which allows using the Push model instead of Pull, that is, an Exporter can send metrics directly to the database instead of waiting for Prometheus or VMAgent to come to it. With VictoriaMetrics, it’s… Read More »

VictoriaMetrics: VMAuth – Proxy, Authentication, and Authorization

27 August 2023
 

  We continue to develop our monitoring stack. See the first part – VictoriaMetrics: creating a Kubernetes monitoring stack with its own Helm chart. What do we want to do next: give access to developers so that they can set Silence for alerts themselves in Alertmanager to avoid spamming Slack, see Prometheus: Alertmanager Web UI alerts… Read More »