Category Archives: Monitoring

Hardware, services and network monitoring systems

AWS: Simple Email Service Bounce rate and monitoring with and Prometheus

14 July 2021
 

 Recently, AWS blocked our AWS Simple Email Service because of its low bounce rate. This can be checked in the AWS SES > Reputation Dashboard, our account currently has Under review status: After we’ve connected AWS Tech Support, they enabled it back, but we must solve the issue asap, and have to monitor AWS SES… Read More »

Kubernetes: namespace hangs in Terminating and metrics-server non-obviousness

1 April 2021
 

 Faced with a very interesting thing during removal of a Kubernetes Namespace. After a kubectl delete namespace NAMESPACE is executed, the namespace hangs in the Terminating state, and any attempt to forcibly remove it didn’t help. First, let’s see how such a force-removal can be done, and then will check the real cause and a… Read More »

Opsgenie: integration with AWS RDS and alerting

18 March 2021
 

 Let’s configure Opsgenie with AWS RDS. The idea is to get notifications from RDS about events and send them to Opsgenie which will send them to our Slack. To do so, we need to configure AWS Simple Notification Service and AWS RDS Event subscriptions. The official documentation is here>>>. Opsgenie confiuration Go to the Integrations… Read More »

Logz.io: collection logs from Kubernetes – fluentd vs filebeat

1 February 2021
 

 We are using Logz.io to collect our Kubernetes cluster logs (also, there is a local Loki instance). Logs are collected and processed by a Fluentd pod on every WorkerNode which are deployed from a DaemonSet in its default configuration, see the documentation here – logzio-k8s. The problem we faced is that those pods are consuming… Read More »

Prometheus: Alertmanager Web UI alerts Silence

26 January 2021
 

 Active alerts sending frequency via Alertmanager is configured via the repeat_interval in the /etc/alertmanager/config.yml file. We have this interval set to 15 minutes, and as result, we have notifications about alerts in our Slack each fifteen minutes. Still, some alerts are such a “known issues”, when we already started the investigation or fixing it, but… Read More »

Linux: LEMP set up – NGINX, PHP, MySQL, SSL, monitoring, logs, and a WordPress blog migration

6 November 2020
 

 Finally got time to migrate the RTFM.CO.UA blog to a new server with Debian 10. This time manually, without any automation will set up a LEMP stack Wrote a similar at 2016 – Debian: установка LEMP — NGINX + PHP-FPM + MariaDB (Rus), but in time the post is more complete of the process and… Read More »

AWS Elastic Kubernetes Service: load-testing and high-load tuning – problems and solutions

4 September 2020
 

 Actually, this post was planned as a short note about using NodeAffinity for Kubernetes Pod: But then, as often happens, after starting writing about one thing, I faced another, and then another one, and as a result – I made this long-read post about Kubernetes load-testing. So, I’ve started about NodeAffinity, but then wondered how… Read More »

Kubernetes: a cluster’s monitoring with the Prometheus Operator

13 August 2020
 

 Continuing with the Kubernetes: monitoring with Prometheus – exporters, a Service Discovery, and its roles, where we configured Prometheus manually to see how it’s working – now, let’s try to use Prometheus Operator installed via Helm chart. So, the task is spin up a Prometheus server and all necessary exporter in an AWS Elastic Kubernetes… Read More »

Kubernetes: HorizontalPodAutoscaler – an overview with examples

12 August 2020
 

 Kubernetes HorizontalPodAutoscaler automatically scales Kubernetes Pods under ReplicationController, Deployment, or ReplicaSet controllers basing on its CPU, memory, or other metrics. It was shortly discussed in the Kubernetes: running metrics-server in AWS EKS for a Kubernetes Pod AutoScaler post, now let’s go deeper to check all options available for scaling. For HPA you can use three… Read More »