Category Archives: Troubleshooting

Various problems solutions

Kubernetes: a cluster’s monitoring with the Prometheus Operator

13 August 2020
 

 Continuing with the Kubernetes: monitoring with Prometheus – exporters, a Service Discovery, and its roles, where we configured Prometheus manually to see how it’s working – now, let’s try to use Prometheus Operator installed via Helm chart. So, the task is spin up a Prometheus server and all necessary exporter in an AWS Elastic Kubernetes… Read More »

Kubernetes: HorizontalPodAutoscaler – an overview with examples

12 August 2020
 

 Kubernetes HorizontalPodAutoscaler automatically scales Kubernetes Pods under ReplicationController, Deployment, or ReplicaSet controllers basing on its CPU, memory, or other metrics. It was shortly discussed in the Kubernetes: running metrics-server in AWS EKS for a Kubernetes Pod AutoScaler post, now let’s go deeper to check all options available for scaling. For HPA you can use three… Read More »

Prometheus: yet-another-cloudwatch-exporter – collecting AWS CloudWatch metrics

23 July 2020
 

 Currently, to collect metrics from the AWS CloudWatch we are using AWS’s own cloudwatch-exporter, see the Prometheus: CloudWatch exporter — сбор метрик из AWS и графики в Grafana post (in Rus), but it has a few gaps: it’s written in Java, so uses CPU/memory of the monitoring host doesn’t scrapes AWS tags from resources uses… Read More »

Kubernetes: 503 no endpoints available for service – causes and solutions

15 June 2020
 

 We have a Redis service running behind a Service with the ClusterIP type. This Redis must accessible by pods from the same namespace (a Gorush service). The problem is that those pod can’t connect to the Redis service using its gorush-server-redis-svc:6379 name and reporting “Can’t connect redis server: connection refused“: [simterm] $ kk -n gorush-test… Read More »

Docker: configure tzdata and timezone during build

17 May 2020
 

 During a Docker image build – it stops asking to configure the tzdata. Dockerfile at this moment is the next: FROM ubuntu:18.04 RUN apt update && apt install -y python-pip python-dev ssh python-boto3 RUN pip install ansible==2.4.3.0 Let’s reproduce – run the build: [simterm] admin@jenkins-production:~$ docker build -t proj/proj-ansible:1.1 . Sending build context to Docker… Read More »

Helm: helm-secrets – sensitive data encryption with AWS KMS and use it with Jenkins

16 May 2020
 

 So, as a follow-up to the Helm: Kubernetes package manager – an overview, getting started post – let’s discuss about sensitive data in our Helm charts. What I want is to store a chart files in a repository, but even if such a repo will be a private Github repo – I still don’t want… Read More »

AWS Elastic Kubernetes Service: a cluster creation automation, part 2 – Ansible, eksctl

1 May 2020
 

 The first part – AWS Elastic Kubernetes Service: a cluster creation automation, part 1 – CloudFormation. To remind the whole idea is to create an automation process to create an EKS cluster: Ansible uses the cloudformation module to create an infrastructure by using an Outputs of the CloudFormation stack created – Ansible from a template will… Read More »

Linux: no sound after suspend/sleep – solution

30 April 2020
 

 I have a laptop with Arch Linux. It’s suspended with the systemctl suspend. The issue: after it’s wake up there is no sound. From the information found during the investigation (see links below) the issue happens because of the NVIDIA drivers and isn’t specific to the Arch Linux – can happen with any other Linux… Read More »

Steam: Civilization V, Arch Linux and ERROR:Invalid resolutions constraints: 0x0 must not be greater than 0x0

27 April 2020
 

 I’m using Steam on Arch Linux (see the Arch Linux: Steam installation). And I have Civilization V game here – my favorite game for last N-years. I have NVIDIA drivers installed and most of the games work fine, even World Of Tanks, but I wasn’t able to play Civilization from the moment I’ve installed Steam… Read More »

AWS: eksctl – “Put http://169.254.169.254/latest/api/token: net/http: request canceled”

26 April 2020
 

 We have a Docker image with the eksctl tool included. We also have an ЕС2 with Linux with the eksctl. There is an AWS IAM Instance Profile attached to this EC2 with the AdminAccess policy assigned. On this ЕС2 we have Jenkins running in a Docker container, and it spawns its jobs inside in additional… Read More »