Kubernetes: namespace hangs in Terminating and metrics-server non-obviousness

By | 04/01/2021
 

Faced with a very interesting thing during removal of a Kubernetes Namespace.

After a kubectl delete namespace NAMESPACE is executed, the namespace hangs in the Terminating state, and any attempt to forcibly remove it didn’t help.

First, let’s see how such a force-removal can be done, and then will check the real cause and a solution of such behavior.

Create a test namespace:

kubectl create namespace test-ns
namespace/test-ns created

Try to remove it – and it hangs:

kubectl delete namespace test-ns
namespace "test-ns" deleted

Check its state – it’s Terminating:

kubectl get ns test-ns
NAME      STATUS        AGE
test-ns   Terminating   50s

During this, nothing was printed to the API server logs.

Namespace removing ways

 --force и --grace-period

Okay, maybe there are some resources and the namespace is waiting for them? Find them all:

kubectl -n test-ns get all
No resources found.

No, nothing.

And delete with the  --force and --grace-period=0 didn’t help:

kubectl delete namespace test-ns --force --grace-period=0

The namespace is still present, and still in the Terminating state.

Clean up finalizers

After googling, almost every found solution told to remove the kubernetes from the  finalizers – edit the namespace:

kubectl edit ns test-ns

And there the kubernetes line:

...
spec:
  finalizers:
  - kubernetes
...

Save changes – and happens nothing. The namespace in the same state, and the finalizers=kubernetes comes back.

APIs: custom.metrics.k8s.io/v1beta1: the server is currently unable to handle the request

But keep reading, it gets better.

If execute the kubectl api-resources command, you may see an error about custom.metrics.k8s.io:

kubectl api-resources
error: unable to retrieve the complete list of server APIs: custom.metrics.k8s.io/v1beta1: the server is currently unable to handle the request

That points me to the idea, that something is wrong with the metrics-server.

A metrics-server version?

The first thought was about the version being used, as we installed it a long time ago and still using 0.3.6:

grep -r metrics-server ../roles/controllers/ | grep github
../roles/controllers/tasks/main.yml:  command: "kubectl --kubeconfig={{ kube_config_path }} apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml"

Let’s try to install the latest one, 0.4.2:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.4.2/components.yaml

And still nothing…

metrics-server arguments?

Well, maybe the is in the connection to the metrics-server?

Try to update --kubelet-insecure-tls, --kubelet-preferred-address-types=InternalIP and even enable hostNetwork=true, like so:

...
    spec:
      hostNetwork: true
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP
        - --kubelet-use-node-status-port
        - --kubelet-insecure-tls
...

Nope…

And during that, the kubectl top command is working, so the metrics-server service is working:

kubectl top node
NAME                                         CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
ip-10-22-35-66.eu-west-3.compute.internal    136m         7%     989Mi           13%
ip-10-22-53-197.eu-west-3.compute.internal   125m         6%     925Mi           12%

So, everything is good? Kubernetes can connect to it, and can receive metrics?

Kubernetes apiservices

Now, let’s go to the error message:

custom.metrics.k8s.io/v1beta1: the server is currently unable to handle the request

And check the apiservice v1beta1.custom.metrics.k8s.io resource:

kubectl get apiservices v1beta1.custom.metrics.k8s.io
NAME                            SERVICE                      AVAILABLE                 AGE
v1beta1.custom.metrics.k8s.io   monitoring/prometheus-adapter   False (ServiceNotFound)   9m7s

Aha, here it is!

monitoring/prometheus-adapter False (ServiceNotFound)

Kubernetes have to call the monitoring/prometheus-adapter service, but check if it’s available:

kk -n monitoring get svc
NAME                                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)            AGE
prometheus-grafana                      ClusterIP   172.20.167.133   <none>        80/TCP             12m
prometheus-kube-state-metrics           ClusterIP   172.20.32.255    <none>        8080/TCP           12m
prometheus-operated                     ClusterIP   None             <none>        9090/TCP           11m
prometheus-prometheus-node-exporter     ClusterIP   172.20.228.75    <none>        9100/TCP           12m
prometheus-prometheus-oper-operator     ClusterIP   172.20.131.64    <none>        8080/TCP,443/TCP   12m
prometheus-prometheus-oper-prometheus   ClusterIP   172.20.205.138   <none>        9090/TCP           12m

No, prometheus-adapter is absent.

And that’s why the custom.metrics.k8s.io was broken.

Install the adapter:

helm -n monitoring install prometheus-adapter stable/prometheus-adapter

Check services again:

kk -n monitoring get svc
NAME                                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)            AGE
prometheus-adapter                      ClusterIP   172.20.59.130    <none>        443/TCP            21s
...

And apiservices:

kubectl get apiservices v1beta1.custom.metrics.k8s.io
NAME                            SERVICE                         AVAILABLE   AGE
v1beta1.custom.metrics.k8s.io   monitoring/prometheus-adapter   True        56s

And our namespace:

kk get ns test-ns
NAME      STATUS        AGE
test-ns   Terminating   19m

And after a few seconds:

kk get ns test-ns
Error from server (NotFound): namespaces "test-ns" not found

Done.



Also published on Medium.