Kubernetes: namespace hangs in Terminating and metrics-server non-obviousness

By | 04/01/2021
 

Faced with a very interesting thing during removal of a Kubernetes Namespace.

After a kubectl delete namespace NAMESPACE is executed, the namespace hangs in the Terminating state, and any attempt to forcibly remove it didn’t help.

First, let’s see how such a force-removal can be done, and then will check the real cause and a solution of such behavior.

Create a test namespace:

[simterm]

$ kubectl create namespace test-ns
namespace/test-ns created

[/simterm]

Try to remove it – and it hangs:

[simterm]

$ kubectl delete namespace test-ns
namespace "test-ns" deleted

[/simterm]

Check its state – it’s Terminating:

[simterm]

$ kubectl get ns test-ns
NAME      STATUS        AGE
test-ns   Terminating   50s

[/simterm]

During this, nothing was printed to the API server logs.

Namespace removing ways

 --force и --grace-period

Okay, maybe there are some resources and the namespace is waiting for them? Find them all:

[simterm]

$ kubectl -n test-ns get all
No resources found.

[/simterm]

No, nothing.

And delete with the  --force and --grace-period=0 didn’t help:

[simterm]

$ kubectl delete namespace test-ns --force --grace-period=0

[/simterm]

The namespace is still present, and still in the Terminating state.

Clean up finalizers

After googling, almost every found solution told to remove the kubernetes from the  finalizers – edit the namespace:

[simterm]

$ kubectl edit ns test-ns

[/simterm]

And there the kubernetes line:

...
spec:
  finalizers:
  - kubernetes
...

Save changes – and happens nothing. The namespace in the same state, and the finalizers=kubernetes comes back.

APIs: custom.metrics.k8s.io/v1beta1: the server is currently unable to handle the request

But keep reading, it gets better.

If execute the kubectl api-resources command, you may see an error about custom.metrics.k8s.io:

[simterm]

$ kubectl api-resources
error: unable to retrieve the complete list of server APIs: custom.metrics.k8s.io/v1beta1: the server is currently unable to handle the request

[/simterm]

That points me to the idea, that something is wrong with the metrics-server.

A metrics-server version?

The first thought was about the version being used, as we installed it a long time ago and still using 0.3.6:

[simterm]

$ grep -r metrics-server ../roles/controllers/ | grep github
../roles/controllers/tasks/main.yml:  command: "kubectl --kubeconfig={{ kube_config_path }} apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml"

[/simterm]

Let’s try to install the latest one, 0.4.2:

[simterm]

$ kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.4.2/components.yaml

[/simterm]

And still nothing…

metrics-server arguments?

Well, maybe the is in the connection to the metrics-server?

Try to update --kubelet-insecure-tls, --kubelet-preferred-address-types=InternalIP and even enable hostNetwork=true, like so:

...
    spec:
      hostNetwork: true
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP
        - --kubelet-use-node-status-port
        - --kubelet-insecure-tls
...

Nope…

And during that, the kubectl top command is working, so the metrics-server service is working:

[simterm]

$ kubectl top node
NAME                                         CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
ip-10-22-35-66.eu-west-3.compute.internal    136m         7%     989Mi           13%       
ip-10-22-53-197.eu-west-3.compute.internal   125m         6%     925Mi           12%

[/simterm]

So, everything is good? Kubernetes can connect to it, and can receive metrics?

Kubernetes apiservices

Now, let’s go to the error message:

custom.metrics.k8s.io/v1beta1: the server is currently unable to handle the request

And check the apiservice v1beta1.custom.metrics.k8s.io resource:

[simterm]

$ kubectl get apiservices v1beta1.custom.metrics.k8s.io
NAME                            SERVICE                      AVAILABLE                 AGE
v1beta1.custom.metrics.k8s.io   monitoring/prometheus-adapter   False (ServiceNotFound)   9m7s

[/simterm]

Aha, here it is!

monitoring/prometheus-adapter False (ServiceNotFound)

Kubernetes have to call the monitoring/prometheus-adapter service, but check if it’s available:

[simterm]

$ kk -n monitoring get svc
NAME                                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)            AGE
prometheus-grafana                      ClusterIP   172.20.167.133   <none>        80/TCP             12m
prometheus-kube-state-metrics           ClusterIP   172.20.32.255    <none>        8080/TCP           12m
prometheus-operated                     ClusterIP   None             <none>        9090/TCP           11m
prometheus-prometheus-node-exporter     ClusterIP   172.20.228.75    <none>        9100/TCP           12m
prometheus-prometheus-oper-operator     ClusterIP   172.20.131.64    <none>        8080/TCP,443/TCP   12m
prometheus-prometheus-oper-prometheus   ClusterIP   172.20.205.138   <none>        9090/TCP           12m

[/simterm]

No, prometheus-adapter is absent.

And that’s why the custom.metrics.k8s.io was broken.

Install the adapter:

[simterm]

$ helm -n monitoring install prometheus-adapter stable/prometheus-adapter

[/simterm]

Check services again:

[simterm]

$ kk -n monitoring get svc 
NAME                                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)            AGE
prometheus-adapter                      ClusterIP   172.20.59.130    <none>        443/TCP            21s
...

[/simterm]

And apiservices:

[simterm]

$ kubectl get apiservices v1beta1.custom.metrics.k8s.io
NAME                            SERVICE                         AVAILABLE   AGE
v1beta1.custom.metrics.k8s.io   monitoring/prometheus-adapter   True        56s

[/simterm]

And our namespace:

[simterm]

$ kk get ns test-ns
NAME      STATUS        AGE
test-ns   Terminating   19m

[/simterm]

And after a few seconds:

[simterm]

$ kk get ns test-ns
Error from server (NotFound): namespaces "test-ns" not found

[/simterm]

Done.