Kubernetes: 503 no endpoints available for service – causes and solutions

By | 06/15/2020
 

We have a Redis service running behind a Service with the ClusterIP type.

This Redis must accessible by pods from the same namespace (a Gorush service).

The problem is that those pod can’t connect to the Redis service using its gorush-server-redis-svc:6379 name and reporting “Can’t connect redis server: connection refused“:

kk -n gorush-test logs gorush-server-55cd659dff-9wfbw
time="2020/06/12 - 13:44:02" level=debug msg="Init App Status Engine as redis"
2020/06/12 13:44:03 Can't connect redis server: dial tcp 172.20.198.35:6379: connect: connection refused
time="2020/06/12 - 13:44:03" level=error msg="storage error: dial tcp 172.20.198.35:6379: connect: connection refused"

Although the service is present and must be working:

kk -n gorush-test get svc
NAME                      TYPE           CLUSTER-IP       EXTERNAL-IP                                                                       PORT(S)        AGE
gorush-server-redis-svc   ClusterIP      172.20.198.35    <none>                                                                            6379/TCP       102s

The “no endpoints available for service” error

Okay, let’s try to debug the connection – at first, run a proxy from your local system to the Kubernetes API server as described in the Kubernetes: ClusterIP vs NodePort vs LoadBalancer, Services и Ingress — обзор, примеры:

kubectl proxy --port=8080
Starting to serve on 127.0.0.1:8080

And try to connect to the Service with the curl:

curl localhost:8080/api/v1/namespaces/gorush-test/services/gorush-server-redis-svc/proxy
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "no endpoints available for service \"gorush-server-redis-svc\"",
"reason": "ServiceUnavailable",
"code": 503
}

Although a pod with the Redis is running and is able to accept a connection – check with the port-forwadrd.

Run it:

kk -n gorush-test port-forward gorush-server-redis-6566764cb7-9s8qk 6379:6379
Forwarding from 127.0.0.1:6379 -> 6379
Forwarding from [::1]:6379 -> 6379

Check:

curl localhost:6379
curl: (1) Received HTTP/0.9 when not allowed

Okay – Redis is really working, so what the issue with its Service?

And why it says “no endpoints”?

Check them:

kk -n gorush-test get endpoints
NAME                      ENDPOINTS   AGE
gorush-server-redis-svc   <none>      3m3s

Indeed – <none>.

Cause #1: labels mismatch

Now, check which labels arse set for the Service to chose pods – find its selector:

kk -n gorush-test get svc gorush-server-redis-svc -o jsonpath={.spec.selector}
map[applicaton:gorush-server-redis]

Or check with it smanifest:

apiVersion: v1
kind: Service
metadata:
  name: {{ .Chart.Name }}-redis-svc
  labels:
    application: {{ .Chart.Name }}-redis-svc
spec:
  ports:
  - port: 6379
  selector:
    applicaton: {{ .Chart.Name }}-redis

And now, check which labels are set to the pod itself:

kk -n gorush-test get deploy gorush-server-redis -o jsonpath={.spec.template.metadata.labels}
map[application:gorush-server-redis]

Its deployment manifest:

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: {{ .Chart.Name }}-redis
spec:
  replicas: 1
  template:
    metadata:
      labels:
        application: {{ .Chart.Name }}-redis
    spec:
      containers:
      - name: redis-master
        image: redis
        ports:
        - containerPort: 6379

And compare them:

  • the Service’s selector: map[applicaton:gorush-server-redis]
  • and pod’s lable: map[application:gorush-server-redis]

Here are the issue – applicaTOn и applicaTIOn.

So, the Service was created as its manifest have no syntax errors, but as it was looking for pods with a wrong label – it wasn’t able to find aa pods to acts as the Service’s backends and the Service wasn’t able to find where to send the traffic to, so it returned the 503 error.

Fix the typo, redeploy and check the Service:

curl -Lkk -n gorush-test describe svc gorush-server-redis-svc
Name:              gorush-server-redis-svc
Namespace:         gorush-test
Labels:            app.kubernetes.io/managed-by=Helm
application=gorush-server-redis
Annotations:       meta.helm.sh/release-name: gorush-server
meta.helm.sh/release-namespace: gorush-test
Selector:          application=gorush-server-redis
Type:              ClusterIP
IP:                172.20.124.195
Port:              <unset>  6379/TCP
TargetPort:        6379/TCP
Endpoints:         10.3.47.94:6379
...

Or:

kk -n gorush-test get endpoints
NAME                      ENDPOINTS         AGE
gorush-server-redis-svc   10.3.47.94:6379   77m

Endpoints: 10.3.47.94:6379 appeared – all done, fixed.

Cause #2: named port

The same error can appera when using named ports for a Service, for example:

apiVersion: v1
kind: Service
metadata:
  name: gorush
  labels:
    app: gorush
    tier: frontend
spec:
  selector:
    app: gorush
    tier: frontend
  type: LoadBalancer
  ports:
  - name: gorush
    protocol: TCP
    port: 80
    targetPort: 8088

ports: – name: gorush

At this time an endpoint is created:

kubectl -n eks-dev-1-gorush get ep
NAME ENDPOINTS AGE
gorush 10.3.35.14:8088 44m

But after performing an API-request it will fail with the same “message”: “no endpoints available for service \”gorush\””:

curl -sL localhost:8080/api/v1/namespaces/eks-dev-1-gorush/services/gorush/proxy/metrics | grep gorush | head
“message”: “no endpoints available for service \”gorush\””,

In this case, the cause is in the port’s name, which needs to be added to the request via “:” :

curl -sL localhost:8080/api/v1/namespaces/eks-dev-1-gorush/services/gorush:gorush/proxy/metrics | head
HELP go_gc_duration_seconds A summary of the GC invocation durations.
TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile=”0″} 9.194e-06
go_gc_duration_seconds{quantile=”0.25″} 1.2092e-05
go_gc_duration_seconds{quantile=”0.5″} 2.1812e-05
go_gc_duration_seconds{quantile=”0.75″} 5.1794e-05
go_gc_duration_seconds{quantile=”1″} 0.000145631
go_gc_duration_seconds_sum 0.001080551
go_gc_duration_seconds_count 32
HELP go_goroutines Number of goroutines that currently exist.

services/gorush:gorush – here it is, and now it’s working.