Istio: a cause and solution of the “SQLSTATE Connection refused”

By | 04/23/2021

During starting a pod we got errors “SQLSTATE[HY000] [2002] Connection refused’” in two different applications – a РНР/Yii and NodeJS.

In the РHР/Yii it’s coming when we are running a pre-install hook during deployment with Helm and on the MySQL Migration Job execution:

Yii Migration Tool (based on Yii v2.0.38)Exception ‘yii\db\Exception’ with message ‘SQLSTATE[HY000] [2002] Connection refused’in /app/vendor/yiisoft/yii2/db/Connection.php:642Error Info:
Array
(
[0] => HY000
[1] => 2002
[2] => Connection refused
)Caused by: Exception ‘PDOException’ with message ‘SQLSTATE[HY000] [2002] Connection refused”

Contents

The cause

This is simple enough: when we added Istio to an environment, it adds its sidecar container to all our pods. With this, sometimes can happen that an application’s container will start before the sidecar with Envoy will be ready to proxy the traffic.

See discussions: Delaying application start until sidecar is ready, App container unable to connect to network before sidecar is fully running и Pod fails to start: Application container unable to access network before sidecar ready.

And solution

In general, Istio waits for the Kubernetes to implement a kind of solution:

The full solution to this in Kubernetes is for k8s to support Sidecar containers as a first class concept, starting them up entirely before starting up the application container.

Check this comment.

But as this solution not done on Kubernetes 1.19, and our AWS Elastic Kubernetes Service works on 1.18, we need to find another way.

Still, in the Istio 1.7 version was added an option that can help:

Added config option values.global.proxy.holdApplicationUntilProxyStarts, which causes the sidecar injector to inject the sidecar at the start of the pod’s container list and configures it to block the start of all other containers until the proxy is ready.

It can be set globally for the whole mesh:

apiVersion: v1
data:
  mesh: |-
    defaultConfig:
      discoveryAddress: istiod.istio-system.svc:15012
      proxy:
        holdApplicationUntilProxyStarts: true
...

Or by a Pod’s level via annotations:

annotations:
  proxy.istio.io/config: '{ "holdApplicationUntilProxyStarts": true }'

Also, there utilities like envoy-preflight that perform checks of the istio-proxy container, and only when it’s alive will start other containers in a pod.

But in our current case the issue was in another: istio-proxy can not start because of the error caused by the DNS settings of the pod, see more at Kubernetes: NodeLocal DNS and the “lookup istiod.istio-system.svc on lookup: no such host” error.

So, when an application started and tried to connect to a MySQL (AWS RDS Aurora), it got the “Connection reset” errors, as Envoy wasn’t able to proxy its traffic.