Kubernetes: NodeLocal DNS и ошибка “lookup istiod.istio-system.svc on lookup: no such host”

Автор: | 19/04/2021

У нас в Deployments используется кастомный NodeLocal DNS в роли локального кеширующего DNS-сервера, что бы уменьшить количество запросов к AWS VPC DNS, см. Kubernetes: нагрузочное тестирование и high-load тюнинг — проблемы и решения.

Выглядит манифест деплоймента так:

...
      dnsPolicy: "None"
      dnsConfig:
        nameservers:
          - 169.254.20.10
...

Проблема в том, что при запуске Istio sidecar, а именно istio-proxy, не может отрезолвить имя istiod.istio-system.svc, что бы получить настройки от центрального Pilot-агента Istio:

[simterm]

$ kk -n test-ns logs test-deploy-dns-f8f5659b5-jpx8g istio-proxy
...
2021-03-29T11:46:44.939174Z     warn    Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2021-03-29T11:46:45.409507Z     warn    sds     failed to warm certificate: failed to generate workload certificate: create certificate: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup istiod.istio-system.svc on 169.254.20.10:53: no such host"
...

[/simterm]

Для решения – добавляем searches в dnsConfig со значением имени кластера, см. Pod’s DNS Config и Namespaces of Services:

...
      dnsPolicy: "None"
      dnsConfig:
        nameservers:
          - 169.254.20.10
        searches:
          - cluster.local
...

Передеплоиваем – и istio-proxy поднялся:

[simterm]

$ kk -n test-ns logs -f test-deploy-dns-5448c4996d-8m5rk istio-proxy
...
2021-03-29T12:44:28.853178Z     info    Proxy role      ips=[10.22.46.129] type=sidecar id=test-deploy-dns-5448c4996d-8m5rk.test-ns domain=test-ns.svc.cluster.local
2021-03-29T12:44:28.853187Z     info    JWT policy is third-party-jwt
2021-03-29T12:44:28.853198Z     info    Pilot SAN: [istiod.istio-system.svc]
2021-03-29T12:44:28.853206Z     info    CA Endpoint istiod.istio-system.svc:15012, provider Citadel
2021-03-29T12:44:28.853244Z     info    Using CA istiod.istio-system.svc:15012 cert with certs: var/run/secrets/istio/root-cert.pem
2021-03-29T12:44:28.853341Z     info    citadelclient   Citadel client using custom root cert: istiod.istio-system.svc:15012
2021-03-29T12:44:28.886213Z     info    ads     All caches have been synced up in 35.295685ms, marking server ready
2021-03-29T12:44:28.886528Z     info    sds     SDS server for workload certificates started, listening on "./etc/istio/proxy/SDS"
2021-03-29T12:44:28.886554Z     info    xdsproxy        Initializing with upstream address "istiod.istio-system.svc:15012" and cluster "Kubernetes"
2021-03-29T12:44:28.886943Z     info    Starting proxy agent
2021-03-29T12:44:28.887121Z     info    sds     Start SDS grpc server
2021-03-29T12:44:28.887211Z     info    Opening status port 15020
2021-03-29T12:44:28.887359Z     info    Received new config, creating new Envoy epoch 0
2021-03-29T12:44:28.887407Z     info    Epoch 0 starting
2021-03-29T12:44:28.892991Z     info    Envoy command: [-c etc/istio/proxy/envoy-rev0.json --restart-epoch 0 --drain-time-s 45 --parent-shutdown-time-s 60 --service-cluster test-app-dns.test-ns --service-node sidecar~10.22.46.129~test-deploy-dns-5448c4996d-8m5rk.test-ns~test-ns.svc.cluster.local --local-address-ip-version v4 --bootstrap-version 3 --log-format %Y-%m-%dT%T.%fZ    %l      envoy %n        %v -l warning --component-log-level misc:error --concurrency 2]

[/simterm]

Готово.