We have a Kubernetes Cron Job which failed on its last run.
Let’s look for the root cause and then will see how to restart such a failed job.
List current jobs:
[simterm]
$ kk -n eks-prod-1-bttrm-apps-ns get cronjobs NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE bttrm-apps-backend-reccuring-payment-cron 0 10 * * * False 1 22m 45d
[/simterm]
Check pods of the bttrm-apps-backend-reccuring-payment-cron Cron Job:
[simterm]
$ kk -n eks-prod-1-bttrm-apps-ns get pod | grep cron bttrm-apps-backend-reccuring-payment-cron-1595757600-7n8rh 0/1 Completed 0 24h bttrm-apps-backend-reccuring-payment-cron-1595844000-jzhrl 0/1 ImagePullBackOff 0 22m
[/simterm]
The 1595844000-jzhrl pod was failed, check its logs:
[simterm]
$ kk -n eks-prod-1-bttrm-apps-ns describe pod bttrm-apps-backend-reccuring-payment-cron-1595844000-jzhrl ... Normal Pulling 21m (x4 over 23m) kubelet, ip-10-4-55-187.us-east-2.compute.internal Pulling image "projectname/projectname-apps:45.aa2416fb" Warning Failed 21m (x4 over 23m) kubelet, ip-10-4-55-187.us-east-2.compute.internal Failed to pull image "projectname/projectname-apps:45.aa2416fb": rpc error: code = Unknown desc = Error response from daemon: pull access denied for projectname/projectname-apps, repository does not exist or may require 'docker login' Warning Failed 21m (x4 over 23m) kubelet, ip-10-4-55-187.us-east-2.compute.internal Error: ErrImagePull Warning Failed 7m58s (x65 over 23m) kubelet, ip-10-4-55-187.us-east-2.compute.internal Error: ImagePullBackOff ...
[/simterm]
Actually, here is the issue cause:
Failed to pull image … Error response from daemon: pull access denied
It wasn’t able to pull the image from a private repository because it has no imagePullSecrets
set:
[simterm]
$ kubectl explain CronJob.spec.jobTemplate.spec.template.spec.imagePullSecrets KIND: CronJob VERSION: batch/v1beta1 RESOURCE: imagePullSecrets <[]Object> DESCRIPTION: ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images used by this PodSpec. If specified, these secrets will be passed to individual puller implementations for them to use. For example, in the case of docker, only DockerConfig type secrets are honored. More info: https://kubernetes.io/docs/concepts/containers/images#specifying-imagepullsecrets-on-a-pod LocalObjectReference contains enough information to let you locate the referenced object inside the same namespace.
[/simterm]
Edit the Cron Job:
[simterm]
$ kk -n eks-prod-1-bttrm-apps-ns edit cronjobs bttrm-apps-backend-reccuring-payment-cron
[/simterm]
Add the Docker Hub authentification setting:
... imagePullSecrets: - name: bttrm-docker-secret ...
And restart the Cron Job using the --from
following by the Cron Job to be restarted, and then by a name for the new Cron Job – Kubernetes will create a new pod copied from your original Cron Job:
[simterm]
$ kk -n eks-prod-1-bttrm-apps-ns create job --from=cronjob/bttrm-apps-backend-reccuring-payment-cron reccuring-payment-cron-manual job.batch/reccuring-payment-cron-manual created
[/simterm]
Check pods:
[simterm]
$ kk -n eks-prod-1-bttrm-apps-ns create get pod NAME READY STATUS RESTARTS AGE ... bttrm-apps-backend-reccuring-payment-cron-1595757600-7n8rh 0/1 Completed 0 24h bttrm-apps-backend-reccuring-payment-cron-1595844000-jzhrl 0/1 ImagePullBackOff 0 32m reccuring-payment-cron-manual-6lm62 0/1 Completed 0 13s
[/simterm]
cron-manual
now is in the Completed status – we are done.