We have a Kubernetes Cron Job which failed on its last run.
Let’s look for the root cause and then will see how to restart such a failed job.
List current jobs:
kk -n eks-prod-1-bttrm-apps-ns get cronjobs
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
bttrm-apps-backend-reccuring-payment-cron 0 10 * * * False 1 22m 45d
Check pods of the bttrm-apps-backend-reccuring-payment-cron Cron Job:
kk -n eks-prod-1-bttrm-apps-ns get pod | grep cron
bttrm-apps-backend-reccuring-payment-cron-1595757600-7n8rh 0/1 Completed 0 24h
bttrm-apps-backend-reccuring-payment-cron-1595844000-jzhrl 0/1 ImagePullBackOff 0 22m
The 1595844000-jzhrl pod was failed, check its logs:
kk -n eks-prod-1-bttrm-apps-ns describe pod bttrm-apps-backend-reccuring-payment-cron-1595844000-jzhrl
...
Normal Pulling 21m (x4 over 23m) kubelet, ip-10-4-55-187.us-east-2.compute.internal Pulling image "projectname/projectname-apps:45.aa2416fb"
Warning Failed 21m (x4 over 23m) kubelet, ip-10-4-55-187.us-east-2.compute.internal Failed to pull image "projectname/projectname-apps:45.aa2416fb": rpc error: code = Unknown desc = Error response from daemon: pull access denied for projectname/projectname-apps, repository does not exist or may require 'docker login'
Warning Failed 21m (x4 over 23m) kubelet, ip-10-4-55-187.us-east-2.compute.internal Error: ErrImagePull
Warning Failed 7m58s (x65 over 23m) kubelet, ip-10-4-55-187.us-east-2.compute.internal Error: ImagePullBackOff
...
Actually, here is the issue cause:
Failed to pull image … Error response from daemon: pull access denied
It wasn’t able to pull the image from a private repository because it has no imagePullSecrets
set:
kubectl explain CronJob.spec.jobTemplate.spec.template.spec.imagePullSecrets
KIND: CronJob
VERSION: batch/v1beta1
RESOURCE: imagePullSecrets <[]Object>
DESCRIPTION:
ImagePullSecrets is an optional list of references to secrets in the same
namespace to use for pulling any of the images used by this PodSpec. If
specified, these secrets will be passed to individual puller
implementations for them to use. For example, in the case of docker, only
DockerConfig type secrets are honored. More info:
https://kubernetes.io/docs/concepts/containers/images#specifying-imagepullsecrets-on-a-pod
LocalObjectReference contains enough information to let you locate the
referenced object inside the same namespace.
Edit the Cron Job:
kk -n eks-prod-1-bttrm-apps-ns edit cronjobs bttrm-apps-backend-reccuring-payment-cron
Add the Docker Hub authentification setting:
...
imagePullSecrets:
- name: bttrm-docker-secret
...
And restart the Cron Job using the --from
following by the Cron Job to be restarted, and then by a name for the new Cron Job – Kubernetes will create a new pod copied from your original Cron Job:
kk -n eks-prod-1-bttrm-apps-ns create job --from=cronjob/bttrm-apps-backend-reccuring-payment-cron reccuring-payment-cron-manual
job.batch/reccuring-payment-cron-manual created
Check pods:
kk -n eks-prod-1-bttrm-apps-ns create get pod
NAME READY STATUS RESTARTS AGE
...
bttrm-apps-backend-reccuring-payment-cron-1595757600-7n8rh 0/1 Completed 0 24h
bttrm-apps-backend-reccuring-payment-cron-1595844000-jzhrl 0/1 ImagePullBackOff 0 32m
reccuring-payment-cron-manual-6lm62 0/1 Completed 0 13s
cron-manual
now is in the Completed status – we are done.