KubeJobFailed #
Meaning #
Job failed complete.
Impact #
Failure of processing of a scheduled task.
Diagnosis #
- Check job via
kubectl -n $NAMESPACE describe jobs $JOB
. - Check pod events via
kubectl -n $NAMESPACE describe pod $POD_FROM_JOB
. - Check pod logs via
kubectl -n $NAMESPACE log pod $POD_FROM_JOB
.
Mitigation #
See Debugging Pods
See Job patterns
redesign job so that it is idempotent (can be re-run many times which will always produce the same output even if input differes)
clean up failed jobs, e.g. using
kubectl
:kubectl -n $NAMESPACE delete job --field-selector status.successful!=1