Kubernetes Jobs: Unlocking the Potential of Containerized Applications
Kubernetes Job is a workload controller object that performs one or more finite tasks in a cluster. The finite nature of jobs differentiates them from most controller objects, such as Deployments, ReplicaSets, StatefulSets, and DaemonSets.
A Job creates one or more Pods and will continue to retry execution of the Pods until a specified number of them successfully terminate. As pods successfully complete, the Job tracks the successful completions. When a specified number of successful completions is reached, the task (ie, Job) is complete. Deleting a Job will clean up the Pods it created.
A Job is useful for running finite tasks that need to be executed once or periodically, such as batch processing, backups, migrations, and other workloads that don’t require continuous service.
A Job can have different completion modes, depending on how many pods it needs to run and how many of them need to succeed. The completion modes are:
- Non-parallel: This is the default mode, where only one pod is created and run by the Job. The Job is considered complete when the pod exits successfully
- Parallel with a fixed completion count: In this mode, the Job specifies a number of completions (
spec.completions
) that indicates how many pods need to exit successfully for the Job to be complete. The Job controller creates as many pods as the parallelism (spec.parallelism
) parameter allows, and replaces any pods that fail or are deleted - Parallel with work queue: In this mode, the Job does not specify a number of completions, but instead relies on the pods to coordinate with each other or an external service to determine when the work is done. The Job controller creates as many pods as the parallelism parameter allows, and does not replace any pods that exit successfully. The Job is considered complete when there are no more active pods
A Job can run multiple Pods in parallel or sequentially. You can specify how many Pods you want to run at the same time (parallelism) and how many Pods you want to complete the task (completions). You can also specify how many times a Pod can be retried if it fails (backoff limit). A Job can also suspend and resume its Pods based on a predefined schedule or a manual action.
Benefits of using Kubernetes Jobs
Kubernetes Jobs are useful for running batch processes or important ad-hoc operations in a cluster. For example, you can use a Job to run a database migration, a backup operation, a data processing task, or any other short-lived computation that you want to run to completion. Jobs play an important role in Kubernetes, especially for running tasks that are not part of the normal operation of your application, but are essential for its functionality or maintenance.
For example, you can use a Job to
- Batch processing: run a series of computations or transformations on a set of data. For example, you can use Jobs to process log files, generate reports, or perform machine learning tasks
- Backup and restore: perform backup and restore operations on your cluster or external data sources. For example, you can use Jobs to create snapshots of your persistent volumes, copy data to a cloud storage service, or restore data from a backup file
- Migration and initialization: perform migration and initialization tasks on your cluster or applications. For example, you can use Jobs to migrate data from one database to another, initialize a cache system, or populate a search index
- Send an email notification or a message: send an email notification or a message to a group of users when a certain event occurs, such as when a new version of an application is deployed, when a backup operation is completed, or when a critical error is detected
- Run a test or a validation: run tests or validations on your applications or data. For example, you can use Jobs to run unit tests, integration tests, or performance tests on your applications, or to validate the integrity of your data
- Process a queue of items: process a queue of items, such as messages, tasks, or events. For example, you can use Jobs to process messages from a message queue, perform tasks from a task queue, or handle events from an event queue
- Running a periodic or scheduled task: run periodic or scheduled tasks on your cluster or applications. For example, you can use Jobs to perform daily backups, weekly reports, or monthly maintenance tasks
How Does CronJobs Works?
- The user deploys a CronJob manifest that specifies 2 key details: the execution schedule and the task to be performed
- The CronJob controller, which is part of the
kube-controller-manager
, implements the primary logic. By default, it checks every 10 seconds to see if a CronJob should be run - When the controller identifies a CronJob that needs to be executed (i.e., the current time matches the time specified in cron syntax), it creates a new object called a Job. Each Job tracks a specific invocation and contains information inherited from the CronJob about what to do
- Once Kubernetes detects a new Job object, it immediately attempts to execute it. It schedules the creation of a new Pod based on the configuration passed down from the CronJob via the Job in jobTemplate. When the Pod is running, it performs the cron task
Kubernetes Jobs in Action
Non-Parallel Job
To create a Job, we need to define a YAML
manifest that specifies the pod template, the number of completions, the number of retries, and other optional parameters
apiVersion: batch/v1
kind: Job
metadata:
name: hello-world
spec:
completions: 1
template:
spec:
containers:
- name: hello
image: busybox
command: ["echo", "Hello World"]
restartPolicy: Never
This Job will create a single Pod that runs a busybox
image and executes the command echo Hello World
. The Job will be completed when the Pod successfully terminates.
To create the Job, you can use the command
kubectl apply -f hello-world.yaml
You can check the status of the Job using commands
kubectl describe job hello-world
kubectl get job hello-world -o yaml
You can also view the logs of the Pod created by the Job using command
kubectl logs hello-world-xxxxx
Parallel Job
A Kubernetes Job can also run multiple Pods in parallel to perform the same task. This can be useful for speeding up the execution time or distributing the workload across different nodes.
To run multiple Pods in parallel with a Kubernetes Job, you need to specify the parallelism
parameter in the spec section of the Job manifest file. The parallelism
parameter defines how many Pods can run concurrently at any given time. The default value is 1, which means only one Pod can run at a time.
For example, if you want to run 4 Pods in parallel with a Kubernetes Job, you can modify the previous example as follows
apiVersion: batch/v1
kind: Job
metadata:
name: hello-world-parallel
spec:
completions: 4
parallelism: 4
template:
spec:
containers:
- name: hello
image: busybox
command: ["echo", "Hello World"]
restartPolicy: Never
This Job will create 4 Pods that run in parallel and execute the same command echo Hello World
. The Job will be completed when all 4 Pods successfully terminate.
Handle Failures and Errors With a Kubernetes Job
Sometimes, a Pod created by a Kubernetes Job may fail or encounter an error during its execution. This can happen due to various reasons, such as network issues, resource constraints, application bugs, etc.
A Kubernetes Job can handle failures and errors by retrying to create and run new Pods until a specified number of successful completions is reached. The number of retries is controlled by the backoffLimit
parameter in the spec section of the Job manifest file. The backoffLimit
parameter defines how many times the Job will retry before giving up. The default value is 6, which means the Job will retry up to 6 times.
For example, if you want to create a Job that retries up to 3 times before failing, you can modify the previous example as follows
apiVersion: batch/v1
kind: Job
metadata:
name: hello-world-retry
spec:
completions: 1
backoffLimit: 3
template:
spec:
containers:
- name: hello
image: busybox
command: ["echo", "Hello World"; "exit 1"]
restartPolicy: Never
This Job will create a single Pod that runs a busybox
image and executes the command echo Hello World; exit 1
. The Pod will always fail with exit code 1. The Job will retry to create and run new Pods up to 3 times before marking the Job as failed.
Schedule a Kubernetes Job to Run Periodically. CronJob
A Kubernetes Job can also be scheduled to run periodically at a specified time or interval. This can be useful for running recurring tasks, such as backups, reports, maintenance, etc.
To schedule a Kubernetes Job to run periodically, you need to use a CronJob object instead of a Job object. A CronJob is a special type of Job that creates and runs Jobs based on a cron expression. A cron expression is a string that defines when and how often a task should be executed.
A CronJob object has the same structure as a Job object, except that it has an additional schedule
parameter in the spec section that defines the cron expression. The schedule
parameter follows the standard cron format, which consists of five fields
- Minute: 0–59
- Hour: 0–23
- Day of month: 1–31
- Month: 1–12 or JAN-DEC
- Day of week: 0–6 or SUN-SAT
For example, if you want to create a CronJob that runs a Job every hour, you can modify the previous example as follows
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello-world-cron
spec:
schedule: "0 * * * *"
jobTemplate:
spec:
completions: 1
template:
spec:
containers:
- name: hello
image: busybox
command: ["echo", "Hello World"]
restartPolicy: Never
This CronJob will create and run a Job every hour at minute 0. The Job will create a single Pod that runs a busybox
image and executes the command echo Hello World
. The Job will be completed when the Pod successfully terminates.
Kubernetes Dashboard
Alternatively, we can use the Kubernetes dashboard to create, manage, and monitor Jobs. The dashboard provides a graphical interface that allows us to perform various operations on Kubernetes resources.
To access the dashboard, we need to run the following command
kubectl proxy
This will start a proxy server that will forward requests to the Kubernetes API server. We can then open a browser and navigate to http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
The dashboard will ask us to authenticate using a token or a kubeconfig file. We can obtain a token by running the following command
kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin-user | awk '{print $1}')
This will output a token that we can copy and paste into the dashboard. Once we are authenticated, we can see an overview of our cluster and its resources. We can click on the Workloads tab and then on Jobs to see a list of Jobs in our cluster. We can also create a new Job by clicking on the Create button and filling in the required fields.
The dashboard also allows us to view the details of each Job, such as its status, events, pods, logs, and YAML definition. We can also perform actions such as scaling, editing, or deleting Jobs from the dashboard.
Jobs Troubleshooting
If a Job fails or doesn’t complete as expected, you can use the following steps to troubleshoot it:
- Check the status of the Job and its pods using the
kubectl describe
command. Look for any errors or warnings in the output, such as pod failures, restarts, or backoff limit exceeded - Check the logs of the pods using the
kubectl logs
command. Look for any error messages or exceptions in the output, such as syntax errors, configuration errors, or runtime errors - Check the events related to the Job and its pods using the
kubectl get events
command. Look for any events that indicate problems, such as pod creation failures, pod scheduling failures, or pod termination failures - If possible, run the pod command or script locally or in a different environment to verify its functionality and correctness
- If necessary, modify the Job manifest or the pod command or script to fix any errors or improve the performance of the Job
Conclusion
Kubernetes Jobs and CronJobs are powerful tools for running finite tasks in your cluster. They allow you to perform batch processing, backup and restore, migration and initialization, and periodic or recurring tasks with ease and efficiency. You can use kubectl to create, monitor, and manage your Jobs and CronJobs. You can also use other tools and frameworks that integrate with Kubernetes Jobs and CronJobs, such as Helm, Argo Workflows, Tekton Pipelines, and Airflow.