Kubernetes Health Probes: Liveness, Readiness, and Startup
In a real-world system, software is never perfect. A container might start successfully but hang internally after a few hours. A new replica might be scheduled but not yet connected to its database. Kubernetes handles these situations through health probes — active checks that the Kubelet runs against every container on its node.
The Kubelet is the agent running on every Kubernetes node. It is responsible for starting, stopping, and monitoring containers. To monitor them, he must be able to reach each container over the network or by running commands inside it. This is why the Kubelet, and the node it runs on, must always have direct connectivity to all containers scheduled on that node.
Kubernetes offers three distinct probe types:
- Startup Probe — Has the application finished starting up?
- Liveness Probe — Is the application still alive and functioning?
- Readiness Probe — Is the application ready to serve user traffic?
Each probe serves a different purpose and triggers a different action when it fails. Understanding the difference between them is essential for building reliable, self-healing Kubernetes workloads.
Core Concepts
The Kubelet and Its Role in Health Checking
Think of the Kubelet as a node-level supervisor. It watches every Pod assigned to the node and continuously polls each container using the probes you define in your manifest. If a probe fails enough times in a row, the Kubelet takes corrective action — either restarting the container or temporarily removing it from load balancing rotation.
Because the Kubelet connects directly from the node to the container's IP address, there is no extra hop through a Service or Ingress. This direct reach also means that network policies must not block traffic from the node's own IP to the container port used by the probe.
Why Health Probes Matter
Without probes, Kubernetes only knows whether a container process is running. It cannot tell whether that process is healthy, deadlocked, or still initialising. With probes, you teach Kubernetes exactly what "healthy" means for your specific application, and it will automatically enforce that definition.
- A deadlocked application process stays running but stops responding → liveness probe catches this.
- A Pod starts but its connection pool is not ready → readiness probe keeps traffic away until it is.
- A slow-starting application takes 60 seconds to load data → startup probe prevents premature restarts.
Probe Execution Mechanisms
Every probe — regardless of type (liveness, readiness, startup) — uses one of four mechanisms to check the container:
| Mechanism | How It Works | Success Condition |
|---|---|---|
httpGet | Kubelet sends an HTTP GET to a path and port on the container. | HTTP response code between 200 and 399. |
tcpSocket | Kubelet tries to open a TCP connection to the container's port. | The TCP connection is established successfully. |
exec | Kubelet runs a command inside the container. | The command exits with code 0. |
grpc | Kubelet calls the gRPC Health Checking Protocol. | Response status is SERVING. |
Common Probe Configuration Fields
All three probe types share the same set of configuration fields. Understanding each field helps you tune probes to match your application's actual behaviour.
| Field | Default | Meaning |
|---|---|---|
initialDelaySeconds | 0 | Seconds to wait after the container starts before the first probe runs. |
periodSeconds | 10 | How often (in seconds) the probe is executed. |
timeoutSeconds | 1 | Seconds to wait for a probe response before treating it as a failure. |
failureThreshold | 3 | Number of consecutive failures before the probe is considered failed. |
successThreshold | 1 | Number of consecutive successes required to mark the probe as passed again (readiness only). |
The Three Probe Types in Detail
Startup Probe
The startup probe answers the question: "Has the application finished its startup routine?" While the startup probe is running and has not yet succeeded, Kubernetes disables both the liveness and readiness probes entirely. This is important for applications that take a long time to initialise — for example, a service that runs database migrations on startup or pre-loads a large model into memory.
If you set failureThreshold: 30 and periodSeconds: 10, you are allowing up to 300 seconds (5 minutes) for the application to finish starting before Kubernetes gives up and restarts the container. Once the startup probe succeeds once, it is never checked again for that container instance.
What happens on failure? Kubernetes kills and restarts the container, just like a failed liveness probe.
Liveness Probe
The liveness probe answers the question: "Is the application still functioning?" Think of it as a heartbeat check. If the application enters a deadlock, an infinite loop, or a broken state from which it cannot recover on its own, the liveness probe will detect the silence and restart the container automatically.
A liveness probe endpoint should be lightweight. It must not check downstream dependencies like databases — if the database goes down, you do not want all your containers to restart, because that would make a bad situation much worse.
What happens on failure? The container is killed and restarted according to the Pod's restartPolicy (which is Always by default for Deployments).
Readiness Probe
The readiness probe answers the question: "Is the application ready to receive traffic?" This probe controls whether the Pod's IP address is included in the Endpoints object of any Service that selects it. If a readiness probe fails, the Pod is removed from the Service's endpoints and stops receiving new requests — but the container itself is not restarted.
Unlike liveness, the readiness probe is appropriate for checking external dependencies such as a database connection or a cache. It is also re-evaluated continuously throughout the Pod's lifetime. A Pod can become "not ready" and then "ready" again multiple times.
What happens on failure? The Pod is removed from Service load balancing. No restart occurs.
Liveness vs Readiness — The Key Distinction
| Aspect | Liveness Probe | Readiness Probe |
|---|---|---|
| Failure action | Container is restarted | Pod removed from Service endpoints |
| Check external dependencies? | No — keep it minimal | Yes — safe and recommended |
| Re-evaluated after pass? | Yes, continuously | Yes, continuously |
| Typically checks | Internal app state (deadlock, freeze) | DB connection, cache, warm-up status |
Hands-On: Kubernetes Commands
View probe configuration for a running Pod
Use kubectl describe pod to see the probe definitions and recent probe events for a running Pod.
Look for the Liveness, Readiness, and Startup sections in the output:
Watch probe-triggered restarts
The RESTARTS column in kubectl get pods increments each time a liveness or startup probe failure causes a container restart.
Inspect probe failure events
When a probe fails, Kubernetes records a Warning event. Use the following command to see these events for a specific Pod:
Check Pod readiness condition
The Ready condition in a Pod's status reflects the readiness probe result. A Pod shows 0/1 in the READY column when its readiness probe is failing.
Test a health endpoint manually from inside the cluster
Run a temporary busybox Pod to test that an endpoint is reachable from within the cluster:
Step-by-Step Example
The Application: inventory-health-api
We will deploy an ASP.NET Core 10 API that exposes dedicated health check endpoints. The application is a simple inventory service. It has three health endpoints:
/healthz/startup— Returns 200 once the application has finished loading initial data./healthz/live— Returns 200 as long as the application process is not deadlocked./healthz/ready— Returns 200 only when the database connection is verified.
Step 1 — Add Health Checks in .NET 10
In your ASP.NET Core 10 Program.cs, register and map the health check endpoints. The Microsoft.Extensions.Diagnostics.HealthChecks package is included out of the box.
Step 2 — Containerise the Application
Build the container image using this multi-stage Dockerfile. The final image is based on the lightweight ASP.NET Core 10 runtime image.
Step 3 — Create the Deployment with All Three Probes
The Deployment below configures all three probe types on the container. Notice how the startup probe is given the most generous budget (failureThreshold: 30 × periodSeconds: 10 = up to 300 seconds to start), protecting the container from liveness restarts during slow initialisation.
Step 4 — Create the Service
Expose the Deployment inside the cluster using a ClusterIP Service. Kubernetes will only route traffic to Pods whose readiness probe is currently passing.
Step 5 — Observe the Probes in Action
Watch the Pods start and become ready. During the startup phase, the READY column shows 0/1. Once the startup probe passes, liveness and readiness probes take over, and the Pod becomes 1/1.
Describe a Pod to confirm all three probe configurations are set correctly:
To simulate a liveness failure, you can temporarily modify your health endpoint to return HTTP 500 and observe Kubernetes incrementing the RESTARTS counter and eventually restarting the container.
Step 6 — Understanding the Probe Lifecycle Timeline
Here is the sequence of probe activity from the moment a container starts:
- Container process starts.
- After
initialDelaySeconds(5s), the startup probe begins polling/healthz/startupevery 10 seconds. - Liveness and readiness probes are suspended while the startup probe is still running.
- The startup probe succeeds → it is deactivated permanently for this container instance.
- The liveness probe begins polling
/healthz/liveevery 10 seconds. - The readiness probe begins polling
/healthz/readyevery 5 seconds. - Once the readiness probe passes, the Pod's IP is added to the
inventory-health-api-serviceEndpoints and it starts receiving traffic.
Bonus: TCP Socket and Exec Probe Examples
When an application does not expose an HTTP endpoint, use a TCP socket probe to verify the port is open. This is common for databases and message brokers:
For scripted health checks (for example, verifying a file exists or a process responds to a signal), use an exec probe:
Summary
Kubernetes health probes are the mechanism through which the Kubelet enforces your definition of a "healthy" container. By combining all three probe types, you build a robust self-healing system:
- The startup probe protects slow-starting containers from being killed before they are ready. Configure a generous
failureThresholdto match your worst-case startup time. - The liveness probe detects containers that are running but stuck. Keep liveness endpoints lightweight — they should only reflect the internal state of the process, not external dependencies.
- The readiness probe controls traffic routing. A failing readiness probe removes the Pod from load balancing without a restart. It is safe to check external dependencies here.
- Choose the right probe mechanism (
httpGet,tcpSocket,exec, orgrpc) based on what your application exposes. - Always set
timeoutSecondsto a realistic value — the default of 1 second is often too aggressive for cloud environments with variable latency.
Together, startup, liveness, and readiness probes turn your Kubernetes cluster into a self-healing platform that automatically recovers from broken containers and gracefully manages traffic during deployments and scale events.