Kubernetes Entry Created: 10 Apr 2026 Updated: 10 Apr 2026

Kubernetes Health Probes: Liveness, Readiness, and Startup

In a real-world system, software is never perfect. A container might start successfully but hang internally after a few hours. A new replica might be scheduled but not yet connected to its database. Kubernetes handles these situations through health probes — active checks that the Kubelet runs against every container on its node.

The Kubelet is the agent running on every Kubernetes node. It is responsible for starting, stopping, and monitoring containers. To monitor them, he must be able to reach each container over the network or by running commands inside it. This is why the Kubelet, and the node it runs on, must always have direct connectivity to all containers scheduled on that node.

Kubernetes offers three distinct probe types:

  1. Startup Probe — Has the application finished starting up?
  2. Liveness Probe — Is the application still alive and functioning?
  3. Readiness Probe — Is the application ready to serve user traffic?

Each probe serves a different purpose and triggers a different action when it fails. Understanding the difference between them is essential for building reliable, self-healing Kubernetes workloads.

Core Concepts

The Kubelet and Its Role in Health Checking

Think of the Kubelet as a node-level supervisor. It watches every Pod assigned to the node and continuously polls each container using the probes you define in your manifest. If a probe fails enough times in a row, the Kubelet takes corrective action — either restarting the container or temporarily removing it from load balancing rotation.

Because the Kubelet connects directly from the node to the container's IP address, there is no extra hop through a Service or Ingress. This direct reach also means that network policies must not block traffic from the node's own IP to the container port used by the probe.

Why Health Probes Matter

Without probes, Kubernetes only knows whether a container process is running. It cannot tell whether that process is healthy, deadlocked, or still initialising. With probes, you teach Kubernetes exactly what "healthy" means for your specific application, and it will automatically enforce that definition.

  1. A deadlocked application process stays running but stops responding → liveness probe catches this.
  2. A Pod starts but its connection pool is not ready → readiness probe keeps traffic away until it is.
  3. A slow-starting application takes 60 seconds to load data → startup probe prevents premature restarts.

Probe Execution Mechanisms

Every probe — regardless of type (liveness, readiness, startup) — uses one of four mechanisms to check the container:

MechanismHow It WorksSuccess Condition
httpGetKubelet sends an HTTP GET to a path and port on the container.HTTP response code between 200 and 399.
tcpSocketKubelet tries to open a TCP connection to the container's port.The TCP connection is established successfully.
execKubelet runs a command inside the container.The command exits with code 0.
grpcKubelet calls the gRPC Health Checking Protocol.Response status is SERVING.

Common Probe Configuration Fields

All three probe types share the same set of configuration fields. Understanding each field helps you tune probes to match your application's actual behaviour.

FieldDefaultMeaning
initialDelaySeconds0Seconds to wait after the container starts before the first probe runs.
periodSeconds10How often (in seconds) the probe is executed.
timeoutSeconds1Seconds to wait for a probe response before treating it as a failure.
failureThreshold3Number of consecutive failures before the probe is considered failed.
successThreshold1Number of consecutive successes required to mark the probe as passed again (readiness only).

The Three Probe Types in Detail

Startup Probe

The startup probe answers the question: "Has the application finished its startup routine?" While the startup probe is running and has not yet succeeded, Kubernetes disables both the liveness and readiness probes entirely. This is important for applications that take a long time to initialise — for example, a service that runs database migrations on startup or pre-loads a large model into memory.

If you set failureThreshold: 30 and periodSeconds: 10, you are allowing up to 300 seconds (5 minutes) for the application to finish starting before Kubernetes gives up and restarts the container. Once the startup probe succeeds once, it is never checked again for that container instance.

What happens on failure? Kubernetes kills and restarts the container, just like a failed liveness probe.

Liveness Probe

The liveness probe answers the question: "Is the application still functioning?" Think of it as a heartbeat check. If the application enters a deadlock, an infinite loop, or a broken state from which it cannot recover on its own, the liveness probe will detect the silence and restart the container automatically.

A liveness probe endpoint should be lightweight. It must not check downstream dependencies like databases — if the database goes down, you do not want all your containers to restart, because that would make a bad situation much worse.

What happens on failure? The container is killed and restarted according to the Pod's restartPolicy (which is Always by default for Deployments).

Readiness Probe

The readiness probe answers the question: "Is the application ready to receive traffic?" This probe controls whether the Pod's IP address is included in the Endpoints object of any Service that selects it. If a readiness probe fails, the Pod is removed from the Service's endpoints and stops receiving new requests — but the container itself is not restarted.

Unlike liveness, the readiness probe is appropriate for checking external dependencies such as a database connection or a cache. It is also re-evaluated continuously throughout the Pod's lifetime. A Pod can become "not ready" and then "ready" again multiple times.

What happens on failure? The Pod is removed from Service load balancing. No restart occurs.

Liveness vs Readiness — The Key Distinction

AspectLiveness ProbeReadiness Probe
Failure actionContainer is restartedPod removed from Service endpoints
Check external dependencies?No — keep it minimalYes — safe and recommended
Re-evaluated after pass?Yes, continuouslyYes, continuously
Typically checksInternal app state (deadlock, freeze)DB connection, cache, warm-up status

Hands-On: Kubernetes Commands

View probe configuration for a running Pod

Use kubectl describe pod to see the probe definitions and recent probe events for a running Pod.

kubectl describe pod <pod-name>

Look for the Liveness, Readiness, and Startup sections in the output:

Liveness: http-get http://:8080/healthz/live delay=0s timeout=3s period=10s #success=1 #failure=3
Readiness: http-get http://:8080/healthz/ready delay=0s timeout=3s period=5s #success=1 #failure=3
Startup: http-get http://:8080/healthz/startup delay=5s timeout=3s period=10s #success=1 #failure=30

Watch probe-triggered restarts

The RESTARTS column in kubectl get pods increments each time a liveness or startup probe failure causes a container restart.

kubectl get pods -w

Inspect probe failure events

When a probe fails, Kubernetes records a Warning event. Use the following command to see these events for a specific Pod:

kubectl describe pod <pod-name> | grep -A 20 Events

Check Pod readiness condition

The Ready condition in a Pod's status reflects the readiness probe result. A Pod shows 0/1 in the READY column when its readiness probe is failing.

kubectl get pod <pod-name> -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}'

Test a health endpoint manually from inside the cluster

Run a temporary busybox Pod to test that an endpoint is reachable from within the cluster:

kubectl run probe-test --image=busybox:1.37 --restart=Never -it --rm -- \
wget -qO- http://inventory-health-api-service/healthz/ready

Step-by-Step Example

The Application: inventory-health-api

We will deploy an ASP.NET Core 10 API that exposes dedicated health check endpoints. The application is a simple inventory service. It has three health endpoints:

  1. /healthz/startup — Returns 200 once the application has finished loading initial data.
  2. /healthz/live — Returns 200 as long as the application process is not deadlocked.
  3. /healthz/ready — Returns 200 only when the database connection is verified.

Step 1 — Add Health Checks in .NET 10

In your ASP.NET Core 10 Program.cs, register and map the health check endpoints. The Microsoft.Extensions.Diagnostics.HealthChecks package is included out of the box.

// Program.cs
var builder = WebApplication.CreateBuilder(args);

builder.Services.AddHealthChecks()
.AddCheck("liveness", () => HealthCheckResult.Healthy("App is alive"))
.AddCheck("readiness", () => HealthCheckResult.Healthy("DB connection OK"))
.AddCheck("startup", () => HealthCheckResult.Healthy("Startup complete"));

var app = builder.Build();

app.MapHealthChecks("/healthz/live", new HealthCheckOptions { Predicate = c => c.Name == "liveness" });
app.MapHealthChecks("/healthz/ready", new HealthCheckOptions { Predicate = c => c.Name == "readiness" });
app.MapHealthChecks("/healthz/startup", new HealthCheckOptions { Predicate = c => c.Name == "startup" });

app.Run();

Step 2 — Containerise the Application

Build the container image using this multi-stage Dockerfile. The final image is based on the lightweight ASP.NET Core 10 runtime image.

# Dockerfile
FROM mcr.microsoft.com/dotnet/sdk:10.0 AS build
WORKDIR /src
COPY . .
RUN dotnet publish -c Release -o /app/publish

FROM mcr.microsoft.com/dotnet/aspnet:10.0
WORKDIR /app
COPY --from=build /app/publish .
EXPOSE 8080
ENTRYPOINT ["dotnet", "InventoryHealthApi.dll"]
docker build -t your-registry/inventory-health-api:1.0.0 .
docker push your-registry/inventory-health-api:1.0.0

Step 3 — Create the Deployment with All Three Probes

The Deployment below configures all three probe types on the container. Notice how the startup probe is given the most generous budget (failureThreshold: 30 × periodSeconds: 10 = up to 300 seconds to start), protecting the container from liveness restarts during slow initialisation.

apiVersion: apps/v1
kind: Deployment
metadata:
name: inventory-health-api
labels:
app: inventory-health-api
spec:
replicas: 3
selector:
matchLabels:
app: inventory-health-api
template:
metadata:
labels:
app: inventory-health-api
spec:
containers:
- name: inventory-health-api
image: mcr.microsoft.com/dotnet/aspnet:10.0
ports:
- containerPort: 8080
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"
startupProbe:
httpGet:
path: /healthz/startup
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
failureThreshold: 30
timeoutSeconds: 3
livenessProbe:
httpGet:
path: /healthz/live
port: 8080
initialDelaySeconds: 0
periodSeconds: 10
failureThreshold: 3
timeoutSeconds: 3
readinessProbe:
httpGet:
path: /healthz/ready
port: 8080
initialDelaySeconds: 0
periodSeconds: 5
failureThreshold: 3
successThreshold: 1
timeoutSeconds: 3
kubectl apply -f inventory-health-api-deployment.yaml

Step 4 — Create the Service

Expose the Deployment inside the cluster using a ClusterIP Service. Kubernetes will only route traffic to Pods whose readiness probe is currently passing.

apiVersion: v1
kind: Service
metadata:
name: inventory-health-api-service
spec:
type: ClusterIP
selector:
app: inventory-health-api
ports:
- port: 80
targetPort: 8080
kubectl apply -f inventory-health-api-service.yaml

Step 5 — Observe the Probes in Action

Watch the Pods start and become ready. During the startup phase, the READY column shows 0/1. Once the startup probe passes, liveness and readiness probes take over, and the Pod becomes 1/1.

kubectl get pods -w

Describe a Pod to confirm all three probe configurations are set correctly:

kubectl describe pod -l app=inventory-health-api

To simulate a liveness failure, you can temporarily modify your health endpoint to return HTTP 500 and observe Kubernetes incrementing the RESTARTS counter and eventually restarting the container.

kubectl get pods --watch

Step 6 — Understanding the Probe Lifecycle Timeline

Here is the sequence of probe activity from the moment a container starts:

  1. Container process starts.
  2. After initialDelaySeconds (5s), the startup probe begins polling /healthz/startup every 10 seconds.
  3. Liveness and readiness probes are suspended while the startup probe is still running.
  4. The startup probe succeeds → it is deactivated permanently for this container instance.
  5. The liveness probe begins polling /healthz/live every 10 seconds.
  6. The readiness probe begins polling /healthz/ready every 5 seconds.
  7. Once the readiness probe passes, the Pod's IP is added to the inventory-health-api-service Endpoints and it starts receiving traffic.

Bonus: TCP Socket and Exec Probe Examples

When an application does not expose an HTTP endpoint, use a TCP socket probe to verify the port is open. This is common for databases and message brokers:

livenessProbe:
tcpSocket:
port: 5432
initialDelaySeconds: 10
periodSeconds: 10

For scripted health checks (for example, verifying a file exists or a process responds to a signal), use an exec probe:

livenessProbe:
exec:
command:
- sh
- -c
- "redis-cli ping | grep PONG"
initialDelaySeconds: 5
periodSeconds: 10

Summary

Kubernetes health probes are the mechanism through which the Kubelet enforces your definition of a "healthy" container. By combining all three probe types, you build a robust self-healing system:

  1. The startup probe protects slow-starting containers from being killed before they are ready. Configure a generous failureThreshold to match your worst-case startup time.
  2. The liveness probe detects containers that are running but stuck. Keep liveness endpoints lightweight — they should only reflect the internal state of the process, not external dependencies.
  3. The readiness probe controls traffic routing. A failing readiness probe removes the Pod from load balancing without a restart. It is safe to check external dependencies here.
  4. Choose the right probe mechanism (httpGet, tcpSocket, exec, or grpc) based on what your application exposes.
  5. Always set timeoutSeconds to a realistic value — the default of 1 second is often too aggressive for cloud environments with variable latency.

Together, startup, liveness, and readiness probes turn your Kubernetes cluster into a self-healing platform that automatically recovers from broken containers and gracefully manages traffic during deployments and scale events.


Share this lesson: