Huguette Dora Edjangue — Platform & DevOps Engineer

Liveness and readiness probes look almost identical in a Kubernetes manifest. They are configured the same way, they hit HTTP endpoints, they have the same parameters. It is easy to assume they do the same thing, or to configure them identically and move on.

They do fundamentally different things, and confusing them causes real problems in production. Here is what I learned while implementing them.

The core difference

Readiness probe

Question it answers:

"Is this pod ready to receive traffic right now?"

When it fails:

The pod is removed from the Service's endpoints. Traffic stops being routed to it. The pod keeps running.

Remove from load balancer

Liveness probe

Question it answers:

"Is this pod still alive and not in a broken state?"

When it fails:

Kubernetes kills the container and restarts it according to the pod's restart policy.

Restart the container

Readiness controls traffic routing. Liveness controls container lifecycle. They solve completely different problems.

Why using the same endpoint for both is a mistake

The most common pattern I see is configuring both probes to hit the same /health endpoint. It seems logical — if the app is healthy, it should be both live and ready, right?

The problem emerges during startup. A Next.js application or any Node.js server takes time to initialise. During that period, the app might be alive — the process is running, the port is open — but not yet ready to serve requests. If your liveness probe fires too early and the app has not loaded yet, Kubernetes kills and restarts the container. That restart resets the timer. The container never gets a chance to finish starting up.

Crash loop patternA liveness probe that fires during startup causes a crash loop. The container starts, the probe fires before the app is ready, Kubernetes restarts it, the probe fires again. The pod never becomes healthy.

What the probes should actually check

Readiness probe → check application readiness

Hit an endpoint that verifies the app is fully initialised and ready to handle requests. For a Next.js app, this might be /api/health that returns 200 only after all startup tasks are complete. For an app that depends on a database connection, the readiness check should verify that connection exists.

Liveness probe → check for deadlocks and zombie states

Hit a simpler endpoint that just confirms the process is alive and not stuck. This should be extremely lightweight — it should not check dependencies or do any real work. If the process can respond at all, it is live. The liveness probe only needs to catch catastrophic failures: memory leaks that cause total unresponsiveness, deadlocks, or corrupted state.

The configuration — what to actually set

livenessProbe:
  httpGet:
    path: /api/alive      # Lightweight — just confirms process is running
    port: 3000
  initialDelaySeconds: 30  # Give the app time to start before first check
  periodSeconds: 10
  failureThreshold: 3     # 3 consecutive failures before restart
  timeoutSeconds: 5

readinessProbe:
  httpGet:
    path: /api/ready      # Checks full app readiness (DB, cache, etc.)
    port: 3000
  initialDelaySeconds: 10  # Can fire earlier than liveness
  periodSeconds: 5
  failureThreshold: 3
  successThreshold: 1     # 1 success re-adds to load balancer
  timeoutSeconds: 3

The initialDelaySeconds on the liveness probe is the most important parameter. Set it too low and you get crash loops. Set it too high and Kubernetes will not catch genuine failures quickly enough. For most Node.js applications, 30 seconds is a safe starting point. Profile your actual startup time and set it to roughly 1.5× that value.

Startup probes — the third option

Kubernetes 1.16 introduced a third probe type: the startup probe. It runs once during container startup and disables the liveness probe until it succeeds. This is the cleanest solution for applications with variable startup times.

startupProbe:
  httpGet:
    path: /api/alive
    port: 3000
  failureThreshold: 30    # Up to 5 minutes (30 × 10s) to start
  periodSeconds: 10
  # Liveness probe only activates after this succeeds

With a startup probe, you can give a slow-starting container up to 5 minutes to initialise without loosening your liveness probe thresholds. Once the startup probe succeeds, the liveness probe takes over with its normal strict settings.

What I got wrong first

My first attempt used the same /health endpoint for both probes with an initialDelaySeconds of 5 seconds. The app took 12 seconds to start. Kubernetes killed the container after 3 failed liveness checks, restarted it, and the cycle repeated. The pod showed as Running but was never Ready.

The fix was two separate endpoints, a 30-second initial delay on liveness, and a startup probe to handle the variable startup window cleanly. After those changes, the pod came up reliably on every deployment.

The rule of thumbReadiness probe = "can you take traffic?" Liveness probe = "are you still alive?" They should point to different endpoints and have different thresholds. If they are identical, one of them is wrong.

The full Kubernetes implementation including probe configuration is documented on the Kubernetes Platform project page.

Readiness vs liveness probes in Kubernetes — what I learned

The core difference

Why using the same endpoint for both is a mistake

What the probes should actually check

The configuration — what to actually set

Startup probes — the third option

What I got wrong first