Kubernetes ReplicaSets
Overview
Running a single Pod is fine for experiments and one-off tasks, but real production services need multiple identical copies of a Pod running simultaneously. You need redundancy so that a node failure does not take your service offline. You need scale so that you can handle more traffic by adding more Pods. You need sharding so that different Pod replicas can work on different parts of a problem in parallel.
You could create those copies by hand — write three separate Pod manifests that are almost identical — but that is tedious and error-prone. More importantly, if one of those Pods crashes, nothing brings it back. You need a controller that continuously watches the cluster and ensures the right number of healthy Pods is always running. That controller is the ReplicaSet.
A ReplicaSet combines a Pod template (the blueprint) with a desired replica count (the target) into a single API object. It permanently runs a control loop that compares the desired count with the actual count and creates or deletes Pods to close any gap. This article explains how ReplicaSets work internally, how to create and scale them, and how to configure Horizontal Pod Autoscaling (HPA) to automate scaling decisions.
Core Concepts
Step 1: Why Singleton Pods Are Not Enough
A standalone Pod object has no self-healing behaviour. If the node it runs on crashes, the Pod simply disappears and is never replaced. Three common reasons to run multiple replicas are:
| Reason | Description |
|---|---|
| Redundancy | If one Pod crashes, the others keep serving traffic while Kubernetes replaces the broken one. |
| Scale | More replicas handle more concurrent requests, spreading load across multiple processes. |
| Sharding | Different replicas can process different slices of data in parallel, reducing total processing time. |
A ReplicaSet solves all three requirements by ensuring the declared number of identical Pods is always running, no matter what happens to individual nodes or Pods.
Step 2: The Reconciliation Loop — Desired vs. Observed State
The core idea behind a ReplicaSet is the reconciliation loop. Kubernetes continuously compares two states:
- Desired state — what you declared (e.g.,
replicas: 3). - Observed state — what is actually running right now (e.g., 2 healthy Pods).
The reconciliation loop runs constantly. When the observed state differs from the desired state, the controller takes corrective action: it creates new Pods when there are too few, or deletes Pods when there are too many. The loop handles user-initiated scaling, node failures, and nodes rejoining the cluster after an outage — all with the same simple logic.
Think of it like a thermostat: you set the target temperature (desired state), and the thermostat switches the heating on or off (corrective action) until the room temperature (observed state) matches. The thermostat does not care why the temperature changed — it just reacts.
Step 3: Loose Coupling Between ReplicaSets and Pods
ReplicaSets do not own the Pods they create. Ownership is determined by label selectors, not by object references. When the ReplicaSet controller needs to know whether enough Pods are running, it queries the Kubernetes API for all Pods that match the ReplicaSet's label selector. The count of matching Pods is the observed replica count.
This loose coupling enables two important behaviours:
- Adoption — if matching Pods already exist (perhaps created manually before the ReplicaSet was defined), the ReplicaSet will adopt them and only create the remaining Pods needed to reach the desired count. Your service never goes to zero replicas.
- Quarantine — you can remove a misbehaving Pod from the ReplicaSet's awareness simply by changing its labels. The ReplicaSet will create a fresh replacement while the sick Pod keeps running in isolation for debugging.
Services are also decoupled from ReplicaSets: a Service selects Pods by labels independently of any ReplicaSet, so Services, ReplicaSets, and Pods are all separate, composable building blocks.
Step 4: The ReplicaSet Spec
Every ReplicaSet manifest has three critical sections inside spec:
| Field | Purpose |
|---|---|
spec.replicas | The desired number of Pods to keep running at all times. |
spec.selector | A label query that identifies which Pods belong to this ReplicaSet. |
spec.template | The Pod blueprint used to create new Pods when the count falls below desired. |
The selector must be a subset of the labels defined in spec.template.metadata.labels. If they do not match, Kubernetes will reject the manifest at admission time. This consistency check prevents a ReplicaSet from creating Pods it will immediately fail to count.
Step 5: Pod Templates
The spec.template section is a complete Pod specification embedded inside the ReplicaSet. It contains a metadata.labels block and a spec block identical to what you would write in a standalone Pod manifest. When the ReplicaSet controller creates a new Pod, it submits this template directly to the API server — there is no separate template file to maintain.
A minimal Pod template looks like this:
Every container in the template should declare resources.requests and resources.limits so that the Kubernetes scheduler can place Pods correctly and the kubelet can enforce resource constraints.
Step 6: How Labels Wire a ReplicaSet to Its Pods
When the ReplicaSet controller starts its reconciliation loop, it fetches a list of all Pods in the namespace and filters them by the label selector defined in spec.selector. The number of Pods returned by that query is the observed replica count.
Each Pod that the ReplicaSet creates gets an ownerReferences entry in its metadata pointing back to the ReplicaSet. This is how you can discover which ReplicaSet manages a given Pod:
Conversely, to list all Pods managed by a specific ReplicaSet, query by its selector labels:
Step 7: Adopting Existing Pods
Imagine you started with a single Pod deployed imperatively for testing. Later you decide to make the service highly-available with three replicas. If you create a ReplicaSet whose selector matches that existing Pod's labels, the ReplicaSet will adopt the running Pod and count it toward the desired replica count. It will then create only the additional Pods needed to reach the target.
This means the transition from one Pod to a replicated set is seamless — there is no moment with zero running Pods, and no downtime.
Step 8: Quarantining a Misbehaving Pod
Health checks catch most failures, but a Pod can sometimes misbehave in ways that probes do not detect — for example, returning subtly wrong data while appearing healthy. In that case, you can quarantine the Pod by changing one of its labels so it no longer matches the ReplicaSet selector:
Once the label no longer matches, the ReplicaSet treats the Pod as if it does not exist and immediately creates a replacement. The quarantined Pod keeps running, live traffic is removed from it (because the Service selector no longer matches either), and your engineers can kubectl exec into it for interactive debugging while the service remains fully available.
Step 9: Imperative Scaling with kubectl scale
The fastest way to change the replica count is the imperative scale command:
This is useful for emergency responses — for example, when traffic spikes unexpectedly. However, imperative changes do not update your source-controlled manifest file. If someone later applies the original manifest (with replicas: 3), the count will be reverted to 3. Always follow an imperative change with a matching declarative update in version control.
Step 10: Declarative Scaling with kubectl apply
The preferred approach is to edit the manifest file and apply it. Open the ReplicaSet YAML, change spec.replicas to the new value, commit the change to source control, and apply:
Declarative scaling is auditable (the change appears in git history), reviewable (you can open a pull request), and repeatable (applying the same file twice is idempotent). Prefer declarative changes in all non-emergency situations.
Step 11: Horizontal Pod Autoscaling (HPA)
Rather than choosing a fixed replica count, you can let Kubernetes choose it automatically using a HorizontalPodAutoscaler (HPA). The HPA continuously reads metrics (CPU, memory, or custom application metrics) from the metrics-server and scales the target ReplicaSet up or down to stay within the bounds you define.
You can create an HPA with an imperative command:
Or declaratively with a manifest (preferred — see the payment-api-hpa.yaml file included with this article). Verify the autoscaler was created:
Important: Do not manually manage spec.replicas on a ReplicaSet that is controlled by an HPA. If both you and the HPA attempt to write the replica count, they will conflict and produce unpredictable behaviour. Let the HPA own the replica count entirely.
HPA requires the metrics-server to be installed. Verify it is running:
Look for a Pod whose name starts with metrics-server. Most managed Kubernetes clusters (AKS, EKS, GKE) include it by default.
Step 12: Deleting a ReplicaSet
When you no longer need a ReplicaSet, delete it with:
By default this also deletes all the Pods managed by the ReplicaSet. If you want to keep the Pods running (for example, to hand them off to a new controller), add the --cascade=false flag:
The Pods will continue running as orphaned pods — no controller will manage or replace them — until you delete them manually or a new ReplicaSet adopts them via a matching label selector.
Hands-On: Kubernetes Commands
Create a ReplicaSet from a manifest file:
List all ReplicaSets in the current namespace:
Describe a ReplicaSet (shows replica counts, selector, events):
List Pods managed by a specific ReplicaSet using its label selector:
Find which ReplicaSet owns a given Pod:
Scale a ReplicaSet imperatively:
Apply a declarative change (e.g., after editing spec.replicas in the manifest):
Create an HPA for a ReplicaSet (imperative):
List all HPAs:
Describe an HPA (shows current/desired replicas, metric values):
Quarantine a Pod by changing its label to remove it from the ReplicaSet:
Delete a ReplicaSet and all its Pods:
Delete a ReplicaSet but keep its Pods running:
Step-by-Step Example
In this example you will deploy an ASP.NET Core 10 payment API as a ReplicaSet with three replicas, verify self-healing, scale it up declaratively, and attach an HPA.
Application: Payment API
The application is a minimal ASP.NET Core 10 Web API that exposes a /health/ready endpoint for readiness checks and a /healthz endpoint for liveness checks.
The Dockerfile for the application:
Register health checks in Program.cs:
Step 1: Create the ReplicaSet
Apply the manifest file (see payment-api-replicaset.yaml in this folder):
Confirm the ReplicaSet was created and three Pods are running:
Expected output — three Pods all in Running state:
Step 2: Inspect the ReplicaSet
See replica counts, selector, Pod template summary, and recent events:
The output will show 3 current / 3 desired and list the three managed Pods.
Step 3: Observe Self-Healing
Delete one of the running Pods manually to simulate a failure. Kubernetes should immediately replace it:
Watch the Pod list — a new Pod will appear within seconds:
The ReplicaSet reconciliation loop detected that observed replicas (2) no longer matched desired replicas (3) and created a replacement.
Step 4: Verify ownerReferences
Confirm the new Pod is owned by the ReplicaSet:
Output: payment-api
Step 5: Scale Declaratively to Five Replicas
Edit payment-api-replicaset.yaml and change spec.replicas from 3 to 5, then apply:
Confirm two new Pods were created:
Step 6: Attach an HPA
Apply the HPA manifest (see payment-api-hpa.yaml in this folder):
Verify the autoscaler is active:
The HPA will maintain between 2 and 10 replicas, scaling up when average CPU exceeds 70% and scaling down when it drops well below that threshold. Remove the explicit spec.replicas field from your ReplicaSet manifest (or stop applying it) so the HPA owns the count.
Step 7: Quarantine a Misbehaving Pod
Suppose payment-api-k2mnp is logging errors but passing health checks. Remove it from the ReplicaSet and the Service without killing it:
The ReplicaSet immediately creates a replacement. The quarantined Pod is still running and you can inspect it interactively:
Step 8: Clean Up
Delete the HPA and ReplicaSet (and all managed Pods):
Verify no Pods remain:
Summary
- A ReplicaSet ensures a specified number of identical Pods are always running. It uses a reconciliation loop to continuously compare desired vs. observed state and take corrective action.
- ReplicaSets are loosely coupled to Pods via label selectors, not ownership. This enables adopting existing Pods and quarantining sick Pods without downtime.
- The ReplicaSet spec has three key fields:
replicas(desired count),selector(label query), andtemplate(Pod blueprint). The selector must be a subset of the template's labels. - Always use declarative scaling (edit the YAML, apply it, commit it) rather than imperative
kubectl scalecommands, except in emergencies. - Use a HorizontalPodAutoscaler to let Kubernetes choose the replica count automatically based on CPU, memory, or custom metrics. Never manually manage replicas on a ReplicaSet that is controlled by an HPA.
- In practice, most workloads use Deployments instead of ReplicaSets directly, because Deployments add rolling-update and rollback capabilities. Deployments manage ReplicaSets under the hood, so understanding ReplicaSets is essential for debugging Deployments.