Kubernetes Entry Created: 10 Mar 2026 Updated: 10 Mar 2026

Integrating Storage in Kubernetes

In an ideal world, every microservice would be completely stateless — handling requests, returning responses, and storing nothing. Stateless services are easy to scale, replace, and redeploy. However, almost every real system has state somewhere: a relational database holding customer records, a search index, a message broker, a cache. At some point, data has to live somewhere.

Integrating that data with Kubernetes is often the most challenging part of building a distributed system. Containerized, cloud-native patterns — decoupled, immutable, declarative — apply naturally to stateless web APIs. Storage is different. Storage solutions often require imperative setup steps, direct IP addressing, or physical proximity to data. These patterns do not fit neatly into the container model.

Kubernetes offers several approaches for dealing with stateful workloads. This article covers the first and most common: importing an external storage service that already exists outside the cluster. Future articles cover reliable singletons and StatefulSets.

The Problem of Data Gravity

Most containerized systems are not built from scratch. They are adapted from existing applications that run on virtual machines, and those VMs hold years of production data. You cannot simply containerize the application and leave the database behind — the data has mass, a pull toward where it already lives. Migrating terabytes of production data is expensive, risky, and time-consuming. This tendency for existing data to resist movement is called data gravity.

Kubernetes provides a clean mechanism for dealing with this: you can represent an external service inside the cluster as if it were a native Kubernetes Service. Your applications never know the difference. The database appears to them as just another cluster service, even though it is actually running on a VM or in a cloud provider's managed database offering.

This pattern is also extremely useful for maintaining identical configuration between environments. In production your application connects to a legacy on-premises database. In testing it connects to a lightweight transient database container. You can name both my-database — one in the prod namespace, one in the test namespace. The application configuration never changes between environments; only the actual backing service differs.

Core Concepts

Services Without Selectors

When you create a normal Kubernetes Service, you provide a label selector — a query that finds the Pods that should receive traffic. But for an external service, there are no Pods. Instead, there is just a DNS hostname or IP address sitting outside the cluster.

Kubernetes supports this with two approaches depending on whether you have a DNS name or only an IP address for the external service.

ExternalName Services (DNS-based)

If your external service has a DNS name, use a Service of type ExternalName. Instead of creating an A record (an IP-to-name mapping) in the cluster's internal DNS, Kubernetes creates a CNAME record that aliases your chosen service name to the external hostname.

The key benefit is that applications inside the cluster use the short, stable name you chose (for example, analytics-db). The external DNS name — which may be long, managed by a cloud provider, or subject to change — is hidden behind that alias. If the external database moves to a different hostname, you update the Service definition once and nothing else changes.

IP-Address Services with Endpoints (IP-based)

Sometimes you do not have a DNS name for the external service — only an IP address. In this case Kubernetes can still represent the service internally, but you must manage the mapping manually using an Endpoints resource.

Normally Kubernetes populates Endpoints automatically by watching Pods that match a Service's label selector. When there is no selector, Kubernetes allocates a virtual IP for the Service but leaves the Endpoints list empty. You are responsible for creating the Endpoints object yourself, pointing it at the external IP.

Because Kubernetes will not update this Endpoints record automatically, you must ensure the IP address stays current. Either guarantee that the external server's IP never changes (a static IP assignment), or build automation that updates the Endpoints record whenever the IP changes.

Namespaces Enable Environment Parity

One of the most powerful benefits of representing external services inside Kubernetes is namespace isolation. Consider a Reporting API deployed into two namespaces:

NamespaceService NameResolves To
prodanalytics-dbanalytics-db.databases.company.com (production server)
testanalytics-dbanalytics-db-test.databases.company.com (test server)

Both versions of the Reporting API use the connection string Host=analytics-db. Neither knows nor cares that a different backing server is used in each environment.

Comparison: ExternalName vs IP-based External Service


ExternalName ServiceService + Endpoints (IP-based)
Requires DNS name?YesNo — IP address is sufficient
DNS record type createdCNAMEA record (virtual IP)
Endpoints managed byKubernetes (automatic CNAME)You (manually)
Load balancing across IPsNo (DNS round-robin only)Yes — list multiple IPs in addresses
Update required when backend movesUpdate externalName fieldUpdate the Endpoints resource

Hands-On: Kubernetes Commands

Inspecting Services and Endpoints

View all Services in a Namespace:

kubectl get services -n reporting

Describe a Service to see its type and how it is configured:

kubectl describe service analytics-db -n reporting

View the Endpoints for a Service. For a selector-based Service this is populated automatically. For an IP-based external Service, check that your manual Endpoints were accepted:

kubectl get endpoints -n reporting
kubectl describe endpoints legacy-metrics-db -n reporting

Testing DNS Resolution from Inside the Cluster

The most reliable way to confirm that a Service resolves correctly is to run a temporary Pod inside the same Namespace and perform a DNS lookup:

kubectl run dns-test --image=busybox:1.37 --rm -it \
--restart=Never -n reporting \
-- nslookup analytics-db

For an ExternalName Service you should see the CNAME chain leading to the external hostname. For an IP-based Service you should see the virtual cluster IP.

Testing TCP Connectivity

Confirm that traffic actually reaches the external server on the expected port:

kubectl run net-test --image=busybox:1.37 --rm -it \
--restart=Never -n reporting \
-- nc -zv analytics-db 5432

Editing an Endpoints Resource

If the external server's IP address changes, update the Endpoints record in place:

kubectl edit endpoints legacy-metrics-db -n reporting

Step-by-Step Example

In this walkthrough we configure a .NET Reporting API to connect to two external PostgreSQL databases: one reached by DNS name, and one reached by IP address only. Both are made available to the application under stable, cluster-internal names.

Step 1: Create the Namespace

kubectl create namespace reporting

Step 2: Import a DNS-Named External Database (ExternalName)

The main analytics database is a managed PostgreSQL instance. Its hostname is analytics-db.databases.company.com. We import it into the cluster under the name analytics-db. Save this as analytics-db-externalname.yaml:

apiVersion: v1
kind: Service
metadata:
name: analytics-db
namespace: reporting
spec:
type: ExternalName
externalName: analytics-db.databases.company.com
kubectl apply -f analytics-db-externalname.yaml

From this moment, any Pod in the reporting Namespace can resolve the hostname analytics-db. Kubernetes DNS will return a CNAME pointing to analytics-db.databases.company.com, which then resolves to the cloud database's IP address. The application does not need to know the cloud provider's hostname at all.

Verify the CNAME record was created:

kubectl run dns-test --image=busybox:1.37 --rm -it \
--restart=Never -n reporting \
-- nslookup analytics-db

Step 3: Import an IP-Based External Database

A legacy metrics database runs on a VM with no DNS name — only an IP address (10.0.1.25) and port 5432. To import this, we need two resources: a Service to give it a stable cluster name, and an Endpoints resource to tell Kubernetes where traffic should go.

First, create the Service with no selector. Save this as legacy-metrics-db-service.yaml:

apiVersion: v1
kind: Service
metadata:
name: legacy-metrics-db
namespace: reporting
spec:
ports:
- port: 5432
targetPort: 5432
kubectl apply -f legacy-metrics-db-service.yaml

At this point the Service exists and has a virtual IP address, but it has no endpoints — traffic sent to it will go nowhere. Now create the Endpoints resource. The name must exactly match the Service name. Save this as legacy-metrics-db-endpoints.yaml:

apiVersion: v1
kind: Endpoints
metadata:
name: legacy-metrics-db
namespace: reporting
subsets:
- addresses:
- ip: 10.0.1.25
ports:
- port: 5432
kubectl apply -f legacy-metrics-db-endpoints.yaml

Now traffic sent to legacy-metrics-db:5432 inside the cluster will be forwarded to 10.0.1.25:5432.

If the database has two replicas for redundancy, list both IPs in the addresses array:

subsets:
- addresses:
- ip: 10.0.1.25
- ip: 10.0.1.26
ports:
- port: 5432

Step 4: Deploy the Reporting API

The Reporting API is a .NET 10 application. It connects to analytics-db using the short cluster-internal hostname — it never needs to know that the database is external. Save this as reporting-api-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
name: reporting-api
namespace: reporting
labels:
app: reporting-api
spec:
replicas: 2
selector:
matchLabels:
app: reporting-api
template:
metadata:
labels:
app: reporting-api
spec:
containers:
- name: reporting-api
image: mcr.microsoft.com/dotnet/aspnet:10.0
ports:
- containerPort: 8080
env:
- name: ConnectionStrings__AnalyticsDb
value: "Host=analytics-db;Port=5432;Database=analytics;Username=app_user"
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
kubectl apply -f reporting-api-deployment.yaml

Notice that the connection string uses Host=analytics-db — just the short service name. Kubernetes DNS handles the rest. This same connection string works unchanged whether the backing database is a cloud-managed instance, a VM, or a containerized database running inside the cluster.

Step 5: Verify the Setup

Confirm that the Deployment is running:

kubectl get pods -n reporting

Check that both Services have entries in the cluster:

kubectl get services -n reporting

Verify the Endpoints were created correctly:

kubectl describe endpoints legacy-metrics-db -n reporting

Test DNS resolution from a temporary Pod inside the reporting Namespace:

kubectl run dns-test --image=busybox:1.37 --rm -it \
--restart=Never -n reporting \
-- nslookup analytics-db
kubectl run dns-test --image=busybox:1.37 --rm -it \
--restart=Never -n reporting \
-- nslookup legacy-metrics-db

Step 6: Demonstrate Environment Parity

The real power of this pattern becomes clear when you consider environment parity. Create a separate reporting-test Namespace:

kubectl create namespace reporting-test

Deploy a analytics-db ExternalName Service in the test Namespace, pointing to a different test database:

kubectl apply -f - <<EOF
apiVersion: v1
kind: Service
metadata:
name: analytics-db
namespace: reporting-test
spec:
type: ExternalName
externalName: analytics-db-test.databases.company.com
EOF

Now you can deploy the exact same Reporting API Deployment (with the same connection string Host=analytics-db) into reporting-test, and it will automatically connect to the test database — zero configuration changes required.

Step 7: Clean Up

kubectl delete namespace reporting
kubectl delete namespace reporting-test

Summary

Integrating external storage into Kubernetes doesn't require migrating data. By representing external services as Kubernetes Services, you gain all the benefits of cluster-native service discovery while keeping your existing infrastructure exactly where it is. Here is what we covered:

  1. Data gravity means existing data resists movement. Kubernetes provides a way to reference external services without migrating them.
  2. For external services reachable by a DNS name, use a Service of type ExternalName. Kubernetes creates a CNAME record in cluster DNS that aliases your chosen name to the external hostname.
  3. For external services reachable only by IP address, create a Service without a selector and manually create an Endpoints resource that maps the Service to the external IP. You are responsible for keeping the Endpoints record up to date.
  4. Namespace isolation makes environment parity trivial. Deploy the same application with the same configuration into prod and test namespaces, with each namespace containing a same-named Service pointing to its environment-specific backend.
  5. Applications written to use a cluster-internal service name (like analytics-db) require no code or configuration changes when the backing service is later migrated into the cluster as a native workload.


Share this lesson: