Microservices on Kubernetes: Deploy and Scale Services (2026)

Kubernetes doesn’t actually deploy your microservices; it orchestrates containers, and your microservices run inside those containers.

Let’s see this in action. Imagine you have a simple user-service that needs to be accessible. First, you’d build a Docker image for it.

# Dockerfile for user-service
FROM golang:1.20-alpine
WORKDIR /app
COPY . .
RUN go build -o userservice
EXPOSE 8080
CMD ["./userservice"]

Then, you build and push this image to a registry (like Docker Hub or a private registry):

docker build -t your-docker-repo/user-service:v1.0 .
docker push your-docker-repo/user-service:v1.0

Now, Kubernetes comes into play. You define a Deployment to manage your user-service pods. This Deployment tells Kubernetes how to run your containerized application.

# user-service-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: user-service-deployment
  labels:
    app: user-service
spec:
  replicas: 3 # Start with 3 instances
  selector:
    matchLabels:
      app: user-service
  template:
    metadata:
      labels:
        app: user-service
    spec:
      containers:
      - name: user-service
        image: your-docker-repo/user-service:v1.0
        ports:
        - containerPort: 8080

Applying this to your cluster:

kubectl apply -f user-service-deployment.yaml

Kubernetes will then ensure that 3 pods running your user-service image are always up and running. If a pod crashes, Kubernetes automatically replaces it.

But how do other services talk to user-service? That’s where Services come in. A Service provides a stable IP address and DNS name for a set of pods.

# user-service-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: user-service
spec:
  selector:
    app: user-service # Selects pods with the label 'app: user-service'
  ports:
    - protocol: TCP
      port: 80 # The port the service will be accessible on
      targetPort: 8080 # The port your container listens on
  type: ClusterIP # Internal IP, only accessible within the cluster

Apply this:

kubectl apply -f user-service-service.yaml

Now, any other microservice within your Kubernetes cluster can reach user-service by using the DNS name user-service (or user-service.your-namespace.svc.cluster.local) and port 80. Kubernetes handles the routing of traffic from the Service IP to one of the healthy user-service pods.

Scaling is equally straightforward. To scale user-service from 3 to 10 replicas, you simply edit the replicas field in your Deployment manifest:

# user-service-deployment.yaml (edited)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: user-service-deployment
  labels:
    app: user-service
spec:
  replicas: 10 # Scaled up to 10 instances
  selector:
    matchLabels:
      app: user-service
  template:
    metadata:
      labels:
        app: user-service
    spec:
      containers:
      - name: user-service
        image: your-docker-repo/user-service:v1.0
        ports:
        - containerPort: 8080

And re-apply:

kubectl apply -f user-service-deployment.yaml

Kubernetes will then create 7 new pods running your user-service, and the user-service Service will automatically start load balancing traffic to all 10 pods.

For more advanced scaling, you can use a HorizontalPodAutoscaler (HPA). This object tells Kubernetes to automatically adjust the number of replicas based on observed CPU utilization or other custom metrics.

# user-service-hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: user-service-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: user-service-deployment
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70 # Scale up when CPU is above 70%

Applying this:

kubectl apply -f user-service-hpa.yaml

Now, if the average CPU utilization across all user-service pods exceeds 70%, Kubernetes will automatically increase the number of replicas, up to a maximum of 10. If the utilization drops, it will scale back down to a minimum of 2 replicas.

The core abstraction Kubernetes provides for managing your microservices is the Deployment (for desired state of your application instances) and the Service (for stable network access to those instances). These two objects, combined with containerization, form the foundation of running microservices on Kubernetes.

What most people don’t realize is that the Service object itself doesn’t do the actual packet routing; it configures network rules within the cluster’s network plane, typically managed by something like kube-proxy (using iptables or IPVS) or a more advanced CNI plugin, to intercept traffic destined for the Service IP and direct it to one of the backing pods. This means the Service is a declarative definition that gets enforced by the cluster’s infrastructure.

The next step is often exposing these internal services to the outside world using Ingress controllers.