MLOps Helm charts are not just about packaging your model; they’re about codifying the entire lifecycle of deploying and managing machine learning models as services within Kubernetes.

Let’s see this in action. Imagine you have a trained XGBoost model saved as model.pkl and you want to serve it via a Flask API.

# app.py
from flask import Flask, request, jsonify
import joblib

app = Flask(__name__)
model = joblib.load('model.pkl')

@app.route('/predict', methods=['POST'])
def predict():
    data = request.get_json()
    features = data['features']
    prediction = model.predict([features])[0]
    return jsonify({'prediction': prediction})

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

Now, we’ll package this into a Helm chart.

A typical MLOps Helm chart structure looks like this:

my-model-chart/
├── Chart.yaml
├── values.yaml
├── templates/
│   ├── deployment.yaml
│   ├── service.yaml
│   └── ingress.yaml (optional)
└── models/
    └── model.pkl

Chart.yaml describes your chart:

apiVersion: v2
name: my-model-chart
description: A Helm chart for deploying a machine learning model
version: 0.1.0
appVersion: "1.0"

values.yaml holds configurable parameters:

replicaCount: 1

image:
  repository: your-dockerhub-username/my-model-api
  pullPolicy: IfNotPresent
  tag: "latest"

service:
  type: ClusterIP
  port: 80
  targetPort: 5000

# Resources for the pod
resources:
  limits:
    cpu: 500m
    memory: 512Mi
  requests:
    cpu: 250m
    memory: 256Mi

# Optional: Ingress configuration
ingress:
  enabled: false
  className: "nginx"
  hosts:
    - host: chart-example.local
      paths:
        - path: /
          pathType: ImplementationSpecific

templates/deployment.yaml defines how to run your model serving container:

apiVersion: apps/v1
kind: Deployment
metadata:

  name: {{ include "my-model-chart.fullname" . }}

  labels:

    {{- include "my-model-chart.labels" . | nindent 4 }}

spec:

  replicas: {{ .Values.replicaCount }}

  selector:
    matchLabels:

      app.kubernetes.io/name: {{ include "my-model-chart.name" . }}


      app.kubernetes.io/instance: {{ .Release.Name }}

  template:
    metadata:
      labels:

        app.kubernetes.io/name: {{ include "my-model-chart.name" . }}


        app.kubernetes.io/instance: {{ .Release.Name }}

    spec:
      containers:

        - name: {{ .Chart.Name }}


          image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"


          imagePullPolicy: {{ .Values.image.pullPolicy }}

          ports:
            - name: http

              containerPort: {{ .Values.service.targetPort }}

              protocol: TCP
          resources:

            {{- toYaml .Values.resources | nindent 12 }}

          # Mount the model file if it's not baked into the image
          # volumeMounts:
          #   - name: model-storage
          #     mountPath: /app/model.pkl
          #     subPath: model.pkl # If using a single file from a ConfigMap or PersistentVolume
      # volumes:
      #   - name: model-storage
      #     emptyDir: {} # Or use a PersistentVolumeClaim or ConfigMap

templates/service.yaml exposes your model:

apiVersion: v1
kind: Service
metadata:

  name: {{ include "my-model-chart.fullname" . }}

  labels:

    {{- include "my-model-chart.labels" . | nindent 4 }}

spec:

  type: {{ .Values.service.type }}

  ports:

    - port: {{ .Values.service.port }}


      targetPort: {{ .Values.service.targetPort }}

      protocol: TCP
      name: http
  selector:

    app.kubernetes.io/name: {{ include "my-model-chart.name" . }}


    app.kubernetes.io/instance: {{ .Release.Name }}

To use this, you’d first build your Docker image (including app.py and model.pkl), push it to a registry, and then deploy:

helm install my-model ./my-model-chart --set image.tag=v1.0.0 --set ingress.enabled=true --set ingress.hosts[0].host=my-model.example.com

This setup allows you to manage your model deployments like any other application in Kubernetes, leveraging its features for scaling, resilience, and updates. The chart acts as a blueprint, defining the Kubernetes resources (Deployments, Services, etc.) needed to run your model serving application.

The truly powerful aspect of using Helm for MLOps is not just about packaging a single model, but about creating reusable templates for common model serving patterns. You can define a base chart that includes standard logging, monitoring sidecars, feature store integration logic, and then parameterize it for different models by simply changing values.yaml and the Docker image. This turns your model deployment from a one-off task into a repeatable, auditable process.

When you update the model, you simply update the image.tag in values.yaml and run helm upgrade. Kubernetes, orchestrated by Helm, handles the rolling update, ensuring zero downtime if configured correctly.

What most people don’t realize is that the subPath in volumeMounts can be used with a ConfigMap containing a single model file. This avoids baking large model files directly into Docker images, allowing for faster image pulls and simpler model updates without rebuilding the entire image, provided the container entrypoint is designed to load the model from the mounted path.

The next logical step is to integrate this with CI/CD pipelines that automatically build the Docker image, update the Helm chart’s values.yaml with the new image tag, and trigger helm upgrade.

Want structured learning?

Take the full MLOps & AI DevOps course →