The most surprising thing about automating model training and deployment is that the "model" itself is often the least important part of the pipeline.
Let’s watch a typical CI/CD pipeline for machine learning in action. Imagine a Git repository holding our code, including data preprocessing scripts, model training scripts (e.g., a Python file using scikit-learn or TensorFlow), and a Dockerfile to containerize our model serving application.
# .github/workflows/ci-cd.yml
name: ML CI/CD Pipeline
on:
push:
branches:
- main
jobs:
build-and-deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.9'
- name: Install dependencies
run: |
pip install -r requirements.txt
pip install mlflow
- name: Train and log model
env:
MLFLOW_TRACKING_URI: ${{ secrets.MLFLOW_TRACKING_URI }}
MLFLOW_EXPERIMENT_NAME: my-model-experiment
run: |
python train.py --data-path data/train.csv --model-output ./model
- name: Build and push Docker image
uses: docker/build-push-action@v4
with:
context: .
push: true
tags: your-dockerhub-username/my-ml-model:latest
file: Dockerfile
- name: Deploy to Kubernetes
uses: azure/k8s-actions/aks-deploy@v1 # Example for AKS, adjust for your platform
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
resource-group: my-resource-group
cluster-name: my-aks-cluster
namespace: ml-models
image-name: your-dockerhub-username/my-ml-model:latest
deployment-file: k8s/deployment.yaml
Here’s what’s happening:
- Checkout Code: Fetches the latest version of our ML project from Git.
- Setup Python: Ensures a consistent Python environment for reproducibility.
- Install Dependencies: Installs all necessary libraries, crucially including
mlflowfor experiment tracking. - Train and Log Model: This is where the magic happens.
train.pynot only trains a model but also logs metrics (accuracy, loss), parameters, and the trained model artifact itself to an MLflow tracking server. This makes experiments auditable and reproducible. TheMLFLOW_TRACKING_URIpoints to where these logs are stored. - Build and Push Docker Image: The
Dockerfiledefines how to package our model serving application (e.g., a Flask API wrapping the trained model). This image is then pushed to a container registry (like Docker Hub or a private registry). - Deploy to Kubernetes: The containerized application is deployed to a Kubernetes cluster. This step typically involves updating a Kubernetes
Deploymentresource to use the newly built Docker image.
The problem this solves is the "last mile" problem of ML: getting a trained model from a data scientist’s laptop into a production environment where it can serve predictions. Traditionally, this was a manual, error-prone process involving ad-hoc scripts and significant coordination. MLOps CI/CD automates this, treating model training and deployment with the same rigor as software code.
Internally, the pipeline orchestrates several key components:
- Version Control System (VCS): Git, acting as the single source of truth for code and configuration.
- CI/CD Platform: GitHub Actions, GitLab CI, Jenkins, etc., which trigger and manage the pipeline execution.
- Experiment Tracking: MLflow, Weights & Biases, or similar, to log model training runs, parameters, and metrics. This is crucial for comparing different model versions and debugging.
- Containerization: Docker, to package the model and its serving code into a portable, reproducible unit.
- Container Registry: Docker Hub, AWS ECR, GCP GCR, etc., to store the built Docker images.
- Orchestration/Deployment Platform: Kubernetes, AWS SageMaker Endpoints, Azure ML Endpoints, etc., to host and serve the model.
The exact levers you control are primarily in the code that runs within the pipeline: the train.py script, the Dockerfile, and the Kubernetes deployment manifests (k8s/deployment.yaml). You define what gets trained, how it’s packaged, and where it’s deployed. The CI/CD platform then automates the execution of these definitions.
Most people focus on the model performance metrics during training. However, the actual artifact being deployed is the container image. If your Dockerfile has a subtle bug, like installing a different version of a library than what your training script used, your deployed model might fail in production even if your training metrics looked perfect. This is why the build and push step for the Docker image is as critical as the training step itself.
The next challenge is often setting up robust model monitoring after deployment.