MLOps Docker: Containerize Models for Reproducible Deploys (2026)

Dockerizing your ML models is the key to moving from "it works on my machine" to robust, reproducible deployments.

Let’s see it in action. Imagine we have a simple Python script for a scikit-learn model that predicts house prices.

# predict.py
import joblib
import pandas as pd
from sklearn.ensemble import RandomForestRegressor

# Load the trained model
model = joblib.load("model.pkl")

def predict_price(area_sqft, num_bedrooms):
    data = pd.DataFrame([[area_sqft, num_bedrooms]], columns=["AreaSqft", "NumBedrooms"])
    return model.predict(data)[0]

if __name__ == "__main__":
    # Example usage
    area = 1500
    bedrooms = 3
    predicted_price = predict_price(area, bedrooms)
    print(f"The predicted price for {bedrooms} bedrooms in {area} sqft is: ${predicted_price:.2f}")

Now, we need a requirements.txt file to list our dependencies:

scikit-learn==1.2.2
pandas==1.5.3
joblib==1.3.1

To containerize this, we create a Dockerfile:

# Use an official Python runtime as a parent image
FROM python:3.9-slim

# Set the working directory in the container
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . /app

# Install any needed packages specified in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Make port 80 available to the world outside this container (for a web API)
EXPOSE 80

# Define environment variable
ENV MODEL_PATH=/app/model.pkl

# Run predict.py when the container launches
CMD ["python", "predict.py"]

Before we build, we need the model.pkl file. For this example, let’s quickly train and save one:

# train.py (for generating model.pkl)
import joblib
from sklearn.ensemble import RandomForestRegressor
import pandas as pd

# Dummy data
data = {'AreaSqft': [1000, 1200, 1500, 1800, 2000],
        'NumBedrooms': [2, 3, 3, 4, 4],
        'Price': [250000, 300000, 375000, 450000, 500000]}
df = pd.DataFrame(data)

X = df[['AreaSqft', 'NumBedrooms']]
y = df['Price']

model = RandomForestRegressor(n_estimators=10, random_state=42)
model.fit(X, y)

joblib.dump(model, "model.pkl")
print("Model saved to model.pkl")

Run python train.py in your local environment.

Now, build the Docker image:

docker build -t ml-model-deployer .

And run the container:

docker run ml-model-deployer

You’ll see the output: The predicted price for 3 bedrooms in 1500 sqft is: $375000.00.

The core problem this solves is dependency hell and environment drift. Without Docker, deploying your model means ensuring the exact same Python version, library versions, and system dependencies are present on the deployment server. This is a fragile and time-consuming process. Docker creates a self-contained unit – an image – that bundles your code, its dependencies, and runtime environment. When you run this image, you get a consistent, isolated environment every single time, regardless of the host machine’s configuration. This guarantees that your model behaves identically in development, testing, and production.

Internally, the Dockerfile is a set of instructions for Docker to build an image. FROM python:3.9-slim selects a lightweight base Python image. WORKDIR /app sets the default directory inside the container. COPY . /app brings your Python scripts and requirements.txt into that directory. RUN pip install ... executes commands within the container during the build process, installing your libraries. EXPOSE 80 is a declaration that the container intends to listen on port 80, crucial for web service deployments. CMD ["python", "predict.py"] specifies the default command to run when a container is started from this image.

The levers you control are primarily in the Dockerfile:

Base Image (FROM): Choosing a minimal image (like slim or alpine) reduces image size and potential attack surface. You can also use images pre-configured with ML tools like NVIDIA’s CUDA images for GPU acceleration.
Dependencies (RUN pip install): Pinning exact versions in requirements.txt (e.g., scikit-learn==1.2.2) is critical for reproducibility.
Entrypoint/Command (ENTRYPOINT, CMD): These define what runs when the container starts. CMD is good for simple scripts, while ENTRYPOINT is better for defining an executable that can be overridden by CMD arguments. For a web API, you’d typically use CMD ["gunicorn", "-b", "0.0.0.0:80", "app:app"] if you were using Flask/FastAPI.
Volumes (VOLUME): For models that are too large to bake into the image or need frequent updates, you can mount them as volumes from the host.
Environment Variables (ENV): Useful for passing configuration like model paths or API keys without hardcoding them into the image.

A common pattern for ML deployments is to wrap your prediction script in a web framework like Flask or FastAPI and then serve it via a WSGI server like Gunicorn. Your Dockerfile would then install flask and gunicorn, and the CMD would be adjusted to start the server. The EXPOSE directive would then correspond to the port Gunicorn listens on.

Most people don’t realize that Docker images are built in layers, and each RUN instruction creates a new layer. If you have multiple RUN commands that install dependencies, Docker caches each layer. If you change one line in requirements.txt, Docker has to re-run all subsequent RUN commands, even if they haven’t changed, leading to slower builds. Combining related commands, like installing multiple packages with a single RUN pip install ... or RUN apt-get update && apt-get install -y ..., can improve build speed by reducing the number of layers and leveraging caching more effectively.

The next step is often integrating this containerized model into a CI/CD pipeline for automated testing and deployment.