MLflow Autologging: Automatically Track PyTorch and TF (2026)

MLflow autologging doesn’t just log parameters and metrics; it actively rewrites your training code on the fly to capture details you’d never think to log manually.

Let’s see it in action with a simple PyTorch example.

import torch
import torch.nn as nn
import torch.optim as optim
import mlflow
from mlflow.models import infer_signature

# Enable autologging for PyTorch
mlflow.autolog()

# Define a simple model
class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(10, 5)
        self.fc2 = nn.Linear(5, 2)

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# Hyperparameters
learning_rate = 0.01
epochs = 5
batch_size = 32
input_features = 10
output_classes = 2

# Generate some dummy data
X_train = torch.randn(100, input_features)
y_train = torch.randint(0, output_classes, (100,))

# Instantiate model, loss, and optimizer
model = SimpleNN()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

# Training loop
for epoch in range(epochs):
    for i in range(0, len(X_train), batch_size):
        inputs = X_train[i:i+batch_size]
        labels = y_train[i:i+batch_size]

        # Forward pass
        outputs = model(inputs)
        loss = criterion(outputs, labels)

        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}')

print("Training finished. Check MLflow UI for logged experiments.")

When you run this script, MLflow automatically starts a run, logs hyperparameters like learning_rate, epochs, and batch_size, and then starts tracking metrics like training loss and accuracy for each epoch. It even infers and logs the model signature and saves the trained model artifact.

The core problem autologging solves is the tedious, error-prone process of manually instrumenting training loops. Developers often forget to log certain parameters, miss logging intermediate metrics, or fail to capture the final model artifact correctly. MLflow’s autologging acts as a universal wrapper, observing common deep learning framework operations and translating them into MLflow tracking events.

Internally, autologging works by patching specific functions and methods within the deep learning libraries. For PyTorch, it hooks into Optimizer.step(), nn.Module.forward(), and the training loop itself. When optimizer.step() is called, it knows parameters have been updated. When model.forward() is called, it can potentially infer input/output shapes and data types. It intercepts loss calculations and epoch completions to record metrics. For TensorFlow, it integrates with Keras callbacks to achieve similar results.

The exact levers you control are primarily through configuration before you start training. You can enable or disable autologging for specific frameworks:

# Enable only PyTorch autologging
mlflow.pytorch.autolog()

# Enable only TensorFlow autologging
mlflow.tensorflow.autolog()

# Disable all autologging
mlflow.autolog(disable=True)

You can also specify what gets logged. For example, to prevent MLflow from logging the model artifact itself (perhaps you manage model saving separately), you can set log_models=False:

mlflow.autolog(log_models=False)

Or, to log parameters only after a certain number of steps rather than on every step:

mlflow.autolog(log_every_n_epochs=1) # Logs metrics once per epoch

This flexibility allows you to tailor the autologging behavior to your specific workflow, avoiding unnecessary clutter in your MLflow runs while ensuring critical information is captured.

One aspect often overlooked is how autologging handles different types of optimizers and loss functions. It’s not just a simple loss.item() capture. For instance, when using CrossEntropyLoss, MLflow attempts to calculate and log accuracy by comparing the predicted class (derived from the output logits) with the true labels. It dynamically inspects the loss function’s type and the output shape of the model to make these educated guesses about relevant metrics beyond just the raw loss value. This means even if you switch from Adam to SGD or MSELoss to NLLLoss, autologging often adapts to log appropriate metrics without explicit code changes.

The next step after leveraging autologging is to explore custom logging within autologged runs, allowing you to add specific, project-level metrics or parameters that autologging doesn’t cover.