MLflow’s Python client is your direct line to tracking, packaging, and deploying machine learning experiments, but most users only scratch the surface of its capabilities.

Let’s see it in action. Imagine you’re training a scikit-learn model. Here’s how you’d log parameters, metrics, and the model itself using mlflow.start_run():

import mlflow
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris

# Load data
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2, random_state=42)

# Start an MLflow run
with mlflow.start_run() as run:
    # Log hyperparameters
    params = {"C": 1.0, "solver": "liblinear"}
    mlflow.log_params(params)

    # Train a model
    model = LogisticRegression(**params)
    model.fit(X_train, y_train)

    # Evaluate the model
    score = model.score(X_test, y_test)
    mlflow.log_metric("accuracy", score)

    # Log the scikit-learn model
    mlflow.sklearn.log_model(model, "iris-model")

    print(f"MLflow Run ID: {run.info.run_id}")

This code snippet demonstrates the core workflow: you define a run, log your experimental inputs (parameters) and outputs (metrics), and then save the trained artifact (the scikit-learn model). The mlflow.start_run() context manager ensures that the run is properly ended, even if errors occur.

The MLflow Python client offers a rich API beyond simple logging. It’s structured around several key components: Runs, Experiments, Models, and Model Registries.

Runs represent a single execution of your ML code. You can start them manually with mlflow.start_run() as shown above, or MLflow can automatically track them if you’re using certain integrations (like mlflow.autolog()). Within a run, you can log:

  • Parameters: mlflow.log_param("param_name", value) for single parameters or mlflow.log_params({"param1": val1, "param2": val2}) for multiple. These are key-value pairs that define your experiment’s setup.
  • Metrics: mlflow.log_metric("metric_name", value, step=step_number) for numerical values that change over time or with training progress. The step argument is crucial for logging metrics across epochs or iterations.
  • Artifacts: mlflow.log_artifact("local/path/to/file") or mlflow.log_artifacts("local/path/to/directory") to save any files produced by your experiment, such as data files, plots, or configuration files.

Experiments are logical groupings of runs. They help you organize your work, for instance, by project or by the type of model you’re experimenting with. You can create a new experiment using mlflow.create_experiment("my-new-experiment") or set the active experiment for subsequent runs using mlflow.set_experiment("existing-experiment-name"). If you don’t explicitly set an experiment, MLflow defaults to an experiment named "Default".

Models can be logged in various formats. MLflow supports a wide range of flavors, including mlflow.sklearn, mlflow.tensorflow, mlflow.pytorch, mlflow.keras, mlflow.xgboost, and more. When you log a model using its specific flavor (e.g., mlflow.sklearn.log_model(...)), MLflow saves not only the model’s serialized state but also the code and dependencies required to load and use it later, making it highly portable. You can also log generic artifacts that represent models, but flavor-specific logging provides much richer metadata.

The Model Registry is where you manage the lifecycle of your logged models. Once a model is logged to a run, you can transition it to the registry using mlflow.tracking.MlflowClient().create_registered_model("my-registered-model") and then mlflow.tracking.MlflowClient().create_model_version(...). This allows you to version models, assign them stages (e.g., "Staging", "Production"), and associate them with specific runs.

A subtle but powerful aspect of the MLflow client is its ability to interact with the tracking server directly using mlflow.tracking.MlflowClient(). This client object provides programmatic access to all MLflow entities. For example, to list all experiments:

client = mlflow.tracking.MlflowClient()
for exp in client.search_experiments():
    print(f"Experiment: {exp.name} (ID: {exp.experiment_id})")

And to retrieve information about a specific run, including its parameters, metrics, and artifacts:

run_id = "your_run_id_here" # Replace with an actual run ID
run = client.get_run(run_id)
print(f"Run ID: {run.info.run_id}")
print(f"Parameters: {run.data.params}")
print(f"Metrics: {run.data.metrics}")
print(f"Artifact URI: {run.info.artifact_uri}")

Many users are unaware that mlflow.log_metric by default only logs the latest value for a given metric name within a run. If you call mlflow.log_metric("loss", 0.5) and then later mlflow.log_metric("loss", 0.4), the first value is overwritten. To log a series of metric values, you must use the step parameter: mlflow.log_metric("loss", 0.5, step=1) followed by mlflow.log_metric("loss", 0.4, step=2). This allows you to reconstruct the full training curve for a metric.

As you expand your MLflow usage, you’ll inevitably want to compare runs across different experiments programmatically.

Want structured learning?

Take the full Mlflow course →