The MariaDB Operator on Kubernetes doesn’t just deploy MariaDB; it actively manages its lifecycle, acting as a self-healing, version-upgrading, and backup-taking automated DBA.

Let’s watch it in action. Imagine we have a simple application that needs a database. We’ll define a MariaDB custom resource in Kubernetes.

apiVersion: mariadb.open.com/v1alpha1
kind: MariaDB
metadata:
  name: my-mariadb-instance
spec:
  mariadbVersion: "10.6.8"
  replicas: 3
  storage:
    size: 10Gi
    class: standard

When we apply this YAML (kubectl apply -f mariadb-crd.yaml), the MariaDB Operator, which is already running in our cluster, notices this new MariaDB object. It then proceeds to:

  1. Create a StatefulSet: This StatefulSet will manage the MariaDB pods. Each pod gets a stable network identity and persistent storage.
  2. Provision Persistent Volumes: Based on spec.storage.size and spec.storage.class, Kubernetes will dynamically provision PersistentVolumeClaims, and thus PersistentVolumes, for each MariaDB replica.
  3. Initialize MariaDB: The first pod starts, initializes the MariaDB data directory, and sets up the initial cluster configuration.
  4. Configure Replication: Subsequent pods join the cluster and establish replication between themselves, turning a single instance into a highly available setup.
  5. Create a Service: A ClusterIP service is created, providing a stable endpoint (my-mariadb-instance.default.svc.cluster.local) for applications to connect to.

The Operator doesn’t just deploy; it continuously monitors. If a MariaDB pod crashes, the Operator will restart it. If a PersistentVolume becomes unavailable, the Operator attempts to reattach it or, in more severe cases, might trigger a recovery process. It also watches for configuration changes in the MariaDB custom resource and applies them.

The core problem the Operator solves is the operational burden of running stateful databases like MariaDB in a dynamic, ephemeral environment like Kubernetes. Traditionally, this involved manual setup for high availability, backups, and upgrades, all of which are complex and error-prone. The Operator codifies this operational knowledge.

Internally, the Operator watches for events on MariaDB custom resources. When a new one is created or an existing one is updated, it reconciles the desired state (defined in the spec) with the actual state of the cluster. It uses Kubernetes APIs to create, update, and delete child resources like StatefulSets, Services, Secrets (for passwords), and PersistentVolumeClaims. For advanced features like backups, it might create CronJobs to trigger mysqldump or similar tools, and Jobs to manage the backup lifecycle.

A key lever you control is the mariadbVersion. The Operator handles the complexities of performing rolling upgrades. When you change spec.mariadbVersion, the Operator will update the StatefulSet’s pod template. It then performs a rolling update, taking pods down one by one, upgrading their MariaDB version, and bringing them back up, ensuring replication is re-established at each step. This minimizes downtime during version migrations.

Another critical aspect is how the Operator manages secrets. It automatically generates a root password and stores it in a Kubernetes Secret. It also creates a user for your application if you specify one in the spec. This secret is then mounted into your application pods, providing credentials without exposing them directly in your deployment manifests.

The Operator provides capabilities for automated backups and restores. By defining a backupSchedule in the MariaDB resource, you can instruct the Operator to periodically create database dumps. These backups are typically stored in an external object storage system (like S3 or GCS), with the Operator managing the CronJob and Job resources required for this process. Restores are also orchestrated through MariaDB resource commands.

What most people don’t realize is how the Operator handles primary election in a Galera cluster. When multiple replicas are configured, and the primary node fails, the Operator doesn’t just restart the failed node. Instead, it orchestrates the promotion of a healthy replica to become the new primary, ensuring that the database remains available and that the cluster reconfigures itself to maintain quorum, all without manual intervention. This process involves careful coordination to avoid split-brain scenarios.

The next concept you’ll likely explore is integrating external tools for monitoring and advanced backup strategies.

Want structured learning?

Take the full Mariadb course →