MariaDB partitioning isn’t just about splitting tables; it’s a powerful tool for optimizing query performance by allowing the database to intelligently ignore irrelevant data.

Let’s see this in action. Imagine a sales table partitioned by sale_date (a DATE column) into monthly partitions.

CREATE TABLE sales (
    sale_id INT AUTO_INCREMENT PRIMARY KEY,
    product_id INT,
    sale_date DATE,
    amount DECIMAL(10, 2)
) PARTITION BY RANGE (TO_DAYS(sale_date)) (
    PARTITION p202301 VALUES LESS THAN (TO_DAYS('2023-02-01')),
    PARTITION p202302 VALUES LESS THAN (TO_DAYS('2023-03-01')),
    PARTITION p202303 VALUES LESS THAN (TO_DAYS('2023-04-01')),
    PARTITION p202304 VALUES LESS THAN (TO_DAYS('2023-05-01'))
);

INSERT INTO sales (product_id, sale_date, amount) VALUES
(101, '2023-01-15', 150.75),
(102, '2023-01-20', 200.00),
(103, '2023-03-10', 75.50),
(104, '2023-03-15', 120.00),
(105, '2023-04-01', 300.00);

Now, consider a query to find sales in March 2023:

SELECT * FROM sales WHERE sale_date BETWEEN '2023-03-01' AND '2023-03-31';

Without partitioning, the database would scan the entire sales table. With partitioning, MariaDB’s optimizer analyzes the WHERE clause and determines that only partitions p202303 (and potentially p202304 depending on the exact range and partition definition) are relevant. It then prunes the other partitions, meaning it completely skips reading data from p202301 and p202302.

This pruning is the core benefit. The database only accesses the data it absolutely needs, dramatically reducing I/O and CPU load for queries that target specific ranges of the partitioning key. This is especially impactful for large tables where full scans are prohibitively expensive. The EXPLAIN statement will show which partitions are being accessed. For the query above, EXPLAIN would reveal that p202303 is the primary target.

The "pruning" mechanism works by the query optimizer evaluating the predicate (the WHERE clause) against the partition definitions. For RANGE partitioning, it converts the predicate’s values into the same format used in the PARTITION BY clause (e.g., TO_DAYS()) and then checks which partition bounds these values fall within. If a query’s predicate can be resolved to a single partition or a contiguous set of partitions, only those partitions are scanned. For example, a query like WHERE sale_date = '2023-03-10' will target only the p202303 partition because its definition VALUES LESS THAN (TO_DAYS('2023-04-01')) encompasses dates up to March 31st, 2023, and the specific date 2023-03-10 clearly falls into this range.

The key to effective pruning is ensuring your queries’ WHERE clauses directly align with the partitioning strategy. If you partition by date, queries filtering by date ranges will prune. If you partition by product_id, queries filtering by specific product_id values will prune. If you have composite partitioning (e.g., range on date, hash on region), pruning can occur on multiple levels. A query filtering on both date and region can potentially prune partitions based on both criteria.

The real magic happens when your partitioning scheme and query patterns are in sync. If you partition a massive log table by day, and your application frequently needs to query logs for a specific day or week, pruning will make those queries fly. Conversely, if you partition by year but your queries always look at individual days, you won’t get much pruning benefit. The optimizer can only prune based on the information available in the WHERE clause and the partition definitions. If the WHERE clause is too broad or doesn’t involve the partitioning key, pruning won’t occur.

The maximum number of partitions you can create is 4096. While this sounds like a lot, it’s possible to hit this limit if you have very granular partitioning (e.g., daily partitions for many years) and don’t manage them. When you hit this limit, you can no longer add partitions. It’s crucial to have a strategy for managing old partitions, typically by dropping or archiving them.

One subtlety often missed is how NULL values are handled in partitioning. For RANGE and LIST partitioning, NULL values are typically not allowed in the partitioning key column and will result in an error if you try to insert them. For KEY and HASH partitioning, NULL values are treated as zero, which can sometimes lead to unexpected data distribution if not accounted for. This means if your partitioning key column can contain NULLs, you need to decide on a strategy: either disallow NULLs (often the best approach for pruning) or ensure your partitioning scheme can accommodate them, potentially by adding a specific partition for NULLs if your partitioning type allows.

The next step in optimizing queries on partitioned tables is understanding how subqueries and joins interact with partition pruning.

Want structured learning?

Take the full Mariadb course →