The MariaDB optimizer trace is not just a log; it’s a real-time, step-by-step breakdown of how MariaDB decides to execute your SQL queries.

Let’s see it in action. Imagine a simple table:

CREATE TABLE users (
    user_id INT AUTO_INCREMENT PRIMARY KEY,
    username VARCHAR(50) NOT NULL,
    email VARCHAR(100),
    signup_date DATE
);

INSERT INTO users (username, email, signup_date) VALUES
('alice', 'alice@example.com', '2023-01-15'),
('bob', 'bob@example.com', '2023-02-20'),
('charlie', 'charlie@example.com', '2023-03-10');

Now, let’s trace a query:

SET optimizer_trace = 'enabled=on';
SELECT * FROM users WHERE username = 'bob';
SELECT * FROM users WHERE signup_date BETWEEN '2023-01-01' AND '2023-02-01';
SELECT optimizer_trace FROM information_schema.optimizer_trace WHERE query_id = LAST_QUERY_ID();
SET optimizer_trace = 'enabled=off';

The output from information_schema.optimizer_trace will be a JSON document. Here’s a snippet of what you might see for the first query:

{
  "steps": [
    {
      "step_id": 1,
      "kind": "Condition filtering",
      "message": "Filter condition: (users.username = 'bob')"
    },
    {
      "step_id": 2,
      "kind": "Condition filtering",
      "message": "Filter condition: (users.username = 'bob')"
    },
    {
      "step_id": 3,
      "kind": "Index scan",
      "message": "Using index: PRIMARY"
    }
  ]
}

This trace shows the optimizer first identifying the filter condition users.username = 'bob', then realizing it can use the PRIMARY index (which is on user_id) for scanning. Wait, that’s not right! The PRIMARY key is on user_id, not username. This highlights a crucial point: the trace shows what the optimizer thinks it’s doing.

This system solves the "black box" problem of query optimization. Instead of guessing why a query is slow, you see the exact decision tree the optimizer followed. It breaks down the process into logical steps:

  1. Condition Filtering: How the WHERE clause is being interpreted.
  2. Table Access: Which tables are being accessed and how. This includes index scan, index seek, full table scan.
  3. Join Strategy: If multiple tables are involved, how they are being joined (e.g., nested loop, hash join, merge join).
  4. Key Usage: Which indexes are considered and selected.
  5. Row Estimation: The optimizer’s estimates of how many rows will be returned at each stage.

The trace provides a detailed JSON output. You can enable it globally, per session, or even per query.

-- Enable for the current session
SET SESSION optimizer_trace="enabled=on";

-- Enable for a specific query
SELECT SQL_NO_CACHE * FROM users WHERE username = 'alice'; -- (This query won't be traced, just an example)
-- To trace a specific query, you'd wrap it:
SET @trace_sql = "SELECT * FROM users WHERE username = 'alice'";
SET optimizer_trace = CONCAT('enabled=on, query_id=', LAST_QUERY_ID());
PREPARE stmt FROM @trace_sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
SELECT optimizer_trace FROM information_schema.optimizer_trace WHERE query_id = LAST_QUERY_ID();
SET optimizer_trace = 'enabled=off';

The query_id is essential for correlating the trace output with the specific query that generated it.

Understanding the kind field is key. index scan means it’s reading an entire index. index seek means it’s using an index to directly find specific rows. full table scan means it’s reading every single row in the table. The optimizer prefers index usage over full table scans.

The most surprising thing is how often the optimizer can be misled by outdated statistics. If your table data changes significantly, and you haven’t run ANALYZE TABLE recently, the optimizer might choose a full table scan because its statistics suggest that’s faster, when in reality, an index would be far more efficient. The trace will show this decision, and you’ll see the estimated rows being very high for a full table scan.

When you’re looking at the trace, pay close attention to the rows field within the table sections. This tells you the optimizer’s estimate of rows to be read from that table at that stage. If this number is unexpectedly large, it’s a strong indicator that the optimizer is making a suboptimal choice, likely due to stale statistics or a missing index.

The next thing you’ll likely want to investigate after understanding query plan decisions is how to influence them, which leads directly into the world of index creation and ANALYZE TABLE statements.

Want structured learning?

Take the full Mariadb course →