MariaDB’s utf8mb4 character set isn’t just a "better UTF-8"; it’s the only way to store full Unicode, including emojis and many Asian characters, correctly in MariaDB.

Let’s see it in action. Imagine you have a simple table:

CREATE TABLE messages (
    id INT AUTO_INCREMENT PRIMARY KEY,
    content VARCHAR(255)
);

If your server and table aren’t configured for utf8mb4, inserting an emoji like "🚀" might result in a broken character or an error.

-- With incorrect configuration, this might fail or store '???'
INSERT INTO messages (content) VALUES ('Launching the rocket 🚀!');

But if utf8mb4 is set up right, that emoji is stored perfectly.

SELECT content FROM messages WHERE id = 1;
-- Output: Launching the rocket 🚀!

The problem MariaDB’s utf8mb4 solves is the historical limitation of its older utf8 (which is actually utf8mb3). The original UTF-8 standard allowed for up to six bytes per character, but MySQL/MariaDB’s utf8 implementation stopped at three bytes. This was sufficient for most Western European languages but excluded characters requiring four bytes, most notably emojis, but also many CJK (Chinese, Japanese, Korean) characters. utf8mb4 supports the full four-byte range, aligning with the modern UTF-8 standard.

To use utf8mb4 correctly, you need to configure it at three levels: the server, the database, and the table.

1. Server Configuration (my.cnf or my.ini)

This is the most crucial step. You need to tell the MariaDB server itself to default to utf8mb4.

  • Diagnosis: Check the current server default character set.

    mysql -u root -p -e "SHOW VARIABLES LIKE 'character_set_server';"
    

    If this shows utf8 or latin1, it’s not configured for utf8mb4.

  • Fix: Edit your MariaDB configuration file (e.g., /etc/mysql/my.cnf, /etc/my.cnf, or C:\ProgramData\MySQL\MySQL Server X.Y\my.ini). Add or modify the following lines under the [mysqld] section:

    [mysqld]
    character-set-server=utf8mb4
    collation-server=utf8mb4_unicode_ci
    

    Restart your MariaDB server for these changes to take effect.

    sudo systemctl restart mariadb  # Or equivalent for your OS
    
  • Why it works: character-set-server sets the default character set for the server when it starts. collation-server defines the rules for string comparison and sorting. utf8mb4_unicode_ci is a widely compatible, case-insensitive collation for utf8mb4.

2. Database Configuration

Once the server defaults are set, new databases will inherit them. However, existing databases might retain older settings.

  • Diagnosis: Check the character set and collation for your specific database.

    SELECT DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME
    FROM INFORMATION_SCHEMA.SCHEMATA
    WHERE SCHEMA_NAME = 'your_database_name';
    

    If it’s not utf8mb4 and utf8mb4_unicode_ci, you need to alter it.

  • Fix: Alter the database to use utf8mb4.

    ALTER DATABASE your_database_name
    CHARACTER SET = utf8mb4
    COLLATE = utf8mb4_unicode_ci;
    
  • Why it works: This explicitly sets the default character set and collation for any new tables created within this database. It doesn’t automatically change existing tables.

3. Table Configuration

Even if the server and database are set to utf8mb4, tables created before these changes, or tables created with explicit different settings, will need to be altered.

  • Diagnosis: Check the character set and collation for your table.

    SELECT TABLE_NAME, ENGINE, ROW_FORMAT,
           TABLE_COLLATION
    FROM INFORMATION_SCHEMA.TABLES
    WHERE TABLE_SCHEMA = 'your_database_name' AND TABLE_NAME = 'your_table_name';
    

    Also, check column-level settings if TABLE_COLLATION is utf8mb4 but you suspect issues.

  • Fix: Alter the table to use utf8mb4.

    ALTER TABLE your_table_name
    CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
    

    This command not only changes the table’s default character set and collation but also converts all character-based columns (like VARCHAR, TEXT, CHAR) to the new encoding.

  • Why it works: CONVERT TO ensures that the table’s metadata and all its string data are updated to use the utf8mb4 encoding and the specified collation, resolving any internal representation issues.

4. Connection Configuration

Finally, your client connection needs to speak utf8mb4 to the server.

  • Diagnosis: Check the client character set.

    SHOW VARIABLES LIKE 'character_set_client';
    SHOW VARIABLES LIKE 'character_set_connection';
    SHOW VARIABLES LIKE 'character_set_results';
    

    These should ideally match utf8mb4.

  • Fix: When connecting, specify the client character set. For mysql command-line client:

    mysql -u your_user -p --default-character-set=utf8mb4 your_database_name
    

    For application connections (e.g., PHP, Python), find the connection string or configuration option and set it to utf8mb4. For example, in PHP PDO:

    $dsn = "mysql:host=localhost;dbname=your_db;charset=utf8mb4";
    $pdo = new PDO($dsn, $user, $password);
    
  • Why it works: This ensures that data sent to the server and data received from the server is correctly interpreted as utf8mb4, preventing misinterpretations at the application boundary.

When you’ve got all these in place, you can confidently store emojis and a much wider range of international characters. The next hurdle you might encounter is dealing with older, non-utf8mb4 dump files that need careful import handling.

Want structured learning?

Take the full Mariadb course →