The most surprising thing about database schema migrations is that they’re fundamentally a version control problem, not a database problem.
Let’s see this in action. Imagine you’ve got a simple users table. Your application needs it, so you write a SQL file:
-- migrations/V1__create_users_table.sql
CREATE TABLE users (
id INT PRIMARY KEY AUTO_INCREMENT,
username VARCHAR(255) NOT NULL UNIQUE,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
You’re using Flyway. You configure it in your pom.xml (if you’re using Maven):
<plugin>
<groupId>org.flywaydb</groupId>
<artifactId>flyway-maven-plugin</artifactId>
<version>9.22.3</version>
<configuration>
<url>jdbc:mysql://localhost:3306/myappdb</url>
<user>myappuser</user>
<password>supersecret</password>
<locations>
<location>filesystem:./migrations</location>
</locations>
</configuration>
</plugin>
When you run mvn flyway:migrate, Flyway checks a special table it creates, flyway_schema_history, to see which migrations have already been applied. It finds it empty, sees V1__create_users_table.sql in your migrations directory, executes it, and records V1 as applied. Your users table now exists.
Now, you need to add an email column. You create a new file:
-- migrations/V2__add_email_to_users.sql
ALTER TABLE users
ADD COLUMN email VARCHAR(255) UNIQUE;
Run mvn flyway:migrate again. Flyway checks flyway_schema_history, sees V1 applied, then sees V2 is new, executes it, and updates its history table. Your users table now has an email column.
This is the core idea: a sequence of declarative SQL statements, managed and executed in order by a tool. The tool’s job is to know what’s been run and what hasn’t, ensuring idempotency and order.
Liquibase offers a similar concept but uses XML, YAML, or JSON for its migration files, which can include more complex logic and conditional execution. A Liquibase changelog.xml might look like this:
<?xml version="1.0" encoding="UTF-8"?>
<databaseChangeLog
xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-4.20.xsd">
<changeSet id="1" author="your_name">
<createTable tableName="users">
<column name="id" type="INT" autoIncrement="true">
<constraints primaryKey="true" nullable="false"/>
</column>
<column name="username" type="VARCHAR(255)">
<constraints nullable="false" unique="true"/>
</column>
<column name="created_at" type="TIMESTAMP" defaultValueComputed="CURRENT_TIMESTAMP"/>
</createTable>
</changeSet>
<changeSet id="2" author="your_name">
<addColumn tableName="users">
<column name="email" type="VARCHAR(255)">
<constraints unique="true"/>
</column>
</addColumn>
</changeSet>
</databaseChangeLog>
You’d configure Liquibase similarly, perhaps with a liquibase.properties file:
url=jdbc:mysql://localhost:3306/myappdb
username=myappuser
password=supersecret
classpath=./target/classes
changeLogFile=changelog.xml
And execute it via its Maven plugin:
mvn liquibase:update
The mental model to build is that these tools maintain a "truth" about your database schema’s evolution. They prevent developers from manually applying changes that have already been made, or applying them out of order. The flyway_schema_history or Liquibase’s databasechangelog and databasechangeloglock tables are crucial. They act as the single source of truth, preventing race conditions and ensuring that every environment (dev, staging, prod) converges to the same schema state.
The secret sauce is that the migration files themselves are immutable once applied. If you need to change a migration, you don’t edit the old one; you create a new migration that undoes the old one and applies the correct version. This is the same principle as Git commits: you don’t rewrite history in production, you add new commits to fix things.
What most people don’t realize is how critical the databasechangeloglock table is in a multi-instance or CI/CD environment. When Liquibase or Flyway runs, it attempts to acquire a lock on this table. This prevents two instances of your application from trying to migrate the database simultaneously, which could lead to data corruption or inconsistent schema states. If this lock isn’t released (e.g., a migration fails mid-way and the process crashes), you can get stuck, and you’ll need to manually intervene to clear the lock.
The next logical step is understanding how to handle rollbacks and managing complex data transformations within these migration frameworks.