Neo4j transaction logs are the unsung heroes of data durability, and understanding how to manage them is key to keeping your database healthy and performant.
Here’s Neo4j running, writing to its transaction log:
# Simulate a write operation
curl -X POST -H "Content-Type: application/json" \
-d '{"statements":[{"statement":"CREATE (p:Person {name: \"Alice\"})"}]}' \
http://localhost:7474/db/data/transaction
This operation, along with every other write to the database, is first recorded in the transaction log. Think of it as a journal that Neo4j writes in before it updates its main data files. This journal is critical for crash recovery: if the database crashes mid-write, Neo4j can replay the transactions from the log to bring itself back to a consistent state.
The transaction logs are stored in the $NEO4J_HOME/data/databases/<database_name>/transactions/ directory. You’ll see files named like t.1, t.2, and so on. Each t.<n> file represents a segment of the transaction log.
The core of managing these logs is controlling their size and how long they are kept. This is where Neo4j’s configuration comes into play, specifically within the neo4j.conf file.
Log Rotation: When to Start a New Log File
Neo4j rotates its transaction log files based on size. When a log file reaches a certain threshold, Neo4j automatically closes it and starts a new one. This prevents any single log file from growing indefinitely, which could impact performance and disk space.
The primary configuration parameter for this is:
dbms.tx.log.rotation.size
This setting defines the maximum size of a single transaction log file before it’s rotated. The default value is typically 256M (256 megabytes).
Diagnosis: To check the current rotation size, you’d look for this line in your neo4j.conf file. If it’s commented out or missing, it’s using the default.
Fix: To change the rotation size, uncomment or add the line and set it to your desired value. For example, to rotate logs at 512 megabytes:
dbms.tx.log.rotation.size=512M
Why it works: Increasing this value means log files will grow larger before rotation, potentially reducing the number of log files and the overhead associated with frequent file operations. Decreasing it means more frequent rotations, which can be useful if you have very strict disk space constraints or want to ensure log segments are smaller for faster replay during recovery.
Log Retention: How Long to Keep Old Log Files
Even after a transaction log file has been rotated (meaning its data has been fully integrated into the main database files and is no longer needed for immediate recovery), Neo4j keeps it for a certain period or until a certain number of log files have accumulated. This is called log retention.
There are two main ways to configure retention:
-
By Time:
dbms.tx.log.retention.timeThis setting defines how long Neo4j should keep transaction log files, even after they are no longer strictly necessary for recovery. The default is2d(2 days).Diagnosis: Check
neo4j.conffordbms.tx.log.retention.time.Fix: To retain logs for 7 days:
dbms.tx.log.retention.time=7dWhy it works: This ensures that even if you need to perform a recovery operation that goes back further than the default retention period, the necessary log files are still available. It’s a safety net for longer-term recovery scenarios.
-
By Count:
dbms.tx.log.retention.countThis setting defines the maximum number of rotated transaction log files that Neo4j will keep. The default is10.Diagnosis: Look for
dbms.tx.log.retention.countinneo4j.conf.Fix: To keep the last 20 rotated log files:
dbms.tx.log.retention.count=20Why it works: This is a more direct way to control disk space used by transaction logs. If you have a very high write volume, log files might be generated and rotated very quickly, making a time-based retention potentially keep a huge number of files. A count-based retention limits the absolute number of files, irrespective of how quickly they were generated.
Important Note: If both
dbms.tx.log.retention.timeanddbms.tx.log.retention.countare set, Neo4j will retain logs until either condition is met. For example, if you set retention to1d(time) and5(count), Neo4j will delete logs older than 1 day or if there are more than 5 rotated log files, whichever comes first.
What Happens When Logs Are Deleted?
When Neo4j determines that a transaction log file is no longer needed based on your retention policies, it will delete the file. This is crucial for managing disk space. If retention is too short or disabled, you risk running out of disk space, which can lead to database instability and data loss. Conversely, if retention is too long, you might consume excessive disk space.
Monitoring Transaction Log Size
You can monitor the transaction log directory ($NEO4J_HOME/data/databases/<database_name>/transactions/) to see the current log files and their sizes.
du -sh $NEO4J_HOME/data/databases/neo4j/transactions/*
This command will show you the disk usage for each file in the transactions directory. You can observe how the active log file (t.<current_number>) grows and how older files (t.<previous_numbers>) are eventually removed according to your configuration.
The Next Challenge: Page Cache Tuning
Once you have your transaction logs well-managed, the next performance bottleneck you’ll likely encounter is the page cache. Understanding how Neo4j uses memory for caching data and query results is vital for optimizing read performance.