Git pre-commit hooks let you automate quality checks before code even gets committed.
Let’s see it in action. Imagine you’re about to commit a change.
# You type:
git commit -m "Add user authentication"
# And then, BAM!
# Running pre-commit...
# flake8 .
# FAILED: flake8 . exited with 1
#
# You see this, not the commit message prompt.
This hook is configured to run flake8, a Python linter, and it found some style issues. The commit is blocked until you fix them.
The core problem pre-commit hooks solve is consistency. Without them, code quality degrades because:
- Manual checks are forgotten: Developers get busy, skip linters, or forget to format code.
- Inconsistent environments: Different developers have different tools or configurations, leading to "it works on my machine" problems.
- Late discovery of errors: Bugs related to style or basic code correctness are found much later, often in CI, making them more expensive to fix.
Pre-commit hooks are scripts that Git runs automatically at specific points in its workflow. The pre-commit framework (a popular tool for managing these hooks) makes it easy to define and install them.
Here’s a typical pre-commit configuration file, .pre-commit-config.yaml:
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-json
- id: check-toml
- id: check-ast
- id: detect-private-key
- repo: https://github.com/psf/black
rev: 23.12.1
hooks:
- id: black
- repo: https://github.com/pycqa/flake8
rev: 6.1.0
hooks:
- id: flake8
This file tells the pre-commit framework which "repositories" (often GitHub repos) contain the tools you want to run, which version (rev) of those tools to use, and which specific "hooks" (the actual commands) to execute from each repo.
To install these hooks into your Git repository, you run:
pre-commit install
This command creates a .git/hooks/pre-commit script that orchestrates the execution of your configured hooks.
When you run git commit, Git first executes the .git/hooks/pre-commit script. If this script exits with a non-zero status code (indicating an error), Git aborts the commit. The pre-commit framework ensures that the correct tools are downloaded (if not already cached) and run against your staged files.
The beauty is that these hooks operate only on the files you’ve staged for commit. This makes them fast and focused.
Let’s break down some common hooks and their mechanics:
trailing-whitespace: This hook checks for and removes whitespace at the end of lines. It’s incredibly simple: it scans lines, finds any space or tab characters after the last non-whitespace character, and removes them. If it finds any, it modifies the file and exits with a non-zero status, blocking the commit.end-of-file-fixer: This hook ensures that files end with a single newline character. If a file has no newline at the end, it adds one. If it has multiple newlines at the end, it reduces them to one. Liketrailing-whitespace, it modifies the file and exits with an error if changes were made.check-yaml,check-json,check-toml: These hooks parse the respective file types to ensure they are syntactically valid. They use the standard parsers for each format. If a file fails to parse, the hook reports the specific syntax error and exits with a failure code.check-ast: For Python, this hook checks the Abstract Syntax Tree of the code. It’s a more fundamental check thanflake8; it ensures the Python code is valid syntax-wise and can be parsed by the Python interpreter. It uses Python’s built-inastmodule.black: This is an opinionated code formatter for Python. It rewrites your Python code to conform to a consistent style. Ifblackmodifies any files, it signals a failure, forcing you to either commit the formatted code or revert the changes.flake8: A popular linter that checks for style guide enforcement (PEP 8) and programming errors. It’s highly configurable but by default catches common issues like unused imports, line length violations, and naming convention issues.
When a hook fails, you’ll see output indicating which hook failed and often a snippet of the error. For example, flake8 might report example.py:15:1: E501 line too long (85 > 79 characters). You then fix the indicated issue in your code, stage the corrected file (git add example.py), and try to commit again.
The .pre-commit-config.yaml file itself is crucial. It specifies the exact versions of the tools to use (rev: v4.5.0, rev: 23.12.1). This is vital for reproducibility. If you didn’t pin versions, every developer might pull a different, potentially incompatible, version of a tool, defeating the purpose of automated checks. The pre-commit framework downloads and caches these tools in a virtual environment managed by the framework itself, so they don’t interfere with your system’s Python installation or other project dependencies.
One subtle but powerful aspect is how hooks are executed. By default, pre-commit runs hooks against all staged files. However, you can configure hooks to run only on specific file types or even to run only if certain other files have changed, using types and files or stages in the hook configuration. For instance, you might only want a Python formatter to run on .py files, or a Dockerfile linter to run only if a Dockerfile has been modified. This optimization prevents unnecessary checks and keeps the commit process fast.
The next step after mastering pre-commit hooks is often integrating them into your Continuous Integration (CI) pipeline. Running them locally catches issues early, but CI ensures that all code, including changes pushed by others, adheres to the same standards.