.gitattributes is a file that can drastically change how Git handles line endings and merges, and most people don’t realize how much power they’re giving up by ignoring it.
Let’s see it in action. Imagine you have a project with both shell scripts and Python files. You want to ensure all shell scripts are checked into Git with Unix-style line endings (LF), and Python files can be more flexible but should also ideally use LF for consistency.
Here’s a .gitattributes file that achieves this:
# Set default line ending to LF for all files
* text=auto
# Explicitly enforce LF for shell scripts
*.sh text eol=lf
# Explicitly enforce LF for Python files
*.py text eol=lf
# For binary files, don't touch line endings
*.png binary
*.jpg binary
When you commit this file and then add a new myscript.sh file with Windows line endings (CRLF) and a myprogram.py file also with CRLF, Git, guided by .gitattributes, will normalize them.
Let’s simulate this.
First, create some files with CRLF endings locally.
# Create a shell script with CRLF endings
echo -e "echo 'Hello'\r\n" > myscript.sh
# Create a Python script with CRLF endings
echo -e "print('World')\r\n" > myprogram.py
# Add them to Git
git add myscript.sh myprogram.py .gitattributes
# Commit
git commit -m "Add scripts with CRLF endings"
Now, let’s inspect the files in your working directory after they’ve been checked out from the repository. If your Git is configured to normalize line endings on checkout (which is the default behavior when .gitattributes is present and text=auto is used), they should have LF endings.
# Check the line endings in your working directory
# On Linux/macOS, you can use 'file' or 'od'
file myscript.sh
# Expected output will likely indicate 'ASCII text' or similar, but crucially,
# the internal representation should be LF.
# To be more precise, check the actual bytes for the newline character
od -c myscript.sh | grep '\n'
# You should see '\n' and not '\r\n'
file myprogram.py
od -c myprogram.py | grep '\n'
The key here is that Git, when checking out these files, reads the .gitattributes file. For *.sh and *.py, it sees text eol=lf. This tells Git: "When I’m checking this file out into the working directory, ensure its line endings are LF. When I’m checking it in from the working directory into the repository, convert it to LF if it’s not already." The * text=auto line is a fallback: if a file doesn’t match any more specific pattern, Git will try to guess if it’s text and, if so, normalize it to the platform’s native line ending (which is usually LF on Linux/macOS and CRLF on Windows). However, by explicitly setting eol=lf for .sh and .py, we override auto and enforce LF.
The system this solves is the classic "line ending hell" that plagues cross-platform development. Different operating systems use different characters to denote the end of a line: Unix-like systems (Linux, macOS) use a Line Feed (LF, \n), while Windows uses Carriage Return + Line Feed (CRLF, \r\n). When code is edited on different OSes, these line endings can get mixed, leading to:
- Merge conflicts: Git sees changed line endings as actual code modifications, even if no logic changed.
- Syntax errors: Some interpreters or compilers might choke on unexpected line endings.
- Inconsistent behavior: A script might work on one OS but fail on another due to line ending differences.
.gitattributes acts as a configuration file that lives in your repository, ensuring these line ending policies are consistent for everyone who clones the project.
Here’s how it works internally:
- On
git add(orgit commit): When you add files to be committed, Git checks.gitattributes. If a file matches a pattern withtextand aneolattribute (likeeol=lf), Git converts the line endings in the file to the specified type (LF in our example) before storing it in the Git object database. If the attribute isbinary, Git doesn’t touch the file. If it’stext=auto, Git tries to detect if it’s a text file and normalizes to the repository’s default (usually LF, but configurable). - On
git checkout(orgit pull): When you check out files from the repository into your working directory, Git again consults.gitattributes. If a file has aneolattribute, Git converts the line endings from their stored format (which is always normalized to LF in the repository, regardless of the original OS) to the specifiedeoltype for your working directory. Ifeolis not specified buttextis, it converts to your OS’s native line ending. Ifbinary, it’s copied as-is.
This means that even if you’re on Windows and edit a .sh file, Git will convert your CRLF endings to LF when you git add it, and then convert it back to CRLF when you git checkout it (if your .gitattributes specified eol=crlf for .sh or if text=auto and your OS is Windows). However, by enforcing eol=lf for .sh and .py, we ensure that the files in your working directory always have LF endings, regardless of your OS, which is often desirable for cross-platform code consistency.
The text=auto setting is a bit magical. Git inspects the file content. If it finds bytes that look like text (e.g., not a high percentage of unprintable characters), it assumes it’s a text file. If it’s a text file and no specific eol is set, it normalizes to the repository’s default line ending style. This is usually LF, but can be configured via core.autocrlf in your Git config. However, .gitattributes overrides core.autocrlf for the files it applies to.
A common pitfall is forgetting to add .gitattributes before adding files with mixed line endings. If you add files first and then add .gitattributes, the files already in Git might retain their original line endings. To fix this, you often need to re-add them after the .gitattributes is in place:
# After adding .gitattributes
git add --renormalize .
git commit -m "Normalize line endings with .gitattributes"
The --renormalize flag tells Git to re-evaluate the line ending normalization for all tracked files based on the current .gitattributes and core.autocrlf settings, and stage any necessary changes.
The next thing you’ll likely encounter is how to handle specific file types that shouldn’t be touched by line ending normalization, like images or compiled binaries, and how to manage diffs when line endings are the only thing that changed.