Git partial clone lets you grab only the specific files or directories you need from a repository, drastically reducing download times and disk space usage, especially for massive repos.
Let’s see it in action. Imagine you’re working on a huge monorepo, and you only need the frontend code.
# First, enable the partial clone feature on your Git client
git config --global uploadpack.allowFilter true
git config --global uploadpack.allowAnySHA1InWant true
# Now, clone only a specific directory
git clone --filter=blob:none --sparse <repository_url>
cd <repository_name>
git sparse-checkout init --cone
git sparse-checkout set frontend/
At this point, your local repository only contains the .git directory and the frontend/ directory. Everything else is a placeholder. When you git status, you won’t see thousands of untracked files from other parts of the monorepo.
The core of this magic is the --filter option during git clone. When you use --filter=blob:none, you’re telling Git, "I don’t want any blobs (file contents) yet." This means Git downloads the commit history and tree objects, but not the actual file data. The --sparse option, combined with sparse-checkout, then allows you to selectively download file contents as you need them.
Here’s how the mental model breaks down:
- The Problem: Traditional
git clonedownloads the entire repository history and all files, even if you only need a small fraction. This is a huge bottleneck for large repositories (think thousands of files, gigabytes of history). - The Solution: Partial Clone (
--filter): This tells the server not to send certain objects.blob:noneis the most common filter, meaning "don’t send me file contents." Git still downloads the commit and tree objects, which represent the directory structure and history, but the actual file data (blobs) are missing. - The Mechanism: Sparse Checkout (
--sparse,sparse-checkout): Once you have the filtered clone, your working directory is mostly empty.git sparse-checkout init --conesets up the mechanism. Then,git sparse-checkout set <path>tells Git which files and directories you do want to materialize. When you access a file that’s part of your sparse checkout, Git will fetch its content on demand from the remote. - The Levers You Control:
uploadpack.allowFilter true: This is a server-side or client-side configuration that enables the filtering protocol. If this isn’t set on the server, you can’t do partial clones. On your client, it’s generally a good idea to enable it globally.uploadpack.allowAnySHA1InWant true: This is another server-side/client-side config that allows Git to fetch objects that might not be reachable from the default branch, which is often necessary for partial clones.--filter=blob:none: The core of the clone command, specifying what not to download initially. Other filters exist, liketree:0(don’t download tree objects) or custom filters.--sparse: Initializes the working directory to be sparse-aware.git sparse-checkout init --cone: Sets up the sparse checkout mode.--conemode means only files directly specified insetare included, and anything else is excluded.git sparse-checkout set <path>: This is where you define your "allowlist" of files/directories. You can specify multiple paths.frontend/would include all files and subdirectories withinfrontend.
When you perform a git fetch or git pull after setting up a sparse checkout, Git is smart enough to only fetch the new objects needed for your specified paths. It doesn’t download the entire repository again.
The truly surprising part is that --filter isn’t just for clone. You can use it with git fetch too. If you git fetch origin --filter=blob:none, you get the latest history and tree objects without downloading any new file contents, keeping your local repo lean even as the remote evolves. This is crucial for keeping your partial clone up-to-date without a massive download.
The next step you’ll likely encounter is needing to access files that aren’t in your current sparse checkout set.