Fix issues when cloning repositories with large blobs (>4GB) by LordKiRon · Pull Request #6069 · git-for-windows/git

LordKiRon · 2026-01-27T14:03:10Z

This PR fixes a couple of problems when cloning large repositories (in my case, a 67GB database containing 4GB+ versioned files). This fixes the clone/fetch, but working with the repository still fails, only much later :)

dscho · 2026-01-27T16:59:24Z

@LordKiRon I would like to upstream this fix, but upstream Git requires a real name in the Signed-off-by line. Would you mind providing that?

LordKiRon · 2026-01-27T17:04:38Z

@LordKiRon I would like to upstream this fix, but upstream Git requires a real name in the Signed-off-by line. Would you mind providing that?

Sorry , but this creates ties between my real life and public activity I would like to avoid. Not that I am doing something illigal or unpropriete in either area :) and I guess after some digging on the net you might even connect my nickname to my real name, but its a different than creating real connection.
I preffer to keep "internet life" and personal life separate.

dscho · 2026-01-27T17:35:38Z

@LordKiRon I would like to upstream this fix, but upstream Git requires a real name in the Signed-off-by line. Would you mind providing that?

Sorry , but this creates ties between my real life and public activity I would like to avoid. Not that I am doing something illigal or unpropriete in either area :) and I guess after some digging on the net you might even connect my nickname to my real name, but its a different than creating real connection. I preffer to keep "internet life" and personal life separate.

Understood. I'll take ownership, publicly, then, documenting that you are the actual author but prefer to stay pseudonymous.

When unpacking objects from a packfile, the object size is decoded from a variable-length encoding. On platforms where unsigned long is 32-bit (such as Windows, even in 64-bit builds), the shift operation overflows when decoding sizes larger than 4GB. The result is a truncated size value, causing the unpacked object to be corrupted or rejected. Fix this by changing the size variable to size_t, which is 64-bit on 64-bit platforms, and ensuring the shift arithmetic occurs in 64-bit space. This was originally authored by LordKiRon <https://github.com/LordKiRon>, who preferred not to reveal their real name and therefore agreed that I take over authorship. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

On Windows, zlib's `uLong` type is 32-bit even on 64-bit systems. When processing data streams larger than 4GB, the `total_in` and `total_out` fields in zlib's `z_stream` structure wrap around, which caused the sanity checks in `zlib_post_call()` to trigger `BUG()` assertions. The git_zstream wrapper now tracks its own 64-bit totals rather than copying them from zlib. The sanity checks compare only the low bits, using `maximum_unsigned_value_of_type(uLong)` to mask appropriately for the platform's `uLong` size. This is based on work by LordKiRon in git-for-windows#6076. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

The odb_read_stream structure uses unsigned long for the size field, which is 32-bit on Windows even in 64-bit builds. When streaming objects larger than 4GB, the size would be truncated to zero or an incorrect value, resulting in empty files being written to disk. Change the size field in odb_read_stream to size_t and introduce unpack_object_header_sz() to return sizes via size_t pointer. Since object_info.sizep remains unsigned long for API compatibility, use temporary variables where the types differ, with comments noting the truncation limitation for code paths that still use unsigned long. This was originally authored by LordKiRon <https://github.com/LordKiRon>, who preferred not to reveal their real name and therefore agreed that I take over authorship. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

The delta header decoding functions return unsigned long, which truncates on Windows for objects larger than 4GB. Introduce size_t variants get_delta_hdr_size_sz() and get_size_from_delta_sz() that preserve the full 64-bit size, and use them in packed_object_info() where the size is needed for streaming decisions. This was originally authored by LordKiRon <https://github.com/LordKiRon>, who preferred not to reveal their real name and therefore agreed that I take over authorship. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

To test Git's behavior with very large pack files, we need a way to generate such files quickly. A naive approach using only readily-available Git commands would take over 10 hours for a 4GB pack file, which is prohibitive. Side-stepping Git's machinery and actual zlib compression by writing uncompressed content with the appropriate zlib header makes things much faster. The fastest method using this approach generates many small, unreachable blob objects and takes about 1.5 minutes for 4GB. However, this cannot be used because we need to test git clone, which requires a reachable commit history. Generating many reachable commits with small, uncompressed blobs takes about 4 minutes for 4GB. But this approach 1) does not reproduce the issues we want to fix (which require individual objects larger than 4GB) and 2) is comparatively slow because of the many SHA-1 calculations. The approach taken here generates a single large blob (filled with NUL bytes), along with the trees and commits needed to make it reachable. This takes about 2.5 minutes for 4.5GB, which is the fastest option that produces a valid, clonable repository with an object large enough to trigger the bugs we want to test. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

The shift overflow bug in index-pack and unpack-objects caused incorrect object size calculation when the encoded size required more than 32 bits of shift. This would result in corrupted or failed unpacking of objects larger than 4GB. Add a test that creates a pack file containing a 4GB+ blob using the new 'test-tool synthesize pack --reachable-large' command, then clones the repository to verify the fix works correctly. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

LordKiRon marked this pull request as ready for review January 27, 2026 14:04

LordKiRon force-pushed the minimalFix branch 2 times, most recently from bf5c415 to a9cb9de Compare January 27, 2026 16:18

This was referenced Jan 27, 2026

fix for false error when processing > 4GB stream #6076

Open

Fix use case resulting 4GB+ files created as 0 bytes on disk #6077

Open

Git for Windows unable to fetch/pull/clone valid repository #6055

Open

tboegi mentioned this pull request Jan 28, 2026

[DRAFT] for testing : Fix 4Gb limit for large files on Git for Windows #2179

Open

dscho added 4 commits February 6, 2026 14:42

dscho force-pushed the minimalFix branch from a9cb9de to 40ba6b4 Compare February 6, 2026 14:14

dscho added 2 commits February 6, 2026 18:24

dscho force-pushed the minimalFix branch from 40ba6b4 to b03bc86 Compare February 6, 2026 17:25

dscho changed the title ~~Implemented minimal fix for shift > 32 issue of 4GB+ data~~ Fix issues when cloning repositories with large blobs (>4GB) Feb 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix issues when cloning repositories with large blobs (>4GB)#6069

Fix issues when cloning repositories with large blobs (>4GB)#6069
LordKiRon wants to merge 6 commits intogit-for-windows:mainfrom
LordKiRon:minimalFix

LordKiRon commented Jan 27, 2026 •

edited by dscho

Loading

Uh oh!

dscho commented Jan 27, 2026

Uh oh!

LordKiRon commented Jan 27, 2026 •

edited

Loading

Uh oh!

dscho commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LordKiRon commented Jan 27, 2026 • edited by dscho Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dscho commented Jan 27, 2026

Uh oh!

LordKiRon commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dscho commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LordKiRon commented Jan 27, 2026 •

edited by dscho

Loading

LordKiRon commented Jan 27, 2026 •

edited

Loading