Skip to content

Fix REF_DELTA chain bug in 'git index-pack' #745

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 25, 2025

Conversation

dscho
Copy link
Member

@dscho dscho commented Apr 24, 2025

Range-diff to that PR:

@dscho dscho requested a review from derrickstolee April 24, 2025 07:34
@dscho dscho self-assigned this Apr 24, 2025
@dscho dscho force-pushed the index-pack-ref-deltas-msft-git branch from 162efbb to 47af8af Compare April 24, 2025 07:42
@dscho
Copy link
Member Author

dscho commented Apr 24, 2025

Aha! There is one of those infamous non-semantic merge conflicts where upstream's constant refactoring requires a change in the backported commits:

  • 1: 5d4beb2 ! 1: bf490fe test-tool: add pack-deltas helper

    @@ t/helper/test-pack-deltas.c (new)
     +
     +	setup_git_directory();
     +
    -+	f = hashfd(the_repository->hash_algo, 1, "<stdout>");
    ++	f = hashfd(1, "<stdout>");
     +	write_pack_header(f, N);
     +
     +	/* Read each line from stdin into 'line' */
  • 2: a943044 = 2: f249ca8 t5309: create failing test for 'git index-pack'

  • 3: 27d3640 = 3: 47af8af index-pack: allow revisiting REF_DELTA chains

@dscho
Copy link
Member Author

dscho commented Apr 24, 2025

It's too bad that upstream's CI story is in such poor shape, and even worse that we inherit all of this mess. The PR build failures (9 jobs are failing) fall into these categories:

I'll just go ahead and ignore these issues from this point on.

derrickstolee
derrickstolee previously approved these changes Apr 24, 2025
Copy link

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for creating this backport. I validated it locally on my Linux machine by building with DEVELOPER=1 and running all test scripts.

@dscho
Copy link
Member Author

dscho commented Apr 24, 2025

Unless @mjcheetham beats me to it, I will merge and publish it as a pre-release tomorrow.

gitster and others added 2 commits April 24, 2025 16:12
The ci/install-dependencies.sh script used in a very early phase of
our CI jobs downloads Perforce, Git-LFS, and JGit, used for running
the test scripts.  The test framework is prepared to properly skip
the tests that depend on these external software, but the CI script
is unnecessarily strict (due to its use of "set -e" in ci/lib.sh)
and fails the entire CI run before even starting to test the rest of
the system.

Notice a failure to download to any of these external software, but
keep going.  We need to be careful about cleaning after a failed
wget, as a later part of the script that does:

        if type jgit >/dev/null 2>&1
        then
                echo "$(tput setaf 6)JGit Version$(tput sgr0)"
                jgit version
        else
                echo >&2 "WARNING: JGit wasn't installed, see above for clues why"
        fi

will (surprise!) succeed running "type jgit", and then fail with
"jgit version", taking the whole thing down due to "set -e".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
The image pointed to by the fedora:latest tag has moved from fedora 41
to 42. The fedora 41 container images have awk installed while the
fedora 42 images do not.  That change is most likely just part of
reducing the size of the base container images.

In both AlmaLinux and Fedora (as well as other RHEL
derivatives/relatives), awk is provided by the gawk package.

On Fedora, `dnf install awk` would work, but for unintended reasons! It
uses the package filelist data to determine that /usr/bin/awk is
provided by gawk and installs gawk as a result.

On AlmaLinux (8 & 9, by my quick testing), that is not the case and
you'd need to use `dnf install gawk` or `dnf install '*bin/awk'` to get
it installed. Having said that, awk _is_ included in the current
AlmaLinux 8 and 9 images, so it isn't strictly needed.  But it's
probably better to be explicit that we need it installed, as a defense
against some future change to the AlmaLinux container removing awk.

Using the package name "gawk" is the right thing to do.

Note that even '*bin/awk' would have worked, but it is less specific.
And who knows, maybe in the far future a BSD variant of awk is offered,
too, and would then cause ambiguities. Best to avoid that.

Suggested-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho force-pushed the index-pack-ref-deltas-msft-git branch from 47af8af to 115467b Compare April 25, 2025 07:34
dscho and others added 7 commits April 25, 2025 09:41
Git's dependency story is improvable: When, say, JGit cannot be
downloaded, it currently fails the entire job without even running all
the tests that do not require JGit (which is the vast, vast majority of
them). It would be better to handle that gracefully by warning about the
JGit download failure and then skipping the respective tests.

This JGit download problem is far from theoretical: There is an incident
on eclipse.org that has now lasted for an extended amount of
time: https://www.eclipsestatus.io/incident/549796.

While upstream Git finally chose to improve their CI story in
b0026da (ci: skip unavailable external software, 2025-04-24), we
should heed the suggestion by the JGit maintainer, Matthias Sohn, in
https://discord.com/channels/1042895022950994071/1364872237710184520/1364885895865696479,
to switch to the far more reliable Maven Central link (according to
https://status.maven.org/uptime, there has not been any major downtime
recently that wasn't resolved within 5 minutes, a far cry from the 30h
of the current eclipse.org incident as of time of writing).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Make sure outage of third-party sites that supply P4, Git-LFS, and JGit
we use for testing would not prevent our CI jobs from running at all.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When trying to demonstrate certain behavior in tests, it can be helpful
to create packfiles that have specific delta structures. 'git
pack-objects' uses various algorithms to select deltas based on their
compression rates, but that does not always demonstrate all possible
packfile shapes. This becomes especially important when wanting to test
'git index-pack' and its ability to parse certain pack shapes.

We have prior art in t/lib-pack.sh, where certain delta structures are
produced by manually writing certain opaque pack contents. However,
producing these script updates is cumbersome and difficult to do as a
contributor.

Instead, create a new test-tool, 'test-tool pack-deltas', that reads a
list of instructions for which objects to include in a packfile and how
those objects should be written in delta form.

At the moment, this only supports REF_DELTAs as those are the kinds of
deltas needed to exercise a bug in 'git index-pack'.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
This new test demonstrates some behavior where a valid packfile is being
rejected by the Git client due to the order in which it is resolving
REF_DELTAs.

The thin packfile has a REF_DELTA chain A->B->C where C is not included
in the packfile. However, the client repository contains both C and B
already. Thus, 'git index-pack' is able to resolve A before resolving B.

When resolving B, it then attempts to resolve any other REF_DELTAs that
are pointing to B as a base. This "revisits" A and complains as if there
is a cycle, but it did not actually detect a cycle.

A fix will arrive in the next change.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
As detailed in the previous changes to t5309-pack-delta-cycles.sh, the
logic within 'git index-pack' to analyze an incoming thin packfile with
REF_DELTAs is suspect. The algorithm is overly cautious around delta
cycles, and that leads in fact to failing even when there is no cycle.

This change adjusts the algorithm to no longer fail in these cases. In
fact, these cycle cases will no longer fail but more importantly the
valid cases will no longer fail, either. The resulting packfile from the
--fix-thin operation will not have cycles either since REF_DELTAs are
forbidden from the on-disk format and OFS_DELTAs are impossible to write
as a cycle.

The crux of the matter is how the algorithm works when the REF_DELTAs
point to base objects that exist in the local repository. When reading
the thin packfile, the object IDs for the delta objects are unknown so
we do not have the delta chain structure automatically. Instead, we need
to start somewhere by selecting a delta whose base is inside our current
object database.

Consider the case where the packfile has two REF_DELTA objects, A and B,
and the delta chain looks like "A depends on B" and "B depends on C" for
some third object C, where C is already in the current repository. The
algorithm _should_ start with all objects that depend on C, finding B,
and then moving on to all objects depending on B, finding A.

However, if the repository also already has object B, then the delta
chain can be analyzed in a different order. The deltas with base B can
be analyzed first, finding A, and then the deltas with base C are
analyzed, finding B. The algorithm currently continues to look for
objects that depend on B, finding A again. This fails due to A's
'real_type' member already being overwritten from OBJ_REF_DELTA to the
correct object type.

This scenario is possible in a typical 'git fetch' where the client does
not advertise B as a 'have' but requests A as a 'want' (and C is noticed
as a common object based on other 'have's). The reason this isn't
typically seen is that most Git servers use OFS_DELTAs to represent
deltas within a packfile. However, if a server uses only REF_DELTAs,
then this kind of issue can occur. There is nothing in the explicit
packfile format that states this use of inter-pack REF_DELTA is
incorrect, only that REF_DELTAs should not be used in the on-disk
representation to avoid cycles.

This die() was introduced in ab791dd (index-pack: fix race condition
with duplicate bases, 2014-08-29). Several refactors have adjusted the
error message and the surrounding logic, but this issue has existed for
a longer time as that was only a conversion from an assert().

The tests in t5309 originated in 3b910d0 (add tests for indexing
packs with delta cycles, 2013-08-23) and b2ef3d9 (test index-pack on
packs with recoverable delta cycles, 2013-08-23). These changes make
note that the current behavior of handling "resolvable" cycles is mostly
a documentation-only test, not that this behavior is the best way for
Git to handle the situation.

The fix here is somewhat complicated due to the amount of state being
adjusted by the loop within threaded_second_pass(). Instead of trying to
resume the start of the loop while adjusting the necessary context, I
chose to scan the REF_DELTAs depending on the current 'parent' and skip
any that have already been processed. This necessarily leaves us in a
state where 'child' and 'child_obj' could be left as NULL and that must
be handled later. There is also some careful handling around skipping
REF_DELTAs when there are also OFS_DELTAs depending on that parent.
There may be value in extending 'test-tool pack-deltas' to allow writing
OFS_DELTAs in order to exercise this logic across the delta types.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Junio says in https://lore.kernel.org/git/xmqq34dxuz21.fsf@gitster.g:

> I needed this to make
>
> $ SANITIZE=leak GIT_TEST_PASSING_SANITIZE_LEAK=true make
> $ cd t && sh t5309-pack-delta-cycles.sh
>
> pass.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Let's make the command-line parsing a bit more stringent. We _could_
use `parse_options()`, but that would be overkill for a single,
non-optional argument. Besides, it would not bring any benefit, as the
parsed value needs to fit in the `uint32_t` type, and `parse_options()`
has no provision to ensure that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho force-pushed the index-pack-ref-deltas-msft-git branch from 115467b to 638c49a Compare April 25, 2025 09:07
@dscho dscho merged commit 0a8cb87 into vfs-2.49.0 Apr 25, 2025
119 checks passed
@dscho dscho deleted the index-pack-ref-deltas-msft-git branch April 25, 2025 11:12
dscho added a commit that referenced this pull request Jun 6, 2025
This is an early version of work already under review upstream:
gitgitgadget#1906
dscho added a commit that referenced this pull request Jun 6, 2025
This is an early version of work already under review upstream:
gitgitgadget#1906
dscho added a commit that referenced this pull request Jun 11, 2025
This is an early version of work already under review upstream:
gitgitgadget#1906
dscho added a commit that referenced this pull request Jun 13, 2025
This is an early version of work already under review upstream:
gitgitgadget#1906
dscho added a commit that referenced this pull request Jun 16, 2025
This is an early version of work already under review upstream:
gitgitgadget#1906
dscho added a commit that referenced this pull request Jun 16, 2025
This is an early version of work already under review upstream:
gitgitgadget#1906
dscho added a commit that referenced this pull request Jun 16, 2025
This is an early version of work already under review upstream:
gitgitgadget#1906
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants