Skip to content

Permit repack command in Scalar clones #732

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 31, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions builtin/repack.c
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
#include "pack-bitmap.h"
#include "refs.h"
#include "list-objects-filter-options.h"
#include "gvfs.h"

#define ALL_INTO_ONE 1
#define LOOSEN_UNREACHABLE 2
Expand Down Expand Up @@ -1177,6 +1178,7 @@ int cmd_repack(int argc,
struct tempfile *refs_snapshot = NULL;
int i, ext, ret;
int show_progress;
const char *tmp_obj_dir = NULL;

/* variables to be filled by option parsing */
int delete_redundant = 0;
Expand Down Expand Up @@ -1301,6 +1303,10 @@ int cmd_repack(int argc,
write_bitmaps = 0;
}

if (gvfs_config_is_set(GVFS_ANY_MASK) &&
!git_config_get_value("gvfs.sharedcache", &tmp_obj_dir))
warning(_("shared object cache is configured but will not be repacked"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's turn this warning into in informational message that the shared cache is repacked instead (and then do that) in a follow-up PR (and add a test to verify that it worked as intended). Unfortunately, it won't be as easy as:

-		warning(_("shared object cache is configured but will not be repacked"));
+		setenv(DB_ENVIRONMENT, tmp_obj_dir, 1);

because subsequent lines call repo_get_object_repository() (this line, this one and this one), and that function merely returns the already-initialized repo->objects->odb->path.

Other than that, the pack-objects command is spawned which also would need to know about this, and the cruft pack/delete redundant part probably also would need to be updated.

Most importantly, though, we really need to ensure that no objects are deleted that are reachable in any Scalar repository that shares that sharedcache repository, even if the objects should not be reachable in that sharedcache repository itself...


if (write_midx && write_bitmaps) {
struct strbuf path = STRBUF_INIT;

Expand Down
7 changes: 0 additions & 7 deletions builtin/worktree.c
Original file line number Diff line number Diff line change
Expand Up @@ -1451,13 +1451,6 @@ int cmd_worktree(int ac,

git_config(git_worktree_config, NULL);

/*
* git-worktree is special-cased to work in Scalar repositories
* even when they use the GVFS Protocol.
*/
if (core_gvfs & GVFS_USE_VIRTUAL_FILESYSTEM)
die("'git %s' is not supported on a GVFS repo", "worktree");

if (!prefix)
prefix = "";

Expand Down
12 changes: 8 additions & 4 deletions git.c
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
#define DELAY_PAGER_CONFIG (1<<4)
#define NO_PARSEOPT (1<<5) /* parse-options is not used */
#define BLOCK_ON_GVFS_REPO (1<<6) /* command not allowed in GVFS repos */
#define BLOCK_ON_VFS_ENABLED (1<<7) /* command not allowed when virtual file system is used */

struct cmd_struct {
const char *cmd;
Expand Down Expand Up @@ -542,6 +543,9 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv, struct
if (!help && p->option & BLOCK_ON_GVFS_REPO && gvfs_config_is_set(GVFS_BLOCK_COMMANDS))
die("'git %s' is not supported on a GVFS repo", p->cmd);

if (!help && p->option & BLOCK_ON_VFS_ENABLED && gvfs_config_is_set(GVFS_USE_VIRTUAL_FILESYSTEM))
die("'git %s' is not supported when using the virtual file system", p->cmd);

if (run_pre_command_hook(the_repository, argv))
die("pre-command hook aborted command");

Expand Down Expand Up @@ -625,7 +629,7 @@ static struct cmd_struct commands[] = {
{ "for-each-ref", cmd_for_each_ref, RUN_SETUP },
{ "for-each-repo", cmd_for_each_repo, RUN_SETUP_GENTLY },
{ "format-patch", cmd_format_patch, RUN_SETUP },
{ "fsck", cmd_fsck, RUN_SETUP | BLOCK_ON_GVFS_REPO},
{ "fsck", cmd_fsck, RUN_SETUP | BLOCK_ON_VFS_ENABLED },
{ "fsck-objects", cmd_fsck, RUN_SETUP },
{ "fsmonitor--daemon", cmd_fsmonitor__daemon, RUN_SETUP },
{ "gc", cmd_gc, RUN_SETUP },
Expand Down Expand Up @@ -668,7 +672,7 @@ static struct cmd_struct commands[] = {
{ "pack-refs", cmd_pack_refs, RUN_SETUP },
{ "patch-id", cmd_patch_id, RUN_SETUP_GENTLY | NO_PARSEOPT },
{ "pickaxe", cmd_blame, RUN_SETUP },
{ "prune", cmd_prune, RUN_SETUP | BLOCK_ON_GVFS_REPO},
{ "prune", cmd_prune, RUN_SETUP | BLOCK_ON_VFS_ENABLED },
{ "prune-packed", cmd_prune_packed, RUN_SETUP },
{ "pull", cmd_pull, RUN_SETUP | NEED_WORK_TREE },
{ "push", cmd_push, RUN_SETUP },
Expand All @@ -681,7 +685,7 @@ static struct cmd_struct commands[] = {
{ "remote", cmd_remote, RUN_SETUP },
{ "remote-ext", cmd_remote_ext, NO_PARSEOPT },
{ "remote-fd", cmd_remote_fd, NO_PARSEOPT },
{ "repack", cmd_repack, RUN_SETUP | BLOCK_ON_GVFS_REPO },
{ "repack", cmd_repack, RUN_SETUP | BLOCK_ON_VFS_ENABLED },
{ "replace", cmd_replace, RUN_SETUP },
{ "replay", cmd_replay, RUN_SETUP },
{ "rerere", cmd_rerere, RUN_SETUP },
Expand Down Expand Up @@ -722,7 +726,7 @@ static struct cmd_struct commands[] = {
{ "verify-tag", cmd_verify_tag, RUN_SETUP },
{ "version", cmd_version },
{ "whatchanged", cmd_whatchanged, RUN_SETUP },
{ "worktree", cmd_worktree, RUN_SETUP },
{ "worktree", cmd_worktree, RUN_SETUP | BLOCK_ON_VFS_ENABLED },

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a much simpler way to do this!

{ "write-tree", cmd_write_tree, RUN_SETUP },
};

Expand Down
2 changes: 2 additions & 0 deletions gvfs.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@
#define GVFS_BLOCK_FILTERS_AND_EOL_CONVERSIONS (1 << 6)
#define GVFS_PREFETCH_DURING_FETCH (1 << 7)

#define GVFS_ANY_MASK 0xFFFFFFFF

void gvfs_load_config_value(const char *value);
int gvfs_config_is_set(int mask);

Expand Down
11 changes: 10 additions & 1 deletion t/t0402-block-command-on-gvfs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ not_with_gvfs fsck
not_with_gvfs gc
not_with_gvfs gc --auto
not_with_gvfs prune
not_with_gvfs repack
not_with_gvfs submodule status
not_with_gvfs update-index --index-version 2
not_with_gvfs update-index --skip-worktree
Expand All @@ -36,4 +35,14 @@ test_expect_success 'test gc --auto succeeds when disabled via config' '
git gc --auto
'

test_expect_success 'test repack fails with VFS bit enabled' '
test_config core.gvfs true &&
test_must_fail git repack
'

test_expect_success 'test repack succeeds with VFS bit disabled' '

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test with a Scalar clone that has a gvfs.sharedCache enabled? I'm curious as to whether git repack will operate on that object directory (my guess is not because its an alternate which Git treats as read-only).

It would be good to be up front about "git repack is no longer blocked, but it also won't do what you expect".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test with a Scalar clone that has a gvfs.sharedCache enabled? I'm curious as to whether git repack will operate on that object directory (my guess is not because its an alternate which Git treats as read-only).

We do have this in git maintenance run:

git/builtin/gc.c

Lines 1822 to 1831 in 5ad6dd6

/*
* To enable the VFS for Git/Scalar shared object cache, use
* the gvfs.sharedcache config option to redirect the
* maintenance to that location.
*/
if (!git_config_get_value("gvfs.sharedcache", &tmp_obj_dir) &&
tmp_obj_dir) {
shared_object_dir = xstrdup(tmp_obj_dir);
setenv(DB_ENVIRONMENT, shared_object_dir, 1);
}

I think this is part of what is verified in

git/t/t7900-maintenance.sh

Lines 1126 to 1155 in 5ad6dd6

test_expect_success 'cache-local-objects task success' '
test_when_finished "rm -rf repo cache" &&
mkdir -p cache/pack &&
git init repo &&
(
cd repo &&
test_commit something &&
git config set gvfs.sharedcache ../cache &&
git config set maintenance.gc.enabled false &&
git config set maintenance.cache-local-objects.enabled true &&
git config set maintenance.cache-local-objects.auto 1 &&
test_import_packfile &&
test_get_packdir_files "*.pack" "*.idx" "*.keep" "*.rev" \
>src.txt &&
test_get_loose_object_files >>src.txt &&
sed "s/.git\\/objects\\//..\\/cache\\//" src.txt >dst.txt &&
git maintenance run &&
while IFS= read -r f; do
test_path_is_missing $f || exit 1
done <src.txt &&
while IFS= read -r f; do
test_path_exists $f || exit 1
done <dst.txt
)
'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache-local-objects maintenance task migrates local loose objects and packfiles to the shared cache. Also the incremental-repack task is redirected to the cache directory, so together the maintenance tasks will repack the packs in the cache dir.

The repack built-in however will only act on the local objects. Would it be worth either:
a) reach repack to redirect to the shared object dir instead (if set, like the incremental-repack task), or
b) print the warning, perhaps recommending a run of git maintenance run --task=incremental-repack instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious as to whether git repack will operate on that object directory (my guess is not because its an alternate which Git treats as read-only).

There is an example of a similar scenario in repack already, whereby if an alternate has been configured the bitmaps are not written since "some objects are not being packed".

git/builtin/repack.c

Lines 1292 to 1302 in 5ad6dd6

if (write_bitmaps && po_args.local && has_alt_odb(the_repository)) {
/*
* When asked to do a local repack, but we have
* packfiles that are inherited from an alternate, then
* we cannot guarantee that the multi-pack-index would
* have full coverage of all objects. We thus disable
* writing bitmaps in that case.
*/
warning(_("disabling bitmap writing, as some objects are not being packed"));
write_bitmaps = 0;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. If I put myself into a user's shoes, I would much rather prefer git repack to take care of the shared object directory (and potentially migrate objects there as part of the repack) if I work on a gvfs.sharedCache-enabled repository.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this? (for option b)

diff --git a/builtin/repack.c b/builtin/repack.c
index 01db1b8f529..5a44171f8e8 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -24,6 +24,7 @@
 #include "pack-bitmap.h"
 #include "refs.h"
 #include "list-objects-filter-options.h"
+#include "gvfs.h"
 
 #define ALL_INTO_ONE 1
 #define LOOSEN_UNREACHABLE 2
@@ -1177,6 +1178,7 @@ int cmd_repack(int argc,
 	struct tempfile *refs_snapshot = NULL;
 	int i, ext, ret;
 	int show_progress;
+	const char *tmp_obj_dir = NULL;
 
 	/* variables to be filled by option parsing */
 	int delete_redundant = 0;
@@ -1301,6 +1303,10 @@ int cmd_repack(int argc,
 		write_bitmaps = 0;
 	}
 
+	if (gvfs_config_is_set(NULL) &&
+	    !git_config_get_value("gvfs.sharedcache", &tmp_obj_dir))
+		warning(_("shared object cache will not be repacked; use `git maintenance run --task=incremental-repack` instead"));
+
 	if (write_midx && write_bitmaps) {
 		struct strbuf path = STRBUF_INIT;
 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be the option with the warning. However, as a user I'd much prefer the command to not warn me about something I probably don't want, but to do what I probably want instead, i.e. the other option you suggested ;-)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think users who are bothering to run git repack themselves really want their object data to be reworked. I'm thinking specifically in the cases like git repack -adf that I use to rewrite all objects into a single packfile. We have historically avoided doing that on the shared object cache, but it should work somewhat well due to the integration with partial clone machinery. It's just going to be slow and users should expect that. We're not choosing to run it in the background because of that cost.

test_config core.gvfs 150 &&
git repack
'

test_done
Loading