git.git - The core git plumbing

Age	Commit message (Collapse)	Author	Files	Lines
3 days	Merge branch 'ps/object-read-stream'	Junio C Hamano	5	-27/+27
	The "git_istream" abstraction has been revamped to make it easier to interface with pluggable object database design. * ps/object-read-stream: streaming: drop redundant type and size pointers streaming: move into object database subsystem streaming: refactor interface to be object-database-centric streaming: move logic to read packed objects streams into backend streaming: move logic to read loose objects streams into backend streaming: make the `odb_read_stream` definition public streaming: get rid of `the_repository` streaming: rely on object sources to create object stream packfile: introduce function to read object info from a store streaming: move zlib stream into backends streaming: create structure for filtered object streams streaming: create structure for packed object streams streaming: create structure for loose object streams streaming: create structure for in-core object streams streaming: allocate stream inside the backend-specific logic streaming: explicitly pass packfile info when streaming a packed object streaming: propagate final object type via the stream streaming: drop the `open()` callback function streaming: rename `git_istream` into `odb_read_stream`
5 days	Merge branch 'lo/repo-struct-z'	Junio C Hamano	1	-2/+6
	"git repo struct" learned to take "-z" as a synonym to "--format=nul". * lo/repo-struct-z: repo: add -z as an alias for --format=nul to git-repo-structure repo: use [--format=... \| -z] instead of [-z] in git-repo-info synopsis repo: remove blank line from Documentation/git-repo.adoc
5 days	Merge branch 'kh/advise-w-git-help-in-branch'	Junio C Hamano	1	-1/+1
	A help message from "git branch" now mentions "git help" instead of "man" when suggesting to read some documentation. * kh/advise-w-git-help-in-branch: branch: advice using git-help(1) instead of man(1)
5 days	Merge branch 'js/last-modified-with-sparse-checkouts'	Junio C Hamano	1	-1/+2
	"git last-modified" used to mishandle "--" to mark the beginning of pathspec, which has been corrected. * js/last-modified-with-sparse-checkouts: last-modified: support sparse checkouts
5 days	Merge branch 'tc/last-modified-active-paths-optimization'	Junio C Hamano	1	-1/+1
	Recent optimization to "last-modified" command introduced use of uninitialized block of memory, which has been corrected. * tc/last-modified-active-paths-optimization: last-modified: fix use of uninitialized memory
10 days	Merge branch 'en/replay-doc-revision-range'	Junio C Hamano	1	-1/+1
	The use of "revision" (a connected set of commits) has been clarified in the "git replay" documentation. * en/replay-doc-revision-range: Documentation/git-replay.adoc: fix errors around revision range
10 days	Merge branch 'pw/replay-exclude-gpgsig-fix'	Junio C Hamano	1	-1/+1
	"git replay" forgot to omit the "gpgsig-sha256" extended header from the resulting commit the same way it omits "gpgsig", which has been corrected. * pw/replay-exclude-gpgsig-fix: replay: do not copy "gpgsign-sha256" header
2025-12-05	Merge branch 'rs/config-set-multi-error-message-fix'	Junio C Hamano	1	-1/+1
	The error message given by "git config set", when the variable being updated has more than one values defined, used old style "git config" syntax with an incorrect option in its hint, both of which have been corrected. * rs/config-set-multi-error-message-fix: config: fix suggestion for failed set of multi-valued option
2025-12-05	Merge branch 'rs/config-unset-opthelp-fix'	Junio C Hamano	1	-2/+2
	The option help text given by "git config unset -h" described the "--all" option to "replace", not "unset", multiple variables, which has been corrected. * rs/config-unset-opthelp-fix: config: fix short help of unset flags
2025-12-05	Merge branch 'ps/object-source-management'	Junio C Hamano	7	-9/+24
	Code refactoring around object database sources. * ps/object-source-management: odb: handle recreation of quarantine directories odb: handle changing a repository's commondir chdir-notify: add function to unregister listeners odb: handle initialization of sources in `odb_new()` http-push: stop setting up `the_repository` for each reference t/helper: stop setting up `the_repository` repeatedly builtin/index-pack: fix deferred fsck outside repos oidset: introduce `oidset_equal()` odb: move logic to disable ref updates into repo odb: refactor `odb_clear()` to `odb_free()` odb: adopt logic to close object databases setup: convert `set_git_dir()` to have file scope path: move `enter_repo()` into "setup.c"
2025-12-05	Merge branch 'cc/fast-import-strip-if-invalid'	Junio C Hamano	2	-17/+93
	"git fast-import" learns "--strip-if-invalid" option to drop invalid cryptographic signature from objects. * cc/fast-import-strip-if-invalid: fast-import: add 'strip-if-invalid' mode to --signed-commits=<mode> commit: refactor verify_commit_buffer() fast-import: refactor finalize_commit_buffer()
2025-12-05	Merge branch 'jc/optional-path'	Junio C Hamano	3	-12/+41
	"git config get --path" segfaulted on an ":(optional)path" that does not exist, which has been corrected. * jc/optional-path: config: really treat missing optional path as not configured config: really pretend missing :(optional) value is not there config: mark otherwise unused function as file-scope static
2025-12-05	repo: add -z as an alias for --format=nul to git-repo-structure	Lucas Seiki Oshiro	1	-1/+5
	Other Git commands that have nul-terminated output, such as git-config, git-status, git-ls-files, and git-repo-info have a flag `-z` for using the null character as the record separator. Add the `-z` flag to git-repo-structure as an alias for `--format=nul`, making it consistent with the behavior of the other commands. Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-05	repo: use [--format=... \| -z] instead of [-z] in git-repo-info synopsis	Lucas Seiki Oshiro	1	-1/+1
	The flag -z is only an alias for --format=null and even though --format and -z can be used together and repeated, only the last one is considered. Replace `[-z]` in the synopsis of git-repo-info by `[--format=... \| -z]`, expliciting that the use of one of those flags replace the other. Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-03	last-modified: support sparse checkouts	Johannes Schindelin	1	-1/+2
	In a sparse checkout, a user might want to run `last-modified` on a directory outside the worktree. And even in non-sparse checkouts, a user might need to run that command on a directory that does not exist in the worktree. These use cases should be supported via the `--` separator between revision and file arguments, which is even advertised in the documentation. This patch fixes a tiny bug that prevents that from working. This fixes https://github.com/git-for-windows/git/issues/5978 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Derrick Stolee <stolee@gmail.com> Acked-by: Toon Claes <toon@iotcl.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-03	branch: advice using git-help(1) instead of man(1)	Kristoffer Haugsbakk	1	-1/+1
	8fbd903e (branch: advise about ref syntax rules, 2024-03-05) added an advice about checking git-check-ref-format(1) for the ref syntax rules. The advice uses man(1). But git(1) is a multi-platform tool and man(1) may not be available on some platforms. It might also be slightly jarring to see a suggestion for running a command which is not from the Git suite. Let’s instead use git-help(1) in order to stay inside the land of git(1). This also means that `help.format` (for `man`, `html` or other formats) will be used if set. Also change to using single quotes (') to quote the command since that is more conventional. While here let’s also update the test to use `{SQ}`, which is more readable and easier to edit. Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-30	Merge branch 'ja/doc-synopsis-style'	Junio C Hamano	1	-1/+1
	Doc mark-up updates. * ja/doc-synopsis-style: doc: pull-fetch-param typofix doc: convert git push to synopsis style doc: convert git pull to synopsis style doc: convert git fetch to synopsis style
2025-11-30	Merge branch 'lo/repo-info-all'	Junio C Hamano	1	-18/+45
	"git repo info" learned "--all" option. * lo/repo-info-all: repo: add --all to git-repo-info repo: factor out field printing to dedicated function
2025-11-29	last-modified: fix use of uninitialized memory	Toon Claes	1	-1/+1
	git-last-modified(1) uses a scratch bitmap to keep track of paths that have been changed between commits. To avoid reallocating a bitmap on each call of process_parent(), the scratch bitmap is kept and reused. Although, it seems an incorrect length is passed to memset(3). `struct bitmap` uses `eword_t` to for internal storage. This type is typedef'd to uint64_t. To fully zero the memory used by the bitmap, multiply the length (saved in `struct bitmap::word_alloc`) by the size of `eword_t`. Reported-by: Anders Kaseorg <andersk@mit.edu> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Toon Claes <toon@iotcl.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-28	Documentation/git-replay.adoc: fix errors around revision range	Elijah Newren	1	-1/+1
	There was significant confusion in the git-replay manual about what constitutes a revision range. As noted in f302c1e4aa09 (revisions(7): clarify that most commands take a single revision range, 2021-05-18): Commands that are specifically designed to take two distinct ranges (e.g. "git range-diff R1 R2" to compare two ranges) do exist, but they are exceptions. Unless otherwise noted, all "git" commands that operate on a set of commits work on a single revision range. `git replay` is not an exception, but a few places in the manual were written as though it were. These appear to have come in revisions to the original series, between v3->v4 (see https://lore.kernel.org/git/CAP8UFD3bpLrVW97DH7j=V9H2GsTSAkksC9L3QujQERFk_kLnZA@mail.gmail.com/ , "More than one <revision-range> can be passed") and between v6->v7 (https://lore.kernel.org/git/20231115143327.2441397-1-christian.couder@gmail.com/, "Takes ranges of commits"), and I missed both of these revisions when reviewing. Fix them now. There was also a reference to the "Commit Limiting options below", but this page has no such section of options; strike the misleading reference. It is worth noting that we are documenting existing behavior, rather than optimal behavior. Junio has multiple times suggested introducing alternative ways to walk revisions and use them in `git replay --advance`, e.g. at * https://lore.kernel.org/git/xmqqy1mqo6kv.fsf@gitster.g/ * https://lore.kernel.org/git/xmqq8rb3is8c.fsf@gitster.g/ * https://lore.kernel.org/git/xmqqtsydj2zk.fsf@gitster.g/ (item (2)) If/when we introduce some new revision walking flag that implements one of these alternate types of revision walks, we can update the --advance option and this manual appropriately. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-26	Merge branch 'pw/worktree-list-display-width-fix'	Junio C Hamano	1	-12/+29
	"git worktree list" attempts to show paths to worktrees while aligning them, but miscounted display columns for the paths when non-ASCII characters were involved, which has been corrected. * pw/worktree-list-display-width-fix: worktree list: quote paths worktree list: fix column spacing
2025-11-26	Merge branch 'ad/blame-diff-algorithm'	Junio C Hamano	1	-1/+51
	"git blame" learns "--diff-algorithm=<algo>" option. * ad/blame-diff-algorithm: blame: make diff algorithm configurable xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
2025-11-26	replay: do not copy "gpgsign-sha256" header	Phillip Wood	1	-1/+1
	When "git replay" replays a commit it copies the extended headers across from the original commit. However, if the original commit was signed, we do not want to copy the header associated with the signature is it wont be valid for the new commit. The code already knows to avoid coping the "gpgsig" header but does not know to avoid copying the "gpgsig-sha256" header. Add that header to the list of exclusions to match what "git commit --amend" does. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-26	fast-import: add 'strip-if-invalid' mode to --signed-commits=<mode>	Christian Couder	2	-14/+83
	Tools like `git filter-repo`[1] use `git fast-export` and `git fast-import` to rewrite repository history. When rewriting history using one such tool though, commit signatures might become invalid because the commits they sign changed due to the changes in the repository history made by the tool between the fast-export and the fast-import steps. Note that as far as signature handling goes: * Since fast-export doesn't know what changes filter-repo may make to the stream, it can't know whether the signatures will still be valid. * Since filter-repo doesn't know what history canonicalizations fast-export performed (and it performs a few), it can't know whether the signatures will still be valid. * Therefore, fast-import is the only process in the pipeline that can know whether a specified signature remains valid. Having invalid signatures in a rewritten repository could be confusing, so users rewritting history might prefer to simply discard signatures that are invalid at the fast-import step. For example a common use case is to rewrite only "recent" history. While specifying commit ranges corresponding to "recent" commits could work, users worry about getting it wrong and want to just automatically rewrite everything, expecting older commit signatures to be untouched. To let them do that, let's add a new 'strip-if-invalid' mode to the `--signed-commits=<mode>` option of `git fast-import`. It would be interesting for the `--signed-tags=<mode>` option to have this mode too, but we leave that for a future improvement. It might also be possible for `git fast-export` to have such a mode in its `--signed-commits=<mode>` and `--signed-tags=<mode>` options, but the use cases for it are much less clear, so we also leave that for possible future improvements. For now let's just die() if 'strip-if-invalid' is passed to these options where it hasn't been implemented yet. [1]: https://github.com/newren/git-filter-repo Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25	builtin/index-pack: fix deferred fsck outside repos	Patrick Steinhardt	1	-3/+18
	When asked to perform object consistency checks via the `--fsck-objects` flag we verify that each object part of the pack is valid. In general, this check can even be performed outside of a Git repository: we don't need an initialized object database as we simply read the object from the packfile directly. But there's one exception: a subset of the object checks may be deferred to a later point in time. For now, this only concerns ".gitmodules" and ".gitattributes" files: whenever we see a tree referencing these files we queue them for a deferred check. This is done because we need to do some extra checks for those files to ensure that they are well-formed, and these checks need to be done regardless of whether the corresponding blobs are part of the packfile or not. This works inside a repository, but unfortunately the logic leads to a segfault when running outside of one. This is because we eventually call `odb_read_object()`, which will crash because the object database has not been initialized. There's multiple options here: - We could in theory create a purely in-memory database with only a packfile store that contains the single packfile. We don't really have the infrastructure for this yet though, and it would end up being quite hacky. - We could refuse to perform consistency checks outside of a repository. But most of the checks work alright, so this would be a regression. - We can skip the finalizing consistency checks when running outside of a repository. This is not as invasive as skipping all checks, but it's not great to randomly skip a subset of tests, either. None of these options really feel perfect. The first one would be the obvious choice if easily possible. There's another option though: instead of skipping the final object checks, we can die if there are any queued object checks. With this change we now die exactly if and only if we would have previously segfaulted. Like this we ensure that objects that _may_ fail the consistency checks won't be silently skipped, and at the same time we give users a much better error message. Refactor the code accordingly and add a test that would have triggered the segfault. Note that we also move down the logic to add the packfile to the store. There is no point doing this any earlier than right before we execute `fsck_finish()`, and it ensures that the logic to set up and perform the consistency check is self-contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-24	config: really treat missing optional path as not configured	Junio C Hamano	2	-3/+5
	These callers expect that git_config_pathname() that returns 0 is a signal that the variable they passed has a string they need to act on. But with the introduction of ":(optional)path" earlier, that is no longer the case. If the path specified by the configuration variable is missing, their variable will get a NULL in it, and they need to act on it (often, just refraining from copying it elsewhere). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-24	config: really pretend missing :(optional) value is not there	Junio C Hamano	1	-9/+36
	Earlier we added support for a value spelled as ":(optional)path" for configuration variables whose values are of type "path", with the documented semantics "if the path is missing, behave as if such a variable definition is not even there." This has worked OK for code paths that reads configuration files and stores the configured value as a string, where NULL in such a string is treated as if the setting is not there, left as the default. However, there are other code paths that do not _ignore_ such NULL values and misbehave. "git config get --path" is one of them. When git_config_pathname() helper function finds that the value of the variable is an optional path and the path is missing, it leaves the destination pointer intact (which usually is left to NULL) and returns 0 to signal a success. format_config() helper however assumed that the destination pointer always gets a string, which no longer is the case, and segfaulted. Make sure that git_config_pathname() clears the destination pointer in such a case, and teach format_config() to react to the condition by returning 1 (which is different from 0 that is a normal success and negative that is an error) to its callers. Adjust the callers to react to this new return value that tells them to pretend as if they did not even see this partcular <key, value> pair. Reported-by: Han Jiang <jhcarl0814@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-24	Merge branch 'jx/repo-struct-utf8width-fix'	Junio C Hamano	1	-4/+17
	The "git repo structure" subcommand tried to align its output but mixed up byte count and display column width, which has been corrected. * jx/repo-struct-utf8width-fix: builtin/repo: fix table alignment for UTF-8 characters t/unit-tests: add UTF-8 width tests for CJK chars
2025-11-24	Merge branch 'ps/object-source-loose'	Junio C Hamano	2	-6/+5
	A part of code paths that deals with loose objects has been cleaned up. * ps/object-source-loose: object-file: refactor writing objects via a stream object-file: rename `write_object_file()` object-file: refactor freshening of objects object-file: rename `has_loose_object()` object-file: read objects via the loose object source object-file: move loose object map into loose source object-file: hide internals when we need to reprepare loose sources object-file: move loose object cache into loose source object-file: introduce `struct odb_source_loose` object-file: move `fetch_if_missing` odb: adjust naming to free object sources odb: introduce `odb_source_new()` odb: fix subtle logic to check whether an alternate is usable
2025-11-24	Merge branch 'sa/replay-atomic-ref-updates'	Junio C Hamano	1	-13/+120
	"git replay" (experimental) learned to perform ref updates itself in a transaction by default, instead of emitting where each refs should point at and leaving the actual update to another command. * sa/replay-atomic-ref-updates: replay: add replay.refAction config option replay: make atomic ref updates the default behavior replay: use die_for_incompatible_opt2() for option validation
2025-11-24	config: fix short help of unset flags	René Scharfe	1	-2/+2
	The flags --all and --value of "git config unset" don't make the command "replace" or "show" anything, they are about selecting what to unset. Change their help text accordingly. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-24	config: fix suggestion for failed set of multi-valued option	René Scharfe	1	-1/+1
	The command "git config set <name> <value>" fails for an option that has multiple values. List the "git config set" flags that can be used, instead of old-style "git config" actions. Reported-by: Paul Wintz <pwintz@ucsc.edu> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23	streaming: drop redundant type and size pointers	Patrick Steinhardt	2	-7/+6
	In the preceding commits we have turned `struct odb_read_stream` into a publicly visible structure. Furthermore, this structure now contains the type and size of the object that we are about to stream. Consequently, the out-pointers that we used before to propagate the type and size of the streamed object are now somewhat redundant with the data contained in the structure itself. Drop these out-pointers and adapt callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23	streaming: move into object database subsystem	Patrick Steinhardt	5	-5/+5
	The "streaming" terminology is somewhat generic, so it may not be immediately obvious that "streaming.{c,h}" is specific to the object database. Rectify this by moving it into the "odb/" directory so that it can be immediately attributed to the object subsystem. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23	streaming: refactor interface to be object-database-centric	Patrick Steinhardt	2	-11/+11
	Refactor the streaming interface to be centered around object databases instead of centered around the repository. Rename the functions accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23	streaming: get rid of `the_repository`	Patrick Steinhardt	3	-4/+5
	Subsequent commits will move the backend-specific logic of object streaming into their respective subsystems. These subsystems have gotten rid of `the_repository` already, but we still use it in two locations in the streaming subsystem. Prepare for the move by fixing those two cases. Converting the logic in `open_istream_pack_non_delta()` is trivial as we already got the object database as input. But for `stream_blob_to_fd()` we have to add a new parameter to make it accessible. So, as we already have to adjust all callers anyway, rename the function to `odb_stream_blob_to_fd()` to indicate it's part of the object subsystem. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23	streaming: rename `git_istream` into `odb_read_stream`	Patrick Steinhardt	2	-3/+3
	In the following patches we are about to make the `git_istream` more generic so that it becomes fully controlled by the specific object source that wants to create it. As part of these refactorings we'll fully move the structure into the object database subsystem. Prepare for this change by renaming the structure from `git_istream` to `odb_read_stream`. This mirrors the `odb_write_stream` structure that we already have. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-21	Merge branch 'kn/maintenance-is-needed'	Junio C Hamano	1	-10/+83
	"git maintenance" command learned "is-needed" subcommand to tell if it is necessary to perform various maintenance tasks. * kn/maintenance-is-needed: maintenance: add 'is-needed' subcommand maintenance: add checking logic in `pack_refs_condition()` refs: add a `optimize_required` field to `struct ref_storage_be` reftable/stack: add function to check if optimization is required reftable/stack: return stack segments directly
2025-11-19	odb: adopt logic to close object databases	Patrick Steinhardt	3	-3/+3
	The logic to close an object database is currently contained in the packfile subsystem. That choice is somewhat relatable, as most of the logic really is to close resources associated with the packfile store itself. But we also end up handling object sources and commit graphs, which certainly is not related to packfiles. Move the function into the object database subsystem and rename it to `odb_close()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-19	path: move `enter_repo()` into "setup.c"	Patrick Steinhardt	3	-3/+3
	The function `enter_repo()` is used to enter a repository at a given path. As such it sits way closer to setting up a repository than it does with handling paths, but regardless of that it's located in "path.c" instead of in "setup.c". Move the function into "setup.c". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-19	Merge branch 'ps/object-source-loose' into ps/object-source-management	Junio C Hamano	2	-6/+5
	A part of code paths that deals with loose objects has been cleaned up. * ps/object-source-loose: object-file: refactor writing objects via a stream object-file: rename `write_object_file()` object-file: refactor freshening of objects object-file: rename `has_loose_object()` object-file: read objects via the loose object source object-file: move loose object map into loose source object-file: hide internals when we need to reprepare loose sources object-file: move loose object cache into loose source object-file: introduce `struct odb_source_loose` object-file: move `fetch_if_missing` odb: adjust naming to free object sources odb: introduce `odb_source_new()` odb: fix subtle logic to check whether an alternate is usable
2025-11-19	Merge branch 'ps/object-source-loose' into ps/object-read-stream	Junio C Hamano	2	-6/+5
	A part of code paths that deals with loose objects has been cleaned up. * ps/object-source-loose: object-file: refactor writing objects via a stream object-file: rename `write_object_file()` object-file: refactor freshening of objects object-file: rename `has_loose_object()` object-file: read objects via the loose object source object-file: move loose object map into loose source object-file: hide internals when we need to reprepare loose sources object-file: move loose object cache into loose source object-file: introduce `struct odb_source_loose` object-file: move `fetch_if_missing` odb: adjust naming to free object sources odb: introduce `odb_source_new()` odb: fix subtle logic to check whether an alternate is usable
2025-11-19	doc: convert git fetch to synopsis style	Jean-Noël Avila	1	-1/+1
	- Switch the synopsis to a synopsis block which will automatically format placeholders in italics and keywords in monospace - Use _<placeholder>_ instead of <placeholder> in the description - Use `backticks` for keywords and more complex option descriptions. The new rendering engine will apply synopsis rules to these spans. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-19	Merge branch 'ps/ref-peeled-tags'	Junio C Hamano	20	-197/+156
	Some ref backend storage can hold not just the object name of an annotated tag, but the object name of the object the tag points at. The code to handle this information has been streamlined. * ps/ref-peeled-tags: t7004: do not chdir around in the main process ref-filter: fix stale parsed objects ref-filter: parse objects on demand ref-filter: detect broken tags when dereferencing them refs: don't store peeled object IDs for invalid tags object: add flag to `peel_object()` to verify object type refs: drop infrastructure to peel via iterators refs: drop `current_ref_iter` hack builtin/show-ref: convert to use `reference_get_peeled_oid()` ref-filter: propagate peeled object ID upload-pack: convert to use `reference_get_peeled_oid()` refs: expose peeled object ID via the iterator refs: refactor reference status flags refs: fully reset `struct ref_iterator::ref` on iteration refs: introduce `.ref` field for the base iterator refs: introduce wrapper struct for `each_ref_fn`
2025-11-19	Merge branch 'ps/packed-git-in-object-store'	Junio C Hamano	2	-21/+20
	The list of packfiles used in a running Git process is moved from the packed_git structure into the packfile store. * ps/packed-git-in-object-store: packfile: track packs via the MRU list exclusively packfile: always add packfiles to MRU when adding a pack packfile: move list of packs into the packfile store builtin/pack-objects: simplify logic to find kept or nonlocal objects packfile: fix approximation of object counts http: refactor subsystem to use `packfile_list`s packfile: move the MRU list into the packfile store packfile: use a `strmap` to store packs by name
2025-11-18	repo: add --all to git-repo-info	Lucas Seiki Oshiro	1	-2/+27
	Add a new flag `--all` to git-repo-info for requesting values for all the available keys. By using this flag, the user can retrieve all the values instead of searching what are the desired keys for what they wants. Helped-by: Karthik Nayak <karthik.188@gmail.com> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18	repo: factor out field printing to dedicated function	Lucas Seiki Oshiro	1	-16/+18
	Move the field printing in git-repo-info to a new function called `print_field`, allowing it to be called by functions other than `print_fields`. Also change its use of quote_c_style() helper to output directly to the standard output stream, instead of taking a result in a strbuf and then printing it outselves. Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18	worktree list: quote paths	Phillip Wood	1	-2/+8
	If a worktree path contains newlines or other control characters it messes up the output of "git worktree list". Fix this by using quote_path() to display the worktree path. The output of "git worktree list" is designed for human consumption, scripts should be using the "--porcelain" option so this change should not break them. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18	worktree list: fix column spacing	Phillip Wood	1	-12/+23
	The output of "git worktree list" displays a table containing the worktree path, HEAD OID and branch name for each worktree. The code aligns the columns by measuring the visual width of the worktree path when it is printed. Unfortunately it fails to use the visual width when calculating the width of the column so, if any of the paths contain a multibyte character, we can end up with excess padding between columns. The simplest fix would be to replace strlen() with utf8_strwidth() in measure_widths(). However that leaves us measuring the visual width twice and the byte length once. By caching the visual width and printing the padding separately to the worktree path, we only need to calculate the visual width once and do not need the byte length at all. The visual widths are stored in an arrays of structs rather than an array of ints as the next commit will add more struct members. Even if there are no multibyte characters in any of the paths we still print an extra space between the path and the object id as the field width is calculated as one plus the length of the path and we print an explicit space as well. This is fixed by not printing the extra space. The tests are updated to include multibyte characters in one of the worktree paths and to check the spacing of the columns. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-17	blame: make diff algorithm configurable	Antonin Delpeuch	1	-1/+51
	The diff algorithm used in 'git-blame(1)' is set to 'myers', without the possibility to change it aside from the `--minimal` option. There has been long-standing interest in changing the default diff algorithm to "histogram", and Git 3.0 was floated as a possible occasion for taking some steps towards that: https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/ As a preparation for this move, it is worth making sure that the diff algorithm is configurable where useful. Make it configurable in the `git-blame(1)` command by introducing the `--diff-algorithm` option and make honor the `diff.algorithm` config variable. Keep Myers diff as the default. Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-16	fast-import: refactor finalize_commit_buffer()	Christian Couder	1	-4/+13
	In a following commit we are going to finalize commit buffers with or without signatures in order to check the signatures and possibly drop them. To do so easily and without duplication, let's refactor the current code that finalizes commit buffers into a new finalize_commit_buffer() function. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-16	builtin/repo: fix table alignment for UTF-8 characters	Jiang Xin	1	-4/+17
	The output table from "git repo structure" is misaligned when displaying UTF-8 characters (e.g., non-ASCII glyphs). E.g.: \| 仓库结构 \| 值 \| \| -------------- \| ---- \| \| * 引用 \| \| \| * 计数 \| 67 \| The previous implementation used simple width formatting with printf() which didn't properly handle multi-byte UTF-8 characters, causing misaligned table columns when displaying repository structure information. This change modifies the stats_table_print_structure function to use strbuf_utf8_align() instead of basic printf width specifiers. This ensures proper column alignment regardless of the character encoding of the content being displayed. Also add test cases for strbuf_utf8_align(), a function newly introduced in "builtin/repo.c". Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-12	Merge branch 'tc/last-modified-active-paths-optimization'	Junio C Hamano	1	-15/+235
	"git last-modified" was optimized by narrowing the set of paths to follow as it dug deeper in the history. * tc/last-modified-active-paths-optimization: last-modified: implement faster algorithm
2025-11-10	maintenance: add 'is-needed' subcommand	Karthik Nayak	1	-1/+62
	The 'git-maintenance(1)' command provides tooling to run maintenance tasks over Git repositories. The 'run' subcommand, as the name suggests, runs the maintenance tasks. When used with the '--auto' flag, it uses heuristics to determine if the required thresholds are met for running said maintenance tasks. There is however a lack of insight into these heuristics. Meaning, the checks are linked to the execution. Add a new 'is-needed' subcommand to 'git-maintenance(1)' which allows users to simply check if it is needed to run maintenance without performing it. This subcommand can check if it is needed to run maintenance without actually running it. Ideally it should be used with the '--auto' flag, which would allow users to check if the thresholds required are met. The subcommand also supports the '--task' flag which can be used to check specific maintenance tasks. While adding the respective tests in 't/t7900-maintenance.sh', remove a duplicate of the test: 'worktree-prune task with --auto honors maintenance.worktree-prune.auto'. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-10	maintenance: add checking logic in `pack_refs_condition()`	Karthik Nayak	1	-9/+21
	The 'git-maintenance(1)' command supports an '--auto' flag. Usage of the flag ensures to run maintenance tasks only if certain thresholds are met. The heuristic is defined on a task level, wherein each task defines an 'auto_condition', which states if the task should be run. The 'pack-refs' task is hard-coded to return 1 as: 1. There was never a way to check if the reference backend needs to be optimized without actually performing the optimization. 2. We can pass in the '--auto' flag to 'git-pack-refs(1)' which would optimize based on heuristics. The previous commit added a `refs_optimize_required()` function, which can be used to check if a reference backend required optimization. Use this within `pack_refs_condition()`. This allows us to add a 'git maintenance is-needed' subcommand which can notify the user if maintenance is needed without actually performing the optimization. Without this change, the reference backend would always state that optimization is needed. Since we import 'revision.h', we need to remove the definition for 'SEEN' which is duplicated in the included header. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-06	Merge branch 'cc/fast-import-export-i18n-cleanup'	Junio C Hamano	2	-179/+180
	Messages from fast-import/export are now marked for i18n. * cc/fast-import-export-i18n-cleanup: gpg-interface: mark a string for translation fast-import: mark strings for translation fast-export: mark strings for translation gpg-interface: use left shift to define GPG_VERIFY_* gpg-interface: simplify ssh fingerprint parsing
2025-11-05	Merge branch 'rz/t0450-bisect-doc-update'	Junio C Hamano	1	-8/+13
	The help text and manual page of "git bisect" command have been made consistent with each other. * rz/t0450-bisect-doc-update: bisect: update usage and docs to match each other
2025-11-05	replay: add replay.refAction config option	Siddharth Asthana	1	-4/+20
	Add a configuration variable to control the default behavior of git replay for updating references. This allows users who prefer the traditional pipeline output to set it once in their config instead of passing --ref-action=print with every command. The config variable uses string values that mirror the behavior modes: * replay.refAction = update (default): atomic ref updates * replay.refAction = print: output commands for pipeline Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Elijah Newren <newren@gmail.com> Helped-by: Christian Couder <christian.couder@gmail.com> Helped-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-05	replay: make atomic ref updates the default behavior	Siddharth Asthana	1	-10/+101
	The git replay command currently outputs update commands that can be piped to update-ref to achieve a rebase, e.g. git replay --onto main topic1..topic2 \| git update-ref --stdin This separation had advantages for three special cases: * it made testing easy (when state isn't modified from one step to the next, you don't need to make temporary branches or have undo commands, or try to track the changes) * it provided a natural can-it-rebase-cleanly (and what would it rebase to) capability without automatically updating refs, similar to a --dry-run * it provided a natural low-level tool for the suite of hash-object, mktree, commit-tree, mktag, merge-tree, and update-ref, allowing users to have another building block for experimentation and making new tools However, it should be noted that all three of these are somewhat special cases; users, whether on the client or server side, would almost certainly find it more ergonomic to simply have the updating of refs be the default. For server-side operations in particular, the pipeline architecture creates process coordination overhead. Server implementations that need to perform rebases atomically must maintain additional code to: 1. Spawn and manage a pipeline between git-replay and git-update-ref 2. Coordinate stdout/stderr streams across the pipe boundary 3. Handle partial failure states if the pipeline breaks mid-execution 4. Parse and validate the update-ref command output Change the default behavior to update refs directly, and atomically (at least to the extent supported by the refs backend in use). This eliminates the process coordination overhead for the common case. For users needing the traditional pipeline workflow, add a new --ref-action=<mode> option that preserves the original behavior: git replay --ref-action=print --onto main topic1..topic2 \| git update-ref --stdin The mode can be: * update (default): Update refs directly using an atomic transaction * print: Output update-ref commands for pipeline use Test suite changes: All existing tests that expected command output now use --ref-action=print to preserve their original behavior. This keeps the tests valid while allowing them to verify that the pipeline workflow still works correctly. New tests were added to verify: - Default atomic behavior (no output, refs updated directly) - Bare repository support (server-side use case) - Equivalence between traditional pipeline and atomic updates - Real atomicity using a lock file to verify all-or-nothing guarantee - Test isolation using test_when_finished to clean up state - Reflog messages include replay mode and target A following commit will add a replay.refAction configuration option for users who prefer the traditional pipeline output as their default behavior. Helped-by: Elijah Newren <newren@gmail.com> Helped-by: Patrick Steinhardt <ps@pks.im> Helped-by: Christian Couder <christian.couder@gmail.com> Helped-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-05	replay: use die_for_incompatible_opt2() for option validation	Siddharth Asthana	1	-3/+3
	In preparation for adding the --ref-action option, convert option validation to use die_for_incompatible_opt2(). This helper provides standardized error messages for mutually exclusive options. The following commit introduces --ref-action which will be incompatible with certain other options. Using die_for_incompatible_opt2() now means that commit can cleanly add its validation using the same pattern, keeping the validation logic consistent and maintainable. This also aligns git-replay's option handling with how other Git commands manage option conflicts, using the established die_for_incompatible_opt*() helper family. Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04	Merge branch 'jt/repo-structure'	Junio C Hamano	1	-3/+377
	"git repo structure", a new command. * jt/repo-structure: builtin/repo: add progress meter for structure stats builtin/repo: add keyvalue and nul format for structure stats builtin/repo: add object counts in structure output builtin/repo: introduce structure subcommand ref-filter: export ref_kind_from_refname() ref-filter: allow NULL filter pattern builtin/repo: rename repo_info() to cmd_repo_info()
2025-11-04	Merge branch 'ps/ref-peeled-tags' into kn/maintenance-is-needed	Junio C Hamano	20	-194/+527
	* ps/ref-peeled-tags: (23 commits) t7004: do not chdir around in the main process ref-filter: fix stale parsed objects ref-filter: parse objects on demand ref-filter: detect broken tags when dereferencing them refs: don't store peeled object IDs for invalid tags object: add flag to `peel_object()` to verify object type refs: drop infrastructure to peel via iterators refs: drop `current_ref_iter` hack builtin/show-ref: convert to use `reference_get_peeled_oid()` ref-filter: propagate peeled object ID upload-pack: convert to use `reference_get_peeled_oid()` refs: expose peeled object ID via the iterator refs: refactor reference status flags refs: fully reset `struct ref_iterator::ref` on iteration refs: introduce `.ref` field for the base iterator refs: introduce wrapper struct for `each_ref_fn` builtin/repo: add progress meter for structure stats builtin/repo: add keyvalue and nul format for structure stats builtin/repo: add object counts in structure output builtin/repo: introduce structure subcommand ...
2025-11-04	builtin/show-ref: convert to use `reference_get_peeled_oid()`	Patrick Steinhardt	1	-13/+19
	The git-show-ref(1) command has multiple different modes: - It knows to show all references matching a pattern. - It knows to list all references that are an exact match to whatever the user has provided. - It knows to check for reference existence. The first two commands use mostly the same infrastructure to print the references via `show_one()`. But while the former mode uses a proper iterator and thus has a `struct reference` available in its context, the latter calls `refs_read_ref()` and thus doesn't. Consequently, we cannot easily use `reference_get_peeled_oid()` to print the peeled value. Adapt the code so that we manually construct a `struct reference` when verifying refs. We wouldn't ever have the peeled value available anyway as we're not using an iterator here, so we can simply plug in the values we _do_ have. With this change we now have a `struct reference` available at both callsites of `show_one()` and can thus pass it, which allows us to use `reference_get_peeled_oid()` instead of `peel_iterated_oid()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04	ref-filter: propagate peeled object ID	Patrick Steinhardt	3	-3/+3
	When queueing a reference in the "ref-filter" subsystem we end up creating a new ref array item that contains the reference's info. One bit of info that we always discard though is the peeled object ID, and because of that we are forced to use `peel_iterated_oid()`. Refactor the code to propagate the peeled object ID via the ref array, if available. This allows us to manually peel tags without having to go through the object database. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04	refs: expose peeled object ID via the iterator	Patrick Steinhardt	3	-5/+6
	Both the "files" and "reftable" backend are able to store peeled values for tags in the respective formats. This allows for a more efficient lookup of the target object of such a tag without having to manually peel via the object database. The infrastructure to access these peeled object IDs is somewhat funky though. When iterating through objects, we store a pointer reference to the current iterator in a global variable. The callbacks invoked by that iterator are then expected to call `peel_iterated_oid()`, which checks whether the globally-stored iterator's current reference refers to the one handed into that function. If so, we ask the iterator to peel the object, otherwise we manually peel the object via the object database. Depending on global state like this is somewhat weird and also quite fragile. Introduce a new `struct reference::peeled_oid` field that can be populated by the reference backends. This field can be accessed via a new function `reference_get_peeled_oid()` that either uses that value, if set, or alternatively peels via the ODB. With this change we don't have to rely on global state anymore, but make the peeled object ID available to the callback functions directly. Adjust trivial callers that already have a `struct reference` available. Remaining callers will be adjusted in subsequent commits. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04	refs: introduce wrapper struct for `each_ref_fn`	Patrick Steinhardt	17	-182/+134
	The `each_ref_fn` callback function type is used across our code base for several different functions that iterate through reference. There's a bunch of callbacks implementing this type, which makes any changes to the callback signature extremely noisy. An example of the required churn is e8207717f1 (refs: add referent to each_ref_fn, 2024-08-09): adding a single argument required us to change 48 files. It was already proposed back then [1] that we might want to introduce a wrapper structure to alleviate the pain going forward. While this of course requires the same kind of global refactoring as just introducing a new parameter, it at least allows us to more change the callback type afterwards by just extending the wrapper structure. One counterargument to this refactoring is that it makes the structure more opaque. While it is obvious which callsites need to be fixed up when we change the function type, it's not obvious anymore once we use a structure. That being said, we only have a handful of sites that actually need to populate this wrapper structure: our ref backends, "refs/iterator.c" as well as very few sites that invoke the iterator callback functions directly. Introduce this wrapper structure so that we can adapt the iterator interfaces more readily. [1]: <ZmarVcF5JjsZx0dl@tanuki> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-03	object-file: refactor writing objects via a stream	Patrick Steinhardt	1	-4/+3
	We have two different ways to write an object into the database: - We either provide the full buffer and write the object all at once. - Or we provide an input stream that has a `read()` function so that we can chunk the object. The latter is especially used for large objects, where it may be too expensive to hold the complete object in memory all at once. While we already have `odb_write_object()` at the ODB-layer, we don't have an equivalent for streaming an object. Introduce a new function `odb_write_object_stream()` to address this gap so that callers don't have to be aware of the inner workings of how to stream an object to disk with a specific object source. Rename `stream_loose_object()` to `odb_source_loose_write_stream()` to clarify its scope. This matches our modern best practices around how to name functions. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-03	object-file: rename `has_loose_object()`	Patrick Steinhardt	1	-2/+2
	Rename `has_loose_object()` to `odb_source_loose_has_object()` so that it becomes clear that this is tied to a specific loose object source. This matches our modern naming schema for functions. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-03	last-modified: implement faster algorithm	Toon Claes	1	-15/+235
	The current implementation of git-last-modified(1) works by doing a revision walk, and inspecting the diff at each level of that walk to annotate entries remaining in the hashmap of paths. In other words, if the diff at some level touches a path which has not yet been associated with a commit, then that commit becomes associated with the path. While a perfectly reasonable implementation, it can perform poorly in either one of two scenarios: 1. There are many entries of interest, in which case there is simply a lot of work to do. 2. Or, there are (even a few) entries which have not been updated in a long time, and so we must walk through a lot of history in order to find a commit that touches that path. This patch rewrites the last-modified implementation that addresses the second point. The idea behind the algorithm is to propagate a set of 'active' paths (a path is 'active' if it does not yet belong to a commit) up to parents and do a truncated revision walk. The walk is truncated because it does not produce a revision for every change in the original pathspec, but rather only for active paths. More specifically, consider a priority queue of commits sorted by generation number. First, enqueue the set of boundary commits with all paths in the original spec marked as interesting. Then, while the queue is not empty, do the following: 1. Pop an element, say, 'c', off of the queue, making sure that 'c' isn't reachable by anything in the '--not' set. 2. For each parent 'p' (with index 'parent_i') of 'c', do the following: a. Compute the diff between 'c' and 'p'. b. Pass any active paths that are TREESAME from 'c' to 'p'. c. If 'p' has any active paths, push it onto the queue. 3. Any path that remains active on 'c' is associated to that commit. This ends up being equivalent to doing something like 'git log -1 -- $path' for each path simultaneously. But, it allows us to go much faster than the original implementation by limiting the number of diffs we compute, since we can avoid parts of history that would have been considered by the revision walk in the original implementation, but are known to be uninteresting to us because we have already marked all paths in that area to be inactive. To avoid computing many first-parent diffs, add another trick on top of this and check if all paths active in 'c' are DEFINITELY NOT in c's Bloom filter. Since the commit-graph only stores first-parent diffs in the Bloom filters, we can only apply this trick to first-parent diffs. Comparing the performance of this new algorithm shows about a 2.5x improvement on git.git: Benchmark 1: master no bloom Time (mean ± σ): 2.868 s ± 0.023 s [User: 2.811 s, System: 0.051 s] Range (min … max): 2.847 s … 2.926 s 10 runs Benchmark 2: master with bloom Time (mean ± σ): 949.9 ms ± 15.2 ms [User: 907.6 ms, System: 39.5 ms] Range (min … max): 933.3 ms … 971.2 ms 10 runs Benchmark 3: HEAD no bloom Time (mean ± σ): 782.0 ms ± 6.3 ms [User: 740.7 ms, System: 39.2 ms] Range (min … max): 776.4 ms … 798.2 ms 10 runs Benchmark 4: HEAD with bloom Time (mean ± σ): 307.1 ms ± 1.7 ms [User: 276.4 ms, System: 29.9 ms] Range (min … max): 303.7 ms … 309.5 ms 10 runs Summary HEAD with bloom ran 2.55 ± 0.02 times faster than HEAD no bloom 3.09 ± 0.05 times faster than master with bloom 9.34 ± 0.09 times faster than master no bloom In short, the existing implementation is comparably fast with Bloom filters as the new implementation is without Bloom filters. So, most repositories should get a dramatic speed-up by just deploying this (even without computing Bloom filters), and all repositories should get faster still when computing Bloom filters. When comparing a more extreme example of `git last-modified -- COPYING t`, the difference is even 5 times better: Benchmark 1: master Time (mean ± σ): 4.372 s ± 0.057 s [User: 4.286 s, System: 0.062 s] Range (min … max): 4.308 s … 4.509 s 10 runs Benchmark 2: HEAD Time (mean ± σ): 826.3 ms ± 22.3 ms [User: 784.1 ms, System: 39.2 ms] Range (min … max): 810.6 ms … 881.2 ms 10 runs Summary HEAD ran 5.29 ± 0.16 times faster than master As an added benefit, results are more consistent now. For example implementation in 'master' gives: $ git log --max-count=1 --format=%H -- pkt-line.h 15df15fe07ef66b51302bb77e393f3c5502629de $ git last-modified -- pkt-line.h 15df15fe07ef66b51302bb77e393f3c5502629de pkt-line.h $ git last-modified \| grep pkt-line.h 5b49c1af03e600c286f63d9d9c9fb01403230b9f pkt-line.h With the changes in this patch the results of git-last-modified(1) always match those of `git log --max-count=1`. One thing to note though, the results might be outputted in a different order than before. This is not considerd to be an issue because nowhere is documented the order is guaranteed. Based-on-patches-by: Derrick Stolee <stolee@gmail.com> Based-on-patches-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Toon Claes <toon@iotcl.com> Acked-by: Taylor Blau <me@ttaylorr.com> [jc: tweaked use of xcalloc() to unbreak coccicheck] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-03	Merge branch 'ps/maintenance-geometric'	Junio C Hamano	1	-55/+258
	"git maintenance" command learns the "geometric" strategy where it avoids doing maintenance tasks that rebuilds everything from scratch. * ps/maintenance-geometric: t7900: fix a flaky test due to git-repack always regenerating MIDX builtin/maintenance: introduce "geometric" strategy builtin/maintenance: make "gc" strategy accessible builtin/maintenance: extend "maintenance.strategy" to manual maintenance builtin/maintenance: run maintenance tasks depending on type builtin/maintenance: improve readability of strategies builtin/maintenance: don't silently ignore invalid strategy builtin/maintenance: make the geometric factor configurable builtin/maintenance: introduce "geometric-repack" task builtin/gc: make `too_many_loose_objects()` reusable without GC config builtin/gc: remove global `repack` variable
2025-10-30	Merge branch 'rz/bisect-help-unknown'	Junio C Hamano	1	-1/+5
	"git bisect" command did not react correctly to "git bisect help" and "git bisect unknown", which has been corrected. * rz/bisect-help-unknown: bisect: fix handling of `help` and invalid subcommands
2025-10-30	Merge branch 'ps/remove-packfile-store-get-packs'	Junio C Hamano	8	-52/+31
	Two slightly different ways to get at "all the packfiles" in API has been cleaned up. * ps/remove-packfile-store-get-packs: packfile: rename `packfile_store_get_all_packs()` packfile: introduce macro to iterate through packs packfile: drop `packfile_store_get_packs()` builtin/grep: simplify how we preload packs builtin/gc: convert to use `packfile_store_get_all_packs()` object-name: convert to use `packfile_store_get_all_packs()`
2025-10-30	Merge branch 'ey/commit-graph-changed-paths-config'	Junio C Hamano	1	-0/+2
	A new configuration variable commitGraph.changedPaths allows to turn "--changed-paths" on by default for "git commit-graph". * ey/commit-graph-changed-paths-config: commit-graph: add new config for changed-paths & recommend it in scalar
2025-10-30	packfile: track packs via the MRU list exclusively	Patrick Steinhardt	1	-2/+2
	We track packfiles via two different lists: - `struct packfile_store::packs` is a list that sorts local packs first. In addition, these packs are sorted so that younger packs are sorted towards the front. - `struct packfile_store::mru` is a list that sorts packs so that most-recently used packs are at the front. The reasoning behind the ordering in the `packs` list is that younger objects stored in the local object store tend to be accessed more frequently, and that is certainly true for some cases. But there are going to be lots of cases where that isn't true. Especially when traversing history it is likely that one needs to access many older objects, and due to our housekeeping it is very likely that almost all of those older objects will be contained in one large pack that is oldest. So whether or not the ordering makes sense really depends on the use case at hand. A flexible approach like our MRU list addresses that need, as it will sort packs towards the front that are accessed all the time. Intuitively, this approach is thus able to satisfy more use cases more efficiently. This reasoning casts some doubt on whether or not it really makes sense to track packs via two different lists. It causes confusion, and it is not clear whether there are use cases where the `packs` list really is such an obvious choice. Merge these two lists into one most-recently-used list. Note that there is one important edge case: `for_each_packed_object()` uses the MRU list to iterate through packs, and then it lists each object in those packs. This would have the effect that we now sort the current pack towards the front, thus modifying the list of packfiles we are iterating over, with the consequence that we'll see an infinite loop. This edge case is worked around by introducing a new field that allows us to skip updating the MRU. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-30	packfile: move list of packs into the packfile store	Patrick Steinhardt	1	-2/+2
	Move the list of packs into the packfile store. This follows the same logic as in a previous commit, where we moved the most-recently-used list of packs, as well. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-30	builtin/pack-objects: simplify logic to find kept or nonlocal objects	Patrick Steinhardt	1	-14/+14
	The function `has_sha1_pack_kept_or_nonlocal()` takes an object ID and then searches through packed objects to figure out whether the object exists in a kept or non-local pack. As a performance optimization we remember the packfile that contains a given object ID so that the next call to the function first checks that same packfile again. The way this is written is rather hard to follow though, as the caching mechanism is intertwined with the loop that iterates through the packs. Consequently, we need to do some gymnastics to re-start the iteration if the cached pack does not contain the objects. Refactor this so that we check the cached packfile at the beginning. We don't have to re-verify whether the packfile meets the properties as we have already verified those when storing the pack in `last_found` in the first place. So all we need to do is to use `find_pack_entry_one()` to check whether the pack contains the object ID, and to skip the cached pack in the loop so that we don't search it twice. Furthermore, stop using the `(void *)1` sentinel value and instead use a simple `NULL` pointer to indicate that we don't have a last-found pack yet. This refactoring significantly simplifies the logic and makes it much easier to follow. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-30	packfile: move the MRU list into the packfile store	Patrick Steinhardt	1	-5/+4
	Packfiles have two lists associated to them: - A list that keeps track of packfiles in the order that they were added to a packfile store. - A list that keeps track of packfiles in most-recently-used order so that packfiles that are more likely to contain a specific object are ordered towards the front. Both of these lists are hosted by `struct packed_git` itself, So to identify all packfiles in a repository you simply need to grab the first packfile and then iterate the `->next` pointers or the MRU list. This pattern has the problem that all packfiles are part of the same list, regardless of whether or not they belong to the same object source. With the upcoming pluggable object database effort this needs to change: packfiles should be contained by a single object source, and reading an object from any such packfile should use that source to look up the object. Consequently, we need to break up the global lists of packfiles into per-object-source lists. A first step towards this goal is to move those lists out of `struct packed_git` and into the packfile store. While the packfile store is currently sitting on the `struct object_database` level, the intent is to push it down one level into the `struct odb_source` in a subsequent patch series. Introduce a new `struct packfile_list` that is used to manage lists of packfiles and use it to store the list of most-recently-used packfiles in `struct packfile_store`. For now, the new list type is only used in a single spot, but we'll expand its usage in subsequent patches. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-30	fast-import: mark strings for translation	Christian Couder	1	-140/+140
	Some error or warning messages in "builtin/fast-import.c" are marked for translation, but many are not. To be more consistent and provide a better experience to people using a translated version, let's mark all the remaining error or warning messages for translation. While at it, let's make the following small changes: - replace "GIT" or "git" in a few error messages to just "Git", - replace "Expected from command, got %s" to "expected 'from' command, got '%s'", which makes it clearer that "from" is a command and should not be translated, - downcase error and warning messages that start with an uppercase, - fix test cases in "t9300-fast-import.sh" that broke because an error or warning message was downcased, - split error and warning messages that are too long, - adjust the indentation of some arguments of the error functions. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-30	fast-export: mark strings for translation	Christian Couder	1	-39/+40
	Some error or warning messages in "builtin/fast-export.c" are marked for translation, but many are not. To be more consistent and provide a better experience to people using a translated version, let's mark all the remaining error or warning messages for translation. While at it: - improve how some arguments to some error functions are indented, - remove "Error:" at the start of an error message, - downcase error and warning messages that start with an uppercase. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-29	Merge branch 'tb/incremental-midx-part-3.1'	Junio C Hamano	1	-1266/+94
	Clean-up "git repack" machinery to prepare for incremental update of midx files. * tb/incremental-midx-part-3.1: (49 commits) builtin/repack.c: clean up unused `#include`s repack: move `write_cruft_pack()` out of the builtin repack: move `write_filtered_pack()` out of the builtin repack: move `pack_kept_objects` to `struct pack_objects_args` repack: move `finish_pack_objects_cmd()` out of the builtin builtin/repack.c: pass `write_pack_opts` to `finish_pack_objects_cmd()` repack: extract `write_pack_opts_is_local()` repack: move `find_pack_prefix()` out of the builtin builtin/repack.c: use `write_pack_opts` within `write_cruft_pack()` builtin/repack.c: introduce `struct write_pack_opts` repack: 'write_midx_included_packs' API from the builtin builtin/repack.c: inline packs within `write_midx_included_packs()` builtin/repack.c: pass `repack_write_midx_opts` to `midx_included_packs` builtin/repack.c: inline `remove_redundant_bitmaps()` builtin/repack.c: reorder `remove_redundant_bitmaps()` repack: keep track of MIDX pack names using existing_packs builtin/repack.c: use a string_list for 'midx_pack_names' builtin/repack.c: extract opts struct for 'write_midx_included_packs()' builtin/repack.c: remove ref snapshotting from builtin repack: remove pack_geometry API from the builtin ...
2025-10-28	bisect: update usage and docs to match each other	Ruoyu Zhong	1	-8/+13
	Update the usage string of `git bisect` and documentation to match each other. While at it, also: 1. Move the synopsis of `git bisect` subcommands to the synopsis section, so that the test `t0450-txt-doc-vs-help.sh` can pass. 2. Document the `git bisect next` subcommand, which exists in the code but is missing from the documentation. See also: [1]. [1]: https://lore.kernel.org/git/3DA38465-7636-4EEF-B074-53E4628F5355@gmail.com/ Suggested-by: Ben Knoble <ben.knoble@gmail.com> Signed-off-by: Ruoyu Zhong <zhongruoyu@outlook.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-28	Merge branch 'cc/fast-import-strip-signed-tags'	Junio C Hamano	2	-4/+46
	"git fast-import" is taught to handle signed tags, just like it recently learned to handle signed commits, in different ways. * cc/fast-import-strip-signed-tags: fast-import: add '--signed-tags=<mode>' option fast-export: handle all kinds of tag signatures t9350: properly count annotated tags lib-gpg: allow tests with GPGSM or GPGSSH prereq first doc: git-tag: stop focusing on GPG signed tags
2025-10-28	Merge branch 'ds/sparse-checkout-clean'	Junio C Hamano	1	-56/+160
	"git sparse-checkout" subcommand learned a new "clean" action to prune otherwise unused working-tree files that are outside the areas of interest. * ds/sparse-checkout-clean: sparse-index: improve advice message instructions t: expand tests around sparse merges and clean sparse-index: point users to new 'clean' action sparse-checkout: add --verbose option to 'clean' dir: add generic "walk all files" helper sparse-checkout: match some 'clean' behavior sparse-checkout: add basics of 'clean' command sparse-checkout: remove use of the_repository
2025-10-28	Merge branch 'ps/remove-packfile-store-get-packs' into ↵	Junio C Hamano	9	-1318/+125
	ps/packed-git-in-object-store * ps/remove-packfile-store-get-packs: (55 commits) packfile: rename `packfile_store_get_all_packs()` packfile: introduce macro to iterate through packs packfile: drop `packfile_store_get_packs()` builtin/grep: simplify how we preload packs builtin/gc: convert to use `packfile_store_get_all_packs()` object-name: convert to use `packfile_store_get_all_packs()` builtin/repack.c: clean up unused `#include`s repack: move `write_cruft_pack()` out of the builtin repack: move `write_filtered_pack()` out of the builtin repack: move `pack_kept_objects` to `struct pack_objects_args` repack: move `finish_pack_objects_cmd()` out of the builtin builtin/repack.c: pass `write_pack_opts` to `finish_pack_objects_cmd()` repack: extract `write_pack_opts_is_local()` repack: move `find_pack_prefix()` out of the builtin builtin/repack.c: use `write_pack_opts` within `write_cruft_pack()` builtin/repack.c: introduce `struct write_pack_opts` repack: 'write_midx_included_packs' API from the builtin builtin/repack.c: inline packs within `write_midx_included_packs()` builtin/repack.c: pass `repack_write_midx_opts` to `midx_included_packs` builtin/repack.c: inline `remove_redundant_bitmaps()` ...
2025-10-24	builtin/maintenance: introduce "geometric" strategy	Patrick Steinhardt	1	-0/+31
	We have two different repacking strategies in Git: - The "gc" strategy uses git-gc(1). - The "incremental" strategy uses multi-pack indices and `git multi-pack-index repack` to merge together smaller packfiles as determined by a specific batch size. The former strategy is our old and trusted default, whereas the latter has historically been used for our scheduled maintenance. But both strategies have their shortcomings: - The "gc" strategy performs regular all-into-one repacks. Furthermore it is rather inflexible, as it is not easily possible for a user to enable or disable specific subtasks. - The "incremental" strategy is not a full replacement for the "gc" strategy as it doesn't know to prune stale data. So today, we don't have a strategy that is well-suited for large repos while being a full replacement for the "gc" strategy. Introduce a new "geometric" strategy that aims to fill this gap. This strategy invokes all the usual cleanup tasks that git-gc(1) does like pruning reflogs and rerere caches as well as stale worktrees. But where it differs from both the "gc" and "incremental" strategy is that it uses our geometric repacking infrastructure exposed by git-repack(1) to repack packfiles. The advantage of geometric repacking is that we only need to perform an all-into-one repack when the object count in a repo has grown significantly. One downside of this strategy is that pruning of unreferenced objects is not going to happen regularly anymore. Every geometric repack knows to soak up all loose objects regardless of their reachability, and merging two or more packs doesn't consider reachability, either. Consequently, the number of unreachable objects will grow over time. This is remedied by doing an all-into-one repack instead of a geometric repack whenever we determine that the geometric repack would end up merging all packfiles anyway. This all-into-one repack then performs our usual reachability checks and writes unreachable objects into a cruft pack. As cruft packs won't ever be merged during geometric repacks we can thus phase out these objects over time. Of course, this still means that we retain unreachable objects for far longer than with the "gc" strategy. But the maintenance strategy is intended especially for large repositories, where the basic assumption is that the set of unreachable objects will be significantly dwarfed by the number of reachable objects. If this assumption is ever proven to be too disadvantageous we could for example introduce a time-based strategy: if the largest packfile has not been touched for longer than $T, we perform an all-into-one repack. But for now, such a mechanism is deferred into the future as it is not clear yet whether it is needed in the first place. Signed-off-by: Patrick Steinhardt <ps@pks.im> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-24	builtin/maintenance: make "gc" strategy accessible	Patrick Steinhardt	1	-3/+6
	While the user can pick the "incremental" maintenance strategy, it is not possible to explicitly use the "gc" strategy. This has two downsides: - It is impossible to use the default "gc" strategy for a specific repository when the strategy was globally set to a different strategy. - It is not possible to use git-gc(1) for scheduled maintenance. Address these issues by making making the "gc" strategy configurable. Furthermore, extend the strategy so that git-gc(1) runs for both manual and scheduled maintenance. Signed-off-by: Patrick Steinhardt <ps@pks.im> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-24	builtin/maintenance: extend "maintenance.strategy" to manual maintenance	Patrick Steinhardt	1	-5/+20
	The "maintenance.strategy" configuration allows users to configure how Git is supposed to perform repository maintenance. The idea is that we provide a set of high-level strategies that may be useful in different contexts, like for example when handling a large monorepo. Furthermore, the strategy can be tweaked by the user by overriding specific tasks. In its current form though, the strategy only applies to scheduled maintenance. This creates something of a gap, as scheduled and manual maintenance will now use _different_ strategies as the latter would continue to use git-gc(1) by default. This makes the strategies way less useful than they could be on the one hand. But even more importantly, the two different strategies might clash with one another, where one of the strategies performs maintenance in such a way that it discards benefits from the other strategy. So ideally, it should be possible to pick one strategy that then applies globally to all the different ways that we perform maintenance. This doesn't necessarily mean that the strategy always does the _same_ thing for every maintenance type. But it means that the strategy can configure the different types to work in tandem with each other. Change the meaning of "maintenance.strategy" accordingly so that the strategy is applied to both types, manual and scheduled. As preceding commits have introduced logic to run maintenance tasks depending on this type we can tweak strategies so that they perform those tasks depending on the context. Note that this raises the question of backwards compatibility: when the user has configured the "incremental" strategy we would have ignored that strategy beforehand. Instead, repository maintenance would have continued to use git-gc(1) by default. But luckily, we can match that behaviour by: - Keeping all current tasks of the incremental strategy as `MAINTENANCE_TYPE_SCHEDULED`. This ensures that those tasks will not run during manual maintenance. - Configuring the "gc" task so that it is invoked during manual maintenance. Like this, the user shouldn't observe any difference in behaviour. Signed-off-by: Patrick Steinhardt <ps@pks.im> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-24	builtin/maintenance: run maintenance tasks depending on type	Patrick Steinhardt	1	-9/+19
	We basically have three different ways to execute repository maintenance: 1. Manual maintenance via `git maintenance run`. 2. Automatic maintenance via `git maintenance run --auto`. 3. Scheduled maintenance via `git maintenance run --schedule=`. At the moment, maintenance strategies only have an effect for the last type of maintenance. This is about to change in subsequent commits, but to do so we need to be able to skip some tasks depending on how exactly maintenance was invoked. Introduce a new maintenance type that discern between manual (1 & 2) and scheduled (3) maintenance. Convert the `enabled` field into a bitset so that it becomes possible to specifiy which tasks exactly should run in a specific context. The types picked for existing strategies match the status quo: - The default strategy is only ever executed as part of a manual maintenance run. It is not possible to use it for scheduled maintenance. - The incremental strategy is only ever executed as part of a scheduled maintenance run. It is not possible to use it for manual maintenance. The strategies will be tweaked in subsequent commits to make use of this new infrastructure. Signed-off-by: Patrick Steinhardt <ps@pks.im> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-24	builtin/maintenance: improve readability of strategies	Patrick Steinhardt	1	-11/+25
	Our maintenance strategies are essentially a large array of structures, where each of the tasks can be enabled and scheduled individually. With the current layout though all the configuration sits on the same nesting layer, which makes it a bit hard to discern which initialized fields belong to what task. Improve readability of the individual tasks by using nested designated initializers instead. Suggested-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-24	builtin/maintenance: don't silently ignore invalid strategy	Patrick Steinhardt	1	-6/+11
	When parsing maintenance strategies we completely ignore the user-configured value in case it is unknown to us. This makes it basically undiscoverable to the user that scheduled maintenance is devolving into a no-op. Change this to instead die when seeing an unknown maintenance strategy. While at it, pull out the parsing logic into a separate function so that we can reuse it in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-24	builtin/maintenance: make the geometric factor configurable	Patrick Steinhardt	1	-1/+8
	The geometric repacking task uses a factor of two for its geometric sequence, meaning that each next pack must contain at least twice as many objects as the next-smaller one. In some cases it may be helpful to configure this factor though to reduce the number of packfile merges even further, e.g. in very big repositories. But while git-repack(1) itself supports doing this, the maintenance task does not give us a way to tune it. Introduce a new "maintenance.geometric-repack.splitFactor" configuration to plug this gap. Signed-off-by: Patrick Steinhardt <ps@pks.im> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-24	builtin/maintenance: introduce "geometric-repack" task	Patrick Steinhardt	1	-0/+102
	Introduce a new "geometric-repack" task. This task uses our geometric repack infrastructure as provided by git-repack(1) itself, which is a strategy that especially hosting providers tend to use to amortize the costs of repacking objects. There is one issue though with geometric repacks, namely that they unconditionally pack all loose objects, regardless of whether or not they are reachable. This is done because it means that we can completely skip the reachability step, which significantly speeds up the operation. But it has the big downside that we are unable to expire objects over time. To address this issue we thus use a split strategy in this new task: whenever a geometric repack would merge together all packs, we instead do an all-into-one repack. By default, these all-into-one repacks have cruft packs enabled, so unreachable objects would now be written into their own pack. Consequently, they won't be soaked up during geometric repacking anymore and can be expired with the next full repack, assuming that their expiry date has surpassed. Signed-off-by: Patrick Steinhardt <ps@pks.im> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-24	builtin/gc: make `too_many_loose_objects()` reusable without GC config	Patrick Steinhardt	1	-4/+4
	To decide whether or not a repository needs to be repacked we estimate the number of loose objects. If the number exceeds a certain threshold we perform the repack, otherwise we don't. This is done via `too_many_loose_objects()`, which takes as parameter the `struct gc_config`. This configuration is only used to determine the threshold. In a subsequent commit we'll add another caller of this function that wants to pass a different limit than the one stored in that structure. Refactor the function accordingly so that we only take the limit as parameter instead of the whole structure. Signed-off-by: Patrick Steinhardt <ps@pks.im> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-24	builtin/gc: remove global `repack` variable	Patrick Steinhardt	1	-29/+45
	The global `repack` variable is used to store all command line arguments that we eventually want to pass to git-repack(1). It is being appended to from multiple different functions, which makes it hard to follow the logic. Besides being hard to follow, it also makes it unnecessarily hard to reuse this infrastructure in new code. Refactor the code so that we store this variable on the stack and pass a pointer to it around as needed. This is done so that we can reuse `add_repack_all_options()` in a subsequent commit. The refactoring itself is straight-forward. One function that deserves attention though is `need_to_gc()`: this function determines whether or not we need to execute garbage collection for `git gc --auto`, but also for `git maintenance run --auto`. But besides figuring out whether we have to perform GC, the function also sets up the `repack` arguments. For `git gc --auto` it's trivial to adapt, as we already have the on-stack variable at our fingertips. But for the maintenance condition it's less obvious what to do. As it turns out, we can just use another temporary variable there that we then immediately discard. If we need to perform GC we execute a child git-gc(1) process to repack objects for us, and that process will have to recompute the arguments anyway. Signed-off-by: Patrick Steinhardt <ps@pks.im> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-22	Merge branch 'bc/sha1-256-interop-01'	Junio C Hamano	1	-1/+10
	The beginning of SHA1-SHA256 interoperability work. * bc/sha1-256-interop-01: t1010: use BROKEN_OBJECTS prerequisite t: allow specifying compatibility hash fsck: consider gpgsig headers expected in tags rev-parse: allow printing compatibility hash docs: add documentation for loose objects docs: improve ambiguous areas of pack format documentation docs: reflect actual double signature for tags docs: update offset order for pack index v3 docs: update pack index v3 format
2025-10-22	bisect: fix handling of `help` and invalid subcommands	Ruoyu Zhong	1	-1/+5
	As documented in git-bisect(1), `git bisect help` should display usage information. However, since the migration of `git bisect` to a full builtin command in 73fce29427 (Turn `git bisect` into a full built-in, 2022-11-10), this behavior was broken. Running `git bisect help` would, instead of showing usage, either fail silently if already in a bisect session, or otherwise trigger an interactive autostart prompt asking "Do you want me to do it for you [Y/n]?". Similarly, since df63421be9 (bisect--helper: handle states directly, 2022-11-10), running invalid subcommands like `git bisect foobar` also led to the same behavior. This occurred because `help` and other unrecognized subcommands were being unconditionally passed to `bisect_state`, which then called `bisect_autostart`, triggering the interactive prompt. Fix this by: 1. Adding explicit handling for the `help` subcommand to show usage; 2. Validating that unrecognized commands are actually valid state commands before calling `bisect_state`; 3. Showing an error with usage for truly invalid commands. This ensures that `git bisect help` displays the usage as documented, and invalid commands fail cleanly without entering interactive mode. Alternate terms are still handled correctly through `check_and_set_terms`. Signed-off-by: Ruoyu Zhong <zhongruoyu@outlook.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-22	commit-graph: add new config for changed-paths & recommend it in scalar	Emily Yang	1	-0/+2
	The changed-path Bloom filters feature has proven stable and reliable over several years of use, delivering significant performance improvement for file history computation in large monorepos. Currently a user can opt-in to writing the changed-path Bloom filters using the "--changed-paths" option to "git commit-graph write". The filters will be persisted until the user drops the filters using the "--no-changed-paths" option. For this functionality, refer to 0087a87ba8 (commit-graph: persist existence of changed-paths, 2020-07-01). Large monorepos using Git's background maintenance to build and update commit-graph files could use an easy switch to enable this feature without a foreground computation. In this commit, we're proposing a new config option "commitGraph.changedPaths": * If "true", "git commit-graph write" will write Bloom filters, equivalent to passing "--changed-paths"; * If "false" or "unset", Bloom filters will be written during "git commit-graph write" only if the filters already exist in the current commit-graph file. This matches the default behaviour of "git commit-graph write" without any "--[no-]changed-paths" option. Note "false" can disable a previous "true" config value but doesn't imply "--no-changed-paths". This config will always respect the precedence of command line option "--[no-]changed-paths". We also set this new config as optional recommended config in scalar to turn on this feature for large repos. Helped-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Emily Yang <emilyyang.git@gmail.com> Acked-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-22	Merge branch 'jt/repo-structure' into ps/ref-peeled-tags	Junio C Hamano	1	-3/+377
	* jt/repo-structure: builtin/repo: add progress meter for structure stats builtin/repo: add keyvalue and nul format for structure stats builtin/repo: add object counts in structure output builtin/repo: introduce structure subcommand ref-filter: export ref_kind_from_refname() ref-filter: allow NULL filter pattern builtin/repo: rename repo_info() to cmd_repo_info()
2025-10-22	Merge branch 'tb/incremental-midx-part-3.1' into ps/ref-peeled-tags	Junio C Hamano	1	-1266/+94
	* tb/incremental-midx-part-3.1: (49 commits) builtin/repack.c: clean up unused `#include`s repack: move `write_cruft_pack()` out of the builtin repack: move `write_filtered_pack()` out of the builtin repack: move `pack_kept_objects` to `struct pack_objects_args` repack: move `finish_pack_objects_cmd()` out of the builtin builtin/repack.c: pass `write_pack_opts` to `finish_pack_objects_cmd()` repack: extract `write_pack_opts_is_local()` repack: move `find_pack_prefix()` out of the builtin builtin/repack.c: use `write_pack_opts` within `write_cruft_pack()` builtin/repack.c: introduce `struct write_pack_opts` repack: 'write_midx_included_packs' API from the builtin builtin/repack.c: inline packs within `write_midx_included_packs()` builtin/repack.c: pass `repack_write_midx_opts` to `midx_included_packs` builtin/repack.c: inline `remove_redundant_bitmaps()` builtin/repack.c: reorder `remove_redundant_bitmaps()` repack: keep track of MIDX pack names using existing_packs builtin/repack.c: use a string_list for 'midx_pack_names' builtin/repack.c: extract opts struct for 'write_midx_included_packs()' builtin/repack.c: remove ref snapshotting from builtin repack: remove pack_geometry API from the builtin ...
2025-10-21	builtin/repo: add progress meter for structure stats	Justin Tobler	1	-6/+40
	When using the structure subcommand for git-repo(1), evaluating a repository may take some time depending on its shape. Add a progress meter to provide feedback to the user about what is happening. The progress meter is enabled by default when the command is executed from a tty. It can also be explicitly enabled/disabled via the --[no-]progress option. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-21	builtin/repo: add keyvalue and nul format for structure stats	Justin Tobler	1	-4/+51
	All repository structure stats are outputted in a human-friendly table form. This format is not suitable for machine parsing. Add a --format option that supports three output modes: `table`, `keyvalue`, and `nul`. The `table` mode is the default format and prints the same table output as before. With the `keyvalue` mode, each line of output contains a key-value pair of a repository stat. The '=' character is used to delimit between keys and values. The `nul` mode is similar to `keyvalue`, but key-values are delimited by a NUL character instead of a newline. Also, instead of a '=' character to delimit between keys and values, a newline character is used. This allows stat values to support special characters without having to cquote them. These two new modes provides output that is more machine-friendly. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-21	builtin/repo: add object counts in structure output	Justin Tobler	1	-6/+99
	The amount of objects in a repository can provide insight regarding its shape. To surface this information, use the path-walk API to count the number of reachable objects in the repository by object type. All regular references are used to determine the reachable set of objects. The object counts are appended to the same table containing the reference information. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-21	builtin/repo: introduce structure subcommand	Justin Tobler	1	-0/+200
	The structure of a repository's history can have huge impacts on the performance and health of the repository itself. Currently, Git lacks a means to surface repository metrics regarding its structure/shape via a single command. Acquiring this information requires users to be familiar with the relevant data points and the various Git commands required to surface them. To fill this gap, supplemental tools such as git-sizer(1) have been developed. To allow users to more readily identify repository structure related information, introduce the "structure" subcommand in git-repo(1). The goal of this subcommand is to eventually provide similar functionality to git-sizer(1), but natively in Git. The initial version of this command only iterates through all references in the repository and tracks the count of branches, tags, remote refs, and other reference types. The corresponding information is displayed in a human-friendly table formatted in a very similar manner to git-sizer(1). The width of each table column is adjusted automatically to satisfy the requirements of the widest row contained. Subsequent commits will surface additional relevant data points to output and also provide other more machine-friendly output formats. Based-on-patch-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-21	builtin/repo: rename repo_info() to cmd_repo_info()	Justin Tobler	1	-3/+3
	Subcommand functions are often prefixed with `cmd_` to denote that they are an entrypoint. Rename repo_info() to cmd_repo_info() accordingly. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-21	Merge branch 'tb/incremental-midx-part-3.1' into ps/maintenance-geometric	Junio C Hamano	12	-1302/+150
	* tb/incremental-midx-part-3.1: (64 commits) builtin/repack.c: clean up unused `#include`s repack: move `write_cruft_pack()` out of the builtin repack: move `write_filtered_pack()` out of the builtin repack: move `pack_kept_objects` to `struct pack_objects_args` repack: move `finish_pack_objects_cmd()` out of the builtin builtin/repack.c: pass `write_pack_opts` to `finish_pack_objects_cmd()` repack: extract `write_pack_opts_is_local()` repack: move `find_pack_prefix()` out of the builtin builtin/repack.c: use `write_pack_opts` within `write_cruft_pack()` builtin/repack.c: introduce `struct write_pack_opts` repack: 'write_midx_included_packs' API from the builtin builtin/repack.c: inline packs within `write_midx_included_packs()` builtin/repack.c: pass `repack_write_midx_opts` to `midx_included_packs` builtin/repack.c: inline `remove_redundant_bitmaps()` builtin/repack.c: reorder `remove_redundant_bitmaps()` repack: keep track of MIDX pack names using existing_packs builtin/repack.c: use a string_list for 'midx_pack_names' builtin/repack.c: extract opts struct for 'write_midx_included_packs()' builtin/repack.c: remove ref snapshotting from builtin repack: remove pack_geometry API from the builtin ...
2025-10-20	Merge branch 'tb/cat-file-objectmode-update'	Junio C Hamano	1	-1/+1
	Code clean-up. * tb/cat-file-objectmode-update: builtin/cat-file.c: simplify calling `report_object_status()`
2025-10-16	packfile: rename `packfile_store_get_all_packs()`	Patrick Steinhardt	2	-4/+4
	In a preceding commit we have removed `packfile_store_get_packs()`. With this function removed it's somewhat useless to still have the "all" infix in `packfile_store_get_all_packs()`. Rename the latter to drop that infix. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	packfile: introduce macro to iterate through packs	Patrick Steinhardt	6	-47/+26
	We have a bunch of different sites that want to iterate through all packs of a given `struct packfile_store`. This pattern is somewhat verbose and repetitive, which makes it somewhat cumbersome. Introduce a new macro `repo_for_each_pack()` that removes some of the boilerplate. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/grep: simplify how we preload packs	Patrick Steinhardt	1	-1/+1
	When using multiple threads in git-grep(1) we eagerly preload both the gitmodules file as well as the packfiles so that the threads won't race with one another to initialize these data structures. For packfiles, this is done by calling `packfile_store_get_packs()`, which first loads our packfiles and then returns a pointer to the first such packfile. This pointer is ignored though, as all we really care about is that `packfile_store_prepare()` was called. Historically, that function was file-local to "packfile.c", but that changed with 4188332569 (packfile: move `get_multi_pack_index()` into "midx.c", 2025-09-02). We can thus simplify the code by calling that function directly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/gc: convert to use `packfile_store_get_all_packs()`	Patrick Steinhardt	1	-1/+1
	When running maintenance tasks via git-maintenance(1) we have a couple of auto-conditions that check whether or not a specific task should be running. One such check is for incremental repacks, which essentially use `git multi-pack-index repack` to repack a set of smaller packfiles into one larger packfile. The auto-condition for this task checks how many packfiles there are that aren't indexed by any multi-pack index. If there is a sufficient number then we execute the above command to combine those into a single pack and add that pack to the MIDX. As we don't care about MIDX'd packs we use `packfile_store_get_packs()`, which knows to not load any packs that are indexed by a MIDX. But as explained in the preceding commit, we want to get rid of that function. We already handle packfiles that have a MIDX by the very nature of this function, as we explicitly count non-MIDX'd packs. As such, we can trivially switch over to use `packfile_store_get_all_packs()` instead. Do so. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	Merge branch 'tb/incremental-midx-part-3.1' into ↵	Junio C Hamano	12	-1302/+150
	ps/remove-packfile-store-get-packs * tb/incremental-midx-part-3.1: (64 commits) builtin/repack.c: clean up unused `#include`s repack: move `write_cruft_pack()` out of the builtin repack: move `write_filtered_pack()` out of the builtin repack: move `pack_kept_objects` to `struct pack_objects_args` repack: move `finish_pack_objects_cmd()` out of the builtin builtin/repack.c: pass `write_pack_opts` to `finish_pack_objects_cmd()` repack: extract `write_pack_opts_is_local()` repack: move `find_pack_prefix()` out of the builtin builtin/repack.c: use `write_pack_opts` within `write_cruft_pack()` builtin/repack.c: introduce `struct write_pack_opts` repack: 'write_midx_included_packs' API from the builtin builtin/repack.c: inline packs within `write_midx_included_packs()` builtin/repack.c: pass `repack_write_midx_opts` to `midx_included_packs` builtin/repack.c: inline `remove_redundant_bitmaps()` builtin/repack.c: reorder `remove_redundant_bitmaps()` repack: keep track of MIDX pack names using existing_packs builtin/repack.c: use a string_list for 'midx_pack_names' builtin/repack.c: extract opts struct for 'write_midx_included_packs()' builtin/repack.c: remove ref snapshotting from builtin repack: remove pack_geometry API from the builtin ...
2025-10-16	builtin/repack.c: clean up unused `#include`s	Taylor Blau	1	-9/+0
	Over the past several dozen commits, we have moved a large amount of functionality out of the repack builtin and into other files like repack.c, repack-cruft.c, repack-filtered.c, repack-midx.c, and repack-promisor.c. These files specify the minimal set of `#include`s that they need to compile successfully, but we did not change the set of `#include`s in the repack builtin itself. Now that the code movement is complete, let's clean up that set of `#include`s and trim down the builtin to include the minimal amount of external headers necessary to compile. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	repack: move `write_cruft_pack()` out of the builtin	Taylor Blau	1	-94/+0
	In an identical fashion as the previous commit, move the function `write_cruft_pack()` into its own compilation unit, and make the function visible through the repack.h API. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	repack: move `write_filtered_pack()` out of the builtin	Taylor Blau	1	-46/+0
	In a similar fashion as in previous commits, move the function `write_filtered_pack()` out of the builtin and into its own compilation unit. This function is now part of the repack.h API, but implemented in its own "repack-filtered.c" unit as it is a separate component from other kinds of repacking operations. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	repack: move `pack_kept_objects` to `struct pack_objects_args`	Taylor Blau	1	-13/+7
	The "pack_kept_objects" variable is defined as static to the repack builtin, but is inherently related to the pack-objects arguments that the builtin uses when generating new packs. Move that field into the "struct pack_objects_args", and shuffle around where we append the corresponding command-line option when preparing a pack-objects process. Specifically: - `write_cruft_pack()` always wants to pass "--honor-pack-keep", so explicitly set the `pack_kept_objects` field to "0" when initializing the `write_pack_opts` struct before calling `write_cruft_pack()`. - `write_filtered_pack()` no longer needs to handle writing the command-line option "--honor-pack-keep" when preparing a pack-objects process, since its call to `prepare_pack_objects()` will have already taken care of that. `write_filtered_pack()` also reads the `pack_kept_objects` field to determine whether to write the existing kept packs with a leading "^" character, so update that to read through the `po_args` pointer instead. - `cmd_repack()` also no longer has to write the "--honor-pack-keep" flag explicitly, since this is also handled via its call to `prepare_pack_objects()`. Since there is a default value for "pack_kept_objects" that relies on whether or not we are writing a bitmap (and not writing a MIDX), extract a default initializer for `struct pack_objects_args` that keeps this conditional default behavior. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	repack: move `finish_pack_objects_cmd()` out of the builtin	Taylor Blau	1	-33/+0
	In a similar spirit as the previous commit(s), now that the function `finish_pack_objects_cmd()` has no explicit dependencies within the repack builtin, let's extract it. This prepares us to extract the remaining two functions within the repack builtin that explicitly write packfiles, which are `write_cruft_pack()` and `write_filtered_pack()`, which will be done in the future commits. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: pass `write_pack_opts` to `finish_pack_objects_cmd()`	Taylor Blau	1	-12/+20
	To prepare to move the `finish_pack_objects_cmd()` function out of the builtin and into the repack.h API, there are a couple of things we need to do first: - First, let's take advantage of `write_pack_opts_is_local()` function introduced in the previous commit instead of passing "local" explicitly. - Let's also avoid referring to the static 'packtmp' field within builtin/repack.c by instead accessing it through the write_pack_opts argument. There are three callers which need to adjust themselves in order to account for this change. The callers which reside in write_cruft_pack() and write_filtered_pack() both already have an "opts" in scope, so they can pass it through transparently. The other call (at the bottom of `cmd_repack()`) needs to initialize its own write_pack_opts to pass the necessary fields over to the direct call to `finish_pack_objects_cmd()`. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	repack: extract `write_pack_opts_is_local()`	Taylor Blau	1	-4/+2
	Similar to the previous commit, the functions `write_cruft_pack()` and `write_filtered_pack()` both compute a "local" variable via the exact same mechanism: const char *scratch; int local = skip_prefix(opts->destination, opts->packdir, &scratch); Not only does this cause us to repeat the same pair of lines, it also introduces an unnecessary "scratch" variable that is common between both functions. Instead of repeating ourselves, let's extract that functionality into a new function in the repack.h API called "write_pack_opts_is_local()". That function takes a pointer to a "struct write_pack_opts" (which has as fields both "destination" and "packdir"), and can encapsulate the dangling "scratch" field. Extract that function and make it visible within the repack.h API, and use it within both `write_cruft_pack()` and `write_filtered_pack()`. While we're at it, match our modern conventions by returning a "bool" instead of "int", and use `starts_with()` instead of `skip_prefix()` to avoid storing the dummy "scratch" variable. The remaining duplication (that is, that both `write_cruft_pack()` and `write_filtered_pack()` still both call `write_pack_opts_is_local()`) will be addressed in the following commit. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	repack: move `find_pack_prefix()` out of the builtin	Taylor Blau	1	-16/+4
	Both callers within the repack builtin which call functions that take a 'write_pack_opts' structure have the following pattern: struct write_pack_opts opts = { .packdir = packdir, .packtmp = packtmp, .pack_prefix = find_pack_prefix(packdir, packtmp), /* ... / }; int ret = write_some_kind_of_pack(&opts, / ... */); , but both "packdir" and "packtmp" are fields within the write_pack_opts struct itself! Instead of also computing the pack_prefix ahead of time, let's have the callees compute it themselves by moving `find_pack_prefix()` out of the repack builtin, and have it take a write_pack_opts pointer instead of the "packdir" and "packtmp" fields directly. This avoids the callers having to do some prep work that is common between the two of them, but also avoids the potential pitfall of accidentally writing: .pack_prefix = find_pack_prefix(packtmp, packdir), (which is well-typed) when the caller meant to instead write: .pack_prefix = find_pack_prefix(packdir, packtmp), Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: use `write_pack_opts` within `write_cruft_pack()`	Taylor Blau	1	-13/+14
	Similar to the changes made in the previous commit to `write_filtered_pack()`, teach `write_cruft_pack()` to take a `write_pack_opts` struct and use that where possible. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: introduce `struct write_pack_opts`	Taylor Blau	1	-14/+16
	There are various functions within the 'repack' builtin which are responsible for writing different kinds of packs. They include: - `static int write_filtered_pack(...)` - `static int write_cruft_pack(...)` as well as the function `finish_pack_objects_cmd()`, which is responsible for finalizing a new pack write, and recording the checksum of its contents in the 'names' list. Both of these `write_` functions have a few things in common. They both take a pointer to the 'pack_objects_args' struct, as well as a pair of character pointers for `destination` and `pack_prefix`. Instead of repeating those arguments for each function, let's extract an options struct called "write_pack_opts" which has these three parameters as member fields. While we're at it, add fields for "packdir," and "packtmp", both of which are static variables within the builtin, and need to be read from within these two functions. This will shorten the list of parameters that callers have to provide to `write_filtered_pack()`, avoid ambiguity when passing multiple variables of the same type, and provide a unified interface for the two functions mentioned earlier. (Note that "pack_prefix" can be derived on the fly as a function of "packdir" and "packtmp", making it unnecessary to store "pack_prefix" explicitly. This commit ignores that potential cleanup in the name of doing as few things as possible, but a later commit will make that change.) Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	repack: 'write_midx_included_packs' API from the builtin	Taylor Blau	1	-305/+0
	Now that we have sufficiently cleaned up the write_midx_included_packs() function, we can move it (along with the struct repack_write_midx_opts) out of the builtin, and into the repack.h header. Since this function (and the static ones that it depends on) are MIDX-specific details of the repacking process, move them to the repack-midx.c compilation unit instead of the general repack.c one. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: inline packs within `write_midx_included_packs()`	Taylor Blau	1	-9/+8
	To write a MIDX at the end of a repack operation, 'git repack' presently computes the set of packs to write into the MIDX, before invoking `write_midx_included_packs()` with a `string_list` containing those packs. The logic for computing which packs are supposed to appear in the resulting MIDX is within `midx_included_packs()`, where it is aware of details like which cruft pack(s) were written/combined, if/how we did a geometric repack, etc. Computing this list ourselves before providing it to the sole function to make use of that list `write_midx_included_packs()` is somewhat awkward. In the future, repack will learn how to write incremental MIDXs, which will use a very different pack selection routine. Instead of doing something like: struct string_list included_packs = STRING_LIST_INIT_DUP; if (incremental) { midx_incremental_included_packs(&included_packs, ...): write_midx_incremental_included_packs(&included_packs, ...); } else { midx_included_packs(&included_packs, ...): write_midx_included_packs(&included_packs, ...); } in the future, let's have each function that writes a MIDX be responsible for itself computing the list of included packs. Inline the declaration and initialization of `included_packs` into the `write_midx_included_packs()` function itself, and repeat that pattern in the future when we introduce new ways to write MIDXs. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: pass `repack_write_midx_opts` to `midx_included_packs`	Taylor Blau	1	-5/+8
	Instead of passing individual parameters (in this case, "existing", "names", and "geometry") to `midx_included_packs()`, pass a pointer to a `repack_write_midx_opts` structure instead. Besides reducing the number of parameters necessary to call the `midx_included_packs` function, this refactoring sets us up nicely to inline the call to `midx_included_packs()` into `write_midx_included_packs()`, thus making the caller (in this case, `cmd_repack()`) oblivious to the set of packs being written into the MIDX. In order to do this, `repack_write_midx_opts` has to keep track of the set of existing packs, so add an additional field to point to that set. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: inline `remove_redundant_bitmaps()`	Taylor Blau	1	-7/+8
	After writing a new MIDX, the repack command removes any bitmaps belonging to packs which were written into the MIDX. This is currently done in a separate function outside of `write_midx_included_packs()`, which forces the caller to keep track of the set of packs written into the MIDX. Prepare to no longer require the caller to keep track of such information by inlining the clean-up into `write_midx_included_packs()`. Future commits will make the caller oblivious to the set of packs included in the MIDX altogether. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: reorder `remove_redundant_bitmaps()`	Taylor Blau	1	-29/+29
	The next commit will inline the call to `remove_redundant_bitmaps()` into `write_midx_included_packs()`. Reorder these two functions to avoid a forward declaration to `remove_redundant_bitmaps()`. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	repack: keep track of MIDX pack names using existing_packs	Taylor Blau	1	-22/+4
	Instead of storing the list of MIDX pack names separately, let's inline it into the existing_packs struct, further reducing the number of parameters we have to pass around. This amounts to adding a new string_list to the existing_packs struct, and populating it via `existing_packs_collect()`. This is fairly straightforward to do, since we are already looping over all packs, all we need to do is: if (p->multi_pack_index) string_list_append(&existing->midx_packs, pack_basename(p)); Note, however, that this check must come before other conditions where we discard and do not keep track of a pack, including the condition "if (!p->pack_local)" immediately below. This is because the existing routine which collects MIDX pack names does so blindly, and does not discard, for example, non-local packs. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: use a string_list for 'midx_pack_names'	Taylor Blau	1	-23/+17
	When writing a new MIDX, repack must determine whether or not there are any packs in the MIDX it is replacing (if one exists) that are not somehow represented in the new MIDX (e.g., either by preserving the pack verbatim, or rolling it up as part of a geometric repack, etc.). In order to do this, it keeps track of a list of pack names from the MIDX present in the repository at the start of the repack operation. Since we manipulate and close the object store, we cannot rely on the repository's in-core representation of the MIDX, since this is subject to change and/or go away. When this behavior was introduced in 5ee86c273b (repack: exclude cruft pack(s) from the MIDX where possible, 2025-06-23), we maintained an array of character pointers instead of using a convenience API, such as string-list.h. Store the list of MIDX pack names in a string_list, thereby reducing the number of parameters we have to pass to `midx_has_unknown_packs()`. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: extract opts struct for 'write_midx_included_packs()'	Taylor Blau	1	-18/+34
	The function 'write_midx_included_packs()', which is responsible for writing a new MIDX with a given set of included packs, currently takes a list of six arguments. In order to extract this function out of the builtin, we have to pass in a few additional parameters, like 'midx_must_contain_cruft' and 'packdir', which are currently declared as static variables within the builtin/repack.c compilation unit. Instead of adding additional parameters to `write_midx_included_packs()` extract out an "opts" struct that names these parameters, and pass a pointer to that, making it less cumbersome to add additional parameters. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: remove ref snapshotting from builtin	Taylor Blau	1	-68/+0
	When writing a MIDX, 'git repack' takes a snapshot of the repository's references and writes the result out to a file, which it then passes to 'git multi-pack-index write' via the '--refs-snapshot'. This is done in order to make bitmap selections with respect to what we are packing, thus avoiding a race where an incoming reference update causes us to try and write a bitmap for a commit not present in the MIDX. Extract this functionality out into a new repack-midx.c compilation unit, and expose the necessary functions via the repack.h API. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	repack: remove pack_geometry API from the builtin	Taylor Blau	1	-235/+0
	Now that the pack_geometry API is fully factored and isolated from the rest of the builtin, declare it within repack.h and move its implementation to "repack-geometry.c" as a separate component. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: pass 'packdir' to `pack_geometry_remove_redundant()`	Taylor Blau	1	-2/+3
	For similar reasons as the preceding commit, pass the "packdir" variable directly to `pack_geometry_remove_redundant()` as a parameter to the function. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: pass 'pack_kept_objects' to `pack_geometry_init()`	Taylor Blau	1	-2/+4
	Prepare to move pack_geometry-related APIs to their own compilation unit by passing in the static "pack_kept_objects" variable directly as a parameter to the 'pack_geometry_init()' function. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: rename various pack_geometry functions	Taylor Blau	1	-26/+26
	Rename functions which work with 'struct pack_geometry' to begin with "pack_geometry_". While we're at it, change `free_pack_geometry()` to instead be named `pack_geometry_release()` to match our conventions, and make clear that that function frees the contents of the struct, not the memory allocated to hold the struct itself. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: remove "repack_promisor_objects()" from the builtin	Taylor Blau	1	-95/+0
	Now that we have properly factored the portion of the builtin which is responsible for repacking promisor objects, we can move that function (and associated dependencies) out of the builtin entirely. Similar to previous extractions, this function is declared in repack.h, but implemented in a separate repack-promisor.c file. This is done to separate promisor-specific repacking functionality from generic repack utilities (like "existing_packs", and "generated_pack" APIs). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: pass "packtmp" to `repack_promisor_objects()`	Taylor Blau	1	-2/+3
	In a similar spirit as previous commit(s), pass the "packtmp" variable to "repack_promisor_objects()" as an explicit parameter of the function, preparing us to move this function in a following commit. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	repack: remove 'generated_pack' API from the builtin	Taylor Blau	1	-83/+0
	Now that we have factored the "generated_pack" API, we can move it to repack.ch, further slimming down builtin/repack.c. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: provide pack locations to `generated_pack_install()`	Taylor Blau	1	-2/+4
	Repeat what was done in the preceding commit for the `generated_pack_install()` function, which needs both "packdir" and "packtmp". (As an aside, it is somewhat unfortunate that the final three parameters to this function are all "const char *", making errors like passing "packdir" and "packtmp" in the wrong order easy. We could define a new structure here, but that may be too heavy-handed.) Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: pass "packtmp" to `generated_pack_populate()`	Taylor Blau	1	-3/+4
	In a similar spirit as previous commits, this function needs to know the temporary pack prefix, which it currently accesses through the static "packtmp" variable within builtin/repack.c. Pass it explicitly as a function parameter to facilitate moving this function out of builtin/repack.c entirely. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: factor out "generated_pack_install"	Taylor Blau	1	-30/+35
	Once all new packs are known to exist, 'repack' installs their contents from their temporary location into their permanent one. This is a semi-involved procedure for each pack, since for each extension (e.g., ".idx", ".pack", ".mtimes", and so on) we have to either: - adjust the filemode of the temporary file before renaming it into place, or - die() if we are missing a non-optional extension, or - unlink() any existing file for extensions that we did not generate (e.g., if a non-cruft pack we generated was identical to, say, a cruft pack which existed at the beginning of the process, we have to remove the ".mtimes" file). Extract this procedure into its own function, and call it "generated_pack_install"(). This will set us up for pulling this function out of the builtin entirely and making it part of the repack.h API, which will be done in a future commit. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: rename "struct generated_pack_data"	Taylor Blau	1	-16/+16
	The name "generated_pack_data" is somewhat redundant, since the contents of the struct is the data associated with the generated pack. Rename the structure to just "generated_pack", resulting in less awkward function names, like "generated_pack_has_ext()" which is preferable to "generated_pack_data_has_ext()". Rename a few related functions to align with the convention that functions to do with a struct "S" should be prefixed with "S_". Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	repack: remove 'existing_packs' API from the builtin	Taylor Blau	1	-173/+0
	The repack builtin defines an API for keeping track of which packs were found in the repository at the beginning of the repack operation. This is used to classify what state a pack was in (kept, non-kept, or cruft), and is also used to mark which packs to delete (or keep) at the end of a repack operation. Now that the prerequisite refactoring is complete, this API is isolated enough that it can be moved out to repack.[ch] and removed from the builtin entirely. As a result, some of its functions become static within repack.c, cleaning up the visible API. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: avoid unnecessary numeric casts in existing_packs	Taylor Blau	1	-2/+2
	There are a couple of spots that cause warnings within the existing_packs API without DISABLE_SIGN_COMPARE_WARNINGS under DEVELOPER=1 mode. In both cases, we have int values that are being compared against size_t ones. Neither of these two cases are incorrect, and the cast is completely OK in practice. But both are unnecessary, since: - in existing_packs_mark_for_deletion_1(), 'hexsz' should be defined as a size_t anyway, since algop->hexsz is. - in existing_packs_collect(), 'i' should be defined as a size_t since it is counting up to the value of a string_list's 'nr' field. (This patch is a little bit of noise, but I would rather see us squelch these warnings ahead of moving the existing_packs API into a separate compilation unit to avoid having to define DISABLE_SIGN_COMPARE_WARNINGS in repack.c.) Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: pass "packdir" when removing packs	Taylor Blau	1	-5/+9
	builtin/repack.c defines a static "packdir" to instruct pack-objects on where to write any new packfiles. This is also the directory scanned when removing any packfiles which were made redundant by the latest repack. Prepare to move the "existing_packs_remove_redundant" function to its own compilation unit by passing in this information as a parameter to that function. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	repack: remove 'remove_redundant_pack' from the builtin	Taylor Blau	1	-16/+2
	Extract "remove_redundant_pack()" as generic repack-related functionality by moving its implementation to the repack.[ch] compilation unit. This is a prerequisite to moving the "existing_packs" API, which is one of the callers of this function. (The remaining caller in the pack geometry code will eventually move to its own compilation unit as well, and will likewise rely on this function.) While moving it over, prefix the function name with "repack_" to indicate that it belongs to the repack-subsystem. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: rename many 'struct existing_packs' functions	Taylor Blau	1	-32/+34
	Rename many of the 'struct existing_packs'-related functions according to the convention introduced in and described by 541204aabe (Documentation: document naming schema for structs and their functions, 2024-07-30). Note that some functions which operate over an individual entry in the list of existing packs are prefixed with "existing_pack_" instead of the plural form. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	repack: remove 'prepare_pack_objects' from the builtin	Taylor Blau	1	-34/+0
	Now that the 'prepare_pack_objects' function no longer refers to external, static variables, move it out to repack.h as generic functionality. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	repack: move 'delta_base_offset' to 'struct pack_objects_args'	Taylor Blau	1	-5/+6
	The static variable 'delta_base_offset' determines whether or not we pass the "--delta-base-offset" command-line argument when spawning pack-objects as a child process. Its introduction dates back to when repack was rewritten in C, all the way back in a1bbc6c017 (repack: rewrite the shell script in C, 2013-09-15). 'struct pack_objects_args' was introduced much later on in 4571324b99 (builtin/repack.c: allow configuring cruft pack generation, 2022-05-20), but did not move the 'delta_base_offset' variable. Since the 'delta_base_offset' is a property of an individual pack-objects command, re-introduce that variable as a member of 'struct pack_objects_args', which will enable further code movement in the subsequent commits. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: pass both pack_objects args to repack_config	Taylor Blau	1	-2/+13
	A subsequent commit will remove 'delta_base_offset' as a static variable within builtin/repack.c, and reintroduce it as a member of the 'struct pack_objects_args'. As a result, the repack_config callback will need to have both the cruft- and non-cruft 'struct pack_objects_args's in scope. Introduce a new 'struct repack_config_ctx' to allow the callee to provide both pointers to the callback. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	repack: introduce new compilation unit	Taylor Blau	1	-24/+1
	Over the years, builtin/repack.c has turned into a grab-bag of functionality powering the 'git repack' builtin. Among its many capabilities, it: - can build and spawn 'git pack-objects' commands, which in turn generate new packs - has infrastructure to manage the set of existing packs in a repository - has infrastructure to split a sequence of packs into a geometric progression based on object size - can manage both generating and combining cruft packs together - can write new MIDXs to name a few. As a result, this builtin has accumulated a lot of code, making adding new functionality difficult. In the future, 'repack' will learn how to manage a chain of incremental MIDXs, adding yet more functionality into the builtin. As a prerequisite step, let's first move some of the functionality in the builtin into its own repack.[ch]. This will be done over the course of many steps, since there are many individual components, some of which will end up in other, yet-to-exist compilation units of their own. Some of the code movement here is also non-trivial, so performing it in individual steps will make it easier to verify. Let's start by migrating 'struct pack_objects_args' (and the related corresponding pack_objects_args_release() function) into repack.h, and teach both the Makefile and Meson how to build the new compilation unit. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: avoid using `hash_to_hex()` in pack geometry	Taylor Blau	1	-1/+3
	In previous commits, we started passing either repository or git_hash_algo pointers around to various spots within builtin/repack.c to reduce our dependency on the_repository in the hope of undef'ing USE_THE_REPOSITORY_VARIABLE. This commit takes us as far as we can (easily) go in that direction by removing the only use of a convenience function that only exists when USE_THE_REPOSITORY_VARIABLE is defined. Unfortunately, the only other such function is "is_bare_repository()", which is less than straightforward to convert into, say, "repo_is_bare()", the latter of the two accepting a repository pointer. Punt on that for now, and declare this commit as the stopping point for our efforts in the direction of undef'ing USE_THE_REPOSITORY_VARIABLE. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: avoid "the_hash_algo" in `finish_pack_objects_cmd()`	Taylor Blau	1	-5/+8
	In a similar spirit as previous commits, avoid referring directly to "the_hash_algo" in builtin/repack.c::finish_pack_objects_cmd() and instead accept one as a parameter to the function. Since this function has a number of callers throughout the builtin, the diff is a little noisier than previous commits. However, each hunk is limited to passing the hash_algo parameter from a repository pointer that is already in scope. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack: avoid "the_hash_algo" in `repack_promisor_objects()`	Taylor Blau	1	-1/+1
	In a similar spirit as the previous commits, avoid referring directly to "the_hash_algo" within builtin/repack.c::repack_promisor_objects(). Since there is already a repository pointer in scope, use its hash_algo value instead. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: avoid "the_hash_algo" in `write_oid()`	Taylor Blau	1	-3/+12
	In a similar spirit as the previous commit, avoid referring directly to "the_hash_algo" within builtin/repack.c::write_oid(). Unlike the previous commit, we are within a callback function, so must introduce a new struct to pass additional data through its "data" pointer. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: avoid "the_hash_algo" when deleting packs	Taylor Blau	1	-4/+6
	The "mark_packs_for_deletion_1" function uses "the_hash_algo->hexsz" to isolate a pack's checksum before deleting it to avoid deleting a newly written pack having the same checksum (that is, some generated pack wound up identical to an existing pack). Avoid this by passing down a "struct git_hash_algo" pointer, and refer to the hash algorithm through it instead. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: avoid "the_repository" when repacking promisor objects	Taylor Blau	1	-3/+4
	Pass a "struct repository" pointer to the 'repack_promisor_objects()' function to avoid using "the_repository". Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: avoid "the_repository" when removing packs	Taylor Blau	1	-8/+10
	The 'remove_redundant_pack()' function uses "the_repository" to obtain, and optionally remove, the repository's MIDX. Instead of relying on "the_repository", pass around a "struct repository *" parameter through its callers, and use that instead. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: avoid "the_repository" when taking a ref snapshot	Taylor Blau	1	-7/+9
	Avoid using "the_repository" in various MIDX-related ref snapshotting functions. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: avoid "the_repository" in existing packs API	Taylor Blau	1	-3/+5
	There are a number of spots within builtin/repack.c which refer to "the_repository", and either make use of the "existing packs" API or otherwise have a 'struct existing_packs *' in scope. Add a "repo" member to "struct existing_packs" and use that instead of "the_repository" in such locations. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-16	builtin/repack.c: avoid "the_repository" in `cmd_repack()`	Taylor Blau	1	-15/+16
	Reduce builtin/repack.c's reliance on `the_repository` by using the currently-UNUSED "repo" parameter within cmd_repack(). The following commits will continue to reduce the usage of the_repository in other places within builtin/repack.c. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-15	Merge branch 'ps/odb-clean-stale-wrappers' into maint-2.51	Junio C Hamano	1	-3/+3
	Code clean-up. * ps/odb-clean-stale-wrappers: odb: drop deprecated wrapper functions
2025-10-15	Merge branch 'kn/refs-files-case-insensitive' into maint-2.51	Junio C Hamano	1	-3/+18
	Deal more gracefully with directory / file conflicts when the files backend is used for ref storage, by failing only the ones that are involved in the conflict while allowing others. * kn/refs-files-case-insensitive: refs/files: handle D/F conflicts during locking refs/files: handle F/D conflicts in case-insensitive FS refs/files: use correct error type when lock exists refs/files: catch conflicts on case-insensitive file-systems
2025-10-15	Merge branch 'jk/add-i-color' into maint-2.51	Junio C Hamano	1	-1/+4
	Some among "git add -p" and friends ignored color.diff and/or color.ui configuration variables, which is an old regression, which has been corrected. * jk/add-i-color: contrib/diff-highlight: mention interactive.diffFilter add-interactive: manually fall back color config to color.ui add-interactive: respect color.diff for diff coloring stash: pass --no-color to diff plumbing child processes
2025-10-15	Merge branch 'jc/diff-no-index-in-subdir' into maint-2.51	Junio C Hamano	1	-0/+15
	"git diff --no-index" run inside a subdirectory under control of a Git repository operated at the top of the working tree and stripped the prefix from the output, and oddballs like "-" (stdin) did not work correctly because of it. Correct the set-up by undoing what the set-up sequence did to cwd and prefix. * jc/diff-no-index-in-subdir: diff: --no-index should ignore the worktree
2025-10-15	Merge branch 'ps/reflog-migrate-fixes' into maint-2.51	Junio C Hamano	1	-19/+84
	"git refs migrate" to migrate the reflog entries from a refs backend to another had a handful of bugs squashed. * ps/reflog-migrate-fixes: refs: fix invalid old object IDs when migrating reflogs refs: stop unsetting REF_HAVE_OLD for log-only updates refs/files: detect race when generating reflog entry for HEAD refs: fix identity for migrated reflogs ident: fix type of string length parameter builtin/reflog: implement subcommand to write new entries refs: export `ref_transaction_update_reflog()` builtin/reflog: improve grouping of subcommands Documentation/git-reflog: convert to use synopsis type
2025-10-14	Merge branch 'kh/format-patch-range-diff-notes'	Junio C Hamano	2	-12/+11
	"git format-patch --range-diff=... --notes=..." did not drive the underlying range-diff with correct --notes parameter, ending up comparing with different set of notes from its main patch output you would get from "git format-patch --notes=..." for a singleton patch. * kh/format-patch-range-diff-notes: format-patch: handle range-diff on notes correctly for single patches revision: add rdiff_log_arg to rev_info range-diff: rename other_arg to log_arg
2025-10-13	builtin/cat-file.c: simplify calling `report_object_status()`	Taylor Blau	1	-1/+1
	In b0b910e052 (cat-file.c: add batch handling for submodules, 2025-06-02), we began handling submodule entries specially when batching cat-file like so: $ echo :sha1collisiondetection \| git.compile cat-file --batch-check 855827c583bc30645ba427885caa40c5b81764d2 submodule Commit b0b910e052 notes that submodules are handled differently than non-existent objects, which print "<given-name> <type>", since there is (a) no object to resolve the OID of in the first place, and as commit b0b910e052 notes, (b) for submodules in particular, it is useful to know what commit it points at without having to spawn another Git process. That commit does so by calling report_object_status() and passing in "oid_to_hex(&data->oid)" for the "obj_name" parameter. This is unnecessary, however, since report_object_status() will do the same automatically if given a NULL "obj_name" argument. That behavior dates back to 6a951937ae (cat-file: add --batch-all-objects option, 2015-06-22), so rely on that instead of having the caller open-code that part of report_object_status(). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-13	fast-import: add '--signed-tags=<mode>' option	Christian Couder	1	-0/+43
	Recently, eaaddf5791 (fast-import: add '--signed-commits=<mode>' option, 2025-09-17) added support for controlling how signed commits are handled by `git fast-import`, but there is no option yet to decide about signed tags. To remediate that, let's add a '--signed-tags=<mode>' option to `git fast-import` too. With this, both `git fast-export` and `git fast-import` have both a '--signed-tags=<mode>' and a '--signed-commits=<mode>' supporting the same <mode>s. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-13	fast-export: handle all kinds of tag signatures	Christian Couder	1	-4/+3
	Currently the handle_tag() function in "builtin/fast-export.c" searches only for "\n-----BEGIN PGP SIGNATURE-----\n" in the tag message to find a tag signature. This doesn't handle all kinds of OpenPGP signatures as some can start with "-----BEGIN PGP MESSAGE-----" too, and this doesn't handle SSH and X.509 signatures either as they use "-----BEGIN SSH SIGNATURE-----" and "-----BEGIN SIGNED MESSAGE-----" respectively. To handle all these kinds of tag signatures supported by Git, let's use the parse_signed_buffer() function to properly find signatures in tag messages. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-09	rev-parse: allow printing compatibility hash	brian m. carlson	1	-1/+10
	Right now, we have a way to print the storage hash, the input hash, and the output hash, but we lack a way to print the compatibility hash. Add a new type to --show-object-format, compat, which prints this value. If no compatibility hash exists, simply print a newline. This is important to allow users to use multiple options at once while still getting unambiguous output. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-08	Merge branch 'ml/reflog-write-committer-info-fix'	Junio C Hamano	1	-0/+2
	"git reflog write" did not honor the configured user.name/email which has been corrected. * ml/reflog-write-committer-info-fix: builtin/reflog: respect user config in "write" subcommand
2025-10-07	Merge branch 'ps/odb-clean-stale-wrappers'	Junio C Hamano	1	-3/+3
	Code clean-up. * ps/odb-clean-stale-wrappers: odb: drop deprecated wrapper functions
2025-10-07	Merge branch 'ps/packfile-store'	Junio C Hamano	12	-43/+63
	Code clean-up around the in-core list of all the pack files and object database(s). * ps/packfile-store: packfile: refactor `get_packed_git_mru()` to work on packfile store packfile: refactor `get_all_packs()` to work on packfile store packfile: refactor `get_packed_git()` to work on packfile store packfile: move `get_multi_pack_index()` into "midx.c" packfile: introduce function to load and add packfiles packfile: refactor `install_packed_git()` to work on packfile store packfile: split up responsibilities of `reprepare_packed_git()` packfile: refactor `prepare_packed_git()` to work on packfile store packfile: reorder functions to avoid function declaration odb: move kept cache into `struct packfile_store` odb: move MRU list of packfiles into `struct packfile_store` odb: move packfile map into `struct packfile_store` odb: move initialization bit into `struct packfile_store` odb: move list of packfiles into `struct packfile_store` packfile: introduce a new `struct packfile_store`
2025-10-02	Merge branch 'kh/you-still-use-whatchanged-fix'	Junio C Hamano	2	-2/+8
	The "do you still use it?" message given by a command that is deeply deprecated and allow us to suggest alternatives has been updated. * kh/you-still-use-whatchanged-fix: BreakingChanges: remove claim about whatchanged reports whatchanged: remove not-even-shorter clause whatchanged: hint about git-log(1) and aliasing you-still-use-that??: help the user help themselves t0014: test shadowing of aliases for a sample of builtins git: allow alias-shadowing deprecated builtins git: move seen-alias bookkeeping into handle_alias(...) git: add `deprecated` category to --list-cmds Makefile: don’t add whatchanged after it has been removed
2025-10-02	Merge branch 'ps/config-get-color-fixes'	Junio C Hamano	1	-5/+15
	The use of "git config get" command to learn how ANSI color sequence is for a particular type, e.g., "git config get --type=color --default=reset no.such.thing", isn't very ergonomic. * ps/config-get-color-fixes: builtin/config: do not spawn pager when printing color codes builtin/config: special-case retrieving colors without a key builtin/config: do not die in `get_color()` t1300: small style fixups t1300: write test expectations in the test's body
2025-10-02	Merge branch 'cc/fast-import-strip-signed-commits'	Junio C Hamano	2	-24/+58
	"git fast-import" learned that "--signed-commits=<how>" option that corresponds to that of "git fast-export". * cc/fast-import-strip-signed-commits: fast-import: add '--signed-commits=<mode>' option gpg-interface: refactor 'enum sign_mode' parsing
2025-10-02	Merge branch 'ms/refs-optimize'	Junio C Hamano	2	-49/+22
	"git refs optimize" is added for not very well explained reason despite it does the same thing as "git pack-refs"... * ms/refs-optimize: t: add test for git refs optimize subcommand t0601: refactor tests to be shareable builtin/refs: add optimize subcommand doc: pack-refs: factor out common options builtin/pack-refs: factor out core logic into a shared library builtin/pack-refs: convert to use the generic refs_optimize() API reftable-backend: implement 'optimize' action files-backend: implement 'optimize' action refs: add a generic 'optimize' API
2025-10-02	Merge branch 'jt/odb-transaction'	Junio C Hamano	3	-17/+24
	The work to build on the bulk-checkin infrastructure to create many objects at once in a transaction and to abstract it into the generic object layer continues. * jt/odb-transaction: odb: add transaction interface object-file: update naming from bulk-checkin object-file: relocate ODB transaction code bulk-checkin: drop flush_odb_transaction() builtin/update-index: end ODB transaction when --verbose is specified bulk-checkin: remove ODB transaction nesting
2025-10-01	builtin/reflog: respect user config in "write" subcommand	Michael Lohmann	1	-0/+2
	The reflog write recognizes only GIT_COMMITTER_NAME and GIT_COMMITTER_EMAIL environment variables, but forgot to honor the user.name and user.email configuration variables, due to lack of repo_config() call to grab these values from the configuration files. The test suite sets these variables, so this behavior was unnoticed. Ensure that the reflog write also uses the values of user.name and user.email if set in the Git configuration. Co-authored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Michael Lohmann <git@lohmann.sh> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-29	Merge branch 'tc/last-modified-recursive-fix'	Junio C Hamano	1	-0/+1
	"git last-modified" operating in non-recursive mode used to trigger a BUG(), which has been corrected. * tc/last-modified-recursive-fix: last-modified: fix bug when some paths remain unhandled
2025-09-29	Merge branch 'kn/refs-files-case-insensitive'	Junio C Hamano	1	-3/+18
	Deal more gracefully with directory / file conflicts when the files backend is used for ref storage, by failing only the ones that are involved in the conflict while allowing others. * kn/refs-files-case-insensitive: refs/files: handle D/F conflicts during locking refs/files: handle F/D conflicts in case-insensitive FS refs/files: use correct error type when lock exists refs/files: catch conflicts on case-insensitive file-systems
2025-09-29	Merge branch 'jk/color-variable-fixes'	Junio C Hamano	10	-23/+25
	Some places in the code confused a variable that is not a boolean to enable color but is an enum that records what the user requested to do about color. A couple of bugs of this sort have been fixed, while the code has been cleaned up to prevent similar bugs in the future. * jk/color-variable-fixes: config: store want_color() result in a separate bool add-interactive: retain colorbool values longer color: return bool from want_color() color: use git_colorbool enum type to store colorbools pretty: use format_commit_context.auto_color as colorbool diff: stop passing ecbdata->use_color as boolean diff: pass o->use_color directly to fill_metainfo() diff: don't use diff_options.use_color as a strict bool diff: simplify color_moved check when flushing grep: don't treat grep_opt.color as a strict bool color: return enum from git_config_colorbool() color: use GIT_COLOR_* instead of numeric constants
2025-09-29	Merge branch 'dk/stash-apply-index'	Junio C Hamano	1	-6/+11
	The stash.index configuration variable can be set to make "git stash pop/apply" pretend that it was invoked with "--index". * dk/stash-apply-index: stash: honor stash.index in apply, pop modes stash: refactor private config globals t3905: remove unneeded blank line t3903: reduce dependencies on previous tests
2025-09-29	Merge branch 'jk/setup-revisions-freefix'	Junio C Hamano	5	-16/+10
	There are double frees and leaks around setup_revisions() API used in "git stash show", which has been fixed, and setup_revisions() API gained a wrapper to make it more ergonomic when using it with strvec-manged argc/argv pairs. * jk/setup-revisions-freefix: revision: retain argv NULL invariant in setup_revisions() treewide: pass strvecs around for setup_revisions_from_strvec() treewide: use setup_revisions_from_strvec() when we have a strvec revision: add wrapper to setup_revisions() from a strvec revision: manage memory ownership of argv in setup_revisions() stash: tell setup_revisions() to free our allocated strings
2025-09-29	Merge branch 'ps/packfile-store' into tb/incremental-midx-part-3.1	Junio C Hamano	12	-43/+63
	* ps/packfile-store: packfile: refactor `get_packed_git_mru()` to work on packfile store packfile: refactor `get_all_packs()` to work on packfile store packfile: refactor `get_packed_git()` to work on packfile store packfile: move `get_multi_pack_index()` into "midx.c" packfile: introduce function to load and add packfiles packfile: refactor `install_packed_git()` to work on packfile store packfile: split up responsibilities of `reprepare_packed_git()` packfile: refactor `prepare_packed_git()` to work on packfile store packfile: reorder functions to avoid function declaration odb: move kept cache into `struct packfile_store` odb: move MRU list of packfiles into `struct packfile_store` odb: move packfile map into `struct packfile_store` odb: move initialization bit into `struct packfile_store` odb: move list of packfiles into `struct packfile_store` packfile: introduce a new `struct packfile_store`
2025-09-25	revision: add rdiff_log_arg to rev_info	Kristoffer Haugsbakk	1	-4/+3
	git-format-patch(1) supports Git notes by showing them beneath the patch/commit message, similar to git-log(1). The command also supports showing those same notes ref names in the range diff output. Note the same ref names; any Git notes options or configuration variables need to be handed off to the range-diff machinery. This works correctly in the case when the range diff is on the cover letter. But it does not work correctly when the output is a single patch with an embedded range diff. Concretely, git-format-patch(1) needs to pass `--[no-]notes` options on to the range-diff subprocess in `range-diff.c`. This is handled in `builtin/log.c` by the local variable `log_arg` in the case of mul- tiple commits, but not in the single commit case where there is no cover letter and the range diff is embedded in the patch output; the range diff is then made in `log-tree.c`, whither `log_arg` has not been propagated. This means that the range-diff subprocess reverts to its default behavior, which is to act like git-log(1) w.r.t. notes. We need to fix this. But first lay the groundwork by converting `log_arg` to a struct member; next we can simply use that member in `log-tree.c` without having to thread it from `builtin/log.c`. No functional changes. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-25	range-diff: rename other_arg to log_arg	Kristoffer Haugsbakk	2	-12/+12
	Rename `other_arg` to `log_arg` in `range_diff_options` and related places. “Other argument” comes from bd361918 (range-diff: pass through --notes to `git log`, 2019-11-20) which introduced Git notes handling to git-range-diff(1) by passing that option on to git-log(1). And that kind of name might be fine in a local context. However, it was initially spread among multiple files, and is now[1] part of the `range_diff_options` struct. It is, prima facie, difficult to guess what “other” means, especially when just looking at the struct. But with a little reading we find out that it is used for `--[no-]notes` and `--diff-merges`, which are both passed on to git-log(1). We should just rename it to reflect this role; `log_arg` suggests, along with the `strvec` type, that it is used to pass extra arguments to git-log(1). † 1: since f1ce6c19 (range-diff: combine all options in a single data structure, 2021-02-05) Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24	packfile: refactor `get_packed_git_mru()` to work on packfile store	Patrick Steinhardt	1	-2/+2
	The `get_packed_git_mru()` function prepares the packfile store and then returns its packfiles in most-recently-used order. Refactor it to accept a packfile store instead of a repository to clarify its scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24	packfile: refactor `get_all_packs()` to work on packfile store	Patrick Steinhardt	8	-25/+49
	The `get_all_packs()` function prepares the packfile store and then returns its packfiles. Refactor it to accept a packfile store instead of a repository to clarify its scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24	packfile: refactor `get_packed_git()` to work on packfile store	Patrick Steinhardt	2	-2/+2
	The `get_packed_git()` function prepares the packfile store and then returns its packfiles. Refactor it to accept a packfile store instead of a repository to clarify its scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24	packfile: introduce function to load and add packfiles	Patrick Steinhardt	2	-9/+5
	We have a recurring pattern where we essentially perform an upsert of a packfile in case it isn't yet known by the packfile store. The logic to do so is non-trivial as we have to reconstruct the packfile's key, check the map of packfiles, then create the new packfile and finally add it to the store. Introduce a new function that does this dance for us. Refactor callsites to use it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24	packfile: refactor `install_packed_git()` to work on packfile store	Patrick Steinhardt	2	-2/+2
	The `install_packed_git()` functions adds a packfile to a specific object store. Refactor it to accept a packfile store instead of a repository to clarify its scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24	packfile: split up responsibilities of `reprepare_packed_git()`	Patrick Steinhardt	4	-5/+5
	In `reprepare_packed_git()` we perform a couple of operations: - We reload alternate object directories. - We clear the loose object cache. - We reprepare packfiles. While the logic is hosted in "packfile.c", it clearly reaches into other subsystems that aren't related to packfiles. Split up the responsibility and introduce `odb_reprepare()` which now becomes responsible for repreparing the whole object database. The existing `reprepare_packed_git()` function is refactored accordingly and only cares about reloading the packfile store now. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-23	Merge branch 'rs/get-oid-with-flags-cleanup'	Junio C Hamano	3	-19/+9
	Code clean-up. * rs/get-oid-with-flags-cleanup: use repo_get_oid_with_flags()
2025-09-23	Merge branch 'jk/add-i-color'	Junio C Hamano	1	-1/+4
	Some among "git add -p" and friends ignored color.diff and/or color.ui configuration variables, which is an old regression, which has been corrected. * jk/add-i-color: contrib/diff-highlight: mention interactive.diffFilter add-interactive: manually fall back color config to color.ui add-interactive: respect color.diff for diff coloring stash: pass --no-color to diff plumbing child processes
2025-09-22	treewide: pass strvecs around for setup_revisions_from_strvec()	Jeff King	2	-5/+4
	The previous commit converted callers of setup_revisions() with a strvec to use the safer and easier _from_strvec() variant. Let's now convert spots that don't directly have a strvec, but receive an argc/argv pair that eventually comes from one. We'll instead pass the strvec down to the point where we call setup_revisions(). That makes these functions slightly less flexible if they were to grow other callers that don't use strvecs, but this rigidity is buying us some safety. It is only safe to pass the free_removed_argv_elements option to setup_revisions() if we know the elements of argv/argc are allocated on the heap. That isn't communicated in the type system when we are passed the bare elements. But if we get a strvec, we know that the elements are allocated strings. And at any rate, each of these modified functions has only a single caller (that has a strvec), so the loss of flexibility is unlikely to ever matter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-22	treewide: use setup_revisions_from_strvec() when we have a strvec	Jeff King	1	-1/+2
	The previous commit introduced a wrapper to make using setup_revisions() with a strvec easier and safer. It converted spots that were already doing most of what the wrapper did. Let's now convert spots where we were not setting up the free_removed_argv_elements flag. As discussed in the previous commit, this probably isn't fixing any bugs or leaks (since these sites wouldn't trigger the re-shuffling of argv that causes them). This is mostly future-proofing us against setup_revisions() becoming more aggressive about its re-shuffling. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-22	revision: add wrapper to setup_revisions() from a strvec	Jeff King	2	-11/+4
	The setup_revisions() function was designed to take the argc/argv pair from the operating system. But we sometimes construct our own argv using a strvec and pass that in. There are a few gotchas that callers need to deal with here: 1. You should always pass the free_removed_argv_elements option via setup_revision_opt. Otherwise, entries may be leaked if setup_revisions() re-shuffles options. 2. After setup_revisions() returns, the strvec state is odd. We get a reduced argc from setup_revisions() telling us how many unknown options were left in place. Entries after that in argv may be retained, or may be NULL (depending on how the reshuffling happened). But the strvec's "nr" field still represents the original value, and some of the entries it thinks it is still storing may be NULL. Callers must be careful with how they access it. Some callers deal with (1), but not all. In practice they are OK because they do not pass any options that would cause setup_revisions() to re-shuffle (namely unknown options which may be relayed from the user, and the use of the "--" separator). But it's probably a good idea to consistently pass this option anyway to future-proof ourselves against the details of setup_revisions() changing. No callers address (2), though I don't think there any visible bugs. Most of them simply call strvec_clear() and never otherwise look at the result. And in fact, if they naively set foo.nr to the argc returned by setup_revisions(), that would cause leaks! Because setup_revisions() does not free consumed options[1], we have to leave the "nr" field of the strvec at its original value to find and free them during strvec_clear(). So I don't think there are any bugs to fix here, but we can make things safer and simpler for callers. Let's introduce a helper function that sets the free_removed_argv_elements automatically and shrinks the strvec to represent the retained options afterwards (taking care to free the now-obsolete entries). We'll start by converting all of the call-sites which use the free_removed_argv_elements option. There should be no behavior change for them, except that their "shrunken" entries are cleaned up immediately, rather than waiting for a strvec_clear() call. [1] Arguably setup_revisions() should be doing this step for us if we told it to free removed options, but there are many existing callers which will be broken if it did. Introducing this helper is a possible first step towards that. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-22	stash: tell setup_revisions() to free our allocated strings	Jeff King	1	-1/+2
	In "git stash show", we do a first pass of parsing our command line options by splitting them into revision args and stash args. These are stored in strvecs, and we pass the revision args to setup_revisions(). But setup_revisions() may modify the argv we pass it, causing us to leak some of the entries. In particular, if it sees a "--" string, that will be dropped from argv. This is the same as other cases addressed by f92dbdbc6a (revisions API: don't leak memory on argv elements that need free()-ing, 2022-08-02), and we should fix it the same way: by passing the free_removed_argv_elements option to setup_revisions(). The added test here is run only with SANITIZE=leak, without checking its output, because the behavior of stash with "--" is a little odd: 1. Running "git stash show" will show --stat output. But running "git stash show --" will show --patch. 2. I'd expect a non-option after "--" to be treated as a pathspec, so: git stash show -p 1 -- foo would look treat "1" as a stash (a synonym for stash@{1}) and restrict the resulting diff to "foo". But it doesn't. We split the revision/stash args without any regard to "--". So in the example above both "1" and "foo" are stashes. Which is an error, but also: git stash show -- foo treats "foo" as a stash, not a pathspec. These are both oddities that we may want to address (or may not, if we want to retain historical quirks). But they are well outside the scope of this patch. So for now we'll just let the tests confirm we aren't leaking without otherwise expecting any behavior. If we later address either of those points and end up with another test that covers "stash show --", we can drop this leak-only test. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-22	builtin/config: do not spawn pager when printing color codes	Patrick Steinhardt	1	-1/+2
	With `git config get --type=color` the user asks us to parse a specific configuration key and turn the value into an ANSI color escape sequence. The printed string can then for example be used as part of shell scripts to reuse the same colors as Git. Right now though we set up the auto-pager, which means that the string may be written to the pager instead of directly to the terminal. This behaviour is problematic for two reasons: - Color codes are meant for direct terminal output; writing them into a pager does not seem like a sensible thing to do without additional text. - It is inconsistent with `git config --get-color`, which never uses a pager, despite the fact that we claim `git config get --type=color` to be a drop-in replacement in git-config(1). Fix this by disabling the pager when outputting color sequences. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>