aboutsummaryrefslogtreecommitdiffstats
path: root/builtin
AgeCommit message (Collapse)AuthorFilesLines
2025-09-04repo: add the field objects.formatLucas Seiki Oshiro1-0/+7
The flag `--show-object-format` from git-rev-parse is used for retrieving the object storage format. This way, it is used for querying repository metadata, fitting in the purpose of git-repo-info. Add a new field `objects.format` to the git-repo-info subcommand containing that information. Mentored-by: Karthik Nayak <karthik.188@gmail.com> Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-04repo: add the flag -z as an alias for --format=nulLucas Seiki Oshiro1-12/+26
Other Git commands that have nul-terminated output (e.g. git-config, git-status, git-ls-files) have a flag `-z` for using the null character as the record separator. Add the `-z` flag to git-repo-info as an alias for `--format=nul`, making it consistent with the behavior of the other commands. Mentored-by: Karthik Nayak <karthik.188@gmail.com> Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-02describe: use oidset in finish_depth_computation()René Scharfe1-12/+22
Depth computation can end early if all remaining commits are flagged. The current code determines if that's the case by checking all queue items each time it dequeues a flagged commit. This can cause quadratic complexity. We could simply count the flagged items in the queue and then update that number as we add and remove items. That would provide a general speedup, but leave one case where we have to scan the whole queue: When we flag a previously seen, but unflagged commit. It could be on the queue and then we'd have to decrease our count. We could dedicate an object flag to track queue membership, but that would leave less for candidate tags, affecting the results. So use a hash table, specifically an oidset of commit hashes, to track that. This avoids quadratic behaviour in all cases and provides a nice performance boost over the previous commit, 08bb69d70f (describe: use prio_queue_replace(), 2025-08-03): Benchmark 1: ./git_08bb69d70f describe $(git rev-list v2.41.0..v2.47.0) Time (mean ± σ): 855.3 ms ± 1.3 ms [User: 790.8 ms, System: 49.9 ms] Range (min … max): 853.7 ms … 857.8 ms 10 runs Benchmark 2: ./git describe $(git rev-list v2.41.0..v2.47.0) Time (mean ± σ): 610.8 ms ± 1.7 ms [User: 546.9 ms, System: 49.3 ms] Range (min … max): 608.9 ms … 613.3 ms 10 runs Summary ./git describe $(git rev-list v2.41.0..v2.47.0) ran 1.40 ± 0.00 times faster than ./git_08bb69d70f describe $(git rev-list v2.41.0..v2.47.0) Helped-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-02builtin/refs: add 'exists' subcommandMeet Soni1-0/+48
As part of the ongoing effort to consolidate reference handling, introduce a new `exists` subcommand. This command provides the same functionality and exit-code behavior as `git show-ref --exists`, serving as its modern replacement. The logic for `show-ref --exists` is minimal. Rather than creating a shared helper function which would be overkill for ~20 lines of code, its implementation is intentionally duplicated here. This contrasts with `git refs list`, where sharing the larger implementation of `for-each-ref` was necessary. Documentation for the new subcommand is also added to the `git-refs(1)` man page. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-02Merge branch 'ps/object-store-midx-dedup-info' into ps/packfile-storeJunio C Hamano4-13/+31
* ps/object-store-midx-dedup-info: midx: compute paths via their source midx: stop duplicating info redundant with its owning source midx: write multi-pack indices via their source midx: load multi-pack indices via their source midx: drop redundant `struct repository` parameter odb: simplify calling `link_alt_odb_entry()` odb: return newly created in-memory sources odb: consistently use "dir" to refer to alternate's directory odb: allow `odb_find_source()` to fail odb: store locality in object database sources
2025-08-29range-diff: add configurable memory limit for cost matrixPaulo Casaretto2-0/+22
When comparing large commit ranges (e.g., 250,000+ commits), range-diff attempts to allocate an n×n cost matrix that can exhaust available memory. For example, with 256,784 commits (n = 513,568), the matrix would require approximately 256GB of memory (513,568² × 4 bytes), causing either immediate segmentation faults due to integer overflow or system hangs. Add a memory limit check in get_correspondences() before allocating the cost matrix. This check uses the total size in bytes (n² × sizeof(int)) and compares it against a configurable maximum, preventing both excessive memory usage and integer overflow issues. The limit is configurable via a new --max-memory option that accepts human-readable sizes (e.g., "1G", "500M"). The default is 4GB for 64 bit systems and 2GB for 32 bit systems. This allows comparing ranges of approximately 32,000 (16,000) commits - generous for real-world use cases while preventing impractical operations. When the limit is exceeded, range-diff now displays a clear error message showing both the requested memory size and the maximum allowed, formatted in human-readable units for better user experience. Example usage: git range-diff --max-memory=1G branch1...branch2 git range-diff --max-memory=500M base..topic1 base..topic2 This approach was chosen over alternatives: - Pre-counting commits: Would require spawning additional git processes and reading all commits twice - Limiting by commit count: Less precise than actual memory usage - Streaming approach: Would require significant refactoring of the current algorithm This issue was previously discussed in: https://lore.kernel.org/git/RFC-cover-v2-0.5-00000000000-20211210T122901Z-avarab@gmail.com/ Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Paulo Casaretto <pcasaretto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-29Merge branch 'jk/describe-blob'Junio C Hamano1-16/+25
"git describe <blob>" misbehaves and/or crashes in some corner cases, which has been taught to exit with failure gracefully. * jk/describe-blob: describe: pass commit to describe_commit() describe: handle blob traversal with no commits describe: catch unborn branch in describe_blob() describe: error if blob not found describe: pass oid struct by const pointer
2025-08-28last-modified: use Bloom filters when availableToon Claes1-2/+46
Our 'git last-modified' performs a revision walk, and computes a diff at each point in the walk to figure out whether a given revision changed any of the paths it considers interesting. When changed-path Bloom filters are available, we can avoid computing many such diffs. Before computing a diff, we first check if any of the remaining paths of interest were possibly changed at a given commit by consulting its Bloom filter. If any of them are, we are resigned to compute the diff. If none of those queries returned "maybe", we know that the given commit doesn't contain any changed paths which are interesting to us. So, we can avoid computing it in this case. Comparing the perf test results on git.git: Test HEAD~ HEAD ------------------------------------------------------------------------------------ 8020.1: top-level last-modified 4.49(4.34+0.11) 2.22(2.05+0.09) -50.6% 8020.2: top-level recursive last-modified 5.64(5.45+0.11) 5.62(5.30+0.11) -0.4% 8020.3: subdir last-modified 0.11(0.06+0.04) 0.07(0.03+0.04) -36.4% Based-on-patch-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Toon Claes <toon@iotcl.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-28last-modified: new subcommand to show when files were last modifiedToon Claes1-0/+281
Similar to git-blame(1), introduce a new subcommand git-last-modified(1). This command shows the most recent modification to paths in a tree. It does so by expanding the tree at a given commit, taking note of the current state of each path, and then walking backwards through history looking for commits where each path changed into its final commit ID. Based-on-patch-by: Jeff King <peff@peff.net> Improved-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Toon Claes <toon@iotcl.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-28ls-files: conditionally leave index sparseDerrick Stolee1-3/+10
When running 'git ls-files' with a pathspec, the index entries get filtered according to that pathspec before iterating over them in show_files(). In 78087097b8 (ls-files: add --sparse option, 2021-12-22), this iteration was prefixed with a check for the '--sparse' option which allows the command to output directory entries; this created a pre-loop call to ensure_full_index(). However, when a user runs 'git ls-files' where the pathspec matches directories that are recursively matched in the sparse-checkout, there are not any sparse directories that match the pathspec so they would not be written to the output. The expansion in this case is just a performance drop for no behavior difference. Replace this global check to expand the index with a check inside the loop for a matched sparse directory. If we see one, then expand the index and continue from the current location. This is safe since the previous entries in the index did not have any sparse directories and thus would remain stable in this expansion. A test in t1092 confirms that this changes the behavior. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-26config: warn on core.commentString=autoPhillip Wood4-0/+16
As support for this setting was deprecated in the last commit print a warning (or die when WITH_BREAKING_CHANGES is enabled) if it is set. Avoid bombarding the user with warnings by only printing it (a) when running commands that call "git commit" and (b) only once per command. Some scaffolding is added to repo_read_config() to allow it to detect deprecated config settings and warn about them. As both "core.commentChar" and "core.commentString" set the comment character we record which one of them is used and tailor the warning message appropriately. Note the odd combination of die_message() followed by die(NULL) is to allow the next commit to insert a call to advise() in the middle. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-26breaking-changes: deprecate support for core.commentString=autoPhillip Wood1-0/+4
When "core.commentString" is set to "auto" then "git commit" will automatically select the comment character ensuring that it is not the first character on any of the lines in the commit message. This was introduced by commit 84c9dc2c5a2 (commit: allow core.commentChar=auto for character auto selection, 2014-05-17). The motivation seems to be to avoid commenting out lines from the existing message when amending a commit that was created with a message from a file. Unfortunately this feature does not work with: * commit message templates that contain comments. * prepare-commit-msg hooks that introduce comments. * "git commit --cleanup=strip --edit -F <file>" which means that it is incompatible with - the "fixup" and "squash" commands of "git rebase -i" as the comments added by those commands are then treated as part of the commit message. - the conflict comments added to the commit message by "git cherry-pick", "git rebase" etc. as these comments are then treated as part of the commit message. It is also ignored by "git notes" when amending a note. The issues with comments coming from a template, hook or file are a consequence of the design of this feature and are therefore hard to fix. As the costs of this feature outweigh the benefits, deprecate it and remove it in Git 3.0. If someone comes up with some patches that fix all the issues in a maintainable way then I'd be happy to see this change reverted. The next commits will add a warning and some advice for users on how they can update their config settings. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-25Merge branch 'lo/repo-info'Junio C Hamano1-0/+150
A new subcommand "git repo" gives users a way to grab various repository characteristics. * lo/repo-info: repo: add the --format flag repo: add the field layout.shallow repo: add the field layout.bare repo: add the field references.format repo: declare the repo command
2025-08-25Merge branch 'ps/commit-graph-wo-globals'Junio C Hamano3-6/+7
Remove dependency on the_repository and other globals from the commit-graph code, and other changes unrelated to de-globaling. * ps/commit-graph-wo-globals: commit-graph: stop passing in redundant repository commit-graph: stop using `the_repository` commit-graph: stop using `the_hash_algo` commit-graph: refactor `parse_commit_graph()` to take a repository commit-graph: store the hash algorithm instead of its length commit-graph: stop using `the_hash_algo` via macros
2025-08-25Merge branch 'dk/help-all'Junio C Hamano1-1/+2
"git cmd --help-all" now works outside repositories. * dk/help-all: builtin: also setup gently for --help-all parse-options: refactor flags for usage_with_options_internal
2025-08-25bulk-checkin: remove global transaction stateJustin Tobler3-7/+10
Object database transactions in the bulk-checkin subsystem rely on global state to track transaction status. Stop relying on global state and instead store the transaction in the `struct object_database`. Functions that operate on transactions are updated to now wire transaction state. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-22Merge branch 'ac/deglobal-fmt-merge-log-config'Junio C Hamano2-4/+5
Code clean-up. * ac/deglobal-fmt-merge-log-config: builtin/fmt-merge-msg: stop depending on 'the_repository' environment: remove the global variable 'merge_log_config'
2025-08-22Merge branch 'jc/diff-no-index-in-subdir'Junio C Hamano1-0/+15
"git diff --no-index" run inside a subdirectory under control of a Git repository operated at the top of the working tree and stripped the prefix from the output, and oddballs like "-" (stdin) did not work correctly because of it. Correct the set-up by undoing what the set-up sequence did to cwd and prefix. * jc/diff-no-index-in-subdir: diff: --no-index should ignore the worktree
2025-08-22Merge branch 'ms/refs-list'Junio C Hamano2-17/+33
The "list" subcommand of "git refs" acts as a front-end for "git for-each-ref". * ms/refs-list: t: add test for git refs list subcommand t6300: refactor tests to be shareable builtin/refs: add list subcommand builtin/for-each-ref: factor out core logic into a helper builtin/for-each-ref: align usage string with the man page doc: factor out common option
2025-08-21Merge branch 'jc/strbuf-split'Junio C Hamano3-63/+64
Arrays of strbuf is often a wrong data structure to use, and strbuf_split*() family of functions that create them often have better alternatives. Update several code paths and replace strbuf_split*(). * jc/strbuf-split: trace2: do not use strbuf_split*() trace2: trim_trailing_newline followed by trim is a no-op sub-process: do not use strbuf_split*() environment: do not use strbuf_split*() config: do not use strbuf_split() notes: do not use strbuf_split*() merge-tree: do not use strbuf_split*() clean: do not use strbuf_split*() [part 2] clean: do not pass the whole structure when it is not necessary clean: do not use strbuf_split*() [part 1] clean: do not pass strbuf by value wt-status: avoid strbuf_split*()
2025-08-21Merge branch 'jc/string-list-split'Junio C Hamano3-3/+3
string_list_split*() family of functions have been extended to simplify common use cases. * jc/string-list-split: string-list: split-then-remove-empty can be done while splitting string-list: optionally omit empty string pieces in string_list_split*() diff: simplify parsing of diff.colormovedws string-list: optionally trim string pieces split by string_list_split*() string-list: unify string_list_split* functions string-list: align string_list_split() with its _in_place() counterpart string-list: report programming error with BUG
2025-08-21Merge branch 'rs/describe-with-prio-queue'Junio C Hamano1-24/+63
"git describe" has been optimized by using better data structure. * rs/describe-with-prio-queue: describe: use prio_queue_replace() describe: use prio_queue
2025-08-21Merge branch 'ps/remote-rename-fix'Junio C Hamano4-136/+227
"git remote rename origin upstream" failed to move origin/HEAD to upstream/HEAD when origin/HEAD is unborn and performed other renames extremely inefficiently, which has been corrected. * ps/remote-rename-fix: builtin/remote: only iterate through refs that are to be renamed builtin/remote: rework how remote refs get renamed builtin/remote: determine whether refs need renaming early on builtin/remote: fix sign comparison warnings refs: simplify logic when migrating reflog entries refs: pass refname when invoking reflog entry callback
2025-08-21Merge branch 'ps/reflog-migrate-fixes'Junio C Hamano1-19/+84
"git refs migrate" to migrate the reflog entries from a refs backend to another had a handful of bugs squashed. * ps/reflog-migrate-fixes: refs: fix invalid old object IDs when migrating reflogs refs: stop unsetting REF_HAVE_OLD for log-only updates refs/files: detect race when generating reflog entry for HEAD refs: fix identity for migrated reflogs ident: fix type of string length parameter builtin/reflog: implement subcommand to write new entries refs: export `ref_transaction_update_reflog()` builtin/reflog: improve grouping of subcommands Documentation/git-reflog: convert to use synopsis type
2025-08-20Merge branch 'lo/repo-info' into lo/repo-info-step-2Junio C Hamano1-0/+150
* lo/repo-info: repo: add the --format flag repo: add the field layout.shallow repo: add the field layout.bare repo: add the field references.format repo: declare the repo command
2025-08-20describe: pass commit to describe_commit()Jeff King1-11/+9
There's a call in describe_commit() to lookup_commit_reference(), but we don't check the return value. If it returns NULL, we'll segfault as we immediately dereference the result. In practice this can never happen, since all callers pass an oid which came from a "struct commit" already. So we can make this more obvious by just taking that commit struct in the first place. Reported-by: Cheng <prophecheng@stu.pku.edu.cn> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-20describe: handle blob traversal with no commitsJeff King1-2/+4
When describing a blob, we traverse from HEAD, remembering each commit we saw, and then checking each blob to report the containing commit. But if we haven't seen any commits at all, we'll segfault (we store the "current" commit as an oid initialized to the null oid, causing lookup_commit_reference() to return NULL). This shouldn't be able to happen normally. We always start our traversal at HEAD, which must be a commit (a property which is enforced by the refs code). But you can trigger the segfault like this: blob=$(echo foo | git hash-object -w --stdin) echo $blob >.git/HEAD git describe $blob We can instead catch this case and return an empty result, which hits the usual "we didn't find $blob while traversing HEAD" error. This is a minor lie in that we did "find" the blob. And this even hints at a bigger problem in this code: what if the traversal pointed to the blob as _not_ part of a commit at all, but we had previously filled in the recorded "current commit"? One could imagine this happening due to a tag pointing directly to the blob in question. But that can't happen, because we only traverse from HEAD, never from any other refs. And the intent of the blob-describing code is to find blobs within commits. So I think this matches the original intent as closely as we can (and again, this segfault cannot be triggered without corrupting your repository!). The test here does not use the formula above, which works only for the files backend (and not reftables). Instead we use another loophole to create the bogus state using only Git commands. See the comment in the test for details. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-18describe: catch unborn branch in describe_blob()Jeff King1-1/+7
When describing a blob, we search for it by traversing from HEAD. We do this by feeding the name HEAD to setup_revisions(). But if we are on an unborn branch, this will fail with a confusing message: $ git describe $blob fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]' It is OK for this to be an error (we cannot find $blob in an empty traversal, so we'd eventually complain about that). But the error message could be more helpful. Let's resolve HEAD ourselves and pass the resolved object id to setup_revisions(). If resolving fails, then we can print a more useful message. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-18describe: error if blob not foundJeff King1-0/+3
If describe_blob() does not find the blob in question, it returns an empty strbuf, and we print an empty line. This differs from describe_commit(), which always either returns an answer or calls die() itself. As the blob function was bolted onto the command afterwards, I think its behavior is not intentional, and it is just a bug that it does not report an error. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-18describe: pass oid struct by const pointerJeff King1-4/+4
We pass a "struct object_id" to describe_blob() by value. This isn't wrong, as an oid is composed only of copy-able values. But it's unusual; typically we pass structs by const pointer, including object_ids. Let's do so. It similarly makes sense for us to hold that pointer in the callback data (rather than yet another copy of the oid). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-17repo: add the --format flagLucas Seiki Oshiro1-6/+40
Add the --format flag to git-repo-info. By using this flag, the users can choose the format for obtaining the data they requested. Given that this command can be used for generating input for other applications and for being read by end users, it requires at least two formats: one for being read by humans and other for being read by machines. Some other Git commands also have two output formats, notably git-config which was the inspiration for the two formats that were chosen here: - keyvalue, where the retrieved data is printed one per line, using = for delimiting the key and the value. This is the default format, targeted for end users. - nul, where the retrieved data is separated by NUL characters, using the newline character for delimiting the key and the value. This format is targeted for being read by machines. Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Justin Tobler <jltobler@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-17repo: add the field layout.shallowLucas Seiki Oshiro1-0/+9
This commit is part of the series that introduces the new subcommand git-repo-info. The flag `--is-shallow-repository` from git-rev-parse is used for retrieving whether the repository is shallow. This way, it is used for querying repository metadata, fitting in the purpose of git-repo-info. Then, add a new field `layout.shallow` to the git-repo-info subcommand containing that information. Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Justin Tobler <jltobler@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-17repo: add the field layout.bareLucas Seiki Oshiro1-0/+10
This commit is part of the series that introduces the new subcommand git-repo-info. The flag --is-bare-repository from git-rev-parse is used for retrieving whether the current repository is bare. This way, it is used for querying repository metadata, fitting in the purpose of git-repo-info. Then, add a new field layout.bare to the git-repo-info subcommand containing that information. Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Justin Tobler <jltobler@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-17repo: add the field references.formatLucas Seiki Oshiro1-2/+72
This commit is part of the series that introduces the new subcommand git-repo-info. The flag `--show-ref-format` from git-rev-parse is used for retrieving the reference format (i.e. `files` or `reftable`). This way, it is used for querying repository metadata, fitting in the purpose of git-repo-info. Add a new field `references.format` to the repo-info subcommand containing that information. Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Justin Tobler <jltobler@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-17repo: declare the repo commandLucas Seiki Oshiro1-0/+27
Currently, `git rev-parse` covers a wide range of functionality not directly related to parsing revisions, as its name suggests. Over time, many features like parsing datestrings, options, paths, and others were added to it because there wasn't a more appropriate command to place them. Create a new Git command called `repo`. `git repo` will be the main command for obtaining the information about a repository (such as metadata and metrics). Also declare a subcommand for `repo` called `info`. `git repo info` will bring the functionality of retrieving repository-related information currently returned by `rev-parse`. Add the required documentation and build changes to enable usage of this subcommand. Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Justin Tobler <jltobler@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-15commit-graph: stop passing in redundant repositoryPatrick Steinhardt1-3/+3
Many of the commit-graph related functions take in both a repository and the object database source (directly or via `struct commit_graph`) for which we are supposed to load such a commit-graph. In the best case this information is simply redundant as the source already contains a reference to its owning object database, which in turn has a reference to its repository. In the worst case this information could even mismatch when passing in a source that doesn't belong to the same repository. Refactor the code so that we only pass in the object database source in those cases. There is one exception though, namely `load_commit_graph_chain_fd_st()`, which is responsible for loading a commit-graph chain. It is expected that parts of the commit-graph chain aren't located in the same object source as the chain file itself, but in a different one. Consequently, this function doesn't work on the source level but on the database level instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-15commit-graph: stop using `the_repository`Patrick Steinhardt2-2/+2
There's still a bunch of uses of `the_repository` in "commit-graph.c", which we want to stop using due to it being a global variable. Refactor the code to stop using `the_repository` in favor of the repository provided via the calling context. This allows us to drop the `USE_THE_REPOSITORY_VARIABLE` macro. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-15commit-graph: stop using `the_hash_algo`Patrick Steinhardt1-1/+2
Stop using `the_hash_algo` as it implicitly relies on `the_repository`. Instead, we either use the hash algo provided via the context or, if there is no such hash algo, we use `the_repository` explicitly. Such uses will be removed in subsequent commits. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-11Merge branch 'rs/merge-compact-summary'Junio C Hamano1-1/+1
Hotfix. * rs/merge-compact-summary: merge: don't document non-existing --compact-summary argument
2025-08-11Merge branch 'rs/for-each-ref-start-after-marker-fix'Junio C Hamano1-1/+1
Hotfix. * rs/for-each-ref-start-after-marker-fix: for-each-ref: call --start-after argument "marker"
2025-08-11midx: stop duplicating info redundant with its owning sourcePatrick Steinhardt1-2/+3
Multi-pack indices store some information that is redundant with their owning source: - The locality bit that tracks whether the source is the primary object source or an alternate. - The object directory path the multi-pack index is located in. - The pointer to the owning parent directory. All of this information is already contained in `struct odb_source`. So now that we always have that struct available when loading a multi-pack index we have it readily accessible. Drop the redundant information and instead store a pointer to the object source. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-11midx: write multi-pack indices via their sourcePatrick Steinhardt2-9/+12
Similar to the preceding commit, refactor the writing side of multi-pack indices so that we pass in the object database source where the index should be written to. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-11midx: load multi-pack indices via their sourcePatrick Steinhardt1-2/+16
To load a multi-pack index the caller is expected to pass both the repository and the object directory where the multi-pack index is located. While this works, this layout has a couple of downsides: - We need to pass in information reduntant with the owning source, namely its object directory and whether the source is local or not. - We don't have access to the source when loading the multi-pack index. If we had that access, we could store a pointer to the owning source in the MIDX and thus deduplicate some information. - Multi-pack indices are inherently specific to the object source and its format. With the goal of pluggable object backends in mind we will eventually want the backends to own the logic of reading and writing multi-pack indices. Making the logic work on top of object sources is a step into that direction. Refactor loading of multi-pack indices accordingly. This surfaces one small problem though: git-multi-pack-index(1) and our MIDX test helper both know to read and write multi-pack-indices located in a different object directory. This issue is addressed by adding the user-provided object directory as an in-memory alternate. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-11midx: drop redundant `struct repository` parameterPatrick Steinhardt1-1/+1
There are a couple of functions that take both a `struct repository` and a `struct multi_pack_index`. This provides redundant information though without much benefit given that the multi-pack index already has a pointer to its owning repository. Drop the `struct repository` parameter from such functions. While at it, reorder the list of parameters of `fill_midx_entry()` so that the MIDX comes first to better align with our coding guidelines. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-11odb: allow `odb_find_source()` to failPatrick Steinhardt1-2/+2
When trying to locate a source for an unknown object directory we will die right away. In subsequent patches we will add new callsites though that want to handle this situation gracefully instead. Refactor the function to return a `NULL` pointer if the source could not be found and adapt the callsites to die instead. Introduce a new wrapper `odb_find_source_or_die()` that continues to die in case the source could not be found. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-11builtin/fmt-merge-msg: stop depending on 'the_repository'Ayush Chandekar1-3/+2
Refactor builtin/fmt-merge-msg.c to remove the dependancy on the global 'the_repository'. Remove the 'UNUSED' macro from the 'struct repository' parameter and replace 'git_config()' with 'repo_config()' so that configuration is read from the passed repository. Also, add a test to make sure that "git fmt-merge-msg -h" can be called outside a repository. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-11environment: remove the global variable 'merge_log_config'Ayush Chandekar2-2/+4
The global variable 'merge_log_config', set via the "merge.log" or "merge.summary" settings, is only used in 'cmd_fmt_merge_msg()' and 'cmd_merge()' to adjust the 'shortlog_len' variable. Remove 'merge_log_config' globally and localize it in 'cmd_fmt_merge_msg()' and 'cmd_merge()'. Set its value by passing it in 'fmt_merge_msg_config()' by passing its pointer to the function via the callback parameter. This change is part of an ongoing effort to eliminate global variables, improve modularity and help libify the codebase. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-09diff: --no-index should ignore the worktreeJunio C Hamano1-0/+15
The act of giving "--no-index" tells Git to pretend that the current directory is not under control of any Git index or repository, so even when you happen to be in a Git controlled working tree, where in that working tree should not matter. But the start-up sequence tries to discover the top of the working tree and chdir(2)'s there, even before Git passes control to the subcommand being run. When diff_no_index() starts running, it starts at a wrong (from the end-user's point of view who thinks "git diff --no-index" is merely a better version of GNU diff) directory, and the original directory the user started the command is at "prefix". Because the paths given from argv[] have already been adjusted to account for this path shuffling by prepending the prefix, and showing the resulting path by stripping the prefix, the effect of these nonsense operations (nonsense in the context of "--no-index", that is) is usually not observable. Except for special cases like "-", where it is not preprocessed by prepending the prefix. Instead of papering over by adding more special cases only to cater to the no-index codepath in the generic code, drive the diff machinery more faithfully to what is going on. If the user started "git diff --no-index" in directory X/Y/Z in a working tree controlled by Git, and the start up sequence of Git chdir(2)'ed up to directory X and left Y/Z in the prefix, revert the effect of the start up sequence by chdir'ing back to Y/Z and emptying the prefix. Reported-by: Gregoire Geis <opensource@gregoirege.is> Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-09merge: don't document non-existing --compact-summary argumentRené Scharfe1-1/+1
3a54f5bd5d (merge/pull: add the "--compact-summary" option, 2025-06-12) added the option --compact-summary to both merge and pull. It takes no no argument, but for merge it got an argument help string. Remove it, since it is unnecessary. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-09for-each-ref: call --start-after argument "marker"René Scharfe1-1/+1
dabecb9db2 (for-each-ref: introduce a '--start-after' option, 2025-07-15) added the option --start-after and referred to its argument as "marker" in documentation and usage string, but not in the option's short help. Use "marker" there as well for consistency and brevity. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-08builtin: also setup gently for --help-allD. Ben Knoble1-1/+2
Git experts often check the help summary of a command to make sure they spell options right when suggesting advice to colleagues. Further, they might check hidden options when responding to queries about deprecated options like git-rebase(1)'s "preserve merges" option. But some commands don't support "--help-all" outside of a git directory. Running (for example) git rebase --help-all outside a directory fails in "setup_git_directory", erroring with the localized form of fatal: not a git repository (or any of the parent directories): .git Like 99caeed05d (Let 'git <command> -h' show usage without a git dir, 2009-11-09), we want to show the "--help-all" output even without a git dir. Make "--help-all" where we expect "-h" to mean "setup_git_directory_gently", and interpose early in the natural place ("show_usage_with_options_if_asked"). Do the same for usage callers with show_usage_if_asked. The exception is merge-recursive, whose help block doesn't use newer APIs. Best-viewed-with: --ignore-space-change Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-07Merge branch 'dl/squelch-maybe-uninitialized'Junio C Hamano1-4/+7
Squelch false-positive compiler warning. * dl/squelch-maybe-uninitialized: t/unit-tests/clar: fix -Wmaybe-uninitialized with -Og remote: bail early from set_head() if missing remote name
2025-08-07Merge branch 'jk/revert-squelch-compiler-warning'Junio C Hamano1-1/+1
Squelch false-positive compiler warning. * jk/revert-squelch-compiler-warning: revert: initialize const value
2025-08-06builtin/remote: only iterate through refs that are to be renamedPatrick Steinhardt1-9/+4
When renaming a remote we also need to rename all references accordingly. But while we only need to rename references that are contained in the "refs/remotes/$OLDNAME/" namespace, we end up using `refs_for_each_rawref()` that iterates through _all_ references. We know to exit early in the callback in case we see an irrelevant reference, but ultimately this is still a waste of compute as we knowingly iterate through references that we won't ever care about. Improve this by using `refs_for_each_rawref_in()`, which knows to only iterate through (potentially broken) references in a given prefix. The following benchmark renames a remote with a single reference in a repository that has 100k unrelated references. This shows a sizeable improvement with the "files" backend: Benchmark 1: rename remote (refformat = files, revision = HEAD~) Time (mean ± σ): 42.6 ms ± 0.9 ms [User: 29.1 ms, System: 8.4 ms] Range (min … max): 40.1 ms … 43.3 ms 10 runs Benchmark 2: rename remote (refformat = files, revision = HEAD) Time (mean ± σ): 31.7 ms ± 4.0 ms [User: 19.6 ms, System: 6.9 ms] Range (min … max): 27.1 ms … 36.0 ms 10 runs Summary rename remote (refformat = files, revision = HEAD) ran 1.35 ± 0.17 times faster than rename remote (refformat = files, revision = HEAD~) The "reftable" backend shows roughly the same absolute improvement, but given that it's already significantly faster than the "files" backend this translates to a much larger relative improvement: Benchmark 1: rename remote (refformat = reftable, revision = HEAD~) Time (mean ± σ): 18.2 ms ± 0.5 ms [User: 12.7 ms, System: 3.0 ms] Range (min … max): 17.3 ms … 21.4 ms 110 runs Benchmark 2: rename remote (refformat = reftable, revision = HEAD) Time (mean ± σ): 8.8 ms ± 0.5 ms [User: 3.8 ms, System: 2.9 ms] Range (min … max): 7.5 ms … 9.9 ms 167 runs Summary rename remote (refformat = reftable, revision = HEAD) ran 2.07 ± 0.12 times faster than rename remote (refformat = reftable, revision = HEAD~) Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06builtin/remote: rework how remote refs get renamedPatrick Steinhardt1-99/+197
It was recently reported [1] that renaming a remote that has dangling symrefs is broken. This issue can be trivially reproduced: $ git init repo Initialized empty Git repository in /tmp/repo/.git/ $ cd repo/ $ git remote add origin /dev/null $ git symbolic-ref refs/remotes/origin/HEAD refs/remotes/origin/master $ git remote rename origin renamed $ git symbolic-ref refs/remotes/origin/HEAD refs/remotes/origin/master $ git symbolic-ref refs/remotes/renamed/HEAD fatal: ref refs/remotes/renamed/HEAD is not a symbolic ref As one can see, the "HEAD" reference did not get renamed but stays in the same place. There are two issues here: - We use `refs_resolve_ref_unsafe()` to resolve references, but we don't pass the `RESOLVE_REF_NO_RECURSE` flag. Consequently, if the reference does not resolve, the function will fail and we thus ignore this branch. - We use `refs_for_each_ref()` to iterate through the old remote's references, but that function ignores broken references. Both of these issues are easy to fix. But having a closer look at the logic that renames remote references surfaces that it leaves a lot to be desired overall. The problem is that we're using O(|refs| + |symrefs| * 2) many reference transactions to perform the renames. We first delete all symrefs, then individually rename every direct reference and finally we recreate the symrefs. On the one hand this isn't even remotely an atomic operation, so if we hit any error we'll already have deleted all references. But more importantly it is also extremely inefficient. The number of transactions for symrefs doesn't really bother us too much, as there should generally only be a single symref anyway ("HEAD"). But the renames are very expensive: - For the "reftable" backend we perform auto-compaction after every single rename, which does add up. - For the "files" backend we potentially have to rewrite the "packed-refs" file on every single rename in case they are packed. The consequence here is quadratic runtime performance. Renaming a 100k references takes hours to complete. Refactor the code to use a single transaction to perform all the reference updates atomically, which speeds up the transaction quite significantly: Benchmark 1: rename remote (refformat = files, revision = HEAD~) Time (mean ± σ): 238.770 s ± 13.857 s [User: 91.473 s, System: 143.793 s] Range (min … max): 204.863 s … 247.699 s 10 runs Benchmark 2: rename remote (refformat = files, revision = HEAD) Time (mean ± σ): 2.103 s ± 0.036 s [User: 0.360 s, System: 1.313 s] Range (min … max): 2.011 s … 2.141 s 10 runs Summary rename remote (refformat = files, revision = HEAD) ran 113.53 ± 6.87 times faster than rename remote (refformat = files, revision = HEAD~) For the "reftable" backend we see a significant speedup, as well, but given that we don't have quadratic runtime behaviour there it's way less extreme: Benchmark 1: rename remote (refformat = reftable, revision = HEAD~) Time (mean ± σ): 8.604 s ± 0.539 s [User: 4.985 s, System: 2.368 s] Range (min … max): 7.880 s … 9.556 s 10 runs Benchmark 2: rename remote (refformat = reftable, revision = HEAD) Time (mean ± σ): 1.177 s ± 0.103 s [User: 0.446 s, System: 0.270 s] Range (min … max): 1.023 s … 1.410 s 10 runs Summary rename remote (refformat = reftable, revision = HEAD) ran 7.31 ± 0.79 times faster than rename remote (refformat = reftable, revision = HEAD~) There is one issue though with using atomic transactions: when nesting a remote into itself it can happen that renamed references conflict with the old referencse. For example, when we have a reference "refs/remotes/origin/foo" and we rename "origin" to "origin/foo", then we'll end up with an F/D conflict when we try to create the renamed reference "refs/remotes/origin/foo/foo". This situation is overall quite unlikely to happen: people tend to not use nested remotes, and if they do they must at the same time also have a conflicting refname. But the end result would be that the old remote references stay intact whereas all the other parts of the repository have been adjusted for the new remote name. Address this by queueing and preparing the reference update before we touch any other part of the repository. Like this we can make sure that the reference update will go through before rewriting the configuration. Otherwise, if the transaction fails to prepare we can gracefully abort the whole operation without any changes having been performed in the repository yet. Furthermore, we can detect the conflict and print some helpful advice for how the user can resolve this situation. So overall, the tradeoff is that: - Reference transactions are now all-or-nothing. This is a significant improvement over the previous state where we may have ended up with partially-renamed references. - Rewriting references is now significantly faster. - We only rewrite the configuration in case we know that all references can be updated. - But we may refuse to rename a remote in case references conflict. Overall this seems like an acceptable tradeoff. While at it, fix the handling of symbolic/broken references by using `refs_for_each_rawref()`. Add tests that cover both this reported issue and tests that exercise nesting of remotes. One thing to note: with this change we cannot provide a proper progress monitor anymore as we queue the references into the transactions as we iterate through them. Consequently, as we don't know yet how many refs there are in total, we cannot report how many percent of the operation is done anymore. But that's a small price to pay considering that you now shouldn't need the progress monitor in most situations at all anymore. [1]: <CANrWfmQWa=RJnm7d3C7ogRX6Tth2eeuGwvwrNmzS2gr+eP0OpA@mail.gmail.com> Reported-by: Han Jiang <jhcarl0814@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06builtin/remote: determine whether refs need renaming early onPatrick Steinhardt1-4/+8
When renaming a remote we may have to also rename remote refs in case the refspec changes. Pull out this computation into a separate loop. While that seems nonsensical right now, it'll help us in a subsequent commit where we will prepare the reference transaction before we rewrite the configuration. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06builtin/remote: fix sign comparison warningsPatrick Steinhardt1-31/+23
Fix -Wsign-comparison warnings. All of the warnings we have are about mismatches in signedness for loop counters. These are trivially fixable by using the correct integer type. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06refs: pass refname when invoking reflog entry callbackPatrick Steinhardt3-8/+10
With `refs_for_each_reflog_ent()` callers can iterate through all the reflog entries for a given reference. The callback that is being invoked for each such entry does not receive the name of the reference that we are currently iterating through. This isn't really a limiting factor, as callers can simply pass the name via the callback data. But this layout sometimes does make for a bit of an awkward calling pattern. One example: when iterating through all reflogs, and for each reflog we iterate through all refnames, we have to do some extra book keeping to track which reference name we are currently yielding reflog entries for. Change the signature of the callback function so that the reference name of the reflog gets passed through to it. Adapt callers accordingly and start using the new parameter in trivial cases. The next commit will refactor the reference migration logic to make use of this parameter so that we can simplify its logic a bit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06Merge branch 'ps/reflog-migrate-fixes' into ps/remote-rename-fixJunio C Hamano1-19/+84
* ps/reflog-migrate-fixes: refs: fix invalid old object IDs when migrating reflogs refs: stop unsetting REF_HAVE_OLD for log-only updates refs/files: detect race when generating reflog entry for HEAD refs: fix identity for migrated reflogs ident: fix type of string length parameter builtin/reflog: implement subcommand to write new entries refs: export `ref_transaction_update_reflog()` builtin/reflog: improve grouping of subcommands Documentation/git-reflog: convert to use synopsis type
2025-08-06builtin/reflog: implement subcommand to write new entriesPatrick Steinhardt1-0/+65
While we provide a couple of subcommands in git-reflog(1) to remove reflog entries, we don't provide any to write new entries. Obviously this is not an operation that really would be needed for many use cases out there, or otherwise people would have complained that such a command does not exist yet. But the introduction of the "reftable" backend changes the picture a bit, as it is now basically impossible to manually append a reflog entry if one wanted to do so due to the binary format. Plug this gap by introducing a simple "write" subcommand. For now, all this command does is to append a single new reflog entry with the given object IDs and message to the reflog. More specifically, it is not yet possible to: - Write multiple reflog entries at once. - Insert reflog entries at arbitrary indices. - Specify the date of the reflog entry. - Insert reflog entries that refer to nonexistent objects. If required, those features can be added at a future point in time. For now though, the new command aims to fulfill the most basic use cases while being as strict as possible when it comes to verifying parameters. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06builtin/reflog: improve grouping of subcommandsPatrick Steinhardt1-19/+19
The way subcommands of git-reflog(1) are laid out does not make any immediate sense. Reorder them such that read-only subcommands precede writing commands for a bit more structure. Furthermore, move the "expire" subcommand last. This prepares for a subsequent change where we are about to introduce a new "write" command to append reflog entries. Like this, the writing subcommands are ordered such that those affecting a single reflog come before those spanning across all reflogs. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-05Merge branch 'kj/renamed-submodule'Junio C Hamano1-6/+44
The case where a new submodule takes a path where used to be a completely different subproject is now dealt a bit better than before. * kj/renamed-submodule: fixup! submodule: skip redundant active entries when pattern covers path fixup! submodule: prevent overwriting .gitmodules on path reuse submodule: skip redundant active entries when pattern covers path submodule: prevent overwriting .gitmodules on path reuse
2025-08-05Merge branch 'ps/object-file-wo-the-repository'Junio C Hamano17-43/+64
Reduce implicit assumption and dependence on the_repository in the object-file subsystem. * ps/object-file-wo-the-repository: object-file: get rid of `the_repository` in index-related functions object-file: get rid of `the_repository` in `force_object_loose()` object-file: get rid of `the_repository` in `read_loose_object()` object-file: get rid of `the_repository` in loose object iterators object-file: remove declaration for `for_each_file_in_obj_subdir()` object-file: inline `for_each_loose_file_in_objdir_buf()` object-file: get rid of `the_repository` when writing objects odb: introduce `odb_write_object()` loose: write loose objects map via their source object-file: get rid of `the_repository` in `finalize_object_file()` object-file: get rid of `the_repository` in `loose_object_info()` object-file: get rid of `the_repository` when freshening objects object-file: inline `check_and_freshen()` functions object-file: get rid of `the_repository` in `has_loose_object()` object-file: stop using `the_hash_algo` object-file: fix -Wsign-compare warnings
2025-08-05builtin/refs: add list subcommandMeet Soni1-0/+14
Git's reference management is distributed across multiple commands. As part of an ongoing effort to consolidate and modernize reference handling, introduce a `list` subcommand under the `git refs` umbrella as a replacement for `git for-each-ref`. Implement `cmd_refs_list` by having it call the `for_each_ref_core()` helper function. This helper was factored out of the original `cmd_for_each_ref` in a preceding commit, allowing both commands to share the same core logic as independent peers. Add documentation for the new command. The man page leverages the shared options file, created in a previous commit, by using the AsciiDoc `include::` macro to ensure consistency with git-for-each-ref(1). Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: shejialuo <shejialuo@gmail.com> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-05builtin/for-each-ref: factor out core logic into a helperMeet Soni1-22/+19
The implementation of `git for-each-ref` is monolithic within `cmd_for_each_ref()`, making it impossible to share its logic with other commands. To enable code reuse for the upcoming `git refs list` subcommand, refactor the core logic into a shared helper function. Introduce a new `for-each-ref.h` header to define the public interface for this shared logic. It contains the declaration for a new helper function, `for_each_ref_core()`, and a macro for the common usage options. Move the option parsing, filtering, and formatting logic from `cmd_for_each_ref()` into a new helper function named `for_each_ref_core()`. This helper is made generic by accepting the command's usage string as a parameter. The original `cmd_for_each_ref()` is simplified to a thin wrapper that is only responsible for defining its specific usage array and calling the shared helper. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: shejialuo <shejialuo@gmail.com> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-05builtin/for-each-ref: align usage string with the man pageMeet Soni1-5/+10
Usage string for `git for-each-ref` was out of sync with its official documentation. The test `t0450-txt-doc-vs-help.sh` was marked as broken due to this. Update the usage string to match the documentation. This allows the test to pass, so remove the corresponding 'known breakage' marker from the test file. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: shejialuo <shejialuo@gmail.com> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-05remote: bail early from set_head() if missing remote nameJeff King1-4/+7
In "git remote set-head", we can take varying numbers of arguments depending on whether we saw the "-d" or "-a" options. But the first argument is always the remote name. The current code is somewhat awkward in that it conditionally handles the remote name up-front like this: if (argc) remote = ...from argv[0]... and then only later decides to bail if we do not have the right number of arguments for the options we saw. This makes it hard to figure out if "remote" is always set when it needs to be. Both for humans, but also for compilers; with -Og, gcc complains that "remote" can be accessed without being initialized (although this is not true, as we'd always die with a usage message in that case). Let's instead enforce the presence of the remote argument up front, which fixes the compiler warning and is easier to understand. It does mean duplicating the code to print a usage message, but it's a single line. Noticed-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Tested-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-04Merge branch 'lm/add-p-context'Junio C Hamano5-21/+120
"git add/etc -p" now honor the diff.context configuration variable, and also they learn to honor the -U<n> command-line option. * lm/add-p-context: add-patch: add diff.context command line overrides add-patch: respect diff.context configuration t: use test_config in t4055 t: use test_grep in t3701 and t4055
2025-08-04Merge branch 'ps/config-wo-the-repository'Junio C Hamano93-258/+298
The config API had a set of convenience wrapper functions that implicitly use the_repository instance; they have been removed and inlined at the calling sites. * ps/config-wo-the-repository: (21 commits) config: fix sign comparison warnings config: move Git config parsing into "environment.c" config: remove unused `the_repository` wrappers config: drop `git_config_set_multivar()` wrapper config: drop `git_config_get_multivar_gently()` wrapper config: drop `git_config_set_multivar_in_file_gently()` wrapper config: drop `git_config_set_in_file_gently()` wrapper config: drop `git_config_set()` wrapper config: drop `git_config_set_gently()` wrapper config: drop `git_config_set_in_file()` wrapper config: drop `git_config_get_bool()` wrapper config: drop `git_config_get_ulong()` wrapper config: drop `git_config_get_int()` wrapper config: drop `git_config_get_string()` wrapper config: drop `git_config_get_string()` wrapper config: drop `git_config_get_string_multi()` wrapper config: drop `git_config_get_value()` wrapper config: drop `git_config_get_value()` wrapper config: drop `git_config_get()` wrapper config: drop `git_config_clear()` wrapper ...
2025-08-04Merge branch 'kn/for-each-ref-skip-updates'Junio C Hamano1-1/+1
Code clean-up. * kn/for-each-ref-skip-updates: ref-filter: use REF_ITERATOR_SEEK_SET_PREFIX instead of '1' t6302: add test combining '--start-after' with '--exclude' for-each-ref: reword the documentation for '--start-after' for-each-ref: fix documentation argument ordering ref-cache: use 'size_t' instead of int for length
2025-08-04Merge branch 'hy/blame-simplify-get-commit-info'Junio C Hamano1-11/+4
Code simplification. * hy/blame-simplify-get-commit-info: blame: remove parameter detailed in get_commit_info()
2025-08-04revert: initialize const valueJeff King1-1/+1
When building with clang-22 and DEVELOPER=1 mode, this warning causes us to fail compilation: builtin/revert.c:114:13: error: default initialization of an object of type 'const char' leaves the object uninitialized [-Werror,-Wdefault-const-init-var-unsafe] 114 | const char sentinel_value; | ^ The compiler is right that this code is a bit funny. We declare a const value without an initializer. It cannot be assigned to because of the const, but without an initializer it has no predictable value. So as a variable it can never have any useful function, and if we tried to look at it, we'd get undefined behavior. But it does have a function. We never use its value, but rather use its address as a sentinel value for some other variables: const char *gpg_sign = &sentinel_value; ...maybe set gpg_sign via parse_options... if (gpg_sign != &sentinel_value) ...we got a non-default value... Normally we'd use NULL as a sentinel value for a pointer, but it doesn't work here because we also want to detect --no-gpg-sign, which is marked by setting the pointer to NULL. We need a separate "this was not touched" value, which is what this sentinel variable gives us. So the code is correct as-is, but the sentinel variable itself is funny enough that it's understandable for a compiler warning to flag it. Let's try to appease the compiler. There are a few possible options: 1. Instead of a variable, we could just construct an artificial sentinel address like "1", "-1", etc. I think these technically fall afoul of the C standard (even if we do not access them, even constructing invalid pointers is not always allowed). But it's also something we do elsewhere, and even happens in some standard interfaces (e.g., mmap()'s MMAP_FAILED value). It does involve some annoying casts, though. 2. We can mark it as static. That gives it a definite value, but perhaps makes people wonder if the static-ness is important, when it's not. 3. We can just give it a value to shut the compiler up, even though nobody cares about that value. I went with (3) here as the smallest and most obvious change. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-03Merge branch 'ow/rebase-verify-insn-fmt-before-initializing-state'Junio C Hamano1-21/+21
"git rebase -i" with bogus rebase.instructionFormat configuration failed to produce the todo file after recording the state files, leading to confused "git status"; this has been corrected. * ow/rebase-verify-insn-fmt-before-initializing-state: rebase: write script before initializing state
2025-08-03Merge branch 'ps/object-store-midx'Junio C Hamano2-8/+12
Redefine where the multi-pack-index sits in the object subsystem, which recently was restructured to allow multiple backends that support a single object source that belongs to one repository. A midx does span mulitple "object sources". * ps/object-store-midx: midx: remove now-unused linked list of multi-pack indices packfile: stop using linked MIDX list in `get_all_packs()` packfile: stop using linked MIDX list in `find_pack_entry()` packfile: refactor `get_multi_pack_index()` to work on sources midx: stop using linked list when closing MIDX packfile: refactor `prepare_packed_git_one()` to work on sources midx: start tracking per object database source
2025-08-03Merge branch 'kn/for-each-ref-skip'Junio C Hamano1-0/+8
"git for-each-ref" learns "--start-after" option to help applications that want to page its output. * kn/for-each-ref-skip: ref-cache: set prefix_state when seeking for-each-ref: introduce a '--start-after' option ref-filter: remove unnecessary else clause refs: selectively set prefix in the seek functions ref-cache: remove unused function 'find_ref_entry()' refs: expose `ref_iterator` via 'refs.h'
2025-08-03describe: use prio_queue_replace()René Scharfe1-16/+52
Optimize the sequence get+put to peek+replace to avoid one unnecessary heap rebalance. Do that by tracking partial get operations in a prio_queue wrapper, struct lazy_queue, and using wrapper functions that turn get into peek and put into replace as needed. This is simpler than tracking the state explicitly in the calling code. We get a nice speedup on top of the previous patch's conversion to prio_queue: Benchmark 1: ./git_2.50.1 describe $(git rev-list v2.41.0..v2.47.0) Time (mean ± σ): 1.559 s ± 0.002 s [User: 1.493 s, System: 0.051 s] Range (min … max): 1.556 s … 1.563 s 10 runs Benchmark 2: ./git_describe_pq describe $(git rev-list v2.41.0..v2.47.0) Time (mean ± σ): 1.204 s ± 0.001 s [User: 1.138 s, System: 0.051 s] Range (min … max): 1.202 s … 1.205 s 10 runs Benchmark 3: ./git describe $(git rev-list v2.41.0..v2.47.0) Time (mean ± σ): 850.9 ms ± 1.6 ms [User: 786.6 ms, System: 49.8 ms] Range (min … max): 849.1 ms … 854.1 ms 10 runs Summary ./git describe $(git rev-list v2.41.0..v2.47.0) ran 1.41 ± 0.00 times faster than ./git_describe_pq describe $(git rev-list v2.41.0..v2.47.0) 1.83 ± 0.00 times faster than ./git_2.50.1 describe $(git rev-list v2.41.0..v2.47.0) Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-03describe: use prio_queueRené Scharfe1-24/+27
Replace the use a list-based priority queue whose order is maintained by commit_list_insert_by_date() with a prio_queue. This avoids quadratic worst-case complexity. And in the somewhat contrived example of describing the 4751 commits from v2.41.0 to v2.47.0 in one go (to get a sizable chunk of describe work with minimal ref loading overhead) it's significantly faster: Benchmark 1: ./git_2.50.1 describe $(git rev-list v2.41.0..v2.47.0) Time (mean ± σ): 1.558 s ± 0.002 s [User: 1.492 s, System: 0.051 s] Range (min … max): 1.557 s … 1.562 s 10 runs Benchmark 2: ./git describe $(git rev-list v2.41.0..v2.47.0) Time (mean ± σ): 1.209 s ± 0.006 s [User: 1.143 s, System: 0.051 s] Range (min … max): 1.201 s … 1.219 s 10 runs Summary ./git describe $(git rev-list v2.41.0..v2.47.0) ran 1.29 ± 0.01 times faster than ./git_2.50.1 describe $(git rev-list v2.41.0..v2.47.0) Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-02notes: do not use strbuf_split*()Junio C Hamano1-11/+12
When reading copy instructions from the standard input, the program reads a line, splits it into tokens at whitespace, and trims each of the tokens before using. We no longer need to use strbuf just to be able to trim, as string_list_split*() family now can trim while splitting a string. Retire the use of strbuf_split() from this code path. Note that this loop is a bit sloppy in that it ensures at least there are two tokens on each line, but ignores if there are extra tokens on the line. Tightening it is outside the scope of this series. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-02merge-tree: do not use strbuf_split*()Junio C Hamano1-14/+16
When reading merge instructions from the standard input, the program reads from the standard input, splits the line into tokens at whitespace, and trims each of them before using. We no longer need to use strbuf just for trimming, as string_list_split*() family can trim while splitting a string. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-02clean: do not use strbuf_split*() [part 2]Junio C Hamano1-9/+11
builtin/clean.c:filter_by_patterns_cmd() interactively reads a line that has exclude patterns from the user and splits the line into a list of patterns. It uses the strbuf_split() so that each split piece can then trimmed. There is no need to use strbuf anymore, thanks to the recent enhancement to string_list_split*() family that allows us to trim the pieces split into a string_list. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-02clean: do not pass the whole structure when it is not necessaryJunio C Hamano1-3/+3
The callee parse_choice() only needs to access a NUL-terminated string; instead of insisting to take a pointer to a strbuf, just take a pointer to a character array. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-02clean: do not use strbuf_split*() [part 1]Junio C Hamano1-27/+23
builtin/clean.c:parse_choice() is fed a single line of input, which is space or comma separated list of tokens, and a list of menu items. It parses the tokens into number ranges (e.g. 1-3 that means the first three items) or string prefix (e.g. 's' to choose the menu item "(s)elect") that specify the elements in the menu item list, and tells the caller which ones are chosen. For parsing the input string, it uses strbuf_split() to split it into bunch of strbufs. Instead use string_list_split_in_place(), for a few reasons. * strbuf_split() is a bad API function to use, that yields an array of strbuf that is a bad data structure to use in general. * string_list_split_in_place() allows you to split with "comma or space"; the current code has to preprocess the input string to replace comma with space because strbuf_split() does not allow this. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-02clean: do not pass strbuf by valueJunio C Hamano1-5/+5
When you pass a structure by value, the callee can modify the contents of the structure that was passed in without having to worry about changing the structure the caller has. Passing structure by value sometimes (but not very often) can be a valid way to give callee a temporary variable it can freely modify. But not a structure with members that are pointers, like a strbuf. builtin/clean.c:list_and_choose() reads a line interactively from the user, and passes the line (in a strbuf) to parse_choice() by value, which then munges by replacing ',' with ' ' (to accept both comma and space separated list of choices). But because the strbuf passed by value still shares the underlying character array buf[], this ends up munging the caller's strbuf contents. This is a catastrophe waiting to happen. If the callee causes the strbuf to be reallocated, the buf[] the caller has will become dangling, and when the caller does strbuf_release(), it would result in double-free. Stop calling the function with misleading call-by-value with strbuf. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-02string-list: align string_list_split() with its _in_place() counterpartJunio C Hamano3-3/+3
The string_list_split_in_place() function was updated by 52acddf3 (string-list: multi-delimiter `string_list_split_in_place()`, 2023-04-24) to take more than one delimiter characters, hoping that we can later use it to replace our uses of strtok(). We however did not make a matching change to the string_list_split() function, which is very similar. Before giving both functions more features in future commits, allow string_list_split() to also take more than one delimiter characters to make them closer to each other. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-01Merge branch 'ly/pull-autostash'Junio C Hamano1-3/+17
"git pull" learned to pay attention to pull.autostash configuration variable, which overrides rebase/merge.autostash. * ly/pull-autostash: pull: add pull.autoStash config option
2025-08-01Merge branch 'jk/unleak-reflog-expire-entry'Junio C Hamano2-0/+4
Leakfix. * jk/unleak-reflog-expire-entry: reflog: close leak of reflog expire entry
2025-08-01Merge branch 'jc/do-not-scan-argv-without-parsing'Junio C Hamano1-9/+13
Update a hard-to-read in-code NEEDSWORK comment. * jc/do-not-scan-argv-without-parsing: rev-list: update a NEEDSWORK comment
2025-08-01Merge branch 'jk/revision-no-early-output'Junio C Hamano1-129/+0
Remove unsupported, unused, and unsupportable old option from "git log". * jk/revision-no-early-output: revision: drop early output option
2025-08-01Merge branch 'jc/rev-list-info-cleanup'Junio C Hamano1-0/+8
Move structure definition from unrelated header file to where it belongs. * jc/rev-list-info-cleanup: rev-list: make "struct rev_list_info" static to the only user
2025-07-31Merge branch 'ps/config-wo-the-repository' into ↵Junio C Hamano93-258/+298
pw/3.0-commentchar-auto-deprecation * ps/config-wo-the-repository: (21 commits) config: fix sign comparison warnings config: move Git config parsing into "environment.c" config: remove unused `the_repository` wrappers config: drop `git_config_set_multivar()` wrapper config: drop `git_config_get_multivar_gently()` wrapper config: drop `git_config_set_multivar_in_file_gently()` wrapper config: drop `git_config_set_in_file_gently()` wrapper config: drop `git_config_set()` wrapper config: drop `git_config_set_gently()` wrapper config: drop `git_config_set_in_file()` wrapper config: drop `git_config_get_bool()` wrapper config: drop `git_config_get_ulong()` wrapper config: drop `git_config_get_int()` wrapper config: drop `git_config_get_string()` wrapper config: drop `git_config_get_string()` wrapper config: drop `git_config_get_string_multi()` wrapper config: drop `git_config_get_value()` wrapper config: drop `git_config_get_value()` wrapper config: drop `git_config_get()` wrapper config: drop `git_config_clear()` wrapper ...
2025-07-29Merge branch 'ps/object-store-midx' into ps/object-store-midx-dedup-infoJunio C Hamano2-8/+12
* ps/object-store-midx: midx: remove now-unused linked list of multi-pack indices packfile: stop using linked MIDX list in `get_all_packs()` packfile: stop using linked MIDX list in `find_pack_entry()` packfile: refactor `get_multi_pack_index()` to work on sources midx: stop using linked list when closing MIDX packfile: refactor `prepare_packed_git_one()` to work on sources midx: start tracking per object database source
2025-07-29add-patch: add diff.context command line overridesLeon Michalak5-21/+120
This patch compliments the previous commit, where builtins that use add-patch infrastructure now respect diff.context and diff.interHunkContext file configurations. In particular, this patch helps users who don't want to set persistent context configurations or just want a way to override them on a one-time basis, by allowing the relevant builtins to accept corresponding command line options that override the file configurations. This mimics commands such as diff and log, which allow for both context file configuration and command line overrides. Signed-off-by: Leon Michalak <leonmichalak6@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-28blame: remove parameter detailed in get_commit_info()Han Young1-11/+4
The get_commit_info() function accepts a parameter that can be used to stop the commit parsing early. However, none of the callers use this feature, and testing proved that the performance gain of stopping parsing early is negligible and unmeasurable. Signed-off-by: Han Young <hanyang.tony@bytedance.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-28for-each-ref: reword the documentation for '--start-after'Karthik Nayak1-1/+1
The documentation for '--start-after' states that the flag cannot be used with general pattern matching. This is a bit vague, since there is no clear understanding about what 'general' means here. Rewrite the sentence to be more specific. While here, fix a typo in the 'OPT_STRING'. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-28Merge branch 'ac/auto-comment-char-fix'Junio C Hamano1-1/+5
"git commit" that concludes a conflicted merge failed to notice and remove existing comment added automatically (like "# Conflicts:") when the core.commentstring is set to 'auto'. * ac/auto-comment-char-fix: config: set comment_line_str to "#" when core.commentChar=auto commit: avoid scanning trailing comments when 'core.commentChar' is "auto"
2025-07-24fixup! submodule: skip redundant active entries when pattern covers pathJunio C Hamano1-3/+2
2025-07-24fixup! submodule: prevent overwriting .gitmodules on path reuseJunio C Hamano1-3/+2
2025-07-24submodule: skip redundant active entries when pattern covers pathK Jayatheerth1-6/+19
configure_added_submodule always writes an explicit submodule.<name>.active entry, even when the new path is already matched by submodule.active patterns. This leads to unnecessary and cluttered configuration. change the logic to centralize wildmatch-based pattern lookup, in configure_added_submodule. Wrap the active-entry write in a conditional that only fires when that helper reports no existing pattern covers the submodule’s path. Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-24submodule: prevent overwriting .gitmodules on path reuseK Jayatheerth1-0/+27
Adding a submodule at a path that previously hosted another submodule (e.g., 'child') reuses the submodule name derived from the path. If the original submodule was only moved (e.g., to 'child_old') and not renamed, this silently overwrites its configuration in .gitmodules. This behavior loses user configuration and causes confusion when the original submodule is expected to remain intact. It assumes that the path-derived name is always safe to reuse, even though the name might still be in use elsewhere in the repository. Teach module_add() to check if the computed submodule name already exists in the repository's submodule config, and if so, refuse the operation unless the user explicitly renames the submodule or uses the --force option, which will automatically generate a unique name by appending a number (e.g., child1). Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23Merge branch 'cc/fast-import-export-signature-names'Junio C Hamano2-36/+139
Clean up the way how signature on commit objects are exported to and imported from fast-import stream. * cc/fast-import-export-signature-names: fast-(import|export): improve on commit signature output format
2025-07-23config: move Git config parsing into "environment.c"Patrick Steinhardt40-0/+40
In "config.c" we host both the business logic to read and write config files as well as the logic to parse specific Git-related variables. On the one hand this is mixing concerns, but even more importantly it means that we cannot easily remove the dependency on `the_repository` in our config parsing logic. Move the logic into "environment.c". This file is a grab bag of all kinds of global state already, so it is quite a good fit. Furthermore, it also hosts most of the global variables that we're parsing the config values into, making this an even better fit. Note that there is one hidden change: in `parse_fsync_components()` we use an `int` to iterate through `ARRAY_SIZE(fsync_component_names)`. But as -Wsign-compare warnings are enabled in this file this causes a compiler warning. The issue is fixed by using a `size_t` instead. This change allows us to drop the `USE_THE_REPOSITORY_VARIABLE` declaration. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_set_multivar()` wrapperPatrick Steinhardt3-13/+13
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_set_multivar()`. All callsites are adjusted so that they use `repo_config_set_multivar(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_get_multivar_gently()` wrapperPatrick Steinhardt2-6/+6
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_multivar_gently()`. All callsites are adjusted so that they use `repo_config_get_multivar_gently(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_set_multivar_in_file_gently()` wrapperPatrick Steinhardt3-25/+25
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_set_multivar_in_file_gently()`. All callsites are adjusted so that they use `repo_config_set_multivar_in_file_gently(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_set_in_file_gently()` wrapperPatrick Steinhardt3-10/+10
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_set_in_file_gently()`. All callsites are adjusted so that they use `repo_config_set_in_file_gently(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_set()` wrapperPatrick Steinhardt4-15/+15
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_set()`. All callsites are adjusted so that they use `repo_config_set(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_set_gently()` wrapperPatrick Steinhardt3-12/+12
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_set_gently()`. All callsites are adjusted so that they use `repo_config_set_gently(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_set_in_file()` wrapperPatrick Steinhardt1-5/+5
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_set_in_file()`. All callsites are adjusted so that they use `repo_config_set_in_file(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_get_bool()` wrapperPatrick Steinhardt7-12/+12
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_bool()`. All callsites are adjusted so that they use `repo_config_get_bool(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_get_ulong()` wrapperPatrick Steinhardt2-7/+7
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_ulong()`. All callsites are adjusted so that they use `repo_config_get_ulong(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_get_int()` wrapperPatrick Steinhardt4-19/+19
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_int()`. All callsites are adjusted so that they use `repo_config_get_int(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_get_string()` wrapperPatrick Steinhardt4-6/+6
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_string()`. All callsites are adjusted so that they use `repo_config_get_string(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_get_string()` wrapperPatrick Steinhardt4-10/+10
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_string()`. All callsites are adjusted so that they use `repo_config_get_string(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_get_string_multi()` wrapperPatrick Steinhardt2-3/+3
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_string_multi()`. All callsites are adjusted so that they use `repo_config_get_string_multi(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_get_value()` wrapperPatrick Steinhardt2-5/+5
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_value()`. All callsites are adjusted so that they use `repo_config_get_value(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_get()` wrapperPatrick Steinhardt2-4/+4
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get()`. All callsites are adjusted so that they use `repo_config_get(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config()` wrapperPatrick Steinhardt82-106/+106
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config()`. All callsites are adjusted so that they use `repo_config(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-22reflog: close leak of reflog expire entryJacob Keller2-0/+4
find_cfg_ent() allocates a struct reflog_expire_entry_option via FLEX_ALLOC_MEM and inserts it into a linked list in the reflog_expire_options structure. The entries in this list are never freed, resulting in a leak in cmd_reflog_expire and the gc reflog expire maintenance task: Direct leak of 39 byte(s) in 1 object(s) allocated from: #0 0x7ff975ee6883 in calloc (/lib64/libasan.so.8+0xe6883) #1 0x0000010edada in xcalloc ../wrapper.c:154 #2 0x000000df0898 in find_cfg_ent ../reflog.c:28 #3 0x000000df0898 in reflog_expire_config ../reflog.c:70 #4 0x00000095c451 in configset_iter ../config.c:2116 #5 0x0000006d29e7 in git_config ../config.h:724 #6 0x0000006d29e7 in cmd_reflog_expire ../builtin/reflog.c:205 #7 0x0000006d504c in cmd_reflog ../builtin/reflog.c:419 #8 0x0000007e4054 in run_builtin ../git.c:480 #9 0x0000007e4054 in handle_builtin ../git.c:746 #10 0x0000007e8a35 in run_argv ../git.c:813 #11 0x0000007e8a35 in cmd_main ../git.c:953 #12 0x000000441e8f in main ../common-main.c:9 #13 0x7ff9754115f4 in __libc_start_call_main (/lib64/libc.so.6+0x35f4) #14 0x7ff9754116a7 in __libc_start_main@@GLIBC_2.34 (/lib64/libc.so.6+0x36a7) #15 0x000000444184 in _start (/home/jekeller/libexec/git-core/git+0x444184) Close this leak by adding a reflog_clear_expire_config() function which iterates the linked list and frees its elements. Call it upon exit of cmd_reflog_expire() and reflog_expire_condition(). Add a basic test which covers this leak. While at it, cover the functionality from commit commit 3cb22b8efe (Per-ref reflog expiry configuration, 2008-06-15). We've had this support for years, but lacked any tests. Co-developed-by: Jeff King <peff@peff.net> Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-22rev-list: update a NEEDSWORK commentJunio C Hamano1-9/+13
The comment is poorly phrased and it in't clear what it wanted to say. Strongly discourage this broken pattern to be copied and pasted to other code paths. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-21rev-list: make "struct rev_list_info" static to the only userJunio C Hamano1-0/+8
The structure has nothing to do with what "git bisect" does; as nobody other than "git rev-list" implementation uses it, move it as a private data type to builtin/rev-list.c Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-21pull: add pull.autoStash config optionLidong Yan1-3/+17
Git uses `rebase.autostash` or `merge.autostash` to determine whether a dirty worktree is allowed during pull. However, this behavior is not clearly documented, making it difficult for users to discover how to enable autostash, or causing them to unknowingly enable it. Add new config option `pull.autostash` along with its documentation and test cases. `pull.autostash` provides the same functionality as `rebase.autostash` and `merge.autostash`, but overrides them when set. If `pull.autostash` is not set, it falls back to `rebase.autostash` or `merge.autostash`, depending on the value of `pull.rebase`. Signed-off-by: Lidong Yan <yldhome2d2@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-21revision: drop early output optionJeff King1-129/+0
We added the --early-output feature long ago in cdcefbc971 (Add "--early-output" log flag for interactive GUI use, 2007-11-03). The idea was that GUIs could use it to progressively render a history view, showing something quick-and-inaccurate at first and then enhancing it later. But we never documented it, and it appears never to have been used, even by the projects which initially expressed interest. There was an RFC patch for gitk to use it: http://public-inbox.org/git/18221.2285.259487.655684@cargo.ozlabs.ibm.com/ but it was never merged. Likewise QGit had a patch in: https://lore.kernel.org/git/e5bfff550711040225ne67c907r2023b1354c35f35@mail.gmail.com/ but it was never fully merged (to this day, QGit has a commented-out line to add "--early-output" to the "log" invocation). Searching for other mentions on the web or forges like github.com turns up nothing. Meanwhile, the feature has been broken off and on over the years without anybody noticing (and naturally, there are no tests, either). From 2011 to 2017 the option didn't even turn on via "--early-output"; this was fixed in e35b6ac56f (revision.h: turn rev_info.early_output back into an unsigned int, 2017-06-10). It worked for a while then, but it does not interact well at all with commit-graphs (which are turned on by default these days). The main logic to count early commits is triggered by limit_list(), which we traditionally invoked when showing output in topo-order (and --early-output always enables --topo-order). But that changed in f0d9cc4196 (revision.c: begin refactoring --topo-order logic, 2018-11-01). Now when we have generation numbers, we skip limit_list() entirely, and the early-output code shows no commits, and just the final header "Final output: 1 done". Which is syntactically OK, but semantically wrong: that message should give the total number of commits we're about to show. So let's drop the feature. It is extra code that is untested and undocumented, and makes working on the revision machinery more brittle. Given the history above, it seems unlikely that anybody is using it (or has used it), and we can drop it without the usual deprecation period. A gentler option might be to "soft" drop it: keep accepting the option, have it imply --topo-order as it does now, print "Final output: 1 done", and then do our regular traversal. That would keep any hypothetical caller working. But it doesn't seem worth the hassle to me. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-21Merge branch 'jk/remote-avoid-overlapping-names'Junio C Hamano1-0/+17
"git remote" now detects remote names that overlap with each other (e.g., remote nickname "outer" and "outer/inner" are used at the same time), as it will lead to overlapping remote-tracking branches. * jk/remote-avoid-overlapping-names: remote: detect collisions in remote names
2025-07-21Merge branch 'tb/midx-avoid-cruft-packs'Junio C Hamano2-87/+289
"pack-objects" has been taught to avoid pointing into objects in cruft packs from midx. * tb/midx-avoid-cruft-packs: repack: exclude cruft pack(s) from the MIDX where possible pack-objects: introduce '--stdin-packs=follow' pack-objects: swap 'show_{object,commit}_pack_hint' pack-objects: fix typo in 'show_object_pack_hint()' pack-objects: perform name-hash traversal for unpacked objects pack-objects: declare 'rev_info' for '--stdin-packs' earlier pack-objects: factor out handling '--stdin-packs' pack-objects: limit scope in 'add_object_entry_from_pack()' pack-objects: use standard option incompatibility functions
2025-07-21Merge branch 'bc/use-sha256-by-default-in-3.0'Junio C Hamano9-9/+9
Prepare to flip the default hash function to SHA-256. * bc/use-sha256-by-default-in-3.0: Enable SHA-256 by default in breaking changes mode help: add a build option for default hash t5300: choose the built-in hash outside of a repo t4042: choose the built-in hash outside of a repo t1007: choose the built-in hash outside of a repo t: default to compile-time default hash if not set setup: use the default algorithm to initialize repo format Use legacy hash for legacy formats builtin: use default hash when outside a repository hash: add a constant for the legacy hash algorithm hash: add a constant for the default hash algorithm
2025-07-17Merge branch 'bc/use-sha256-by-default-in-3.0' into ps/config-wo-the-repositoryJunio C Hamano9-9/+9
* bc/use-sha256-by-default-in-3.0: Enable SHA-256 by default in breaking changes mode help: add a build option for default hash t5300: choose the built-in hash outside of a repo t4042: choose the built-in hash outside of a repo t1007: choose the built-in hash outside of a repo t: default to compile-time default hash if not set setup: use the default algorithm to initialize repo format Use legacy hash for legacy formats builtin: use default hash when outside a repository hash: add a constant for the legacy hash algorithm hash: add a constant for the default hash algorithm
2025-07-16object-file: get rid of `the_repository` in `force_object_loose()`Patrick Steinhardt1-1/+2
The function `force_object_loose()` forces an object to become a loose object in case it only exists in its packed form. To do so it implicitly relies on `the_repository`. Refactor the function by passing a `struct odb_source` as parameter. While the check whether any such loose object exists already acts on the whole object database, writing the loose object happens in one specific source. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-16object-file: get rid of `the_repository` in `read_loose_object()`Patrick Steinhardt1-1/+1
The function `read_loose_object()` takes a path to an object file and tries to parse it. As such, the function does not depend on any specific object database but instead acts as an ODB-independent way to read a specific file. As such, all it needs as input is a repository so that we can derive repo settings and the hash algorithm. That repository isn't passed in as a parameter though, as we implicitly depend on the global `the_repository`. Refactor the function so that we pass in the repository as a parameter. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-16object-file: get rid of `the_repository` in loose object iteratorsPatrick Steinhardt6-18/+17
The iterators for loose objects still rely on `the_repository`. Refactor them: - `for_each_loose_file_in_objdir()` is refactored so that the caller is now expected to pass an `odb_source` as parameter instead of the path to that source. Furthermore, it is renamed accordingly to `for_each_loose_file_in_source()`. - `for_each_loose_object()` is refactored to take in an object database now and calls the above function in a loop. This allows us to get rid of the global dependency. Adjust callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-16object-file: get rid of `the_repository` when writing objectsPatrick Steinhardt1-1/+2
The logic that writes loose objects still relies on `the_repository` to decide where exactly the object shall be written to. Refactor it so that the logic instead operates on a `struct odb_source` so that we can get rid of this global dependency. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-16odb: introduce `odb_write_object()`Patrick Steinhardt9-16/+19
We do not have a backend-agnostic way to write objects into an object database. While there is `write_object_file()`, this function is rather specific to the loose object format. Introduce `odb_write_object()` to plug this gap. For now, this function is a simple wrapper around `write_object_file()` and doesn't even use the passed-in object database yet. This will change in subsequent commits, where `write_object_file()` is converted so that it works on top of an `odb_source`. `odb_write_object()` will then become responsible for deciding which source an object shall be written to. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-16object-file: get rid of `the_repository` in `finalize_object_file()`Patrick Steinhardt3-4/+4
We implicitly depend on `the_repository` when moving an object file into place in `finalize_object_file()`. Get rid of this global dependency by passing in a repository. Note that one might be pressed to inject an object database instead of a repository. But the function doesn't really care about the ODB at all. All it does is to move a file into place while checking whether there is any collision. As such, the functionality it provides is independent of the object database and only needs the repository as parameter so that it can adjust permissions of the file we are about to finalize. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-16object-file: get rid of `the_repository` in `has_loose_object()`Patrick Steinhardt1-4/+20
We implicitly depend on `the_repository` in `has_loose_object()`. Refactor the function to accept an `odb_source` as input that should be checked for such a loose object. This refactoring changes semantics of the function to not check the whole object database for such a loose object anymore, but instead we now only check that single source. Existing callers thus need to loop through all sources manually now. While this change may seem illogical at first, whether or not an object exists in a specific format should be answered by the source using that format. As such, we can eventually convert this into a generic function `odb_source_has_object()` that simply checks whether a given object exists in an object source. And as we will know about the format that any given source uses it allows us to derive whether the object exists in a given format. This change also makes `has_loose_object_nonlocal()` obsolete. The only caller of this function is adapted so that it skips the primary object source. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-16Merge branch 'rs/parse-options-precision'Junio C Hamano4-0/+9
Define .precision to more canned parse-options type to avoid bugs coming from using a variable with a wrong type to capture the parsed values. * rs/parse-options-precision: parse-options: add precision handling for OPTION_COUNTUP parse-options: add precision handling for OPTION_BITOP parse-options: add precision handling for OPTION_NEGBIT parse-options: add precision handling for OPTION_BIT parse-options: add precision handling for OPTION_SET_INT parse-options: add precision handling for PARSE_OPT_CMDMODE parse-options: require PARSE_OPT_NOARG for OPTION_BITOP
2025-07-16Merge branch 'ph/fetch-prune-optim'Junio C Hamano2-15/+9
"git fetch --prune" used to be O(n^2) expensive when there are many refs, which has been corrected. * ph/fetch-prune-optim: clean up interface for refs_warn_dangling_symrefs refs: remove old refs_warn_dangling_symref fetch-prune: optimize dangling-ref reporting
2025-07-16commit: avoid scanning trailing comments when 'core.commentChar' is "auto"Ayush Chandekar1-1/+5
When core.commentChar is set to "auto", Git selects a comment character by scanning the commit message contents and avoiding any character already present in the message. If the message still contains old conflict comments (starting with a comment character), Git assumes that character is in use and chooses a different one. As a result, those existing comment lines are no longer recognized as comments and end up being included in the final commit message. To avoid this, skip scanning the trailing comment block when selecting the comment character. This allows Git to safely reuse the original character when appropriate, keeping the commit message clean and free of leftover conflict information. Background: The "auto" value for core.commentchar was introduced in the commit 84c9dc2c5a (commit: allow core.commentChar=auto for character auto selection, 2014-05-17) but did not exhibit this issue at that time. The bug was introduced in commit a6c2654f83 (rebase -m: fix --signoff with conflicts, 2024-04-18) where Git started writing conflict comments to the file at 'rebase_path_message()'. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-15Merge branch 'ps/object-store'Junio C Hamano40-263/+272
Code clean-up around object access API. * ps/object-store: odb: rename `read_object_with_reference()` odb: rename `pretend_object_file()` odb: rename `has_object()` odb: rename `repo_read_object_file()` odb: rename `oid_object_info()` odb: trivial refactorings to get rid of `the_repository` odb: get rid of `the_repository` when handling submodule sources odb: get rid of `the_repository` when handling the primary source odb: get rid of `the_repository` in `for_each()` functions odb: get rid of `the_repository` when handling alternates odb: get rid of `the_repository` in `odb_mkstemp()` odb: get rid of `the_repository` in `assert_oid_type()` odb: get rid of `the_repository` in `find_odb()` odb: introduce parent pointers object-store: rename files to "odb.{c,h}" object-store: rename `object_directory` to `odb_source` object-store: rename `raw_object_store` to `object_database`
2025-07-15packfile: refactor `get_multi_pack_index()` to work on sourcesPatrick Steinhardt2-8/+12
The function `get_multi_pack_index()` loads multi-pack indices via `prepare_packed_git()` and then returns the linked list of multi-pack indices that is stored in `struct object_database`. That list is in the process of being removed though in favor of storing the MIDX as part of the object database source it belongs to. Refactor `get_multi_pack_index()` so that it returns the multi-pack index for a single object source. Callers are now expected to call this function for each source they are interested in. This requires them to iterate through alternates, so we have to prepare alternate object sources before doing so. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-15Merge branch 'tb/midx-avoid-cruft-packs' into ps/object-store-midxJunio C Hamano2-87/+289
* tb/midx-avoid-cruft-packs: repack: exclude cruft pack(s) from the MIDX where possible pack-objects: introduce '--stdin-packs=follow' pack-objects: swap 'show_{object,commit}_pack_hint' pack-objects: fix typo in 'show_object_pack_hint()' pack-objects: perform name-hash traversal for unpacked objects pack-objects: declare 'rev_info' for '--stdin-packs' earlier pack-objects: factor out handling '--stdin-packs' pack-objects: limit scope in 'add_object_entry_from_pack()' pack-objects: use standard option incompatibility functions
2025-07-15for-each-ref: introduce a '--start-after' optionKarthik Nayak1-0/+8
The `git-for-each-ref(1)` command is used to iterate over references present in a repository. In large repositories with millions of references, it would be optimal to paginate this output such that we can start iteration from a given reference. This would avoid having to iterate over all references from the beginning each time when paginating through results. The previous commit added 'seek' functionality to the reference backends. Utilize this and expose a '--start-after' option in 'git-for-each-ref(1)'. When used, the reference iteration seeks to the lexicographically next reference and iterates from there onward. This enables efficient pagination workflows, where the calling script can remember the last provided reference and use that as the starting point for the next set of references: git for-each-ref --count=100 git for-each-ref --count=100 --start-after=refs/heads/branch-100 git for-each-ref --count=100 --start-after=refs/heads/branch-200 Since the reference iterators only allow seeking to a specified marker via the `ref_iterator_seek()`, we introduce a helper function `start_ref_iterator_after()`, which seeks to next reference by simply adding (char) 1 to the marker. We must note that pagination always continues from the provided marker, as such any concurrent reference updates lexicographically behind the marker will not be output. Document the same. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-14Merge branch 'kh/doc-config-subcommands'Junio C Hamano1-6/+6
Documentation updates. * kh/doc-config-subcommands: config: mention --url in the synopsis config: use --value instead of value-pattern config: document --[no-]value config: use --value=<pattern> consistently config: document --[no-]show-names
2025-07-14Merge branch 'ac/prune-wo-the-repository'Junio C Hamano3-17/+14
Some code paths in the "git prune" used to ignore passed in repository object and used the_repository singleton instance instead, which has been corrected. * ac/prune-wo-the-repository: builtin/prune: stop depending on 'the_repository' repository: move 'repository_format_precious_objects' to repo scope
2025-07-14Merge branch 'cb/total-ram-bsd-fix'Junio C Hamano1-3/+10
Use of sysctl() system call to learn the total RAM size used on BSDs has been corrected. * cb/total-ram-bsd-fix: builtin/gc: correct total_ram calculation with HAVE_BSD_SYSCTL
2025-07-09Merge branch 'ps/object-store' into ps/object-file-wo-the-repositoryJunio C Hamano40-264/+272
* ps/object-store: odb: rename `read_object_with_reference()` odb: rename `pretend_object_file()` odb: rename `has_object()` odb: rename `repo_read_object_file()` odb: rename `oid_object_info()` odb: trivial refactorings to get rid of `the_repository` odb: get rid of `the_repository` when handling submodule sources odb: get rid of `the_repository` when handling the primary source odb: get rid of `the_repository` in `for_each()` functions odb: get rid of `the_repository` when handling alternates odb: get rid of `the_repository` in `odb_mkstemp()` odb: get rid of `the_repository` in `assert_oid_type()` odb: get rid of `the_repository` in `find_odb()` odb: introduce parent pointers object-store: rename files to "odb.{c,h}" object-store: rename `object_directory` to `odb_source` object-store: rename `raw_object_store` to `object_database`
2025-07-09fast-(import|export): improve on commit signature output formatChristian Couder2-36/+139
A recent commit, d9cb0e6ff8 (fast-export, fast-import: add support for signed-commits, 2025-03-10), added support for signed commits to fast-export and fast-import. When a signed commit is processed, fast-export can output either "gpgsig sha1" or "gpgsig sha256" depending on whether the signed commit uses the SHA-1 or SHA-256 Git object format. However, this implementation has a number of limitations: - the output format was not properly described in the documentation, - the output format is not very informative as it doesn't even say if the signature is an OpenPGP, an SSH, or an X509 signature, - the implementation doesn't support having both one signature on the SHA-1 object and one on the SHA-256 object. Let's improve on these limitations by improving fast-export and fast-import so that: - all the signatures are exported, - at most one signature on the SHA-1 object and one on the SHA-256 are imported, - if there is more than one signature on the SHA-1 object or on the SHA-256 object, fast-import emits a warning for each additional signature, - the output format is "gpgsig <git-hash-algo> <signature-format>", where <git-hash-algo> is the Git object format as before, and <signature-format> is the signature type ("openpgp", "x509", "ssh" or "unknown"), - the output is properly documented. About the output format: - <git-hash-algo> allows to know which representation of the commit was signed (the SHA-1 or the SHA-256 version) which helps with both signature verification and interoperability between repos with different hash functions, - <signature-format> helps tools that process the fast-export stream, so they don't have to parse the ASCII armor to identify the signature type. It could be even better to be able to import more than one signature on the SHA-1 object and on the SHA-256 object, but other parts of Git don't handle that well for now, so this is left for future improvements. Helped-by: brian m. carlson <sandals@crustytoothpaste.net> Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-09parse-options: add precision handling for OPTION_NEGBITRené Scharfe1-0/+1
Similar to 09705696f7 (parse-options: introduce precision handling for `OPTION_INTEGER`, 2025-04-17) support value variables of different sizes for OPTION_NEGBIT. Do that by requiring their "precision" to be set, casting their "value" pointer accordingly and checking whether the value fits. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-09parse-options: add precision handling for OPTION_BITRené Scharfe1-0/+1
Similar to 09705696f7 (parse-options: introduce precision handling for `OPTION_INTEGER`, 2025-04-17) support value variables of different sizes for OPTION_BIT. Do that by requiring their "precision" to be set, casting their "value" pointer accordingly and checking whether the value fits. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-09parse-options: add precision handling for OPTION_SET_INTRené Scharfe1-0/+6
Similar to 09705696f7 (parse-options: introduce precision handling for `OPTION_INTEGER`, 2025-04-17) support value variables of different sizes for OPTION_SET_INT. Do that by requiring their "precision" to be set, casting their "value" pointer accordingly and checking whether the value fits. Factor out the casting code from the part of do_get_value() that handles OPTION_INTEGER to avoid code duplication. We're going to use it in the next patches as well. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-09parse-options: add precision handling for PARSE_OPT_CMDMODERené Scharfe1-0/+1
Build on 09705696f7 (parse-options: introduce precision handling for `OPTION_INTEGER`, 2025-04-17) to support value variables of different sizes for PARSE_OPT_CMDMODE options. Do that by requiring their "precision" to be set and casting their "value" pointer accordingly. Call the function that does the raw casting do_get_int_value() to reserve the name get_int_value() for a more friendly wrapper we're going to introduce in one of the next patches. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-09Merge branch 'ps/object-store' into ps/object-store-midxJunio C Hamano40-264/+272
* ps/object-store: odb: rename `read_object_with_reference()` odb: rename `pretend_object_file()` odb: rename `has_object()` odb: rename `repo_read_object_file()` odb: rename `oid_object_info()` odb: trivial refactorings to get rid of `the_repository` odb: get rid of `the_repository` when handling submodule sources odb: get rid of `the_repository` when handling the primary source odb: get rid of `the_repository` in `for_each()` functions odb: get rid of `the_repository` when handling alternates odb: get rid of `the_repository` in `odb_mkstemp()` odb: get rid of `the_repository` in `assert_oid_type()` odb: get rid of `the_repository` in `find_odb()` odb: introduce parent pointers object-store: rename files to "odb.{c,h}" object-store: rename `object_directory` to `odb_source` object-store: rename `raw_object_store` to `object_database`
2025-07-08remote: detect collisions in remote namesJeff King1-0/+17
When two remotes collide in the destinations of their fetch refspecs, the results can be confusing. For example, in this silly example: git config remote.one.url [...] git config remote.one.fetch +refs/heads/*:refs/remotes/collide/* git config remote.two.url [...] git config remote.two.fetch +refs/heads/*:refs/remotes/collide/* git fetch --all we may try to write to the same ref twice (once for each remote we're fetching). There's also a more subtle version of this. If you have remotes "outer/inner" and "outer", then the ref "inner/branch" on the second remote will conflict with just "branch" on the former (they both want to write to "refs/remotes/outer/inner/branch"). We probably don't want to forbid this kind of overlap completely. While the results can be confusing, there are legitimate reasons to have multiple refs write into the same namespace (e.g., if one is a "backup" of the other that is rarely fetched from). But it may be worth limiting the porcelain "git remote" command to avoid this confusion. The example above cannot be done with "git remote", because it always[1] matches the refspecs to the remote name, and you can only have one instance of each remote name. But you can still trigger the more subtle variant like this: git remote add outer [...] git remote add outer/inner [...] So let's detect that kind of name collision (in both directions) and forbid it. You can still do whatever you like by manipulating the config directly, but this should prevent the most obvious foot-gun. [1] Almost always. With the --mirror option, the resulting refspec will just write into "refs/*"; the remote name does not appear in the ref namespace at all. Our new "names must not overlap" rule is not necessary for that case, but it seems reasonable to enforce it consistently. We already require all remote names to be valid in the ref namespace, even though we won't ever use them in that context for --mirror remotes. Likewise, our new rule doesn't help with overlap here. Any two mirror remotes will always overlap (in fact, any mirror remote along with any other single one, since refs/remotes/ is a subset of the mirrored refs). I'm not sure this is worth worrying about, but if it is, we'd want an additional rule like "mirror remotes must be the only remote". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-08Merge branch 'kn/fetch-push-bulk-ref-update'Junio C Hamano3-94/+156
"git push" and "git fetch" are taught to update refs in batches to gain performance. * kn/fetch-push-bulk-ref-update: receive-pack: handle reference deletions separately refs/files: skip updates with errors in batched updates receive-pack: use batched reference updates send-pack: fix memory leak around duplicate refs fetch: use batched reference updates refs: add function to translate errors to strings
2025-07-07Merge branch 'jk/fix-leak-send-pack'Junio C Hamano1-3/+6
Leakfix. * jk/fix-leak-send-pack: send-pack: clean-up even when taking an early exit send-pack: clean up extra_have oid array
2025-07-07Merge branch 'jk/submodule-remote-lookup-cleanup'Junio C Hamano2-66/+42
Updating submodules from the upstream did not work well when submodule's HEAD is detached, which has been improved. * jk/submodule-remote-lookup-cleanup: submodule: look up remotes by URL first submodule: move get_default_remote_submodule() submodule--helper: improve logic for fallback remote name remote: remove the_repository from some functions dir: move starts_with_dot(_dot)_slash to dir.h remote: fix tear down of struct remote remote: remove branch->merge_name and fix branch_release()
2025-07-07builtin/gc: correct total_ram calculation with HAVE_BSD_SYSCTLCarlo Marcelo Arenas Belón1-3/+10
The calls to sysctl() assume a 64-bit memory size for the variable holding the value, but the actual size depends on the key name and platform, at least for HW_PHYSMEM. Detect any mismatched reads, and retry with a shorter variable when needed. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07builtin/prune: stop depending on 'the_repository'Ayush Chandekar1-15/+12
Refactor builtin/prune.c to remove the dependency on the global 'the_repository'. Replace all the occurrences of 'the_repository' with repo and thus remove the definition '#define USE_THE_REPOSITORY_VARIABLE'. Also, add a test to make sure that 'git prune -h' can be called when the repository is `NULL`. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07repository: move 'repository_format_precious_objects' to repo scopeAyush Chandekar3-3/+3
The 'extensions.preciousObjects' setting when set true, prevents operations that might drop objects from the object storage. This setting is populated in the global variable 'repository_format_precious_objects'. Move this global variable to repo scope by adding it to 'struct repository and also refactor all the occurences accordingly. This change is part of an ongoing effort to eliminate global variables, improve modularity and help libify the codebase. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01clean up interface for refs_warn_dangling_symrefsPhil Hord2-8/+2
The refs_warn_dangling_symrefs interface is a bit fragile as it passes in printf-formatting strings with expectations about the number of arguments. This patch series made it worse by adding a 2nd positional argument. But there are only two call sites, and they both use almost identical display options. Make this safer by moving the format strings into the function that uses them to make it easier to see when the arguments don't match. Pass a prefix string and a dry_run flag so the decision logic can be handled where needed. Signed-off-by: Phil Hord <phil.hord@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01fetch-prune: optimize dangling-ref reportingPhil Hord2-12/+12
When pruning during `git fetch` we check each pruned ref against the ref_store one at a time to decide whether to report it as dangling. This causes every local ref to be scanned for each ref being pruned. If there are N refs in the repo and M refs being pruned, this code is O(M*N). However, `git remote prune` uses a very similar function that is only O(N*log(M)). Remove the wasteful ref scanning for each pruned ref and use the faster version already available in refs_warn_dangling_symrefs. Change the message to include the original refname since the message is no longer printed immediately after the line that did just print the refname. In a repo with 126,000 refs, where I was pruning 28,000 refs, this code made about 3.6 billion calls to strcmp and consumed 410 seconds of CPU. (Invariably in that time, my remote would timeout and the fetch would fail anyway.) After this change, the same operation completes in under a second. Signed-off-by: Phil Hord <phil.hord@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01Use legacy hash for legacy formatsbrian m. carlson1-1/+1
We have a large variety of data formats and protocols where no hash algorithm was defined and the default was assumed to always be SHA-1. Instead of explicitly stating SHA-1, let's use the constant to represent the legacy hash algorithm (which is still SHA-1) so that it's clear for documentary purposes that it's a legacy fallback option and not an intentional choice to use SHA-1. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01builtin: use default hash when outside a repositorybrian m. carlson8-8/+8
We have some commands that can operate inside or outside a repository. If we're operating outside a repository, we clearly cannot use the repository's hash algorithm as a default since it doesn't exist, so instead, let's pick the default instead of specifically SHA-1. Right now this results in no functional change since the default is SHA-1, but that may change in the future. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `read_object_with_reference()`Patrick Steinhardt4-21/+15
Rename `read_object_with_reference()` to `odb_read_object_peeled()` to match other functions related to the object database and our modern coding guidelines. Furthermore though, the old name didn't really describe very well what this function actually does, which is to walk down any commit and tag objects until an object of the required type has been found. This is generally referred to as "peeling", so the new name should be way more descriptive. No compatibility wrapper is introduced as the function is not used a lot throughout our codebase. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `has_object()`Patrick Steinhardt11-26/+27
Rename `has_object()` to `odb_has_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `repo_read_object_file()`Patrick Steinhardt14-63/+59
Rename `repo_read_object_file()` to `odb_read_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `oid_object_info()`Patrick Steinhardt19-69/+79
Rename `oid_object_info()` to `odb_read_object_info()` as well as their `_extended()` variant to match other functions related to the object database and our modern coding guidelines. Introduce compatibility wrappers so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: get rid of `the_repository` when handling submodule sourcesPatrick Steinhardt1-1/+2
The "--recursive" flag for git-grep(1) allows users to grep for a string across submodule boundaries. To make this work we add each submodule's object sources to our own object database so that the objects can be accessed directly. The infrastructure for this depends on a global string list of submodule paths. The caller is expected to call `add_submodule_odb_by_path()` for each source and the object database will then eventually register all submodule sources via `do_oid_object_info_extended()` in case it isn't able to look up a specific object. This reliance on global state is of course suboptimal with regards to our libification efforts. Refactor the logic so that the list of submodule sources is instead tracked in the object database itself. This allows us to lose the condition of `r == the_repository` before registering submodule sources as we only ever add submodule sources to `the_repository` anyway. As such, behaviour before and after this refactoring should always be the same. Rename the functions accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: get rid of `the_repository` in `for_each()` functionsPatrick Steinhardt3-3/+5
There are a couple of iterator-style functions that execute a callback for each instance of a given set, all of which currently depend on `the_repository`. Refactor them to instead take an object database as parameter so that we can get rid of this dependency. Rename the functions accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: get rid of `the_repository` when handling alternatesPatrick Steinhardt4-9/+12
The functions to manage alternates all depend on `the_repository`. Refactor them to accept an object database as a parameter and adjust all callers. The functions are renamed accordingly. Note that right now the situation is still somewhat weird because we end up using the object store path provided by the object store's repository anyway. Consequently, we could have instead passed in a pointer to the repository instead of passing in the pointer to the object store. This will be addressed in subsequent commits though, where we will start to use the path owned by the object store itself. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: get rid of `the_repository` in `odb_mkstemp()`Patrick Steinhardt2-2/+3
Get rid of our dependency on `the_repository` in `odb_mkstemp()` by passing in the object database as a parameter and adjusting all callers. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: get rid of `the_repository` in `assert_oid_type()`Patrick Steinhardt1-1/+1
Get rid of our dependency on `the_repository` in `assert_oid_type()` by passing in the object database as a parameter and adjusting all callers. Rename the function to `odb_assert_oid_type()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: get rid of `the_repository` in `find_odb()`Patrick Steinhardt1-2/+2
Get rid of our dependency on `the_repository` in `find_odb()` by passing in the object database in which we want to search for the source and adjusting all callers. Rename the function to `odb_find_source()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename files to "odb.{c,h}"Patrick Steinhardt38-38/+38
In the preceding commits we have renamed the structures contained in "object-store.h" to `struct object_database` and `struct odb_backend`. As such, the code files "object-store.{c,h}" are confusingly named now. Rename them to "odb.{c,h}" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename `object_directory` to `odb_source`Patrick Steinhardt8-32/+32
The `object_directory` structure is used as an access point for a single object directory like ".git/objects". While the structure isn't yet fully self-contained, the intent is for it to eventually contain all information required to access objects in one specific location. While the name "object directory" is a good fit for now, this will change over time as we continue with the agenda to make pluggable object databases a thing. Eventually, objects may not be accessed via any kind of directory at all anymore, but they could instead be backed by any kind of durable storage mechanism. While it seems quite far-fetched for now, it is thinkable that eventually this might even be some form of a database, for example. As such, the current name of this structure will become worse over time as we evolve into the direction of pluggable ODBs. Immediate next steps will start to carve out proper self-contained object directories, which requires us to pass in these object directories as parameters. Based on our modern naming schema this means that those functions should then be named after their subsystem, which means that we would start to bake the current name into the codebase more and more. Let's preempt this by renaming the structure. There have been a couple alternatives that were discussed: - `odb_backend` was discarded because it led to the association that one object database has a single backend, but the model is that one alternate has one backend. Furthermore, "backend" is more about the actual backing implementation and less about the high-level concept. - `odb_alternate` was discarded because it is a bit of a stretch to also call the main object directory an "alternate". Instead, pick `odb_source` as the new name. It makes it sufficiently clear that there can be multiple sources and does not cause confusion when mixed with the already-existing "alternate" terminology. In the future, this change allows us to easily introduce for example a `odb_files_source` and other format-specific implementations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01send-pack: clean-up even when taking an early exitJunio C Hamano1-3/+5
Previous commit has plugged one leak in the normal code path, but there is an early exit that leaves without releasing any resources acquired in the function. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01config: mention --url in the synopsisKristoffer Haugsbakk1-1/+1
4e513890008 (builtin/config: introduce "get" subcommand, 2024-05-06) introduced `get` and `--url` but didn’t add `--url` to the synopsis. Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01config: use --value=<pattern> consistentlyKristoffer Haugsbakk1-6/+6
This option was introduced in a series of commits from fe3ccc7aab (Merge branch 'ps/config-subcommands', 2024-05-15). But two styles were used for the value provided to the option: 1. Synopsis: `--value=<value>` 2. Deprecated Modes: `--value=<pattern>` (2) is also used in the synopsis on the command. Use (2) consistently throughout since it’s a pattern in the general case (`value` sounds more generic). Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-30Merge branch 'jc/merge-compact-summary'Junio C Hamano2-4/+65
"git merge/pull" has been taught the "--compact-summary" option to use the compact-summary format, intead of diffstat, when showing the summary of the incoming changes. * jc/merge-compact-summary: merge/pull: extend merge.stat configuration variable to cover --compact-summary merge/pull: add the "--compact-summary" option
2025-06-30Merge branch 'bc/stash-export-import'Junio C Hamano1-11/+449
An interchange format for stash entries is defined, and subcommand of "git stash" to import/export has been added. * bc/stash-export-import: builtin/stash: provide a way to import stashes from a ref builtin/stash: provide a way to export stashes to a ref builtin/stash: factor out revision parsing into a function object-name: make get_oid quietly return an error
2025-06-27send-pack: clean up extra_have oid arrayJacob Keller1-0/+1
Commit c8009635785e ("fetch-pack, send-pack: clean up shallow oid array", 2024-09-25) cleaned up the shallow oid array in cmd_send_pack, but didn't clean up extra_have, which is still leaked at program exit. I suspect the particular tests in t5539 don't trigger any additions to the extra_have array, which explains why the tests can pass leak free despite this gap. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-25Merge branch 'ps/maintenance-ref-lock'Junio C Hamano5-183/+249
"git maintenance" lacked the care "git gc" had to avoid holding onto the repository lock for too long during packing refs, which has been remedied. * ps/maintenance-ref-lock: builtin/maintenance: fix locking race when handling "gc" task builtin/gc: avoid global state in `gc_before_repack()` usage: allow dying without writing an error message builtin/maintenance: fix locking race with refs and reflogs tasks builtin/maintenance: split into foreground and background tasks builtin/maintenance: fix typedef for function pointers builtin/maintenance: extract function to run tasks builtin/maintenance: stop modifying global array of tasks builtin/maintenance: mark "--task=" and "--schedule=" as incompatible builtin/maintenance: centralize configuration of explicit tasks builtin/gc: drop redundant local variable builtin/gc: use designated field initializers for maintenance tasks
2025-06-25Merge branch 'jc/you-still-use-whatchanged'Junio C Hamano2-8/+21
"git whatchanged" that is longer to type than "git log --raw" which is its modern rough equivalent has outlived its usefulness more than 10 years ago. Plan to deprecate and remove it. * jc/you-still-use-whatchanged: whatschanged: list it in BreakingChanges document whatchanged: remove when built with WITH_BREAKING_CHANGES whatchanged: require --i-still-use-this tests: prepare for a world without whatchanged doc: prepare for a world without whatchanged you-still-use-that??: help deprecating commands for removal
2025-06-25receive-pack: handle reference deletions separatelyKarthik Nayak1-34/+68
In 9d2962a7c4 (receive-pack: use batched reference updates, 2025-05-19) we updated the 'git-receive-pack(1)' command to use batched reference updates. One edge case which was missed during this implementation was when a user pushes multiple branches such as: delete refs/heads/branch/conflict create refs/heads/branch Before using batched updates, the references would be applied sequentially and hence no conflicts would arise. With batched updates, while the first update applies, the second fails due to D/F conflict. A similar issue was present in 'git-fetch(1)' and was fixed by separating out reference pruning into a separate transaction in the commit 'fetch: use batched reference updates'. Apply a similar mechanism for 'git-receive-pack(1)' and separate out reference deletions into its own batch. This means 'git-receive-pack(1)' will now use up to two transactions, whereas before using batched updates it would use _at least_ two transactions. So using batched updates is still the better option. Add a test to validate this behavior. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-24Merge branch 'kj/stash-onbranch-submodule-fix'Junio C Hamano1-2/+8
"git stash" recorded a wrong branch name when submodules are present in the current checkout, which has been corrected. * kj/stash-onbranch-submodule-fix: stash: fix incorrect branch name in stash message
2025-06-24Merge branch 'pw/stash-p-pathspec-fixes'Junio C Hamano1-3/+7
"git stash -p <pathspec>" improvements. * pw/stash-p-pathspec-fixes: stash: allow "git stash [<options>] --patch <pathspec>" to assume push stash: allow "git stash -p <pathspec>" to assume push again
2025-06-23submodule: look up remotes by URL firstJacob Keller1-1/+25
The get_default_remote_submodule() function performs a lookup to find the appropriate remote to use within a submodule. The function first checks to see if it can find the remote for the current branch. If this fails, it then checks to see if there is exactly one remote. It will use this, before finally falling back to "origin" as the default. If a user happens to rename their default remote from origin, either manually or by setting something like clone.defaultRemoteName, this fallback will not work. In such cases, the submodule logic will try to use a non-existent remote. This usually manifests as a failure to trigger the submodule update. The parent project already knows and stores the submodule URL in either .gitmodules or its .git/config. Add a new repo_remote_from_url() helper which will iterate over all the remotes in a repository and return the first remote which has a matching URL. Refactor get_default_remote_submodule to find the submodule and get its URL. If a valid URL exists, first try to obtain a remote using the new repo_remote_from_url(). Fall back to the repo_default_remote() otherwise. The fallback logic is kept in case for some reason the user has manually changed the URL within the submodule. Additionally, we still try to use a remote rather than directly passing the URL in the fetch_in_submodule() logic. This ensures that an update will properly update the remote refs within the submodule as expected, rather than just fetching into FETCH_HEAD. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-23submodule: move get_default_remote_submodule()Jacob Keller1-16/+16
A future refactor got get_default_remote_submodule() is going to depend on resolve_relative_url(). That function depends on get_default_remote(). Move get_default_remote_submodule() after resolve_relative_url() first to make the additional functionality easier to review. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-23submodule--helper: improve logic for fallback remote nameJacob Keller1-41/+5
The repo_get_default_remote() function in submodule--helper currently tries to figure out the proper remote name to use for a submodule based on a few factors. First, it tries to find the remote for the currently checked out branch. This works if the submodule is configured to checkout to a branch instead of a detached HEAD state. In the detached HEAD state, the code calls back to using "origin", on the assumption that this is the default remote name. Some users may change this, such as by setting clone.defaultRemoteName, or by changing the remote name manually within the submodule repository. As a first step to improving this situation, refactor to reuse the logic from remotes_remote_for_branch(). This function uses the remote from the branch if it has one. If it doesn't then it checks to see if there is exactly one remote. It uses this remote first before attempting to fall back to "origin". To allow using this helper function, introduce a repo_default_remote() helper to remote.c which takes a repository structure. This helper will load the remote configuration and get the "HEAD" branch. Then it will call remotes_remote_for_branch to find the default remote. Replace calls of repo_get_default_remote() with the calls to this new function. To maintain consistency with the existing callers, continue copying the returned string with xstrdup. This isn't a perfect solution for users who change remote names, but it should help in cases where the remote name is changed but users haven't added any additional remotes. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-23dir: move starts_with_dot(_dot)_slash to dir.hJacob Keller1-12/+0
Both submodule--helper.c and submodule-config.c have an implementation of starts_with_dot_slash and starts_with_dot_dot_slash. The dir.h header has starts_with_dot(_dot)_slash_native, which sets PATH_MATCH_NATIVE. Move the helpers to dir.h as static inlines. I thought about renaming them to postfix with _platform but that felt too long and ugly. On the other hand it might be slightly confusing with _native. This simplifies a submodule refactor which wants to use the helpers earlier in the submodule--helper.c file. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-23remote: remove branch->merge_name and fix branch_release()Jacob Keller1-1/+1
The branch structure has both branch->merge_name and branch->merge for tracking the merge information. The former is allocated by add_merge() and stores the names read from the configuration file. The latter is allocated by set_merge() which is called by branch_get() when an external caller requests a branch. This leads to the confusing situation where branch->merge_nr tracks both the size of branch->merge (once its allocated) and branch->merge_name. The branch_release() function incorrectly assumes that branch->merge is always set when branch->merge_nr is non-zero, and can potentially crash if read_config() is called without branch_get() being called on every branch. In addition, branch_release() fails to free some of the memory associated with the structure including: * Failure to free the refspec_item containers in branch->merge[i] * Failure to free the strings in branch->merge_name[i] * Failure to free the branch->merge_name parent array. The set_merge() function sets branch->merge_nr to 0 when there is no valid remote_name, to avoid external callers seeing a non-zero merge_nr but a NULL merge array. This results in failure to release most of the merge data as well. These issues could be fixed directly, and indeed I initially proposed such a change at [1] in the past. While this works, there was some confusion during review because of the inconsistencies. Instead, its time to clean up the situation properly. Remove branch->merge_name entirely. Instead, allocate branch->merge earlier within add_merge() instead of within set_merge(). Instead of having set_merge() copy from merge_name[i] to merge[i]->src, just have add_merge() directly initialize merge[i]->src. Modify the add_merge() to call xstrdup() itself, instead of having the caller of add_merge() do so. This makes it more obvious which code owns the memory. Update all callers which use branch->merge_name[i] to use branch->merge[i]->src instead. Add a merge_clear() function which properly releases all of the merge-related memory, and which sets branch->merge_nr to zero. Use this both in branch_release() and in set_merge(), fixing the leak when set_merge() finds no valid remote_name. Add a set_merge variable to the branch structure, which indicates whether set_merge() has been called. This replaces the previous use of a NULL check against the branch->merge array. With these changes, the merge array is always allocated when merge_nr is non-zero. This use of refspec_item to store the names should be safe. External callers should be using branch_get() to obtain a pointer to the branch, which will call set_merge(), and the callers internal to remote.c already handle the partially initialized refpsec_item structure safely. This end result is cleaner, and avoids duplicating the merge names twice. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Link: [1] https://lore.kernel.org/git/20250617-jk-submodule-helper-use-url-v2-1-04cbb003177d@gmail.com/ Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-23repack: exclude cruft pack(s) from the MIDX where possibleTaylor Blau1-20/+167
In ddee3703b3 (builtin/repack.c: add cruft packs to MIDX during geometric repack, 2022-05-20), repack began adding cruft pack(s) to the MIDX with '--write-midx' to ensure that the resulting MIDX was always closed under reachability in order to generate reachability bitmaps. While the previous patch added the '--stdin-packs=follow' option to pack-objects, it is not yet on by default. Given that, suppose you have a once-unreachable object packed in a cruft pack, which later becomes reachable from one or more objects in a geometrically repacked pack. That once-unreachable object *won't* appear in the new pack, since the cruft pack was not specified as included or excluded when the geometrically repacked pack was created with 'pack-objects --stdin-packs' (*not* '--stdin-packs=follow', which is not on). If that new pack is included in a MIDX without the cruft pack, then trying to generate bitmaps for that MIDX may fail. This happens when the bitmap selection process picks one or more commits which reach the once-unreachable objects. To mitigate this failure mode, commit ddee3703b3 ensures that the MIDX will be closed under reachability by including cruft pack(s). If cruft pack(s) were not included, we would fail to generate a MIDX bitmap. But ddee3703b3 alludes to the fact that this is sub-optimal by saying [...] it's desirable to avoid including cruft packs in the MIDX because it causes the MIDX to store a bunch of objects which are likely to get thrown away. , which is true, but hides an even larger problem. If repositories rarely prune their unreachable objects and/or have many of them, the MIDX must keep track of a large number of objects which bloats the MIDX and slows down object lookup. This is doubly unfortunate because the vast majority of objects in cruft pack(s) are unlikely to be read. But any object lookups that go through the MIDX must binary search over them anyway, slowing down object lookups using the MIDX. This patch causes geometrically-repacked packs to contain a copy of any once-unreachable object(s) with 'git pack-objects --stdin-packs=follow', allowing us to avoid including any cruft packs in the MIDX. This is because a sequence of geometrically-repacked packs that were all generated with '--stdin-packs=follow' are guaranteed to have their union be closed under reachability. Note that you cannot guarantee that a collection of packs is closed under reachability if not all of them were generated with "following" as above. One tell-tale sign that not all geometrically-repacked packs in the MIDX were generated with "following" is to see if there is a pack in the existing MIDX that is not going to be somehow represented (either verbatim or as part of a geometric rollup) in the new MIDX. If there is, then starting to generate packs with "following" during geometric repacking won't work, since it's open to the same race as described above. But if you're starting from scratch (e.g., building the first MIDX after an all-into-one '--cruft' repack), then you can guarantee that the union of subsequently generated packs from geometric repacking *is* closed under reachability. (One exception here is when "starting from scratch" results in a noop repack, e.g., because the non-cruft pack(s) in a repository already form a geometric progression. Since we can't tell whether or not those were generated with '--stdin-packs=follow', they may depend on once-unreachable objects, so we have to include the cruft pack in the MIDX in this case.) Detect when this is the case and avoid including cruft packs in the MIDX where possible. The existing behavior remains the default, and the new behavior is available with the config 'repack.midxMustIncludeCruft' set to 'false'. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-23pack-objects: introduce '--stdin-packs=follow'Taylor Blau1-22/+64
When invoked with '--stdin-packs', pack-objects will generate a pack which contains the objects found in the "included" packs, less any objects from "excluded" packs. Packs that exist in the repository but weren't specified as either included or excluded are in practice treated like the latter, at least in the sense that pack-objects won't include objects from those packs. This behavior forces us to include any cruft pack(s) in a repository's multi-pack index for the reasons described in ddee3703b3 (builtin/repack.c: add cruft packs to MIDX during geometric repack, 2022-05-20). The full details are in ddee3703b3, but the gist is if you have a once-unreachable object in a cruft pack which later becomes reachable via one or more commits in a pack generated with '--stdin-packs', you *have* to include that object in the MIDX via the copy in the cruft pack, otherwise we cannot generate reachability bitmaps for any commits which reach that object. Note that the traversal here is best-effort, similar to the existing traversal which provides name-hash hints. This means that the object traversal may hand us back a blob that does not actually exist. We *won't* see missing trees/commits with 'ignore_missing_links' because: - missing commit parents are discarded at the commit traversal stage by revision.c::process_parents() - missing tag objects are discarded by revision.c::handle_commit() - missing tree objects are discarded by the list-objects code in list-objects.c::process_tree() But we have to handle potentially-missing blobs specially by making a separate check to ensure they exist in the repository. Failing to do so would mean that we'd add an object to the packing list which doesn't actually exist, rendering us unable to write out the pack. This prepares us for new repacking behavior which will "resurrect" objects found in cruft or otherwise unspecified packs when generating new packs. In the context of geometric repacking, this may be used to maintain a sequence of geometrically-repacked packs, the union of which is closed under reachability, even in the case described earlier. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-23pack-objects: swap 'show_{object,commit}_pack_hint'Taylor Blau1-6/+6
show_commit_pack_hint() has heretofore been a noop, so its position within its compilation unit only needs to appear before its first use. But the following commit will sometimes have `show_commit_pack_hint()` call `show_object_pack_hint()`, so reorder the former to appear after the latter to minimize the code movement in that patch. Suggested-by: Elijah Newren <newren@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-23pack-objects: fix typo in 'show_object_pack_hint()'Taylor Blau1-1/+1
Noticed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-23pack-objects: perform name-hash traversal for unpacked objectsTaylor Blau1-8/+12
With '--unpacked', pack-objects adds loose objects (which don't appear in any of the excluded packs from '--stdin-packs') to the output pack without considering them as reachability tips for the name-hash traversal. This was an oversight in the original implementation of '--stdin-packs', since the code which enumerates and adds loose objects to the output pack (`add_unreachable_loose_objects()`) did not have access to the 'rev_info' struct found in `read_packs_list_from_stdin()`. Excluding unpacked objects from that traversal doesn't affect the correctness of the resulting pack, but it does make it harder to discover good deltas for loose objects. Now that the 'rev_info' struct is declared outside of `read_packs_list_from_stdin()`, we can pass it to `add_objects_in_unpacked_packs()` and add any loose objects as tips to the above-mentioned traversal, in theory producing slightly tighter packs as a result. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-23pack-objects: declare 'rev_info' for '--stdin-packs' earlierTaylor Blau1-33/+34
Once 'read_packs_list_from_stdin()' has called for_each_object_in_pack() on each of the input packs, we do a reachability traversal to discover names for any objects we picked up so we can generate name hash values and hopefully get higher quality deltas as a result. A future commit will change the purpose of this reachability traversal to find and pack objects which are reachable from commits in the input packs, but are packed in an unknown (not included nor excluded) pack. Extract the code which initializes and performs the reachability traversal to take place in the caller, not the callee, which prepares us to share this code for the '--unpacked' case (see the function add_unreachable_loose_objects() for more details). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-23pack-objects: factor out handling '--stdin-packs'Taylor Blau1-6/+12
At the bottom of cmd_pack_objects() we check which mode the command is running in (e.g., generating a cruft pack, handling '--stdin-packs', using the internal rev-list, etc.) and handle the mode appropriately. The '--stdin-packs' case is handled inline (dating back to its introduction in 339bce27f4 (builtin/pack-objects.c: add '--stdin-packs' option, 2021-02-22)) since it is relatively short. Extract the body of "if (stdin_packs)" into its own function to prepare for the implementation to become lengthier in a following commit. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-23pack-objects: limit scope in 'add_object_entry_from_pack()'Taylor Blau1-1/+1
In add_object_entry_from_pack() we declare 'revs' (given to us through the miscellaneous context argument) earlier in the "if (p)" conditional than is necessary. Move it down as far as it can go to reduce its scope. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-23pack-objects: use standard option incompatibility functionsTaylor Blau1-9/+11
pack-objects has a handful of explicit checks for pairs of command-line options which are mutually incompatible. Many of these pre-date a699367bb8 (i18n: factorize more 'incompatible options' messages, 2022-01-31). Convert the explicit checks into die_for_incompatible_opt2() calls, which simplifies the implementation and standardizes pack-objects' output when given incompatible options (e.g., --stdin-packs with --filter gives different output than --keep-unreachable with --unpack-unreachable). There is one minor piece of test fallout in t5331 that expects the old format, which has been corrected. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-18Merge branch 'ly/submodule-update-failure-leakfix'Junio C Hamano1-1/+3
A memory leak on an error code path has been plugged. * ly/submodule-update-failure-leakfix: builtin/submodule--helper: fix leak when remote_submodule_branch() failed
2025-06-18Merge branch 'ly/commit-buffer-reencode-leakfix'Junio C Hamano2-1/+3
Leakfix. * ly/commit-buffer-reencode-leakfix: repo_logmsg_reencode: fix memory leak when use repo_logmsg_reencode ()