aboutsummaryrefslogtreecommitdiffstats
path: root/t
AgeCommit message (Collapse)AuthorFilesLines
2024-12-28reftable: fix allocation count on realloc errorRené Scharfe1-0/+26
When realloc(3) fails, it returns NULL and keeps the original allocation intact. REFTABLE_ALLOC_GROW overwrites both the original pointer and the allocation count variable in that case, simultaneously leaking the original allocation and misrepresenting the number of storable items. parse_names() avoids the leak by keeping the original pointer if reallocation fails, but still increase the allocation count in such a case as if it succeeded. That's OK, because the error handling code just frees everything and doesn't look at names_cap anymore. reftable_buf_add() does the same, but here it is a problem as it leaves the reftable_buf in a broken state, with ->alloc being roughly twice as big as the actually allocated memory, allowing out-of-bounds writes in subsequent calls. Reimplement REFTABLE_ALLOC_GROW to avoid leaks, keep allocation counts in sync and still signal failures to callers while avoiding code duplication in callers. Make it an expression that evaluates to 0 if no reallocation is needed or it succeeded and 1 on failure while keeping the original pointer and allocation counter values. Adjust REFTABLE_ALLOC_GROW_OR_NULL to the new calling convention for REFTABLE_ALLOC_GROW, but keep its support for non-size_t alloc variables for now. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-28reftable: avoid leaks on realloc errorRené Scharfe1-0/+30
When realloc(3) fails, it returns NULL and keeps the original allocation intact. REFTABLE_ALLOC_GROW overwrites both the original pointer and the allocation count variable in that case, simultaneously leaking the original allocation and misrepresenting the number of storable items. parse_names() and reftable_buf_add() avoid leaking by restoring the original pointer value on failure, but all other callers seem to be OK with losing the old allocation. Add a new variant of the macro, REFTABLE_ALLOC_GROW_OR_NULL, which plugs the leak and zeros the allocation counter. Use it for those callers. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-23Merge branch 'kn/reflog-migration'Junio C Hamano1-22/+51
"git refs migrate" learned to also migrate the reflog data across backends. * kn/reflog-migration: refs: mark invalid refname message for translation refs: add support for migrating reflogs refs: allow multiple reflog entries for the same refname refs: introduce the `ref_transaction_update_reflog` function refs: add `committer_info` to `ref_transaction_add_update()` refs: extract out refname verification in transactions refs/files: add count field to ref_lock refs: add `index` field to `struct ref_udpate` refs: include committer info in `ref_update` struct
2024-12-23Merge branch 'ps/ci-meson'Junio C Hamano8-55/+127
The meson-build procedure is integrated into CI to catch and prevent bitrotting. * ps/ci-meson: ci: wire up Meson builds t: introduce compatibility options to clar-based tests t: fix out-of-tree tests for some git-p4 tests Makefile: detect missing Meson tests meson: detect missing tests at configure time t/unit-tests: rename clar-based unit tests to have a common prefix Makefile: drop -DSUPPRESS_ANNOTATED_LEAKS ci/lib: support custom output directories when creating test artifacts
2024-12-23Merge branch 'js/range-diff-diff-merges'Junio C Hamano1-0/+16
"git range-diff" learned to optionally show and compare merge commits in the ranges being compared, with the --diff-merges option. * js/range-diff-diff-merges: range-diff: introduce the convenience option `--remerge-diff` range-diff: optionally include merge commits' diffs in the analysis
2024-12-23Merge branch 'as/show-index-uninitialized-hash'Junio C Hamano1-0/+18
Regression fix for 'show-index' when run outside of a repository. * as/show-index-uninitialized-hash: t5300: add test for 'show-index --object-format' show-index: fix uninitialized hash function
2024-12-23Merge branch 'ps/build-sign-compare'Junio C Hamano27-32/+43
Start working to make the codebase buildable with -Wsign-compare. * ps/build-sign-compare: t/helper: don't depend on implicit wraparound scalar: address -Wsign-compare warnings builtin/patch-id: fix type of `get_one_patchid()` builtin/blame: fix type of `length` variable when emitting object ID gpg-interface: address -Wsign-comparison warnings daemon: fix type of `max_connections` daemon: fix loops that have mismatching integer types global: trivial conversions to fix `-Wsign-compare` warnings pkt-line: fix -Wsign-compare warning on 32 bit platform csum-file: fix -Wsign-compare warning on 32-bit platform diff.h: fix index used to loop through unsigned integer config.mak.dev: drop `-Wno-sign-compare` global: mark code units that generate warnings with `-Wsign-compare` compat/win32: fix -Wsign-compare warning in "wWinMain()" compat/regex: explicitly ignore "-Wsign-compare" warnings git-compat-util: introduce macros to disable "-Wsign-compare" warnings
2024-12-23Merge branch 'kn/reftable-writer-log-write-verify'Junio C Hamano2-4/+51
Reftable backend adds check for upper limit of log's update_index. * kn/reftable-writer-log-write-verify: reftable/writer: ensure valid range for log's update_index
2024-12-19Merge branch 'tc/bundle-with-tag-remove-workaround'Junio C Hamano1-0/+44
"git bundle create" with an annotated tag on the positive end of the revision range had a workaround code for older limitation in the revision walker, which has become unnecessary. * tc/bundle-with-tag-remove-workaround: bundle: remove unneeded code
2024-12-19Merge branch 'js/log-remerge-keep-ancestry'Junio C Hamano1-0/+7
"git log -p --remerge-diff --reverse" was completely broken. * js/log-remerge-keep-ancestry: log: --remerge-diff needs to keep around commit parents
2024-12-19Merge branch 'bf/fetch-set-head-config'Junio C Hamano2-0/+149
"git fetch" honors "remote.<remote>.followRemoteHEAD" settings to tweak the remote-tracking HEAD in "refs/remotes/<remote>/HEAD". * bf/fetch-set-head-config: remote set-head: set followRemoteHEAD to "warn" if "always" fetch set_head: add warn-if-not-$branch option fetch set_head: move warn advice into advise_if_enabled fetch: add configuration for set_head behaviour
2024-12-19Merge branch 'jc/set-head-symref-fix'Junio C Hamano1-0/+17
"git fetch" from a configured remote learned to update a missing remote-tracking HEAD but it asked the remote about their HEAD even when it did not need to, which has been corrected. Incidentally, this also corrects "git fetch --tags $URL" which was broken by the new feature in an unspecified way. * jc/set-head-symref-fix: fetch: do not ask for HEAD unnecessarily
2024-12-19Merge branch 'bf/set-head-symref'Junio C Hamano11-17/+221
When "git fetch $remote" notices that refs/remotes/$remote/HEAD is missing and discovers what branch the other side points with its HEAD, refs/remotes/$remote/HEAD is updated to point to it. * bf/set-head-symref: fetch set_head: handle mirrored bare repositories fetch: set remote/HEAD if it does not exist refs: add create_only option to refs_update_symref_extended refs: add TRANSACTION_CREATE_EXISTS error remote set-head: better output for --auto remote set-head: refactor for readability refs: atomically record overwritten ref in update_symref refs: standardize output of refs_read_symbolic_ref t/t5505-remote: test failure of set-head t/t5505-remote: set default branch to main
2024-12-16refs: add support for migrating reflogsKarthik Nayak1-22/+51
The `git refs migrate` command was introduced in 25a0023f28 (builtin/refs: new command to migrate ref storage formats, 2024-06-06) to support migrating from one reference backend to another. One limitation of the command was that it didn't support migrating repositories which contained reflogs. A previous commit, added support for adding reflog updates in ref transactions. Using the added functionality bake in reflog support for `git refs migrate`. To ensure that the order of the reflogs is maintained during the migration, we add the index for each reflog update as we iterate over the reflogs from the old reference backend. This is to ensure that the order is maintained in the new backend. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-16range-diff: optionally include merge commits' diffs in the analysisJohannes Schindelin1-0/+16
The `git log` command already offers support for including diffs for merges, via the `--diff-merges=<format>` option. Let's add corresponding support for `git range-diff`, too. This makes it more convenient to spot differences between commit ranges that contain merges. This is especially true in scenarios with non-trivial merges, i.e. merges introducing changes other than, or in addition to, what merge ORT would have produced. Merging a topic branch that changes a function signature into a branch that added a caller of that function, for example, would require the merge commit itself to adjust that caller to the modified signature. In my code reviews, I found the `--diff-merges=remerge` option particularly useful. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-16Merge branch 'js/log-remerge-keep-ancestry' into js/range-diff-diff-mergesJunio C Hamano1-0/+7
* js/log-remerge-keep-ancestry: log: --remerge-diff needs to keep around commit parents
2024-12-15Merge branch 'ps/build'Junio C Hamano5-13/+1229
Build procedure update plus introduction of Meson based builds. * ps/build: (24 commits) Introduce support for the Meson build system Documentation: add comparison of build systems t: allow overriding build dir t: better support for out-of-tree builds Documentation: extract script to generate a list of mergetools Documentation: teach "cmd-list.perl" about out-of-tree builds Documentation: allow sourcing generated includes from separate dir Makefile: simplify building of templates Makefile: write absolute program path into bin-wrappers Makefile: allow "bin-wrappers/" directory to exist Makefile: refactor generators to be PWD-independent Makefile: extract script to generate gitweb.js Makefile: extract script to generate gitweb.cgi Makefile: extract script to massage Python scripts Makefile: extract script to massage Shell scripts Makefile: use "generate-perl.sh" to massage Perl library Makefile: extract script to massage Perl scripts Makefile: consistently use PERL_PATH Makefile: generate doc versions via GIT-VERSION-GEN Makefile: generate "git.rc" via GIT-VERSION-GEN ...
2024-12-15Merge branch 'ps/commit-with-message-syntax-fix'Junio C Hamano1-0/+15
The syntax ":/<text>" to name the latest commit with the matching text was broken with a recent change, which has been corrected. * ps/commit-with-message-syntax-fix: object-name: fix reversed ordering with ":/<text>" revisions
2024-12-15Merge branch 'rj/strvec-splice-fix'Junio C Hamano1-0/+10
Correct strvec_splice() that misbehaved when the strvec is empty. * rj/strvec-splice-fix: strvec: `strvec_splice()` to a statically initialized vector
2024-12-15Merge branch 'bf/explicit-config-set-in-advice-messages'Junio C Hamano15-23/+23
The advice messages now tell the newer 'git config set' command to set the advice.token configuration variable to squelch a message. * bf/explicit-config-set-in-advice-messages: advice: suggest using subcommand "git config set"
2024-12-15Merge branch 'jc/forbid-head-as-tagname'Junio C Hamano2-3/+15
"git tag" has been taught to refuse to create refs/tags/HEAD as such a tag will be confusing in the context of UI provided by the Git Porcelain commands. * jc/forbid-head-as-tagname: tag: "git tag" refuses to use HEAD as a tagname t5604: do not expect that HEAD can be a valid tagname refs: drop strbuf_ prefix from helpers refs: move ref name helpers around
2024-12-15Merge branch 'jk/describe-perf'Junio C Hamano2-3/+51
"git describe" optimization. * jk/describe-perf: describe: split "found all tags" and max_candidates logic describe: stop traversing when we run out of names describe: stop digging for max_candidates+1 t/perf: add tests for git-describe t6120: demonstrate weakness in disjoint-root handling
2024-12-15Merge branch 'kn/reftable-writer-log-write-verify' into kn/reflog-migrationJunio C Hamano2-4/+51
* kn/reftable-writer-log-write-verify: reftable/writer: ensure valid range for log's update_index
2024-12-13Merge branch 'kn/midx-wo-the-repository'Junio C Hamano1-4/+4
Yet another "pass the repository through the callchain" topic. * kn/midx-wo-the-repository: midx: inline the `MIDX_MIN_SIZE` definition midx: pass down `hash_algo` to functions using global variables midx: pass `repository` to `load_multi_pack_index` midx: cleanup internal usage of `the_repository` and `the_hash_algo` midx-write: pass down repository to `write_midx_file[_only]` write-midx: add repository field to `write_midx_context` midx-write: use `revs->repo` inside `read_refs_snapshot` midx-write: pass down repository to static functions packfile.c: remove unnecessary prepare_packed_git() call midx: add repository to `multi_pack_index` struct config: make `packed_git_(limit|window_size)` non-global variables config: make `delta_base_cache_limit` a non-global variable packfile: pass down repository to `for_each_packed_object` packfile: pass down repository to `has_object[_kept]_pack` packfile: pass down repository to `odb_pack_name` packfile: pass `repository` to static function in the file packfile: use `repository` from `packed_git` directly packfile: add repository to struct `packed_git`
2024-12-13Merge branch 'cw/worktree-extension'Junio C Hamano8-46/+155
Introduce a new repository extension to prevent older Git versions from mis-interpreting worktrees created with relative paths. * cw/worktree-extension: worktree: refactor `repair_worktree_after_gitdir_move()` worktree: add relative cli/config options to `repair` command worktree: add relative cli/config options to `move` command worktree: add relative cli/config options to `add` command worktree: add `write_worktree_linking_files()` function worktree: refactor infer_backlink return worktree: add `relativeWorktrees` extension setup: correctly reinitialize repository version
2024-12-13Merge branch 'en/fast-import-verify-path'Junio C Hamano2-4/+111
"git fast-import" learned to reject paths with ".." and "." as their components to avoid creating invalid tree objects. * en/fast-import-verify-path: t9300: test verification of renamed paths fast-import: disallow more path components fast-import: disallow "." and ".." path components
2024-12-13Merge branch 'jt/bundle-fsck'Junio C Hamano1-0/+7
"git bundle --unbundle" and "git clone" running on a bundle file both learned to trigger fsck over the new objects with configurable fck check levels. * jt/bundle-fsck: transport: propagate fsck configuration during bundle fetch fetch-pack: split out fsck config parsing bundle: support fsck message configuration bundle: add bundle verification options type
2024-12-13log: --remerge-diff needs to keep around commit parentsJohannes Schindelin1-0/+7
To show a remerge diff, the merge needs to be recreated. For that to work, the merge base(s) need to be found, which means that the commits' parents have to be traversed until common ancestors are found (if any). However, one optimization that hails all the way back to cb115748ec0d (Some more memory leak avoidance, 2006-06-17) is to release the commit's list of parents immediately after showing it _and to set that parent list to `NULL`_. This can break the merge base computation. This problem is most obvious when traversing the commits in reverse: In that instance, if a parent of a merge commit has been shown as part of the `git log` command, by the time the merge commit's diff needs to be computed, that parent commit's list of parent commits will have been set to `NULL` and as a result no merge base will be found (even if one should be found). Traversing commits in reverse is far from the only circumstance in which this problem occurs, though. There are many avenues to traversing at least one commit in the revision walk that will later be part of a merge base computation, for example when not even walking any revisions in `git show <merge1> <merge2>` where `<merge1>` is part of the commit graph between the parents of `<merge2>`. Another way to force a scenario where a commit is traversed before it has to be traversed again as part of a merge base computation is to start with two revisions (where the first one is reachable from the second but not in a first-parent ancestry) and show the commit log with `--topo-order` and `--first-parent`. Let's fix this by special-casing the `remerge_diff` mode, similar to what we did with reflogs in f35650dff6a4 (log: do not free parents when walking reflog, 2017-07-07). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-13t: introduce compatibility options to clar-based testsPatrick Steinhardt1-1/+18
Our unit tests that don't yet use the clar unit testing framework ignore any option that they do not understand. It is thus fine to just pass test options we set up globally to those unit tests as they are simply ignored. This makes our life easier because we don't have to special case those options with Meson, where test options are set up globally via `meson test --test-args=`. But our clar-based unit testing framework is way stricter here and will fail in case it is passed an unknown option. Stub out these options with no-ops to make our life a bit easier. Note that this also requires us to remove the `-x` short option for `--exclude`. This is because `-x` has another meaning in our integration tests, as it enables shell tracing. I doubt there are a lot of people out there using it as we only got a small hand full of clar tests in the first place. So better change it now so that we can in the long run improve compatibility between the two different test drivers. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-13t: fix out-of-tree tests for some git-p4 testsPatrick Steinhardt2-50/+50
Both t9835 and t9836 exercise git-p4, but one exercises Python 2 whereas the other one uses Python 3. These tests do not exercise "git p4", but instead they use "git p4.py". This calls the unbuilt version of "git-p4.py" that still has the "#!/usr/bin/env python" shebang, which allows the test to modify which Python version comes first in $PATH, making it possible to force a Python version. But "git-p4.py" is not in our PATH during out-of-tree builds, and thus we cannot locate "git-p4.py". The tests thus break with CMake and Meson. Fix this by instead manually setting up script wrappers that invoke the respective Python interpreter directly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-13Makefile: detect missing Meson testsPatrick Steinhardt1-1/+17
In the preceding commit, we have introduced consistency checks to Meson to detect any discrepancies with missing or extraneous tests in its build instructions. These checks only get executed in Meson though, so any users of our Makefiles wouldn't be alerted of the fact that they have to modify the Meson build instructions in case they add or remove any tests. Add a comparable test target to our Makefile to plug this gap. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-13meson: detect missing tests at configure timePatrick Steinhardt1-0/+36
It is quite easy for the list of integration tests to go out-of-sync without anybody noticing. Introduce a new configure-time check that verifies that all tests are wired up properly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-13t/unit-tests: rename clar-based unit tests to have a common prefixPatrick Steinhardt4-3/+6
All of the code files for unit tests using the self-grown unit testing framework have a "t-" prefix to their name. This makes it easy to identify them and use globbing in our Makefile and in other places. On the other hand though, our clar-based unit tests have no prefix at all and thus cannot easily be discerned from other files in the unit test directory. Introduce a new "u-" prefix for clar-based unit tests. This prefix will be used in a subsequent commit to easily identify such tests. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-12bundle: remove unneeded codeToon Claes1-0/+44
The changes in commit c06793a4ed (allow git-bundle to create bottomless bundle, 2007-08-08) ensure annotated tags are properly preserved when creating a bundle using a revision range operation. At the time the range notation would peel the ends to their corresponding commit, meaning ref v2.0 would point to the v2.0^0 commit. So the above workaround was introduced. This code looks up the ref before it's written to the bundle, and if the ref doesn't point to the object we expect (for tags this would be a tag object), we skip the ref from the bundle. Instead, when the ref is a tag that's the positive end of the range (e.g. v2.0 from the range "v1.0..v2.0"), then that ref is written to the bundle instead. Later, in 895c5ba3c1 (revision: do not peel tags used in range notation, 2013-09-19), the behavior of parsing ranges was changed and the problem was fixed at the cause. But the workaround in bundle.c was not reverted. Now it seems this workaround can cause a race condition. git-bundle(1) uses setup_revisions() to parse the input into `struct rev_info`. Later, in write_bundle_refs(), it uses this info to write refs to the bundle. As mentioned at this point each ref is looked up again and checked whether it points to the object we expect. If not, the ref is not written to the bundle. But, when creating a bundle in a heavy traffic repository (a repo with many references, and frequent ref updates) it's possible a branch ref was updated between setup_revisions() and write_bundle_refs() and thus the extra check causes the ref to be skipped. The workaround was originally added to deal with tags, but the code path also gets hit by non-tag refs, causing this race condition. Because it's no longer needed, remove it and fix the possible race condition. Signed-off-by: Toon Claes <toon@iotcl.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-12Merge branch 'ps/build' into ps/ci-mesonJunio C Hamano5-13/+1229
* ps/build: (24 commits) Introduce support for the Meson build system Documentation: add comparison of build systems t: allow overriding build dir t: better support for out-of-tree builds Documentation: extract script to generate a list of mergetools Documentation: teach "cmd-list.perl" about out-of-tree builds Documentation: allow sourcing generated includes from separate dir Makefile: simplify building of templates Makefile: write absolute program path into bin-wrappers Makefile: allow "bin-wrappers/" directory to exist Makefile: refactor generators to be PWD-independent Makefile: extract script to generate gitweb.js Makefile: extract script to generate gitweb.cgi Makefile: extract script to massage Python scripts Makefile: extract script to massage Shell scripts Makefile: use "generate-perl.sh" to massage Perl library Makefile: extract script to massage Perl scripts Makefile: consistently use PERL_PATH Makefile: generate doc versions via GIT-VERSION-GEN Makefile: generate "git.rc" via GIT-VERSION-GEN ...
2024-12-12Merge branch 'cw/worktree-extension' into ps/ci-mesonJunio C Hamano8-46/+155
* cw/worktree-extension: worktree: refactor `repair_worktree_after_gitdir_move()` worktree: add relative cli/config options to `repair` command worktree: add relative cli/config options to `move` command worktree: add relative cli/config options to `add` command worktree: add `write_worktree_linking_files()` function worktree: refactor infer_backlink return worktree: add `relativeWorktrees` extension setup: correctly reinitialize repository version
2024-12-10Merge branch 'ps/reftable-iterator-reuse'Junio C Hamano1-0/+73
Optimize reading random references out of the reftable backend by allowing reuse of iterator objects. * ps/reftable-iterator-reuse: refs/reftable: reuse iterators when reading refs reftable/merged: drain priority queue on reseek reftable/stack: add mechanism to notify callers on reload refs/reftable: refactor reflog expiry to use reftable backend refs/reftable: refactor reading symbolic refs to use reftable backend refs/reftable: read references via `struct reftable_backend` refs/reftable: figure out hash via `reftable_stack` reftable/stack: add accessor for the hash ID refs/reftable: handle reloading stacks in the reftable backend refs/reftable: encapsulate reftable stack
2024-12-10Merge branch 'ps/reftable-detach'Junio C Hamano10-107/+115
Isolates the reftable subsystem from the rest of Git's codebase by using fewer pieces of Git's infrastructure. * ps/reftable-detach: reftable/system: provide thin wrapper for lockfile subsystem reftable/stack: drop only use of `get_locked_file_path()` reftable/system: provide thin wrapper for tempfile subsystem reftable/stack: stop using `fsync_component()` directly reftable/system: stop depending on "hash.h" reftable: explicitly handle hash format IDs reftable/system: move "dir.h" to its only user
2024-12-10Merge branch 'bc/allow-upload-pack-from-other-people'Junio C Hamano2-3/+10
Loosen overly strict ownership check introduced in the recent past, to keep the promise "cloning a suspicious repository is a safe first step to inspect it". * bc/allow-upload-pack-from-other-people: Allow cloning from repositories owned by another user
2024-12-10Merge branch 'pb/mergetool-errors'Junio C Hamano1-0/+8
End-user experience of "git mergetool" when the command errors out has been improved. * pb/mergetool-errors: git-difftool--helper.sh: exit upon initialize_merge_tool errors git-mergetool--lib.sh: add error message for unknown tool variant git-mergetool--lib.sh: add error message if 'setup_user_tool' fails git-mergetool--lib.sh: use TOOL_MODE when erroring about unknown tool completion: complete '--tool-help' in 'git mergetool'
2024-12-10strvec: `strvec_splice()` to a statically initialized vectorRubén Justo1-0/+10
We use a singleton empty array to initialize a `struct strvec`; similar to the empty string singleton we use to initialize a `struct strbuf`. Note that an empty strvec instance (with zero elements) does not necessarily need to be an instance initialized with the singleton. Let's refer to strvec instances initialized with the singleton as "empty-singleton" instances. As a side note, this is the current `strvec_pop()`: void strvec_pop(struct strvec *array) { if (!array->nr) return; free((char *)array->v[array->nr - 1]); array->v[array->nr - 1] = NULL; array->nr--; } So, with `strvec_pop()` an instance can become empty but it does not going to be the an "empty-singleton". This "empty-singleton" circumstance requires us to be careful when adding elements to instances. Specifically, when adding the first element: when we detach the strvec instance from the singleton and set the internal pointer in the instance to NULL. After this point we apply `realloc()` on the pointer. We do this in `strvec_push_nodup()`, for example. The recently introduced `strvec_splice()` API is expected to be normally used with non-empty strvec's. However, it can also end up being used with "empty-singleton" strvec's: struct strvec arr = STRVEC_INIT; int a = 0, b = 0; ... no modification to arr, a or b ... const char *rep[] = { "foo" }; strvec_splice(&arr, a, b, rep, ARRAY_SIZE(rep)); So, we'll try to add elements to an "empty-singleton" strvec instance. Avoid misapplying `realloc()` to the singleton in `strvec_splice()` by adding a special case for strvec's initialized with the singleton. Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-08object-name: fix reversed ordering with ":/<text>" revisionsPatrick Steinhardt1-0/+15
Recently it was reported [1] that "look for the youngest commit reachable from any ref with log message that match the given pattern" syntax (i.e. ':/<text>') started to return results in reverse recency order. This regression was introduced in Git v2.47.0 and is caused by a memory leak fix done in 57fb139b5e (object-name: fix leaking commit list items, 2024-08-01). The intent of the identified commit is to stop modifying the commit list provided by the caller such that the caller can properly free all commit list items, including those that the called function might potentially remove from the list. This was done by creating a copy of the passed-in commit list and modifying this copy instead of the caller-provided list. We already knew to create such a copy beforehand with the `backup` list, which was used to clear the `ONELINE_SEEN` commit mark after we were done. So the refactoring simply renamed that list to `copy` and started to operate on that list instead. There is a gotcha though: the backup list, and thus now also the copied list, is always being prepended to, so the resulting list is in reverse order! The end result is that we pop commits from the wrong end of the commit list, returning commits in reverse recency order. Fix the bug by appending to the list instead. [1]: <CAKOEJdcPYn3O01p29rVa+xv=Qr504FQyKJeSB-Moze04ViCGGg@mail.gmail.com> Reported-by: Aarni Koskela <aarni@valohai.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07fetch: do not ask for HEAD unnecessarilyJunio C Hamano1-0/+17
In 3f763ddf28 (fetch: set remote/HEAD if it does not exist, 2024-11-22), git-fetch learned to opportunistically set $REMOTE/HEAD when fetching by always asking for remote HEAD, in the hope that it will help setting refs/remotes/<name>/HEAD if missing. But it is not needed to always ask for remote HEAD. When we are fetching from a remote, for which we have remote-tracking branches, we do need to know about HEAD. But if we are doing one-shot fetch, e.g., $ git fetch --tags https://github.com/git/git we do not even know what sub-hierarchy of refs/remotes/<remote>/ we need to adjust the remote HEAD for. There is no need to ask for HEAD in such a case. Incidentally, because the unconditional request to list "HEAD" affected the number of ref-prefixes requested in the ls-remote request, this affected how the requests for tags are added to the same ls-remote request, breaking "git fetch --tags $URL" performed against a URL that is not configured as a remote. Reported-by: Josh Steadmon <steadmon@google.com> [jc: tests are also borrowed from Josh's patch] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07reftable/writer: ensure valid range for log's update_indexKarthik Nayak2-4/+51
Each reftable addition has an associated update_index. While writing refs, the update_index is verified to be within the range of the reftable writer, i.e. `writer.min_update_index <= ref.update_index` and `writer.max_update_index => ref.update_index`. The corresponding check for reflogs in `reftable_writer_add_log` is however missing. Add a similar check, but only check for the upper limit. This is because reflogs are treated a bit differently than refs. Each reflog entry in reftable has an associated update_index and we also allow expiring entries in the middle, which is done by simply writing a new reflog entry with the same update_index. This means, writing reflog entries with update_index lesser than the writer's update_index is an expected scenario. Add a new unit test to check for the limits and fix some of the existing tests, which were setting arbitrary values for the update_index by ensuring they stay within the now checked limits. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07Introduce support for the Meson build systemPatrick Steinhardt2-0/+1205
Introduce support for the Meson build system, a "modern" meta build system that supports many different platforms, including Linux, macOS, Windows and BSDs. Meson supports different backends, including Ninja, Xcode and Microsoft Visual Studio. Several common IDEs provide an integration with it. The biggest contender compared to Meson is probably CMake as outlined in our "Documentation/technical/build-systems.txt" file. Based on my own personal experience from working with both build systems extensively I strongly favor Meson over CMake. In my opinion, it feels significantly easier to use with a syntax that feels more like a "real" programming language. The second big reason is that Meson supports Rust natively, which may prove to be important given that the project may pick up Rust as another language eventually. Using Meson is rather straight-forward. An example: ``` # Meson uses out-of-tree builds. You can set up multiple build # directories, how you name them is completely up to you. $ mkdir build $ cd build $ meson setup .. -Dprefix=/tmp/git-installation # Build the project. This also provides several other targets like e.g. `install` or `test`. $ ninja # Meson has been wired up to support execution of our test suites. # Both our unit tests and our integration tests are supported. # Running `meson test` without any arguments will execute all tests, # but the syntax supports globbing to select only some tests. $ meson test 't-*' # Execute single test interactively to allow for debugging. $ meson test 't0000-*' --interactive --test-args=-ix ``` The build instructions have been successfully tested on the following systems, tests are passing: - Apple macOS 10.15. - FreeBSD 14.1. - NixOS 24.11. - OpenBSD 7.6. - Ubuntu 24.04. - Windows 10 with Cygwin. - Windows 10 with MinGW64, except for t9700, which is also broken with our Makefile. - Windows 10 with Visual Studio 2022 toolchain, using the Native Tools Command Prompt with `meson setup --vsenv`. Tests pass, except for t9700. - Windows 10 with Visual Studio 2022 solution, using the Native Tools Command Prompt with `meson setup --backend vs2022`. Tests pass, except for t9700. - Windows 10 with VS Code, using the Meson plug-in. It is expected that there will still be rough edges in the current version. If this patch lands the expectation is that it will coexist with our other build systems for a while. Like this, distributions can slowly migrate over to Meson and report any findings they have to us such that we can continue to iterate. A potential cutoff date for other build systems may be Git 3.0. Some notes: - The installed distribution is structured somewhat differently than how it used to be the case. All of our binaries are installed into `$libexec/git-core`, while all binaries part of `$bindir` are now symbolic links pointing to the former. This rule is consistent in itself and thus easier to reason about. - We do not install dashed binaries into `$libexec/git-core` anymore, so there won't e.g. be a symlink for git-add(1). These are not required by modern Git and there isn't really much of a use case for those anymore. By not installing those symlinks we thus start the deprecation of this layout. - We're targeting Meson 1.3.0, which has been released relatively recently November 2023. The only feature we use from that version is `fs.relative_to()`, which we could replace if necessary. If so, we could start to target Meson 1.0.0 and newer, released in December 2022. - The whole build instructions count around 3300 lines, half of which is listing all of our code and test files. Our Makefiles are around 5000 lines, autoconf adds another 1300 lines. CMake in comparison has only 1200 linescode, but it avoids listing individual files and does not wire up auto-configuration as extensively as the Meson instructions do. - We bundle a set of subproject wrappers for curl, expat, openssl, pcre2 and zlib. This allows developers to build Git without these dependencies preinstalled, and Meson will fetch and build them automatically. This is especially helpful on Windows. Helped-by: Eli Schwartz <eschwartz@gentoo.org> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07t: allow overriding build dirPatrick Steinhardt1-1/+2
Our "test-lib.sh" assumes that our build directory is the parent directory of "t/". While true when using our Makefile, it's not when using build systems that support out-of-tree builds. In commit ee9e66e4e7 (cmake: avoid editing t/test-lib.sh, 2022-10-18), we have introduce support for overriding the GIT_BUILD_DIR by creating the file "$GIT_BUILD_DIR/GIT-BUILD-DIR" with its contents pointing to the location of the build directory. The intent was to stop modifying "t/test-lib.sh" with the CMake build systems while allowing out-of-tree builds. But "$GIT_BUILD_DIR" is somewhat misleadingly named, as it in fact points to the _source_ directory. So while that commit solved part of the problem for out-of-tree builds, CMake still has to write files into the source tree. Solve the second part of the problem, namely not having to write any data into the source directory at all, by also supporting an environment variable that allows us to point to a different build directory. This allows us to perform properly self-contained out-of-tree builds. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07t: better support for out-of-tree buildsPatrick Steinhardt3-6/+6
Our in-tree builds used by the Makefile use various different build directories scattered around different locations. The paths to those build directories have to be propagated to our tests such that they can find the contained files. This is done via a mixture of hardcoded paths in our test library and injected variables in our bin-wrappers or "GIT-BUILD-OPTIONS". The latter two mechanisms are preferable over using hardcoded paths. For one, we have all paths which are subject to change stored in a small set of central files instead of having the knowledge of build paths in many files. And second, it allows build systems which build files elsewhere to adapt those paths based on their own needs. This is especially nice in the context of build systems that use out-of-tree builds like CMake or Meson. Remove hardcoded knowledge of build paths from our test library and move it into our bin-wrappers and "GIT-BUILD-OPTIONS". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-07Makefile: use common template for GIT-BUILD-OPTIONSPatrick Steinhardt1-6/+16
The "GIT-BUILD-OPTIONS" file is generated by our build systems to propagate built-in features and paths to our tests. The generation is done ad-hoc, where both our Makefile and the CMake build instructions simply echo a bunch of strings into the file. This makes it very hard to figure out what variables are expected to exist and what format they have, and the written variables can easily get out of sync between build systems. Introduce a new "GIT-BUILD-OPTIONS.in" template to address this issue. This has multiple advantages: - It demonstrates which built options exist in the first place. - It can serve as a spot to document the build options. - Some build systems complain when not all variables could be substituted, alerting us of mismatches. Others don't, but if we forgot to substitute such variables we now have a bogus string that will likely cause our tests to fail, if they have any meaning in the first place. Backfill values that we didn't yet set in our CMake build instructions. While at it, remove the `SUPPORTS_SIMPLE_IPC` variable that we only set up in CMake as it isn't used anywhere. This change requires us to adapt the setup of TEST_OUTPUT_DIRECTORY in "test-lib.sh" such that it does not get overwritten after sourcing when it has been set up via the environment. This is the only instance I could find where we rely on ordering on variables. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06t/helper: don't depend on implicit wraparoundPatrick Steinhardt2-7/+2
In our test helpers we have two cases where we assign -1 to an `unsigned long`. The intent is to essentially mean "unbounded output", which is achieved via implicit wraparound of the value. This pattern causes warnings with -Wsign-compare though. Adapt it and instead use `ULONG_MAX` explicitly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06global: trivial conversions to fix `-Wsign-compare` warningsPatrick Steinhardt11-44/+17
We have a bunch of loops which iterate up to an unsigned boundary using a signed index, which generates warnigs because we compare a signed and unsigned value in the loop condition. Address these sites for trivial cases and enable `-Wsign-compare` warnings for these code units. This patch only adapts those code units where we can drop the `DISABLE_SIGN_COMPARE_WARNINGS` macro in the same step. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06diff.h: fix index used to loop through unsigned integerPatrick Steinhardt1-1/+0
The `struct diff_flags` structure is essentially an array of flags, all of which have the same type. We can thus use `sizeof()` to iterate through all of the flags, which we do in `diff_flags_or()`. But while the statement returns an unsigned integer, we used a signed integer to iterate through the flags, which generates a warning. Fix this by using `size_t` for the index instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06global: mark code units that generate warnings with `-Wsign-compare`Patrick Steinhardt28-0/+44
Mark code units that generate warnings with `-Wsign-compare`. This allows for a structured approach to get rid of all such warnings over time in a way that can be easily measured. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06describe: split "found all tags" and max_candidates logicJeff King1-0/+10
Commit a30154187a (describe: stop traversing when we run out of names, 2024-10-31) taught git-describe to automatically reduce the max_candidates setting to match the total number of possible names. This lets us break out of the traversal rather than fruitlessly searching for more candidates when there are no more to be found. However, setting max_candidates to 0 (e.g., if the repo has no tags) overlaps with the --exact-match option, which explicitly uses the same value. And this causes a regression with --always, which is ignored in exact-match mode. We used to get this in a repo with no tags: $ git describe --always HEAD b2f0a7f and now we get: $ git describe --always HEAD fatal: no tag exactly matches 'b2f0a7f47f5f2aebe1e7fceff19a57de20a78c06' The reason is that we bail early in describe_commit() when max_candidates is set to 0. This logic goes all the way back to 2c33f75754 (Teach git-describe --exact-match to avoid expensive tag searches, 2008-02-24). We should obviously fix this regression, but there are two paths, depending on what you think: $ git describe --always --exact-match and $ git describe --always --candidates=0 should do. Since the "--always" option was added, it has always been ignored in --exact-match (or --candidates=0) mode. I.e., we treat --exact-match as a true exact match of a tag, and never fall back to using --always, even if it was requested. If we think that's a bug (or at least a misfeature), then the right solution is to fix it by removing the early bail-out from 2c33f75754, letting the noop algorithm run and then hitting the --always fallback output. And then our regression naturally goes away, because it follows the same path. If we think that the current "--exact-match --always" behavior is the right thing, then we have to differentiate the case where we automatically reduced max_candidates to 0 from the case where the user asked for it specifically. That's possible to do with a flag, but we can also just reimplement the logic from a30154187a to explicitly break out of the traversal when we run out of candidates (rather than relying on the existing max_candidates check). My gut feeling is along the lines of option 1 (it's a bug, and people would be happy for "--exact-match --always" to give the fallback rather than ignoring "--always"). But the documentation can be interpreted in the other direction, and we've certainly lived with the existing behavior for many years. So it's possible that changing it now is the wrong thing. So this patch fixes the regression by taking the second option, retaining the "--exact-match" behavior as-is. There are two new tests. The first shows that the regression is fixed (we don't even need a new repo without tags; a restrictive --match is enough to create the situation that there are no candidate names). The second test confirms that the "--exact-match --always" behavior remains unchanged and continues to die when there is no tag pointing at the specified commit. It's possible we may reconsider this in the future, but this shows that the approach described above is implemented faithfully. We can also run the perf tests in p6100 to see that we've retained the speedup that a30154187a was going for: Test HEAD^ HEAD -------------------------------------------------------------------------------------- 6100.2: describe HEAD 0.72(0.64+0.07) 0.72(0.66+0.06) +0.0% 6100.3: describe HEAD with one max candidate 0.01(0.00+0.00) 0.01(0.00+0.00) +0.0% 6100.4: describe HEAD with one tag 0.01(0.01+0.00) 0.01(0.01+0.00) +0.0% Reported-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06Merge branch 'kh/sequencer-comment-char'Junio C Hamano3-0/+48
The sequencer failed to honor core.commentString in some places. * kh/sequencer-comment-char: sequencer: comment commit messages properly sequencer: comment `--reference` subject line properly sequencer: comment checked-out branch properly
2024-12-06advice: suggest using subcommand "git config set"Bence Ferdinandy15-23/+23
The advice message currently suggests using "git config advice..." to disable advice messages, but since 00bbdde141 (builtin/config: introduce "set" subcommand, 2024-05-06) we have the "set" subcommand for config. Since using the subcommand is more in-line with the modern interface, any advice should be promoting its usage. Change the disable advice message to use the subcommand instead. Change all uses of "git config advice" in the tests to use the subcommand. Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06remote set-head: set followRemoteHEAD to "warn" if "always"Bence Ferdinandy1-0/+11
When running "remote set-head" manually it is unlikely, that the user would actually like to have "fetch" always update the remote/HEAD. On the contrary, it is more likely, that the user would expect remote/HEAD to stay the way they manually set it, and just forgot about having "followRemoteHEAD" set to "always". When "followRemoteHEAD" is set to "always" make running "remote set-head" change the config to "warn". Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06fetch set_head: add warn-if-not-$branch optionBence Ferdinandy1-0/+38
Currently if we want to have a remote/HEAD locally that is different from the one on the remote, but we still want to get a warning if remote changes HEAD, our only option is to have an indiscriminate warning with "follow_remote_head" set to "warn". Add a new option "warn-if-not-$branch", where $branch is a branch name we do not wish to get a warning about. If the remote HEAD is $branch do not warn, otherwise, behave as "warn". E.g. let's assume, that our remote origin has HEAD set to "master", but locally we have "git remote set-head origin seen". Setting 'remote.origin.followRemoteHEAD = "warn"' will always print a warning, even though the remote has not changed HEAD from "master". Setting 'remote.origin.followRemoteHEAD = "warn-if-not-master" will squelch the warning message, unless the remote changes HEAD from "master". Note, that should the remote change HEAD to "seen" (which we have locally), there will still be no warning. Improve the advice message in report_set_head to also include silencing the warning message with "warn-if-not-$branch". Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06fetch set_head: move warn advice into advise_if_enabledBence Ferdinandy1-2/+0
Advice about what to do when getting a warning is typed out explicitly twice and is printed as regular output. The output is also tested for. Extract the advice message into a single place and use a wrapper function, so if later the advice is made more chatty the signature only needs to be changed in once place. Remove the testing for the advice output in the tests. Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04midx: pass `repository` to `load_multi_pack_index`Karthik Nayak1-4/+4
The `load_multi_pack_index` function in midx uses `the_repository` variable to access the `repository` struct. Modify the function and its callee's to send the `repository` field. This moves usage of `the_repository` to the `test-read-midx.c` file. While that is not optimal, it is okay, since the upcoming commits will slowly move the usage of `the_repository` up the layers and remove it eventually. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04Merge branch 'bc/drop-ancient-libcurl-and-perl'Junio C Hamano4-4/+4
Drop support for older libcURL and Perl. * bc/drop-ancient-libcurl-and-perl: gitweb: make use of s///r Require Perl 5.26.0 INSTALL: document requirement for libcurl 7.61.0 git-curl-compat: remove check for curl 7.56.0 git-curl-compat: remove check for curl 7.53.0 git-curl-compat: remove check for curl 7.52.0 git-curl-compat: remove check for curl 7.44.0 git-curl-compat: remove check for curl 7.43.0 git-curl-compat: remove check for curl 7.39.0 git-curl-compat: remove check for curl 7.34.0 git-curl-compat: remove check for curl 7.25.0 git-curl-compat: remove check for curl 7.21.5
2024-12-04Merge branch 'kn/pass-repo-to-builtin-sub-sub-commands'Junio C Hamano1-3/+5
Built-in Git subcommands are supplied the repository object to work with; they learned to do the same when they invoke sub-subcommands. * kn/pass-repo-to-builtin-sub-sub-commands: builtin: pass repository to sub commands
2024-12-04Merge branch 'ps/bisect-double-free-fix'Junio C Hamano1-0/+5
Work around Coverity warning that would not trigger in practice. * ps/bisect-double-free-fix: bisect: address Coverity warning about potential double free
2024-12-04Merge branch 'tb/use-test-file-size-more'Junio C Hamano2-2/+2
Use the right helper program to measure file size in performance tests. * tb/use-test-file-size-more: t/perf: use 'test_file_size' in more places
2024-12-04Merge branch 'sj/ref-contents-check'Junio C Hamano1-34/+542
"git fsck" learned to issue warnings on "curiously formatted" ref contents that have always been taken valid but something Git wouldn't have written itself (e.g., missing terminating end-of-line after the full object name). * sj/ref-contents-check: ref: add symlink ref content check for files backend ref: check whether the target of the symref is a ref ref: add basic symref content check for files backend ref: add more strict checks for regular refs ref: port git-fsck(1) regular refs check for files backend ref: support multiple worktrees check for refs ref: initialize ref name outside of check functions ref: check the full refname instead of basename ref: initialize "fsck_ref_report" with zero
2024-12-04Merge branch 'ps/ref-backend-migration-optim'Junio C Hamano2-2/+2
The migration procedure between two ref backends has been optimized. * ps/ref-backend-migration-optim: reftable: rename scratch buffer refs: adapt `initial_transaction` flag to be unsigned reftable/block: optimize allocations by using scratch buffer reftable/block: rename `block_writer::buf` variable reftable/writer: optimize allocations by using a scratch buffer refs: don't normalize log messages with `REF_SKIP_CREATE_REFLOG` refs: skip collision checks in initial transactions refs: use "initial" transaction semantics to migrate refs refs/files: support symbolic and root refs in initial transaction refs: introduce "initial" transaction flag refs/files: move logic to commit initial transaction refs: allow passing flags when setting up a transaction
2024-12-04Merge branch 'ps/leakfixes-part-10'Junio C Hamano928-1041/+94
Leakfixes. * ps/leakfixes-part-10: (27 commits) t: remove TEST_PASSES_SANITIZE_LEAK annotations test-lib: unconditionally enable leak checking t: remove unneeded !SANITIZE_LEAK prerequisites t: mark some tests as leak free t5601: work around leak sanitizer issue git-compat-util: drop now-unused `UNLEAK()` macro global: drop `UNLEAK()` annotation t/helper: fix leaking commit graph in "read-graph" subcommand builtin/branch: fix leaking sorting options builtin/init-db: fix leaking directory paths builtin/help: fix leaks in `check_git_cmd()` help: fix leaking return value from `help_unknown_cmd()` help: fix leaking `struct cmdnames` help: refactor to not use globals for reading config builtin/sparse-checkout: fix leaking sanitized patterns split-index: fix memory leak in `move_cache_to_base_index()` git: refactor builtin handling to use a `struct strvec` git: refactor alias handling to use a `struct strvec` strvec: introduce new `strvec_splice()` function line-log: fix leak when rewriting commit parents ...
2024-12-04Merge branch 'ps/gc-stale-lock-warning'Junio C Hamano1-0/+13
Give a bit of advice/hint message when "git maintenance" stops finding a lock file left by another instance that still is potentially running. * ps/gc-stale-lock-warning: t7900: fix host-dependent behaviour when testing git-maintenance(1) builtin/gc: provide hint when maintenance hits a stale schedule lock
2024-12-04t9300: test verification of renamed pathsJeff King1-2/+9
Commit da91a90c2f (fast-import: disallow more path components, 2024-11-30) added two separate verify_path() calls (one for added/modified files, and one for renames/copies). But our tests only exercise the first one. Let's protect ourselves against regressions by tweaking one of the tests to rename into the bad path. There are adjacent tests that will stay as additions, so now both calls are covered. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-03tag: "git tag" refuses to use HEAD as a tagnameJunio C Hamano1-0/+12
Even though the plumbing level allows you to create refs/tags/HEAD and refs/heads/HEAD, doing so makes it confusing within the context of the UI Git Porcelain commands provides. Just like we prevent a branch from getting called "HEAD" at the Porcelain layer (i.e. "git branch" command), teach "git tag" to refuse to create a tag "HEAD". With a few new tests, we make sure that - "git tag HEAD" and "git tag -a HEAD" are rejected - "git update-ref refs/tags/HEAD" is still allowed (this is a deliberate design decision to allow others to create their own UI on top of Git infrastructure that may be different from our UI). - "git tag -d HEAD" can remove refs/tags/HEAD to recover from an mistake. Helped-by: Jeff King <peff@peff.net> Helped-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-03t5604: do not expect that HEAD can be a valid tagnameJunio C Hamano1-3/+3
09116a1c (refs: loosen over-strict "format" check, 2011-11-16) introduced a test piece (originally in t5700) that expects to be able to create a tag named "HEAD" and then a local clone using the repository as its own reference works correctly. Later, another test piece started using this tag starting at acede2eb (t5700: document a failure of alternates to affect fetch, 2012-02-11). But the breakage 09116a1c fixed was not specific to the tagname HEAD. It would have failed exactly the same way if the tag used were foo instead of HEAD. Before forbidding "git tag" from creating "refs/tags/HEAD", update these tests to use 'foo', not 'HEAD', as the name of the test tag. Note that the test piece that uses the tag learned the value of the tag in unnecessarily inefficient and convoluted way with for-each-ref. Just use "rev-parse" instead. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02fast-import: disallow more path componentsElijah Newren2-5/+85
Instead of just disallowing '.' and '..', make use of verify_path() to ensure that fast-import will disallow anything we wouldn't allow into the index, such as anything under .git/, .gitmodules as a symlink, or a dos drive prefix on Windows. Since a few fast-export and fast-import tests that tried to stress-test the correct handling of quoting relied on filenames that fail is_valid_win32_path(), such as spaces or periods at the end of filenames or backslashes within the filename, turn off core.protectNTFS for those tests to ensure they keep passing. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02fetch: add configuration for set_head behaviourBence Ferdinandy1-0/+102
In the current implementation, if refs/remotes/$remote/HEAD does not exist, running fetch will create it, but if it does exist it will not do anything, which is a somewhat safe and minimal approach. Unfortunately, for users who wish to NOT have refs/remotes/$remote/HEAD set for any reason (e.g. so that `git rev-parse origin` doesn't accidentally point them somewhere they do not want to), there is no way to remove this behaviour. On the other side of the spectrum, users may want fetch to automatically update HEAD or at least give them a warning if something changed on the remote. Introduce a new setting, remote.$remote.followRemoteHEAD with four options: - "never": do not ever do anything, not even create - "create": the current behaviour, now the default behaviour - "warn": print a message if remote and local HEAD is different - "always": silently update HEAD on every change Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02worktree: refactor `repair_worktree_after_gitdir_move()`Caleb White1-4/+18
This refactors `repair_worktree_after_gitdir_move()` to use the new `write_worktree_linking_files` function. It also preserves the relativity of the linking files; e.g., if an existing worktree used absolute paths then the repaired paths will be absolute (and visa-versa). `repair_worktree_after_gitdir_move()` is used to repair both sets of worktree linking files if the `.git` directory is moved during a re-initialization using `git init`. This also adds a test case for reinitializing a repository that has relative worktrees. Signed-off-by: Caleb White <cdwhite3@pm.me> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02worktree: add relative cli/config options to `repair` commandCaleb White1-0/+39
This teaches the `worktree repair` command to respect the `--[no-]relative-paths` CLI option and `worktree.useRelativePaths` config setting. If an existing worktree with an absolute path is repaired with `--relative-paths`, the links will be replaced with relative paths, even if the original path was correct. This allows a user to covert existing worktrees between absolute/relative as desired. To simplify things, both linking files are written when one of the files needs to be repaired. In some cases, this fixes the other file before it is checked, in other cases this results in a correct file being written with the same contents. Signed-off-by: Caleb White <cdwhite3@pm.me> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02worktree: add relative cli/config options to `move` commandCaleb White1-0/+25
This teaches the `worktree move` command to respect the `--[no-]relative-paths` CLI option and `worktree.useRelativePaths` config setting. If an existing worktree is moved with `--relative-paths` the new path will be relative (and visa-versa). Signed-off-by: Caleb White <cdwhite3@pm.me> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02worktree: add relative cli/config options to `add` commandCaleb White4-40/+70
This introduces the `--[no-]relative-paths` CLI option and `worktree.useRelativePaths` configuration setting to the `worktree add` command. When enabled these options allow worktrees to be linked using relative paths, enhancing portability across environments where absolute paths may differ (e.g., containerized setups, shared network drives). Git still creates absolute paths by default, but these options allow users to opt-in to relative paths if desired. The t2408 test file is removed and more comprehensive tests are written for the various worktree operations in their own files. Signed-off-by: Caleb White <cdwhite3@pm.me> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02setup: correctly reinitialize repository versionCaleb White1-3/+3
When reinitializing a repository, Git does not account for extensions other than `objectformat` and `refstorage` when determining the repository version. This can lead to a repository being downgraded to version 0 if extensions are set, causing Git future operations to fail. This patch teaches Git to check if other extensions are defined in the config to ensure that the repository version is set correctly. Signed-off-by: Caleb White <cdwhite3@pm.me> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-28transport: propagate fsck configuration during bundle fetchJustin Tobler1-0/+7
When fetching directly from a bundle, fsck message severity configuration is not propagated to the underlying git-index-pack(1). It is only capable of enabling or disabling fsck checks entirely. This does not align with the fsck behavior for fetches through git-fetch-pack(1). Use the fsck config parsing from fetch-pack to populate fsck message severity configuration and wire it through to `unbundle()` to enable the same fsck verification as done through fetch-pack. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-27Merge branch 'bf/set-head-symref' into bf/fetch-set-head-configJunio C Hamano11-17/+221
* bf/set-head-symref: fetch set_head: handle mirrored bare repositories fetch: set remote/HEAD if it does not exist refs: add create_only option to refs_update_symref_extended refs: add TRANSACTION_CREATE_EXISTS error remote set-head: better output for --auto remote set-head: refactor for readability refs: atomically record overwritten ref in update_symref refs: standardize output of refs_read_symbolic_ref t/t5505-remote: test failure of set-head t/t5505-remote: set default branch to main
2024-11-27Merge branch 'kn/ref-transaction-hook-with-reflog'Junio C Hamano1-2/+0
The ref-transaction hook triggered for reflog updates, which has been corrected. * kn/ref-transaction-hook-with-reflog: refs: don't invoke reference-transaction hook for reflogs
2024-11-27Merge branch 'jt/index-pack-allow-promisor-only-while-fetching'Junio C Hamano1-3/+1
We now ensure "index-pack" is used with the "--promisor" option only during a "git fetch". * jt/index-pack-allow-promisor-only-while-fetching: index-pack: teach --promisor to forbid pack name
2024-11-27Merge branch 'en/fast-import-avoid-self-replace'Junio C Hamano1-0/+28
"git fast-import" can be tricked into a replace ref that maps an object to itself, which is a useless thing to do. * en/fast-import-avoid-self-replace: fast-import: avoid making replace refs point to themselves
2024-11-27Merge branch 'ps/clar-build-improvement'Junio C Hamano2-50/+63
Fix for clar unit tests to support CMake build. * ps/clar-build-improvement: Makefile: let clar header targets depend on their scripts cmake: use verbatim arguments when invoking clar commands cmake: use SH_EXE to execute clar scripts t/unit-tests: convert "clar-generate.awk" into a shell script
2024-11-26reftable/merged: drain priority queue on reseekPatrick Steinhardt1-0/+73
In 5bf96e0c39 (reftable/generic: move seeking of records into the iterator, 2024-05-13) we have refactored the reftable codebase such that iterators can be initialized once and then re-seeked multiple times. This feature is used by 1869525066 (refs/reftable: wire up support for exclude patterns, 2024-09-16) in order to skip records based on exclude patterns provided by the caller. The logic to re-seek the merged iterator is insufficient though because we don't drain the priority queue on a re-seek. This means that the queue may contain stale entries and thus reading the next record in the queue will return the wrong entry. While this is an obvious bug, it is harmless in the context of above exclude patterns: - If the queue contained stale entries that match the pattern then the caller would already know to filter out such refs. This is because our codebase is prepared to handle backends that don't have a way to efficiently implement exclude patterns. - If the queue contained stale entries that don't match the pattern we'd eventually filter out any duplicates. This is because the reftable code discards items with the same ref name and sorts any remaining entries properly. So things happen to work in this context regardless of the bug, and there is no other use case yet where we re-seek iterators. We're about to introduce a caching mechanism though where iterators are reused by the reftable backend, and that will expose the bug. Fix the issue by draining the priority queue when seeking and add a testcase that surfaces the issue. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26builtin: pass repository to sub commandsKarthik Nayak1-3/+5
In 9b1cb5070f (builtin: add a repository parameter for builtin functions, 2024-09-13) the repository was passed down to all builtin commands. This allowed the repository to be passed down to lower layers without depending on the global `the_repository` variable. Continue this work by also passing down the repository parameter from the command to sub-commands. This will help pass down the repository to other subsystems and cleanup usage of global variables like 'the_repository' and 'the_hash_algo'. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26fast-import: disallow "." and ".." path componentsElijah Newren1-0/+20
If a user specified e.g. M 100644 :1 ../some-file then fast-import previously would happily create a git history where there is a tree in the top-level directory named "..", and with a file inside that directory named "some-file". The top-level ".." directory causes problems. While git checkout will die with errors and fsck will report hasDotdot problems, the user is going to have problems trying to remove the problematic file. Simply avoid creating this bad history in the first place. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26bisect: address Coverity warning about potential double freePatrick Steinhardt1-0/+5
Coverity has started to warn about a potential double-free in `find_bisection()`. This warning is triggered because we may modify the list head of the passed-in `commit_list` in case it is an UNINTERESTING commit, but still call `free_commit_list()` on the original variable that points to the now-freed head in case where `do_find_bisection()` returns a `NULL` pointer. As far as I can see, this double free cannot happen in practice, as `do_find_bisection()` only returns a `NULL` pointer when it was passed a `NULL` input. So in order to trigger the double free we would have to call `find_bisection()` with a commit list that only consists of UNINTERESTING commits, but I have not been able to construct a case where that happens. Drop the `else` branch entirely as it seems to be a no-op anyway. Another option might be to instead call `free_commit_list()` on `list`, which is the modified version of `commit_list` and thus wouldn't cause a double free. But as mentioned, I couldn't come up with any case where a passed-in non-NULL list becomes empty, so this shouldn't be necessary. And if it ever does become necessary we'd notice anyway via the leak sanitizer. Interestingly enough we did not have a single test exercising this branch: all tests pass just fine even when replacing it with a call to `BUG()`. Add a test that exercises it. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26Merge branch 'ps/leakfixes-part-10' into ps/bisect-double-free-fixJunio C Hamano928-1041/+94
* ps/leakfixes-part-10: (27 commits) t: remove TEST_PASSES_SANITIZE_LEAK annotations test-lib: unconditionally enable leak checking t: remove unneeded !SANITIZE_LEAK prerequisites t: mark some tests as leak free t5601: work around leak sanitizer issue git-compat-util: drop now-unused `UNLEAK()` macro global: drop `UNLEAK()` annotation t/helper: fix leaking commit graph in "read-graph" subcommand builtin/branch: fix leaking sorting options builtin/init-db: fix leaking directory paths builtin/help: fix leaks in `check_git_cmd()` help: fix leaking return value from `help_unknown_cmd()` help: fix leaking `struct cmdnames` help: refactor to not use globals for reading config builtin/sparse-checkout: fix leaking sanitized patterns split-index: fix memory leak in `move_cache_to_base_index()` git: refactor builtin handling to use a `struct strvec` git: refactor alias handling to use a `struct strvec` strvec: introduce new `strvec_splice()` function line-log: fix leak when rewriting commit parents ...
2024-11-26sequencer: comment commit messages properlyKristoffer Haugsbakk1-0/+15
The rebase todo editor has commands like `fixup -c` which affects the commit messages of the rebased commits.[1] For example: pick hash1 <msg> fixup hash2 <msg> fixup -c hash3 <msg> This says that hash2 and hash3 should be squashed into hash1 and that hash3’s commit message should be used for the resulting commit. So the user is presented with an editor where the two first commit messages are commented out and the third is not. However this does not work if `core.commentChar`/`core.commentString` is in use since the comment char is hardcoded (#) in this `sequencer.c` function. As a result the first commit message will not be commented out. † 1: See 9e3cebd97cb (rebase -i: add fixup [-C | -c] command, 2021-01-29) Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Co-authored-by: Phillip Wood <phillip.wood@dunelm.org.uk> Reported-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26sequencer: comment `--reference` subject line properlyKristoffer Haugsbakk1-0/+14
`git revert --reference <commit>` leaves behind a comment in the first line:[1] # *** SAY WHY WE ARE REVERTING ON THE TITLE LINE *** Meaning that the commit will just consist of the next line if the user exits the editor directly: This reverts commit <--format=reference commit> But the comment char here is hardcoded (#). Which means that the comment line will inadvertently be included in the commit message if `core.commentChar`/`core.commentString` is in use. † 1: See 43966ab3156 (revert: optionally refer to commit in the "reference" format, 2022-05-26) Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26sequencer: comment checked-out branch properlyKristoffer Haugsbakk1-0/+19
`git rebase --update-ref` does not insert commands for dependent/sub- branches which are checked out.[1] Instead it leaves a comment about that fact. The comment char is hardcoded (#). In turn the comment line gets interpreted as an invalid command when `core.commentChar`/ `core.commentString` is in use. † 1: See 900b50c242 (rebase: add --update-refs option, 2022-07-19) Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25t7900: fix host-dependent behaviour when testing git-maintenance(1)Patrick Steinhardt1-2/+7
We have recently added a new test to t7900 that exercises whether git-maintenance(1) fails as expected when the "schedule.lock" file exists. The test depends on whether or not the host has the required executables present to schedule maintenance tasks in the first place, like systemd or launchctl -- if not, the test fails with an unrelated error before even checking for the lock file. This fails for example in our CI systems, where macOS images do not have launchctl available. Fix this issue by creating a stub systemctl(1) binary and using the systemd scheduler. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25Merge branch 'ak/typofixes' into maint-2.47Junio C Hamano9-10/+10
Typofixes. * ak/typofixes: t: fix typos t/helper: fix a typo t/perf: fix typos t/unit-tests: fix typos contrib: fix typos compat: fix typos
2024-11-25git-mergetool--lib.sh: add error message if 'setup_user_tool' failsPhilippe Blain1-0/+8
In git-mergetool--lib.sh::setup_tool, we check if the given tool is a known builtin tool, a known variant, or a user-defined tool by calling setup_user_tool, and we return with the exit code from setup_user_tool if it was called. setup_user_tool checks if {diff,merge}tool.$tool.cmd is set and quietly returns with an error if not. This leads to the following invocation quietly failing: git mergetool --tool=unknown which is not very user-friendly. Adjust setup_tool to output an error message before returning if setup_user_tool returned with an error. Note that we do not check the result of the second call to setup_user_tool in setup_tool, as this call is only meant to allow users to redefine 'cmd' for a builtin tool; it is not an error if they have not done so. Note that this behaviour of quietly failing is a regression dating back to de8dafbada (mergetool: break setup_tool out into separate initialization function, 2021-02-09), as before this commit an unknown mergetool would be diagnosed in get_merge_tool_path when called from run_merge_tool. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25fetch set_head: handle mirrored bare repositoriesBence Ferdinandy1-0/+10
When adding a remote to bare repository with "git remote add --mirror", running fetch will fail to update HEAD to the remote's HEAD, since it does not know how to handle bare repositories. On the other hand HEAD already has content, since "git init --bare" has already set HEAD to whatever is the default branch set for the user. Unless this - by chance - is the same as the remote's HEAD, HEAD will be pointing to a bad symref. Teach set_head to handle bare repositories, by overwriting HEAD so it mirrors the remote's HEAD. Note, that in this case overriding the local HEAD reference is necessary, since HEAD will exist before fetch can be run, but this should not be an issue, since the whole purpose of --mirror is to be an exact mirror of the remote, so following any changes to HEAD makes sense. Also note, that although "git remote set-head" also fails when trying to update the remote's locally tracked HEAD in a mirrored bare repository, the usage of the command does not make much sense after this patch: fetch will update the remote HEAD correctly, and setting it manually to something else is antithetical to the concept of mirroring. Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25fetch: set remote/HEAD if it does not existBence Ferdinandy11-17/+135
When cloning a repository remote/HEAD is created, but when the user creates a repository with git init, and later adds a remote, remote/HEAD is only created if the user explicitly runs a variant of "remote set-head". Attempt to set remote/HEAD during fetch, if the user does not have it already set. Silently ignore any errors. Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25remote set-head: better output for --autoBence Ferdinandy1-1/+62
Currently, set-head --auto will print a message saying "remote/HEAD set to branch", which implies something was changed. Change the output of --auto, so the output actually reflects what was done: a) set a previously unset HEAD, b) change HEAD because remote changed or c) no updates. As edge cases, if HEAD is changed from a previous symbolic reference that was not a remote branch, explicitly call attention to this fact, and also notify the user if the previous reference was not a symbolic reference. Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25t/t5505-remote: test failure of set-headBence Ferdinandy1-0/+12
The test coverage was missing a test for the failure branch of remote set-head auto's output. Add the missing text and while we are at it, correct a small grammatical mistake in the error's output ("setup" is the noun, "set up" is the verb). Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25t/t5505-remote: set default branch to mainBence Ferdinandy1-0/+3
Consider the bare repository called "mirror" in the test. Running `git remote add --mirror -f origin ../one` will not change HEAD, consequently if init.defaultBranch is not the same as what HEAD in the remote ("one"), HEAD in "mirror" will be pointing to a non-existent reference. Hence if "mirror" is used as a remote by yet another repository, ls-remote will not show HEAD. On the other hand, if init.defaultBranch happens to match HEAD in "one", then ls-remote will show HEAD. Since the "ci/run-build-and-tests.sh" script globally exports GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main for some (but not all) jobs, there may be a drift in some tests between how the test repositories are set up in the CI and during local testing, if the test itself uses "master" as default instead of "main". In particular, this happens in t5505-remote.sh. This issue does not manifest currently, as the test does not do any remote HEAD manipulation where this would come up, but should such things be added, a locally passing test would break the CI and vice-versa. Set GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main in t5505-remote to be consistent with the CI. Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-22Merge branch 'tb/multi-pack-reuse-dupfix'Junio C Hamano1-0/+22
Object reuse code based on multi-pack-index sent an unwanted copy of object. * tb/multi-pack-reuse-dupfix: pack-objects: only perform verbatim reuse on the preferred pack t5332-multi-pack-reuse.sh: demonstrate duplicate packing failure
2024-11-22Merge branch 'sm/difftool'Junio C Hamano1-0/+4
Use of some uninitialized variables in "git difftool" has been corrected. * sm/difftool: builtin/difftool: intialize some hashmap variables
2024-11-22Merge branch 'jk/fetch-prefetch-double-free-fix'Junio C Hamano1-0/+4
Double-free fix. * jk/fetch-prefetch-double-free-fix: refspec: store raw refspecs inside refspec_item refspec: drop separate raw_nr count fetch: adjust refspec->raw_nr when filtering prefetch refspecs
2024-11-22Merge branch 'jk/test-malloc-debug-check'Junio C Hamano1-47/+50
Avoid build/test breakage on a system without working malloc debug support dynamic library. * jk/test-malloc-debug-check: test-lib: move malloc-debug setup after $PATH setup test-lib: check malloc debug LD_PRELOAD before using
2024-11-22t/perf: use 'test_file_size' in more placesTaylor Blau2-2/+2
The perf test suite prefers to use test_file_size over 'wc -c' when inside of a test_size block. One advantage is that accidentally writign "wc -c file" (instead of "wc -c <file") does not inadvertently break the tests (since the former will include the filename in the output of wc). Both of the two uses of test_size use "wc -c", but let's convert those to the more conventional test_file_size helper instead. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21t: remove TEST_PASSES_SANITIZE_LEAK annotationsPatrick Steinhardt928-928/+0
Now that the default value for TEST_PASSES_SANITIZE_LEAK is `true` there is no longer a need to have that variable declared in all of our tests. Drop it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21test-lib: unconditionally enable leak checkingPatrick Steinhardt3-96/+1
Over the last two releases we have plugged a couple hundred of memory leaks exposed by the Git test suite. With the preceding commits we have finally fixed the last leak exposed by our test suite, which means that we are now basically leak free wherever we have branch coverage. From hereon, the Git test suite should ideally stay free of memory leaks. Most importantly, any test suite that is being added should automatically be subject to the leak checker, and if that test does not pass it is a strong signal that the added code introduced new memory leaks and should not be accepted without further changes. Drop the infrastructure around TEST_PASSES_SANITIZE_LEAK to reflect this new requirement. Like this, all test suites will be subject to the leak checker by default. This is being intentionally strict, but we still have an escape hatch: the SANITIZE_LEAK prerequisite. There is one known case in t5601 where the leak sanitizer itself is buggy, so adding this prereq in such cases is acceptable. Another acceptable situation is when a newly added test uncovers preexisting memory leaks: when fixing that memory leak would be sufficiently complicated it is fine to annotate and document the leak accordingly. But in any case, the burden is now on the patch author to explain why exactly they have to add the SANITIZE_LEAK prerequisite. The TEST_PASSES_SANITIZE_LEAK annotations will be dropped in the next patch. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21t: remove unneeded !SANITIZE_LEAK prerequisitesPatrick Steinhardt3-11/+11
We have a couple of !SANITIZE_LEAK prerequisites for tests that used to fail due to memory leaks. These have all been fixed by now, so let's drop the prerequisite. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21t: mark some tests as leak freePatrick Steinhardt2-0/+2
Both t5558 and t5601 are leak-free starting with 6dab49b9fb (bundle-uri: plug leak in unbundle_from_file(), 2024-10-10). Mark them accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21t5601: work around leak sanitizer issuePatrick Steinhardt1-11/+15
When running t5601 with the leak checker enabled we can see a hang in our CI systems. This hang seems to be system-specific, as I cannot reproduce it on my own machine. As it turns out, the issue is in those testcases that exercise cloning of `~repo`-style paths. All of the testcases that hang eventually end up interpreting "repo" as the username and will call getpwnam(3p) with that username. That should of course be fine, and getpwnam(3p) should just return an error. But instead, the leak sanitizer seems to be recursing while handling a call to `free()` in the NSS modules: #0 0x00007ffff7fd98d5 in _dl_update_slotinfo (req_modid=1, new_gen=2) at ../elf/dl-tls.c:720 #1 0x00007ffff7fd9ac4 in update_get_addr (ti=0x7ffff7a91d80, gen=<optimized out>) at ../elf/dl-tls.c:916 #2 0x00007ffff7fdc85c in __tls_get_addr () at ../sysdeps/x86_64/tls_get_addr.S:55 #3 0x00007ffff7a27e04 in __lsan::GetAllocatorCache () at ../../../../src/libsanitizer/lsan/lsan_linux.cpp:27 #4 0x00007ffff7a2b33a in __lsan::Deallocate (p=0x0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:127 #5 __lsan::lsan_free (p=0x0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:220 ... #261505 0x00007ffff7fd99f2 in free (ptr=<optimized out>) at ../include/rtld-malloc.h:50 #261506 _dl_update_slotinfo (req_modid=1, new_gen=2) at ../elf/dl-tls.c:822 #261507 0x00007ffff7fd9ac4 in update_get_addr (ti=0x7ffff7a91d80, gen=<optimized out>) at ../elf/dl-tls.c:916 #261508 0x00007ffff7fdc85c in __tls_get_addr () at ../sysdeps/x86_64/tls_get_addr.S:55 #261509 0x00007ffff7a27e04 in __lsan::GetAllocatorCache () at ../../../../src/libsanitizer/lsan/lsan_linux.cpp:27 #261510 0x00007ffff7a2b33a in __lsan::Deallocate (p=0x5020000001e0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:127 #261511 __lsan::lsan_free (p=0x5020000001e0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:220 #261512 0x00007ffff793da25 in module_load (module=0x515000000280) at ./nss/nss_module.c:188 #261513 0x00007ffff793dee5 in __nss_module_load (module=0x515000000280) at ./nss/nss_module.c:302 #261514 __nss_module_get_function (module=0x515000000280, name=name@entry=0x7ffff79b9128 "getpwnam_r") at ./nss/nss_module.c:328 #261515 0x00007ffff793e741 in __GI___nss_lookup_function (fct_name=<optimized out>, ni=<optimized out>) at ./nss/nsswitch.c:137 #261516 __GI___nss_next2 (ni=ni@entry=0x7fffffffa458, fct_name=fct_name@entry=0x7ffff79b9128 "getpwnam_r", fct2_name=fct2_name@entry=0x0, fctp=fctp@entry=0x7fffffffa460, status=status@entry=0, all_values=all_values@entry=0) at ./nss/nsswitch.c:120 #261517 0x00007ffff794c6a7 in __getpwnam_r (name=name@entry=0x501000000060 "repo", resbuf=resbuf@entry=0x7ffff79fb320 <resbuf>, buffer=<optimized out>, buflen=buflen@entry=1024, result=result@entry=0x7fffffffa4b0) at ../nss/getXXbyYY_r.c:343 #261518 0x00007ffff794c4d8 in getpwnam (name=0x501000000060 "repo") at ../nss/getXXbyYY.c:140 #261519 0x00005555557e37ff in getpw_str (username=0x5020000001a1 "repo", len=4) at path.c:613 #261520 0x00005555557e3937 in interpolate_path (path=0x5020000001a0 "~repo", real_home=0) at path.c:654 #261521 0x00005555557e3aea in enter_repo (path=0x501000000040 "~repo", strict=0) at path.c:718 #261522 0x000055555568f0ba in cmd_upload_pack (argc=1, argv=0x502000000100, prefix=0x0, repo=0x0) at builtin/upload-pack.c:57 #261523 0x0000555555575ba8 in run_builtin (p=0x555555a20c98 <commands+3192>, argc=2, argv=0x502000000100, repo=0x555555a53b20 <the_repo>) at git.c:481 #261524 0x0000555555576067 in handle_builtin (args=0x7fffffffaab0) at git.c:742 #261525 0x000055555557678d in cmd_main (argc=2, argv=0x7fffffffac58) at git.c:912 #261526 0x00005555556963cd in main (argc=2, argv=0x7fffffffac58) at common-main.c:64 Note that this stack is more than 260000 function calls deep. Run under the debugger this will eventually segfault, but in our CI systems it seems like this just hangs forever. I assume that this is a bug either in the leak sanitizer or in glibc, as I cannot reproduce it on my machine. In any case, let's work around the bug for now by marking those tests with the "!SANITIZE_LEAK" prereq. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21t/helper: fix leaking commit graph in "read-graph" subcommandPatrick Steinhardt1-2/+1
We're leaking the commit-graph in the "test-helper read-graph" subcommand, but as the leak is annotated with `UNLEAK()` the leak sanitizer doesn't complain. Fix the leak by calling `free_commit_graph()`. Besides getting rid of the `UNLEAK()` annotation, it also increases code coverage because we properly release resources as Git would do it, as well. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21split-index: fix memory leak in `move_cache_to_base_index()`Patrick Steinhardt1-0/+1
In `move_cache_to_base_index()` we move the index cache of the main index into the split index, which is used when writing a shared index. But we don't release the old split index base in case we already had a split index before this operation, which can thus leak memory. Plug the leak by releasing the previous base. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21git: refactor builtin handling to use a `struct strvec`Patrick Steinhardt1-1/+1
Similar as with the preceding commit, `handle_builtin()` does not properly track lifetimes of the `argv` array and its strings. As it may end up modifying the array this can lead to memory leaks in case it contains allocated strings. Refactor the function to use a `struct strvec` instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21git: refactor alias handling to use a `struct strvec`Patrick Steinhardt1-0/+1
In `handle_alias()` we use both `argcp` and `argv` as in-out parameters. Callers mostly pass through the static array from `main()`, but once we handle an alias we replace it with an allocated array that may contain some allocated strings. Callers do not handle this scenario at all and thus leak memory. We could in theory handle the lifetime of `argv` in a hacky fashion by letting callers free it in case they see that an alias was handled. But while that would likely work, we still wouldn't be able to easily handle the lifetime of strings referenced by `argv`. Refactor the code to instead use a `struct strvec`, which effectively removes the need for us to manually track lifetimes. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21strvec: introduce new `strvec_splice()` functionPatrick Steinhardt1-0/+65
Introduce a new `strvec_splice()` function that can replace a range of strings in the vector with another array of strings. This function will be used in subsequent commits. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21line-log: fix leak when rewriting commit parentsPatrick Steinhardt1-0/+1
In `process_ranges_merge_commit()` we try to figure out which of the parents can be blamed for the given line changes. When we figure out that none of the files in the line-log have changed we assign the complete blame to that commit and rewrite the parents of the current commit to only use that single parent. This is done via `commit_list_append()`, which is misleadingly _not_ appending to the list of parents. Instead, we overwrite the parents with the blamed parent. This makes us lose track of the old pointers, creating a memory leak. Fix this issue by freeing the parents before we overwrite them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21bisect: fix various cases where we leak commit list itemsPatrick Steinhardt1-0/+1
There are various cases where we leak commit list items because we evict items from the list, but don't free them. Plug those. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21builtin/blame: fix leaking blame entries with `--incremental`Patrick Steinhardt1-0/+2
When passing `--incremental` to git-blame(1) we exit early by jumping to the `cleanup` label. But some of the cleanups we perform are handled between the `goto` and its label, and thus we leak the data. Move the cleanups after the `cleanup` label. While at it, move the logic to free the scoreboard's `final_buf` into `cleanup_scoreboard()` and drop its `const` declaration. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21ref: add symlink ref content check for files backendshejialuo1-0/+141
Besides the textual symref, we also allow symbolic links as the symref. So, we should also provide the consistency check as what we have done for textual symref. And also we consider deprecating writing the symbolic links. We first need to access whether symbolic links still be used. So, add a new fsck message "symlinkRef(INFO)" to tell the user be aware of this information. We have already introduced "files_fsck_symref_target". We should reuse this function to handle the symrefs which use legacy symbolic links. We should not check the trailing garbage for symbolic refs. Add a new parameter "symbolic_link" to disable some checks which should only be executed for textual symrefs. And we need to also generate the "referent" parameter for reusing "files_fsck_symref_target" by the following steps: 1. Use "strbuf_add_real_path" to resolve the symlink and get the absolute path "ref_content" which the symlink ref points to. 2. Generate the absolute path "abs_gitdir" of "gitdir" and combine "ref_content" and "abs_gitdir" to extract the relative path "relative_referent_path". 3. If "ref_content" is outside of "gitdir", we just set "referent" with "ref_content". Instead, we set "referent" with "relative_referent_path". Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21ref: check whether the target of the symref is a refshejialuo1-0/+29
Ideally, we want to the users use "git symbolic-ref" to create symrefs instead of writing raw contents into the filesystem. However, "git symbolic-ref" is strict with the refname but not strict with the referent. For example, we can make the "referent" located at the "$(gitdir)/logs/aaa" and manually write the content into this where we can still successfully parse this symref by using "git rev-parse". $ git init repo && cd repo && git commit --allow-empty -mx $ git symbolic-ref refs/heads/test logs/aaa $ echo $(git rev-parse HEAD) > .git/logs/aaa $ git rev-parse test We may need to add some restrictions for "referent" parameter when using "git symbolic-ref" to create symrefs because ideally all the nonpseudo-refs should be located under the "refs" directory and we may tighten this in the future. In order to tell the user we may tighten the above situation, create a new fsck message "symrefTargetIsNotARef" to notify the user that this may become an error in the future. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21ref: add basic symref content check for files backendshejialuo1-0/+111
We have code that checks regular ref contents, but we do not yet check the contents of symbolic refs. By using "parse_loose_ref_content" for symbolic refs, we will get the information of the "referent". We do not need to check the "referent" by opening the file. This is because if "referent" exists in the file system, we will eventually check its correctness by inspecting every file in the "refs" directory. If the "referent" does not exist in the filesystem, this is OK as it is seen as the dangling symref. So we just need to check the "referent" string content. A regular ref could be accepted as a textual symref if it begins with "ref:", followed by zero or more whitespaces, followed by the full refname, followed only by whitespace characters. However, we always write a single SP after "ref:" and a single LF after the refname. It may seem that we should report a fsck error message when the "referent" does not apply above rules and we should not be so aggressive because third-party reimplementations of Git may have taken advantage of the looser syntax. Put it more specific, we accept the following contents: 1. "ref: refs/heads/master " 2. "ref: refs/heads/master \n \n" 3. "ref: refs/heads/master\n\n" When introducing the regular ref content checks, we created two fsck infos "refMissingNewline" and "trailingRefContent" which exactly represents above situations. So we will reuse these two fsck messages to write checks to info the user about these situations. But we do not allow any other trailing garbage. The followings are bad symref contents which will be reported as fsck error by "git-fsck(1)". 1. "ref: refs/heads/master garbage\n" 2. "ref: refs/heads/master \n\n\n garbage " And we introduce a new "badReferentName(ERROR)" fsck message to report above errors by using "is_root_ref" and "check_refname_format" to check the "referent". Since both "is_root_ref" and "check_refname_format" don't work with whitespaces, we use the trimmed version of "referent" with these functions. In order to add checks, we will do the following things: 1. Record the untrimmed length "orig_len" and untrimmed last byte "orig_last_byte". 2. Use "strbuf_rtrim" to trim the whitespaces or newlines to make sure "is_root_ref" and "check_refname_format" won't be failed by them. 3. Use "orig_len" and "orig_last_byte" to check whether the "referent" misses '\n' at the end or it has trailing whitespaces or newlines. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21ref: add more strict checks for regular refsshejialuo1-2/+55
We have already used "parse_loose_ref_contents" function to check whether the ref content is valid in files backend. However, by using "parse_loose_ref_contents", we allow the ref's content to end with garbage or without a newline. Even though we never create such loose refs ourselves, we have accepted such loose refs. So, it is entirely possible that some third-party tools may rely on such loose refs being valid. We should not report an error fsck message at current. We should notify the users about such "curiously formatted" loose refs so that adequate care is taken before we decide to tighten the rules in the future. And it's not suitable either to report a warn fsck message to the user. We don't yet want the "--strict" flag that controls this bit to end up generating errors for such weirdly-formatted reference contents, as we first want to assess whether this retroactive tightening will cause issues for any tools out there. It may cause compatibility issues which may break the repository. So, we add the following two fsck infos to represent the situation where the ref content ends without newline or has trailing garbages: 1. refMissingNewline(INFO): A loose ref that does not end with newline(LF). 2. trailingRefContent(INFO): A loose ref has trailing content. It might appear that we can't provide the user with any warnings by using FSCK_INFO. However, in "fsck.c::fsck_vreport", we will convert FSCK_INFO to FSCK_WARN and we can still warn the user about these situations when using "git refs verify" without introducing compatibility issues. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21ref: port git-fsck(1) regular refs check for files backendshejialuo1-0/+105
"git-fsck(1)" implicitly checks the ref content by passing the callback "fsck_handle_ref" to the "refs.c::refs_for_each_rawref". Then, it will check whether the ref content (eventually "oid") is valid. If not, it will report the following error to the user. error: refs/heads/main: invalid sha1 pointer 0000... And it will also report above errors when there are dangling symrefs in the repository wrongly. This does not align with the behavior of the "git symbolic-ref" command which allows users to create dangling symrefs. As we have already introduced the "git refs verify" command, we'd better check the ref content explicitly in the "git refs verify" command thus later we could remove these checks in "git-fsck(1)" and launch a subprocess to call "git refs verify" in "git-fsck(1)" to make the "git-fsck(1)" more clean. Following what "git-fsck(1)" does, add a similar check to "git refs verify". Then add a new fsck error message "badRefContent(ERROR)" to represent that a ref has an invalid content. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21ref: support multiple worktrees check for refsshejialuo1-0/+51
We have already set up the infrastructure to check the consistency for refs, but we do not support multiple worktrees. However, "git-fsck(1)" will check the refs of worktrees. As we decide to get feature parity with "git-fsck(1)", we need to set up support for multiple worktrees. Because each worktree has its own specific refs, instead of just showing the users "refs/worktree/foo", we need to display the full name such as "worktrees/<id>/refs/worktree/foo". So we should know the id of the worktree to get the full name. Add a new parameter "struct worktree *" for "refs-internal.h::fsck_fn". Then change the related functions to follow this new interface. The "packed-refs" only exists in the main worktree, so we should only check "packed-refs" in the main worktree. Use "is_main_worktree" method to skip checking "packed-refs" in "packed_fsck" function. Then, enhance the "files-backend.c::files_fsck_refs_dir" function to add "worktree/<id>/" prefix when we are not in the main worktree. Last, add a new test to check the refname when there are multiple worktrees to exercise the code. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21ref: check the full refname instead of basenameshejialuo1-37/+55
In "files-backend.c::files_fsck_refs_name", we validate the refname format by using "check_refname_format" to check the basename of the iterator with "REFNAME_ALLOW_ONELEVEL" flag. However, this is a bad implementation. Although we doesn't allow a single "@" in ".git" directory, we do allow "refs/heads/@". So, we will report an error wrongly when there is a "refs/heads/@" ref by using one level refname "@". Because we just check one level refname, we either cannot check the other parts of the full refname. And we will ignore the following errors: "refs/heads/ new-feature/test" "refs/heads/~new-feature/test" In order to fix the above problem, enhance "files_fsck_refs_name" to use the full name for "check_refname_format". Then, replace the tests which are related to "@" and add tests to exercise the above situations using for loop to avoid repetition. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21refs: skip collision checks in initial transactionsPatrick Steinhardt1-1/+1
Reference transactions use `refs_verify_refname_available()` to check for colliding references. This check consists of two parts: - Checks for whether multiple ref updates in the same transaction conflict with each other. - Checks for whether existing refs conflict with any refs part of the transaction. While we generally cannot avoid the first check, the second check is superfluous in cases where the transaction is an initial one in an otherwise empty ref store. The check results in multiple ref reads as well as the creation of a ref iterator for every ref we're checking, which adds up quite fast when performing the check for many refs. Introduce a new flag that allows us to skip this check and wire it up in such that the backends pass it when running an initial transaction. This leads to significant speedups when migrating ref storage backends. From "files" to "reftable": Benchmark 1: migrate files:reftable (refcount = 100000, revision = HEAD~) Time (mean ± σ): 472.4 ms ± 6.7 ms [User: 175.9 ms, System: 285.2 ms] Range (min … max): 463.5 ms … 483.2 ms 10 runs Benchmark 2: migrate files:reftable (refcount = 100000, revision = HEAD) Time (mean ± σ): 86.1 ms ± 1.9 ms [User: 67.9 ms, System: 16.0 ms] Range (min … max): 82.9 ms … 90.9 ms 29 runs Summary migrate files:reftable (refcount = 100000, revision = HEAD) ran 5.48 ± 0.15 times faster than migrate files:reftable (refcount = 100000, revision = HEAD~) And from "reftable" to "files": Benchmark 1: migrate reftable:files (refcount = 100000, revision = HEAD~) Time (mean ± σ): 452.7 ms ± 3.4 ms [User: 209.9 ms, System: 235.4 ms] Range (min … max): 445.9 ms … 457.5 ms 10 runs Benchmark 2: migrate reftable:files (refcount = 100000, revision = HEAD) Time (mean ± σ): 95.2 ms ± 2.2 ms [User: 73.6 ms, System: 20.6 ms] Range (min … max): 91.7 ms … 100.8 ms 28 runs Summary migrate reftable:files (refcount = 100000, revision = HEAD) ran 4.76 ± 0.11 times faster than migrate reftable:files (refcount = 100000, revision = HEAD~) Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21refs: use "initial" transaction semantics to migrate refsPatrick Steinhardt1-1/+1
Until now, we couldn't use "initial" transaction semantics to migrate refs because the "files" backend only supported writing regular refs via the initial transaction because it simply mapped the transaction to a "packed-refs" transaction. But with the preceding commit, the "files" backend has learned to also write symbolic and root refs in the initial transaction by creating a second transaction for all refs that need to be written as loose refs. Adapt the code to migrate refs to commit the transaction as an initial transaction. This results in a signiticant speedup when migrating many refs: Benchmark 1: migrate reftable:files (refcount = 100000, revision = HEAD~) Time (mean ± σ): 3.247 s ± 0.034 s [User: 0.485 s, System: 2.722 s] Range (min … max): 3.216 s … 3.309 s 10 runs Benchmark 2: migrate reftable:files (refcount = 100000, revision = HEAD) Time (mean ± σ): 453.6 ms ± 1.9 ms [User: 214.6 ms, System: 230.5 ms] Range (min … max): 451.5 ms … 456.4 ms 10 runs Summary migrate reftable:files (refcount = 100000, revision = HEAD) ran 7.16 ± 0.08 times faster than migrate reftable:files (refcount = 100000, revision = HEAD~) As the reftable backend doesn't (yet) special-case initial transactions there is no comparable speedup for that backend. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-20Merge branch 'jt/repack-local-promisor'Junio C Hamano3-8/+38
"git gc" discards any objects that are outside promisor packs that are referred to by an object in a promisor pack, and we do not refetch them from the promisor at runtime, resulting an unusable repository. Work it around by including these objects in the referring promisor pack at the receiving end of the fetch. * jt/repack-local-promisor: index-pack: repack local links into promisor packs t5300: move --window clamp test next to unclamped t0410: use from-scratch server t0410: make test description clearer
2024-11-20Merge branch 'db/submodule-fetch-with-remote-name-fix' into maint-2.47Junio C Hamano1-0/+20
A "git fetch" from the superproject going down to a submodule used a wrong remote when the default remote names are set differently between them. * db/submodule-fetch-with-remote-name-fix: submodule: correct remote name with fetch
2024-11-20Merge branch 'ps/cache-tree-w-broken-index-entry' into maint-2.47Junio C Hamano1-8/+11
Fail gracefully instead of crashing when attempting to write the contents of a corrupt in-core index as a tree object. * ps/cache-tree-w-broken-index-entry: unpack-trees: detect mismatching number of cache-tree/index entries cache-tree: detect mismatching number of index entries cache-tree: refactor verification to return error codes
2024-11-20Merge branch 'ps/maintenance-start-crash-fix' into maint-2.47Junio C Hamano1-0/+16
"git maintenance start" crashed due to an uninitialized variable reference, which has been corrected. * ps/maintenance-start-crash-fix: builtin/gc: fix crash when running `git maintenance start`
2024-11-20Merge branch 'ds/line-log-asan-fix' into maint-2.47Junio C Hamano1-0/+28
Use after free and double freeing at the end in "git log -L... -p" had been identified and fixed. * ds/line-log-asan-fix: line-log: protect inner strbuf from free
2024-11-20index-pack: teach --promisor to forbid pack nameJonathan Tan1-3/+1
Currently, - Running "index-pack --promisor" outside a repo segfaults. - It may be confusing to a user that running "index-pack --promisor" within a repo may make changes to the repo's object DB, especially since the packs indexed by the index-pack invocation may not even be related to the repo. As discussed in [1] and [2], teaching --promisor to forbid a packfile name solves both these problems. This combination of arguments requires a repo (since we are writing the resulting .pack and .idx to it) and it is clear that the files are related to the repo. Currently, Git uses "index-pack --promisor" only when fetching into a repo, so it could be argued that we should teach "index-pack" a new argument (say, "--fetching-mode") instead of tying --promisor to a generic argument like the packfile name. However, this --promisor feature could conceivably be used whenever we have a packfile that is known to come from the promisor remote (whether obtained through Git's fetch protocol or through other means) so not using a new argument seems reasonable - one could envision a user-made script obtaining a packfile and then running "index-pack --promisor --stdin", for example. In fact, it might be possible to relax the restriction further (say, by also allowing --promisor when indexing a packfile that is in the object DB), but relaxing the restriction is backwards-compatible so we can revisit that later. One thing to watch out for is the possibility of a future Git feature that indexes a pack in the context of a repo, but does not necessarily write the resulting pack to it (and does not necessarily desire to make any changes to the object DB). One such feature would be fetch quarantine, which might need the repo context in order to detect hash collisions, but would also need to ensure that the object DB is undisturbed in case the fetch fails for whatever reason, even if the reason occurs only after the indexing is complete. It may not be obvious to the implementer of such a feature that "index-pack" could sometimes write packs other than the indexed pack to the object DB, but there are already other ways that "fetch" could write to the object DB (in particular, packfile URIs and bundle URIs), so hopefully the implementation of this future feature would already include a test that the object DB be undisturbed. This change requires the change to t5300 by 1f52cdfacb (index-pack: document and test the --promisor option, 2022-03-09) to be undone. (--promisor is already tested indirectly, so we don't need the explicit test here any more.) [1] https://lore.kernel.org/git/20241114005652.GC1140565@coredump.intra.peff.net/ [2] https://lore.kernel.org/git/20241119185345.GB15723@coredump.intra.peff.net/ Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-20builtin/gc: provide hint when maintenance hits a stale schedule lockPatrick Steinhardt1-0/+8
When running scheduled maintenance via `git maintenance start`, we acquire a lockfile to ensure that no other scheduled maintenance task is running in the repository concurrently. If so, we do provide an error to the user hinting that another process seems to be running in this repo. There are two important cases why such a lockfile may exist: - An actual git-maintenance(1) process is still running in this repository. - An earlier process may have crashed or was interrupted part way through and has left a stale lockfile behind. In c95547a394 (builtin/gc: fix crash when running `git maintenance start`, 2024-10-10), we have fixed an issue where git-maintenance(1) would crash with the "start" subcommand, and the underlying bug causes the second scenario to trigger quite often now. Most users don't know how to get out of that situation again though. Ideally, we'd be removing the stale lock for our users automatically. But in the context of repository maintenance this is rather risky, as it can easily run for hours or even days. So finding a clear point where we know that the old process has exited is basically impossible. We have the same issue in other subsystems, e.g. when locking refs. Our lockfile interfaces thus provide the `unable_to_lock_message()` function for exactly this purpose: it provides a nice hint to the user that explains what is going on and how to get out of that situation again by manually removing the file. Adapt git-maintenance(1) to print a similar hint. While we could use the above function, we can provide a bit more context as we know exactly what kind of process would create the lockfile. Reported-by: Miguel Rincon Barahona <mrincon@gitlab.com> Reported-by: Kev Kloss <kkloss@gitlab.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19Merge branch 'ps/reftable-detach' into ps/reftable-iterator-reuseJunio C Hamano10-107/+115
* ps/reftable-detach: reftable/system: provide thin wrapper for lockfile subsystem reftable/stack: drop only use of `get_locked_file_path()` reftable/system: provide thin wrapper for tempfile subsystem reftable/stack: stop using `fsync_component()` directly reftable/system: stop depending on "hash.h" reftable: explicitly handle hash format IDs reftable/system: move "dir.h" to its only user
2024-11-19reftable/system: provide thin wrapper for lockfile subsystemPatrick Steinhardt5-0/+6
We use the lockfile subsystem to write lockfiles for "tables.list". As with the tempfile subsystem, the lockfile subsystem also hooks into our infrastructure to prune stale locks via atexit(3p) or signal handlers. Furthermore, the lockfile subsystem also handles locking timeouts, which do add quite a bit of logic. Having to reimplement that in the context of Git wouldn't make a whole lot of sense, and it is quite likely that downstream users of the reftable library may have a better idea for how exactly to implement timeouts. So again, provide a thin wrapper for the lockfile subsystem instead such that the compatibility shim is fully self-contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19reftable/system: stop depending on "hash.h"Patrick Steinhardt10-107/+108
We include "hash.h" in "reftable/system.h" such that we can use hash format IDs as well as the raw size of SHA1 and SHA256. As we are in the process of converting the reftable library to become standalone we of course cannot rely on those constants anymore. Introduce a new `enum reftable_hash` to replace internal uses of the hash format IDs and new constants that replace internal uses of the hash size. Adapt the reftable backend to set up the correct hash function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19reftable/system: move "dir.h" to its only userPatrick Steinhardt1-0/+1
We still include "dir.h" in "reftable/system.h" even though it is not used by anything but by a single unit test. Move it over into that unit test so that we don't accidentally use any functionality provided by it in the reftable codebase. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19fast-import: avoid making replace refs point to themselvesElijah Newren1-0/+28
If someone replaces a commit with a modified version, then builds on that commit, and then later decides to rewrite history in a format like git fast-export --all | CMD_TO_TWEAK_THE_STREAM | git fast-import and CMD_TO_TWEAK_THE_STREAM undoes the modifications that the replacement did, then at the end you'd get a replace ref that points to itself. For example: $ git show-ref | grep replace fb92ebc654641b310e7d0360d0a5a49316fd7264 refs/replace/fb92ebc654641b310e7d0360d0a5a49316fd7264 Git commands which pay attention to replace refs will die with an error when a self-referencing replace ref is present: $ git log fatal: replace depth too high for object fb92ebc654641b310e7d0360d0a5a49316fd7264 Avoid such problems by deleting replace refs that will simply end up pointing to themselves at the end of our writing. Unless users specify --quiet, warn them when we delete such a replace ref. Two notes about this patch: * We are not ignoring the problematic update of the replace ref (turning it into a no-op), we are replacing the update with a delete. The logic here is that if the repository had a value for the replace ref before fast-import was run, and the replace ref was explicitly named in the fast-import stream, we don't want the replace ref to be left with a pre-fast-import value. * While loops with more than one element (e.g. refs/replace/A points to B, and refs/replace/B points to A) are possible, they seem much less plausible. It is pretty easy to create a sequence of git-filter-repo commands that will trigger a self-referencing replace ref, but I do not know how to trigger a scenario with a cycle length greater than 1. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-18t/unit-tests: convert "clar-generate.awk" into a shell scriptPatrick Steinhardt2-50/+63
Convert "clar-generate.awk" into a shell script that invokes awk(1). This allows us to avoid the shell redirect in the build system, which may otherwise be a problem with build systems on platforms that use a different shell. While at it, wrap the overly long lines in the CMake build instructions. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-15Allow cloning from repositories owned by another userbrian m. carlson2-3/+10
Historically, Git has allowed users to clone from an untrusted repository, and we have documented that this is safe to do so: `upload-pack` tries to avoid any dangerous configuration options or hooks from the repository it's serving, making it safe to clone an untrusted directory and run commands on the resulting clone. However, this was broken by f4aa8c8bb1 ("fetch/clone: detect dubious ownership of local repositories", 2024-04-10) in an attempt to make things more secure. That change resulted in a variety of problems when cloning locally and over SSH, but it did not change the stated security boundary. Because the security boundary has not changed, it is safe to adjust part of the code that patch introduced. To do that and restore the previous functionality, adjust enter_repo to take two flags instead of one. The two bits are - ENTER_REPO_STRICT: callers that require exact paths (as opposed to allowing known suffixes like ".git", ".git/.git" to be omitted) can set this bit. Corresponds to the "strict" parameter that the flags word replaces. - ENTER_REPO_ANY_OWNER_OK: callers that are willing to run without ownership check can set this bit. The former is --strict-paths option of "git daemon". The latter is set only by upload-pack, which honors the claimed security boundary. Note that local clones across ownership boundaries require --no-local so that upload-pack is used. Document this fact in the manual page and provide an example. This patch was based on one written by Junio C Hamano. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-15pack-objects: only perform verbatim reuse on the preferred packTaylor Blau1-1/+1
When reusing objects from source pack(s), write_reused_pack_verbatim() is responsible for reusing objects whole eword_t's at a time. It works by taking the longest continuous run of objects from the beginning of each source pack that the caller wants, and reuses the entirety of that section from each pack. This is based on the assumption that we don't have any gaps within the region. This assumption relieves us from having to patch any OFS_DELTAs, since we know that there aren't any gaps between any delta and its base in that region. To illustrate why this assumption is necessary, suppose we have some pack P, which has objects X, Y, and Z. If the MIDX's copy of Y was selected from a pack other than P, then the bit corresponding to object Y will appear earlier in the bitmap than the bits corresponding to X and Z. If pack-objects already has or will use the copy of Y from the pack it was selected from in the MIDX, then it is an error to reuse all objects between X and Z in the source pack. Doing so will cause us to reuse Y from a different pack than the one which represents Y in the MIDX, causing us to either: - include the object twice, assuming that the caller wants Y in the pack, or - include the object once, resulting in us packing more objects than necessary. This regression comes from ca0fd69e37 (pack-objects: prepare `write_reused_pack_verbatim()` for multi-pack reuse, 2023-12-14), which incorrectly assumed that there would be no gaps in reusable regions of non-preferred packs. Instead, we can only safely perform the whole-word reuse optimization on the preferred pack, where we know with certainty that no gaps exist in that region of the bitmap. We can still reuse objects from non-preferred packs, but we have to inspect them individually in write_reused_pack() to ensure that any gaps that may exist are accounted for. This allows us to simplify the implementation of write_reused_pack_verbatim() back to almost its pre-multi-pack reuse form, since we can now assume that the beginning of the pack appears at the beginning of the bitmap, meaning that we don't have to account for any bits up to the first word boundary (like we had to special case in ca0fd69e37). The only significant changes from the pre-ca0fd69e37 implementation are: - that we can no longer inspect words up to the end of reuse_packfile_bitmap->word_alloc, since we only want to look at words whose bits all correspond to objects in the given packfile, and - that we return early when given a reuse_packfile which is not preferred, making the call a noop. In the future, it might be possible to restore this optimization if we could guarantee that some reuse packs don't contain any gaps by construction (similar to the "disjoint packs" idea in very early versions of multi-pack reuse). Helped-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-15t5332-multi-pack-reuse.sh: demonstrate duplicate packing failureTaylor Blau1-0/+22
In the multi-pack reuse code, there are two paths for reusing the on-disk representation of an object, handled by: - builtin/pack-objects.c::write_reused_pack_one() - builtin/pack-objects.c::write_reused_pack_verbatim() The former is responsible for copying the bytes for a single object out of an existing source pack. The latter does the same but for a region of objects aligned at eword_t boundaries. Demonstrate a bug whereby write_reused_pack_verbatim() can be tricked into writing out objects from some source pack, even when those objects were selected from a different source pack in the MIDX bitmap. When the caller wants at least one of the objects in that region, pack-objects will write the same object twice as a result of this bug. In the other case where the caller doesn't want any of the objects in the region of interest, we will write out objects that weren't requested. Demonstrate this bug by creating two packs, where the preferred one of those packs contains a single object which also appears in the main (non-preferred) pack. A separate bug[^1] prevents us from triggering the main bug when the duplicated object is the last one in the main pack, but any earlier object will suffice. We could fix that separate bug, but the following commit will simplify write_reused_pack_verbatim() and only call it on the preferred pack, so doing so would have little point. [^1]: Because write_reused_pack_verbatim() only reuses bits in the range off_t pack_start_off = pack_pos_to_offset(reuse_packfile->p, 0); off_t pack_end_off = pack_pos_to_offset(reuse_packfile->p, pos - reuse_packfile->bitmap_pos); written += pos - reuse_packfile->bitmap_pos; /* We're recording one chunk, not one object. */ record_reused_object(pack_start_off, pack_start_off - (hashfile_total(out) - pack_start)); , or in other words excluding the object beginning at position 'pos - reuse_packfile->bitmap_pos' in the source pack. But since reuse_packfile->bitmap_pos is '1' in the non-preferred pack (accounting for the single-object pack which is preferred), we don't actually copy the bytes from the last object. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-15refs: don't invoke reference-transaction hook for reflogsKarthik Nayak1-2/+0
The reference-transaction hook is invoked whenever there is a reference update being performed. For each state of the transaction, we iterate over the updates present and pass this information to the hook. The `ref_update` structure is used to hold these updates within a `transaction`. We use the same structure for holding reflog updates too. Which means that the reference transaction hook is also obtaining information about a reflog update. This is a bug, since: - The hook is designed to work with reference updates and reflogs updates are different. - The hook doesn't have the required information to distinguish reference updates from reflog updates. This is particularly evident when the default branch (pointed by HEAD) is updated, we see that the hook also receives information about HEAD being changed. In reality, we only add a reflog update for HEAD, while HEAD's values remains the same. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-14test-lib: move malloc-debug setup after $PATH setupJeff King1-50/+50
Originally, the conditional definition of the setup/teardown functions for malloc checking could be run at any time, because they depended only on command-line options and the system getconf function. But since 02d900361c (test-lib: check malloc debug LD_PRELOAD before using, 2024-11-11), we probe the system by running "git version". Since this code runs before we've set $PATH to point to the version of Git we intend to test, we actually run the system version of git. This mostly works, since what we really care about is whether the LD_PRELOAD works, and it should work the same with any program. But there are some corner cases: 1. You might not have a system git at all, in which case the preload will appear to fail, even though it could work with the actual built version of git. 2. Your system git could be linked in a different way. For example, if it was built statically, then it will ignore LD_PRELOAD entirely, and we might assume that the preload works, even though it might not when used with a dynamic build. We could give a more complete path to the version of Git we intend to test, but features like GIT_TEST_INSTALLED make that not entirely trivial. So instead, let's just bump the setup until after we've set up the $PATH. There's no need for us to do it early, as long as it is done before the first test runs. Reported-by: Toon Claes <toon@iotcl.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-13Merge branch 'ps/mingw-rename'Junio C Hamano1-3/+5
The MinGW compatibility layer has been taught to support POSIX semantics for atomic renames when other process(es) have a file opened at the destination path. * ps/mingw-rename: compat/mingw: support POSIX semantics for atomic renames compat/mingw: allow deletion of most opened files compat/mingw: share file handles created via `CreateFileW()`
2024-11-13Merge branch 'jt/commit-graph-missing'Junio C Hamano1-2/+2
A regression where commit objects missing from a commit-graph can cause an infinite loop when doing a fetch in a partial clone has been fixed. * jt/commit-graph-missing: fetch-pack: die if in commit graph but not obj db Revert "fetch-pack: add a deref_without_lazy_fetch_extended()"
2024-11-13Merge branch 'en/shallow-exclude-takes-a-ref-fix'Junio C Hamano1-0/+7
The "--shallow-exclude=<ref>" option to various history transfer commands takes a ref, not an arbitrary revision. * en/shallow-exclude-takes-a-ref-fix: doc: correct misleading descriptions for --shallow-exclude upload-pack: fix ambiguous error message
2024-11-13Merge branch 'ak/t1016-style'Junio C Hamano1-132/+130
Test modernization. * ak/t1016-style: t1016: clean up style
2024-11-13Merge branch 'ps/leakfixes-part-9'Junio C Hamano22-2/+31
More leakfixes. * ps/leakfixes-part-9: (22 commits) list-objects-filter-options: work around reported leak on error builtin/merge: release output buffer after performing merge dir: fix leak when parsing "status.showUntrackedFiles" t/helper: fix leaking buffer in "dump-untracked-cache" t/helper: stop re-initialization of `the_repository` sparse-index: correctly free EWAH contents dir: release untracked cache data combine-diff: fix leaking lost lines builtin/tag: fix leaking key ID on failure to sign transport-helper: fix leaking import/export marks builtin/commit: fix leaking cleanup config trailer: fix leaking strbufs when formatting trailers trailer: fix leaking trailer values builtin/commit: fix leaking change data contents upload-pack: fix leaking URI protocols pretty: clear signature check diff-lib: fix leaking diffopts in `do_diff_cache()` revision: fix leaking bloom filters builtin/grep: fix leak with `--max-count=0` grep: fix leak in `grep_splice_or()` ...
2024-11-13builtin/difftool: intialize some hashmap variablesSimon Marchi1-0/+4
When running a dir-diff command that produces no diff, variables `wt_modified` and `tmp_modified` are used while uninitialized, causing: $ /home/smarchi/src/git/git-difftool --dir-diff master free(): invalid pointer [1] 334004 IOT instruction (core dumped) /home/smarchi/src/git/git-difftool --dir-diff master $ valgrind --track-origins=yes /home/smarchi/src/git/git-difftool --dir-diff master ... Invalid free() / delete / delete[] / realloc() at 0x48478EF: free (vg_replace_malloc.c:989) by 0x422CAC: hashmap_clear_ (hashmap.c:208) by 0x283830: run_dir_diff (difftool.c:667) by 0x284103: cmd_difftool (difftool.c:801) by 0x238E0F: run_builtin (git.c:484) by 0x2392B9: handle_builtin (git.c:750) by 0x2399BC: cmd_main (git.c:921) by 0x356FEF: main (common-main.c:64) Address 0x1ffefff180 is on thread 1's stack in frame #2, created by run_dir_diff (difftool.c:358) ... If taking any `goto finish` path before these variables are initialized, `hashmap_clear_and_free()` operates on uninitialized data, sometimes causing a crash. This regression was introduced in 7f795a1715 (builtin/difftool: plug several trivial memory leaks, 2024-09-26). Fix it by initializing those variables with the `HASHMAP_INIT` macro. Add a test comparing the main branch to itself, resulting in no diff. Signed-off-by: Simon Marchi <simon.marchi@efficios.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-12fetch: adjust refspec->raw_nr when filtering prefetch refspecsJeff King1-0/+4
In filter_prefetch_refspecs(), we may remove one or more refspecs if they point into refs/tags/. When we do, we remove the item from the refspec->items array, shifting subsequent items down, and then decrement the refspec->nr count. We also remove the item from the refspec->raw array, but fail to decrement refspec->raw_nr. This leaves us with a count that is too high, and anybody looking at the "raw" array will erroneously see either: 1. The removed entry, if there were no subsequent items to shift down. 2. A duplicate of the final entry, as everything is shifted down but there was nothing to overwrite the final item. The obvious culprit to run into this is calling refspec_clear(), which will try to free the removed entry (case 1) or double-free the final entry (case 2). But even though the bug has existed since the function was added in 2e03115d0c (fetch: add --prefetch option, 2021-04-16), we did not trigger it in the test suite. The --prefetch option is normally only used with configured refspecs, and we never bother to call refspec_clear() on those (they are stored as part of a struct remote, which is held in a global variable). But you could trigger case 2 manually like: git fetch --prefetch . refs/tags/foo refs/tags/bar Ironically you couldn't trigger case 1, because the code accidentally leaked the string in the raw array, and the two bugs (the leak and the double-free) cancelled out. But when we fixed the leak in ea4780307c (fetch: free "raw" string when shrinking refspec, 2024-09-24), it became possible to trigger that, too, with a single item: git fetch --prefetch . refs/tags/foo We can fix both cases by just correctly decrementing "raw_nr" when we shrink the array. Even though we don't expect people to use --prefetch with command-line refspecs, we'll add a test to make sure it behaves well (like the test just before it, we're just confirming that the filtered prefetch succeeds at all). Reported-by: Eric Mills <ermills@epic.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-12index-pack: repack local links into promisor packsJonathan Tan1-0/+30
Teach index-pack to, when processing the objects in a pack with --promisor specified on the CLI, repack local objects (and the local objects that they refer to, recursively) referenced by these objects into promisor packs. This prevents the situation in which, when fetching from a promisor remote, we end up with promisor objects (newly fetched) referring to non-promisor objects (locally created prior to the fetch). This situation may arise if the client had previously pushed objects to the remote, for example. One issue that arises in this situation is that, if the non-promisor objects become inaccessible except through promisor objects (for example, if the branch pointing to them has moved to point to the promisor object that refers to them), then GC will garbage collect them. There are other ways to solve this, but the simplest seems to be to enforce the invariant that we don't have promisor objects referring to non-promisor objects. This repacking is done from index-pack to minimize the performance impact. During a fetch, the only time most objects are fully inflated in memory is when their object ID is computed, so we also scan the objects (to see which objects they refer to) during this time. Also to minimize the performance impact, an object is calculated to be local if it's a loose object or present in a non-promisor pack. (If it's also in a promisor pack or referred to by an object in a promisor pack, it is technically already a promisor object. But a misidentification of a promisor object as a non-promisor object is relatively benign here - we will thus repack that promisor object into a promisor pack, duplicating it in the object store, but there is no correctness issue, just an issue of inefficiency.) Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-12test-lib: check malloc debug LD_PRELOAD before usingJeff King1-2/+5
This fixes test failures across the suite on glibc platforms that don't have libc_malloc_debug.so.0. We added support for glibc's malloc checking routines long ago in a731fa916e (Add MALLOC_CHECK_ and MALLOC_PERTURB_ libc env to the test suite for detecting heap corruption, 2012-09-14). Back then we didn't need to do any checks to see if the platform supported it. We were just setting some environment variables which would either enable it or not. That changed in 131b94a10a (test-lib.sh: Use GLIBC_TUNABLES instead of MALLOC_CHECK_ on glibc >= 2.34, 2022-03-04). Now that glibc split this out into libc_malloc_debug.so, we have to add it to LD_PRELOAD. We only do that when we detect glibc, but it's possible to have glibc but not the malloc debug library. In that case LD_PRELOAD will complain to stderr, and tests which check for an empty stderr will fail. You can work around this by setting TEST_NO_MALLOC_CHECK, which disables the feature entirely. But it's not obvious to know you need to do that. Instead, since this malloc checking is best-effort anyway, let's just automatically disable it when the LD_PRELOAD appears not to work. We can check it by running something simple that should work (and produce nothing on stderr) like "git version". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-11t5300: add test for 'show-index --object-format'Abhijeet Sonar1-0/+14
In 88a09a557c (builtin/show-index: provide options to determine hash algo), the flag --object-format was added to show-index builtin as a way to provide a hash algorithm explicitly. However, we do not have tests in place for that functionality. Add them. Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-11show-index: fix uninitialized hash functionAbhijeet Sonar1-0/+4
In c8aed5e8da (repository: stop setting SHA1 as the default object hash), we got rid of the default hash algorithm for the_repository. Due to this change, it is now the responsibility of the callers to set their own default when this is not present. As stated in the docs, show-index should use SHA1 as the default hash algorithm when run outside a repository. Make sure this promise is met by falling back to SHA1 when the_hash_algo is not present (i.e. when the command is run outside a repository). Also add a test that verifies this behavior. Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-08Merge branch 'jk/left-right-bitmap'Junio C Hamano1-0/+12
When called with '--left-right' and '--use-bitmap-index', 'rev-list' will produce output without any left/right markers, which has been corrected. * jk/left-right-bitmap: rev-list: skip bitmap traversal for --left-right
2024-11-08Merge branch 'ps/upgrade-clar'Junio C Hamano12-124/+205
Buildfix and upgrade of Clar to a newer version. * ps/upgrade-clar: cmake: set up proper dependencies for generated clar headers cmake: fix compilation of clar-based unit tests Makefile: extract script to generate clar declarations Makefile: adjust sed command for generating "clar-decls.h" t/unit-tests: update clar to 206accb
2024-11-07t/perf: add tests for git-describeJeff King1-0/+30
We don't have a perf script for git-describe, despite it often being accused of slowness. Let's add a few simple tests to start with. Rather than use the existing tags from our test repo, we'll make our own so that we have a known quantity and position. We'll add a "new" tag near the tip of HEAD, and an "old" one that is at the very bottom. And then our tests are: 1. Describing HEAD naively requires walking all the way down to the old tag as we collect candidates. This gives us a baseline for what "slow" looks like. 2. Doing the same with --candidates=1 can potentially be fast, because we can quie after finding "new". But we don't, and it's also slow. 3. Likewise we should be able to quit when there are no more tags to find. This can happen naturally if a repo has few tags, but also if you restrict the set of tags with --match. Here are the results running against linux.git. Note that I have a commit-graph built for the repo, so "slow" here is ~700ms. Without a commit graph it's more like 9s! Test HEAD -------------------------------------------------------------- 6100.2: describe HEAD 0.70(0.66+0.04) 6100.3: describe HEAD with one max candidate 0.70(0.66+0.04) 6100.4: describe HEAD with one tag 0.70(0.64+0.06) Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-07t6120: demonstrate weakness in disjoint-root handlingJeff King1-3/+11
Commit 30b1c7ad9d (describe: don't abort too early when searching tags, 2020-02-26) tried to fix a problem that happens when there are disjoint histories: to accurately compare the counts for different tags, we need to keep walking the history longer in order to find a common base. But its fix misses a case: we may still bail early if we hit the max_candidates limit, producing suboptimal output. You can see this in action by adding "--candidates=2" to the tests; we'll stop traversing as soon as we see the second tag and will produce the wrong answer. I hit this in practice while trying to teach git-describe not to keep looking for candidates after we've seen all tags in the repo (effectively adding --candidates=2, since these toy repos have only two tags each). This is probably fixable by continuing to walk after hitting the max-candidates limit, all the way down to a common ancestor of all candidates. But it's not clear in practice what the preformance implications would be (it would depend on how long the branches that hold the candidates are). So I'm punting on that for now, but I'd like to adjust the tests to be more resilient, and to document the findings. So this patch: 1. Adds an extra tag at the bottom of history. This shouldn't change the output, but does mean we are more resilient to low values of --candidates (e.g., if we start reducing it to the total number of tags). This is arguably closer to the real world anyway, where you're not going to have just 2 tags, but an arbitrarily long history going back in time, possibly with multiple irrelevant tags in it (I called the new tag "H" here for "history"). 2. Run the same tests with --candidates=2, which shows that even with the current code they can fail if we end the traversal early. That leaves a trail for anybody interested in trying to improve the behavior. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-07Merge branch 'ps/leakfixes-part-9' into ps/leakfixes-part-10Junio C Hamano22-2/+31
* ps/leakfixes-part-9: (22 commits) list-objects-filter-options: work around reported leak on error builtin/merge: release output buffer after performing merge dir: fix leak when parsing "status.showUntrackedFiles" t/helper: fix leaking buffer in "dump-untracked-cache" t/helper: stop re-initialization of `the_repository` sparse-index: correctly free EWAH contents dir: release untracked cache data combine-diff: fix leaking lost lines builtin/tag: fix leaking key ID on failure to sign transport-helper: fix leaking import/export marks builtin/commit: fix leaking cleanup config trailer: fix leaking strbufs when formatting trailers trailer: fix leaking trailer values builtin/commit: fix leaking change data contents upload-pack: fix leaking URI protocols pretty: clear signature check diff-lib: fix leaking diffopts in `do_diff_cache()` revision: fix leaking bloom filters builtin/grep: fix leak with `--max-count=0` grep: fix leak in `grep_splice_or()` ...
2024-11-06compat/mingw: support POSIX semantics for atomic renamesPatrick Steinhardt1-3/+5
By default, Windows restricts access to files when those files have been opened by another process. As explained in the preceding commits, these restrictions can be loosened such that reads, writes and/or deletes of files with open handles _are_ allowed. While we set up those sharing flags in most relevant code paths now, we still don't properly handle POSIX-style atomic renames in case the target path is open. This is failure demonstrated by t0610, where one of our tests spawns concurrent writes in a reftable-enabled repository and expects all of them to succeed. This test fails most of the time because the process that has acquired the "tables.list" lock is unable to rename it into place while other processes are busy reading that file. Windows 10 has introduced the `FILE_RENAME_FLAG_POSIX_SEMANTICS` flag that allows us to fix this usecase [1]. When set, it is possible to rename a file over a preexisting file even when the target file still has handles open. Those handles must have been opened with the `FILE_SHARE_DELETE` flag, which we have ensured in the preceding commits. Careful readers might have noticed that [1] does not mention the above flag, but instead mentions `FILE_RENAME_POSIX_SEMANTICS`. This flag is not for use with `SetFileInformationByHandle()` though, which is what we use. And while the `FILE_RENAME_FLAG_POSIX_SEMANTICS` flag exists, it is not documented on [2] or anywhere else as far as I can tell. Unfortunately, we still support Windows systems older than Windows 10 that do not yet have this new flag. Our `_WIN32_WINNT` SDK version still targets 0x0600, which is Windows Vista and later. And even though that Windows version is out-of-support, bumping the SDK version all the way to 0x0A00, which is Windows 10 and later, is not an option as it would make it impossible to compile on Windows 8.1, which is still supported. Instead, we have to manually declare the relevant infrastructure to make this feature available and have fallback logic in place in case we run on a Windows version that does not yet have this flag. On another note: `mingw_rename()` has a retry loop that is used in case deleting a file failed because it's still open in another process. One might be pressed to not use this loop anymore when we can use POSIX semantics. But unfortunately, we have to keep it around due to our dependence on the `FILE_SHARE_DELETE` flag. While we know to set that sharing flag now, other applications may not do so and may thus still cause sharing violations when we try to rename a file. This fixes concurrent writes in the reftable backend as demonstrated in t0610, but may also end up fixing other usecases where Git wants to perform renames. [1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/ns-ntifs-_file_rename_information [2]: https://learn.microsoft.com/en-us/windows/win32/api/winbase/ns-winbase-file_rename_info Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05fetch-pack: die if in commit graph but not obj dbJonathan Tan1-2/+2
When fetching, there is a step in which sought objects are first checked against the local repository; only objects that are not in the local repository are then fetched. This check first looks up the commit graph file, and returns "present" if the object is in there. However, the action of first looking up the commit graph file is not done everywhere in Git, especially if the type of the object at the time of lookup is not known. This means that in a repo corruption situation, a user may encounter an "object missing" error, attempt to fetch it, and still encounter the same error later when they reattempt their original action, because the object is present in the commit graph file but not in the object DB. Therefore, make it a fatal error when this occurs. (Note that we cannot proceed to include this object in the list of objects to be fetched without changing at least the fetch negotiation code: what would happen is that the client will send "want X" and "have X" and when I tested at $DAYJOB with a work server that uses JGit, the server reasonably returned an empty packfile. And changing the fetch negotiation code to only use the object DB when deciding what to report as "have" would be an unnecessary slowdown, I think.) This was discovered when a lazy fetch of a missing commit completed with nothing actually fetched, and the writing of the commit graph file after every fetch then attempted to read said missing commit, triggering a lazy fetch of said missing commit, resulting in an infinite loop with no user-visible indication (until they check the list of processes running on their computer). With this fix, there is no infinite loop. Note that although the repo corruption we discovered was caused by a bug in GC in a partial clone, the behavior that this patch teaches Git to warn about applies to any repo with commit graph enabled and with a missing commit, whether it is a partial clone or not. t5330, introduced in 3a1ea94a49 (commit-graph.c: no lazy fetch in lookup_commit_in_graph(), 2022-07-01), tests that an interaction between fetch and the commit graph does not cause an infinite loop. This patch changes the exit code in that situation, so that test had to be changed. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04list-objects-filter-options: work around reported leak on errorPatrick Steinhardt1-0/+1
This one is a little bit more curious. In t6112, we have a test that exercises the `git rev-list --filter` option with invalid filters. We execute git-rev-list(1) via `test_must_fail`, which means that we check for leaks even though Git exits with an error code. This causes the following leak: Direct leak of 27 byte(s) in 1 object(s) allocated from: #0 0x5555555e6946 in realloc.part.0 lsan_interceptors.cpp.o #1 0x5555558fb4b6 in xrealloc wrapper.c:137:8 #2 0x5555558b6e06 in strbuf_grow strbuf.c:112:2 #3 0x5555558b7550 in strbuf_add strbuf.c:311:2 #4 0x5555557c1a88 in strbuf_addstr strbuf.h:310:2 #5 0x5555557c1d4c in parse_list_objects_filter list-objects-filter-options.c:261:3 #6 0x555555885ead in handle_revision_pseudo_opt revision.c:2899:3 #7 0x555555884e20 in setup_revisions revision.c:3014:11 #8 0x5555556c4b42 in cmd_rev_list builtin/rev-list.c:588:9 #9 0x5555555ec5e3 in run_builtin git.c:483:11 #10 0x5555555eb1e4 in handle_builtin git.c:749:13 #11 0x5555555ec001 in run_argv git.c:819:4 #12 0x5555555eaf94 in cmd_main git.c:954:19 #13 0x5555556fd569 in main common-main.c:64:11 #14 0x7ffff7ca714d in __libc_start_call_main (.../lib/libc.so.6+0x2a14d) #15 0x7ffff7ca7208 in __libc_start_main@GLIBC_2.2.5 (.../libc.so.6+0x2a208) #16 0x5555555ad064 in _start (git+0x59064) This leak is valid, as we call `die()` and do not clean up the memory at all. But what's curious is that this is the only leak reported, because we don't clean up any other allocated memory, either, and I have no idea why the leak sanitizer treats this buffer specially. In any case, we can work around the leak by shuffling things around a bit. Instead of calling `gently_parse_list_objects_filter()` and dying after we have modified the filter spec, we simply do so beforehand. Like this we don't allocate the buffer in the error case, which makes the reported leak go away. It's not pretty, but it manages to make t6112 leak free. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04builtin/merge: release output buffer after performing mergePatrick Steinhardt1-0/+1
The `obuf` member of `struct merge_options` is used to buffer output in some cases. In order to not discard its allocated memory we only release its contents in `merge_finalize()` when we're not currently recursing into a subtree. This results in some situations where we seemingly do not release the buffer reliably. We thus have calls to `strbuf_release()` for this buffer scattered across the codebase. But we're missing one callsite in git-merge(1), which causes a memory leak. We should ideally refactor this interface so that callers don't have to know about any such internals. But for now, paper over the issue by adding one more `strbuf_release()` call. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04dir: fix leak when parsing "status.showUntrackedFiles"Patrick Steinhardt1-0/+1
We use `repo_config_get_string()` to read "status.showUntrackedFiles" from the config subsystem. This function allocates the result, but we never free the result after parsing it. The value never leaves the scope of the calling function, so refactor it to instead use `repo_config_get_string_tmp()`, which does not hand over ownership to the caller. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04t/helper: fix leaking buffer in "dump-untracked-cache"Patrick Steinhardt1-0/+2
We never release the local `struct strbuf base` buffer, thus leaking memory. Fix this leak. This leak is exposed by t7063, but plugging it alone does not make the whole test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04t/helper: stop re-initialization of `the_repository`Patrick Steinhardt2-2/+1
While "common-main.c" already initializes `the_repository` for us, we do so a second time in the "read-cache" test helper. This causes a memory leak because the old repository's contents isn't released. Stop calling `initialize_repository()` to plug this leak. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04combine-diff: fix leaking lost linesPatrick Steinhardt1-0/+1
The `cnt` variable tracks the number of lines in a patch diff. It can happen though that there are no newlines, in which case we'd still end up allocating our array of `sline`s. In fact, we always allocate it with `cnt + 2` entries: one extra entry for the deletion hunk at the end, and another entry that we don't seem to ever populate at all but acts as a kind of sentinel value. When we loop through the array to clear it at the end of this function we only loop until `lno < cnt`, and thus we may not end up releasing whatever the two extra `sline`s contain. While that shouldn't matter for the sentinel value, it does matter for the extra deletion hunk sline. Regardless of that, plug this memory leak by releasing both extra entries, which makes the logic a bit easier to reason about. While at it, fix the formatting of a local comment, which incidentally also provides the necessary context for why we overallocate the `sline` array. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04builtin/tag: fix leaking key ID on failure to signPatrick Steinhardt1-0/+1
We do not free the key ID when signing a tag fails. Do so by using the common exit path. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04transport-helper: fix leaking import/export marksPatrick Steinhardt1-0/+1
Fix leaking import and export marks for transport helpers. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04builtin/commit: fix leaking cleanup configPatrick Steinhardt1-0/+1
The cleanup string set by the config is leaking when it is being overridden by an option. Fix this by tracking these via two separate variables such that we can free the old value. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04trailer: fix leaking strbufs when formatting trailersPatrick Steinhardt1-0/+1
When formatting trailer lines we iterate through each of the trailers and munge their respective token/value pairs according to the trailer options. When formatting a trailer that has its `item->token` pointer set we perform the munging in two local buffers. In the case where we figure out that the value is empty and `trim_empty` is set we just skip over the trailer item. But the buffers are local to the loop and we don't release their contents, leading to a memory leak. Plug this leak by lifting the buffers outside of the loop and releasing them on function return. This fixes the memory leaks, but also optimizes the loop as we don't have to reallocate the buffers on every single iteration. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04builtin/commit: fix leaking change data contentsPatrick Steinhardt1-0/+1
While we free the worktree change data, we never free its contents. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04upload-pack: fix leaking URI protocolsPatrick Steinhardt1-0/+1
We don't clear `struct upload_pack::uri_protocols`, which causes a memory leak. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04pretty: clear signature checkPatrick Steinhardt4-0/+4
The signature check in the formatting context is never getting released. Fix this to plug the resulting memory leak. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04diff-lib: fix leaking diffopts in `do_diff_cache()`Patrick Steinhardt1-0/+1
In `do_diff_cache()` we initialize a new `rev_info` and then overwrite its `diffopt` with a user-provided set of options. This can leak memory because `repo_init_revisions()` may end up allocating memory for the `diffopt` itself depending on the configuration. And since that field is overwritten we won't ever free it. Plug the memory leak by releasing the diffopts before we overwrite them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04revision: fix leaking bloom filtersPatrick Steinhardt1-0/+1
The memory allocated by `prepare_to_use_bloom_filter()` is not released by `release_revisions()`, causing a memory leak. Plug it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04builtin/grep: fix leak with `--max-count=0`Patrick Steinhardt1-0/+1
When executing with `--max-count=0` we'll return early from git-grep(1) without performing any cleanup, which causes memory leaks. Plug these. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04t/helper: fix leaks in "reach" test toolPatrick Steinhardt2-0/+11
The "reach" test tool doesn't bother to clean up any of its allocated resources, causing various leaks. Plug them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04upload-pack: fix ambiguous error messageElijah Newren1-0/+7
upload-pack.c takes any --shallow-exclude argument(s) from clone/fetch/etc. and passes them through expand_ref(). If it does not get back exactly one ref from the call to expand_ref(), it will die with the following error: fatal: git upload-pack: ambiguous deepen-not: %s Given that the documentation suggests to users that --shallow-exclude accepts a revision rather than a ref (which will be corrected in a subsequent commit), users may try to pass a revision. In such a case, expand_ref() will return 0 matches, but the error message we print will be misleading since "ambiguous" suggests there are multiple matches. Provide a clearer error message for such a case. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-03t1016: clean up styleAndrew Kreimer1-132/+130
Adhere to Documentation/CodingGuidelines: - Whitespace and redirect operator. - Case arms indentation. - Tabs for indentation. Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-02t5300: move --window clamp test next to unclampedJonathan Tan1-5/+5
A subsequent commit will change the behavior of "git index-pack --promisor", which is exercised in "build pack index for an existing pack", causing the unclamped and clamped versions of the --window test to exhibit different behavior. Move the clamp test closer to the unclamped test that it references. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-02t0410: use from-scratch serverJonathan Tan1-1/+1
A subsequent commit will add functionality: when fetching from a promisor remote, existing non-promisor objects that are ancestors of any fetched object will be repacked into promisor packs (since if a promisor remote has an object, it also has all its ancestors). This means that sometimes, a fetch from a promisor remote results in 2 new promisor packs (instead of the 1 that you would expect). There is a test that fetches a descendant of a local object from a promisor remote, but also specifically tests that there is exactly 1 promisor pack as a result of the fetch. This means that this test will fail when the subsequent commit is added. Since the ancestry of the fetched object is not the concern of this test, make the fetched objects have no ancestry in common with the objets in the client repo. This is done by making the server from scratch, instead of using an existing repo that has objects in common with the client. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-02t0410: make test description clearerJonathan Tan1-2/+2
Commit 9a4c507886 (t0410: test fetching from many promisor remotes, 2019-06-25) adds some tests that demonstrate not the automatic fetching of missing objects, but the direct fetching from another promisor remote (configured explicitly in one test and implicitly via --filter on the "git fetch" CLI invocation in the other test) - thus demonstrating support for multiple promisor remotes, as described in the commit message. Change the test descriptions accordingly to make this clearer. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-01Merge branch 'jk/dumb-http-finalize'Taylor Blau3-3/+35
The dumb-http code regressed when the result of re-indexing a pack yielded an *.idx file that differs in content from the *.idx file it downloaded from the remote. This has been corrected by no longer relying on the *.idx file we got from the remote. * jk/dumb-http-finalize: packfile: use oidread() instead of hashcpy() to fill object_id packfile: use object_id in find_pack_entry_one() packfile: convert find_sha1_pack() to use object_id http-walker: use object_id instead of bare hash packfile: warn people away from parse_packed_git() packfile: drop sha1_pack_index_name() packfile: drop sha1_pack_name() packfile: drop has_pack_index() dumb-http: store downloaded pack idx as tempfile t5550: count fetches in "previously-fetched .idx" test midx: avoid duplicate packed_git entries
2024-11-01Merge branch 'ak/more-typofixes'Taylor Blau26-37/+37
More typofixes. * ak/more-typofixes: t: fix typos
2024-11-01Merge branch 'rs/grep-lookahead'Taylor Blau1-0/+9
Fix 'git grep' regression on macOS by disabling lookahead when encountering invalid UTF-8 byte sequences. * rs/grep-lookahead: grep: disable lookahead on error
2024-11-01Merge branch 'ak/t1016-cleanup'Taylor Blau1-2/+2
Test cleanup. * ak/t1016-cleanup: t1016: clean up style
2024-11-01Merge branch 'ua/atoi'Taylor Blau2-0/+33
Replace various calls to atoi() with strtol_i() and strtoul_ui(), and add improved error handling. * ua/atoi: imap: replace atoi() with strtol_i() for UIDVALIDITY and UIDNEXT parsing merge: replace atoi() with strtol_i() for marker size validation daemon: replace atoi() with strtoul_ui() and strtol_i()
2024-11-01Merge branch 'sa/notes-edit'Taylor Blau1-0/+63
Teach 'git notes add' and 'git notes append' a new '-e' flag, instructing them to open the note in $GIT_EDITOR before saving. * sa/notes-edit: notes: teach the -e option to edit messages in editor
2024-11-01Merge branch 'sk/t9101-cleanup'Taylor Blau1-17/+17
Test cleanup. * sk/t9101-cleanup: t9101: ensure no whitespace after redirect
2024-11-01Merge branch 'kh/mv-breakage'Taylor Blau1-0/+12
Demonstrate an assertion failure in 'git mv'. * kh/mv-breakage: t7001: add failure test which triggers assertion
2024-11-01Merge branch 'ua/t3404-cleanup'Taylor Blau1-25/+50
Test update. * ua/t3404-cleanup: t3404: replace test with test_line_count() t3404: avoid losing exit status with focus on `git show` and `git cat-file`
2024-11-01Merge branch 'ps/platform-compat-fixes'Taylor Blau26-133/+260
Various platform compatibility fixes split out of the larger effort to use Meson as the primary build tool. * ps/platform-compat-fixes: t6006: fix prereq handling with `test_format ()` http: fix build error on FreeBSD builtin/credential-cache: fix missing parameter for stub function t7300: work around platform-specific behaviour with long paths on MinGW t5500, t5601: skip tests which exercise paths with '[::1]' on Cygwin t3404: work around platform-specific behaviour on macOS 10.15 t1401: make invocation of tar(1) work with Win32-provided one t/lib-gpg: fix setup of GNUPGHOME in MinGW t/lib-gitweb: test against the build version of gitweb t/test-lib: wire up NO_ICONV prerequisite t/test-lib: fix quoting of TEST_RESULTS_SAN_FILE
2024-11-01rev-list: skip bitmap traversal for --left-rightJeff King1-0/+12
Running: git rev-list --left-right --use-bitmap-index one...two will produce output without any left-right markers, since the bitmap traversal returns only a single set of reachable commits. Instead we should refuse to use bitmaps here and produce the correct output using a traditional traversal. This is probably not the only remaining option that misbehaves with bitmaps, but it's particularly egregious in that it feels like it _could_ work. Doing two separate traversals for the left/right sides and then taking the symmetric set differences should yield the correct answer, but our traversal code doesn't know how to do that. It's not clear if naively doing two separate traversals would always be a performance win. A traditional traversal only needs to walk down to the merge base, but bitmaps always fill out the full reachability set. So depending on your bitmap coverage, we could end up walking old bits of history twice to fill out the same uninteresting bits on both sides. We'd also of course end up with a very large --boundary set, if the user asked for that. So this might or might not be something worth implementing later. But for now, let's make sure we don't produce the wrong answer if somebody tries it. The test covers this, but also the same thing with "--count" (which is what I originally tried in a real-world case). Ironically the try_bitmap_count() code already realizes that "--left-right" won't work there. But that just causes us to fall back to the regular bitmap traversal code, which itself doesn't handle counting (we produce a list of objects rather than a count). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-30Merge branch 'sk/t7011-cleanup'Taylor Blau1-11/+11
Test cleanup. * sk/t7011-cleanup: t7011: ensure no whitespace after redirect
2024-10-30Merge branch 'co/t6050-pipefix'Taylor Blau1-47/+86
Avoid losing exit status by having Git command being tested on the upstream side of a pipe. * co/t6050-pipefix: t6050: avoid pipes with upstream Git commands
2024-10-30Merge branch 'ks/t4205-fixup'Taylor Blau1-1/+1
Testfix. * ks/t4205-fixup: t4205: fix typo in 'NUL termination with --stat'
2024-10-30Merge branch 'ps/reftable-strbuf'Taylor Blau9-210/+218
Implements a new reftable-specific strbuf replacement to reduce reftable's dependency on Git-specific data structures. * ps/reftable-strbuf: reftable: handle trivial `reftable_buf` errors reftable/stack: adapt `stack_filename()` to handle allocation failures reftable/record: adapt `reftable_record_key()` to handle allocation failures reftable/stack: adapt `format_name()` to handle allocation failures t/unit-tests: check for `reftable_buf` allocation errors reftable/blocksource: adapt interface name reftable: convert from `strbuf` to `reftable_buf` reftable/basics: provide new `reftable_buf` interface reftable: stop using `strbuf_addf()` reftable: stop using `strbuf_addbuf()`
2024-10-28t6006: fix prereq handling with `test_format ()`Patrick Steinhardt1-4/+4
In df383b5842 (t/test-lib: wire up NO_ICONV prerequisite, 2024-10-16) we have introduced a new NO_ICONV prerequisite that makes us skip tests in case Git is not compiled with support for iconv. This change subtly broke t6006: while the test suite still passes, some of its tests won't execute because they run into an error. ./t6006-rev-list-format.sh: line 92: test_expect_%e: command not found The broken tests use `test_format ()`, and the mentioned commit simply prepended the new prerequisite to its arguments. But that does not work, as the function is not aware of prereqs at all and will now treat all of its arguments incorrectly. Fix this by making the function aware of prereqs by accepting an optional fourth argument. Adapt the callsites accordingly. Reported-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>