aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2025-02-12Merge branch 'da/help-autocorrect-one-fix'Junio C Hamano3-11/+18
"git -c help.autocorrect=0 psuh" shows the suggested typofix, unlike the previous attempt in the base topic. * da/help-autocorrect-one-fix: help: add "show" as a valid configuration value help: show the suggested command when help.autocorrect is false
2025-02-12Merge branch 'sc/help-autocorrect-one'Junio C Hamano2-16/+35
"[help] autocorrect = 1" used to be a way to say "please wait for 0.1 second after suggesting a typofix of the command name before running that command"; now it means "yes, if there is a plausible typofix for the command name, please run it immediately". * sc/help-autocorrect-one: help: interpret boolean string values for help.autocorrect
2025-02-12Merge branch 'ms/remote-valid-remote-name'Junio C Hamano4-11/+12
Code shuffling. * ms/remote-valid-remote-name: remote: relocate valid_remote_name
2025-02-12Merge branch 'ms/refspec-cleanup'Junio C Hamano6-220/+244
Code clean-up. cf. <Z6G-toOJjMmK8iJG@pks.im> * ms/refspec-cleanup: refspec: relocate apply_refspecs and related funtions refspec: relocate matching related functions remote: rename query_refspecs functions refspec: relocate refname_matches_negative_refspec_item remote: rename function omit_name_by_refspec
2025-02-12Merge branch 'jp/doc-trailer-config'Junio C Hamano3-135/+140
Documentaiton updates. * jp/doc-trailer-config: config.txt: add trailer.* variables
2025-02-12Merge branch 'zh/gc-expire-to'Junio C Hamano3-2/+47
"git gc" learned the "--expire-to" option and passes it down to underlying "git repack". * zh/gc-expire-to: gc: add `--expire-to` option
2025-02-12Merge branch 'js/libgit-rust'Junio C Hamano24-81/+673
Foreign language interface for Rust into our code base has been added. * js/libgit-rust: libgit: add higher-level libgit crate libgit-sys: also export some config_set functions libgit-sys: introduce Rust wrapper for libgit.a common-main: split init and exit code into new files
2025-02-12Merge branch 'ac/t5401-use-test-path-is-file'Junio C Hamano1-8/+8
Test clean-up. * ac/t5401-use-test-path-is-file: t5401: prefer test_path_is_* helper function
2025-02-12Merge branch 'ac/t6423-unhide-git-exit-status'Junio C Hamano1-3/+6
Test clean-up. * ac/t6423-unhide-git-exit-status: t6423: fix suppression of Git’s exit code in tests
2025-02-12Merge branch 'ps/repack-keep-unreachable-in-unpacked-repo'Junio C Hamano2-1/+20
"git repack --keep-unreachable" to send unreachable objects to the main pack "git repack -ad" produces did not work when there is no existing packs, which has been corrected. * ps/repack-keep-unreachable-in-unpacked-repo: builtin/repack: fix `--keep-unreachable` when there are no packs
2025-02-12Merge branch 'ds/name-hash-tweaks'Junio C Hamano22-16/+389
"git pack-objects" and its wrapper "git repack" learned an option to use an alternative path-hash function to improve delta-base selection to produce a packfile with deeper history than window size. * ds/name-hash-tweaks: pack-objects: prevent name hash version change test-tool: add helper for name-hash values p5313: add size comparison test pack-objects: add GIT_TEST_NAME_HASH_VERSION repack: add --name-hash-version option pack-objects: add --name-hash-version option pack-objects: create new name-hash function version
2025-02-10The ninth batchJunio C Hamano1-0/+16
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-10Merge branch 'jk/ci-coverity-update'Junio C Hamano1-1/+1
CI update to make Coverity job work again. * jk/ci-coverity-update: ci: set CI_JOB_IMAGE for coverity job
2025-02-10Merge branch 'sk/unit-tests-0130'Junio C Hamano9-353/+348
Convert a handful of unit tests to work with the clar framework. * sk/unit-tests-0130: t/unit-tests: convert strcmp-offset test to use clar test framework t/unit-tests: convert strbuf test to use clar test framework t/unit-tests: adapt example decorate test to use clar test framework t/unit-tests: convert hashmap test to use clar test framework
2025-02-10Merge branch 'ps/hash-cleanup'Junio C Hamano23-202/+226
Further code clean-up on the use of hash functions. Now the context object knows what hash function it is working with. * ps/hash-cleanup: global: adapt callers to use generic hash context helpers hash: provide generic wrappers to update hash contexts hash: stop typedeffing the hash context hash: convert hashing context to a structure
2025-02-10Merge branch 'jt/gitlab-ci-base-fix'Junio C Hamano1-2/+2
Two CI tasks, whitespace check and style check, work on the difference from the base version and the version being checked, but the base was computed incorrectly in GitLab CI in some cases, which has been corrected. * jt/gitlab-ci-base-fix: ci: fix base commit fallback for check-whitespace and check-style
2025-02-10Merge branch 'pw/apply-ulong-overflow-check'Junio C Hamano2-0/+16
"git apply" internally uses unsigned long for line numbers and uses strtoul() to parse numbers on the hunk headers. It however forgot to check parse errors. * pw/apply-ulong-overflow-check: apply: detect overflow when parsing hunk header
2025-02-10Merge branch 'ps/setup-reinit-fixes'Junio C Hamano2-11/+27
"git init" to reinitialize a repository that already exists cannot change the hash function and ref backends; such a request is silently ignored now. * ps/setup-reinit-fixes: setup: fix reinit of repos with incompatible GIT_DEFAULT_HASH setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMAT t0001: remove duplicate test
2025-02-06The eighth batchJunio C Hamano1-0/+10
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06Merge branch 'ps/leakfixes-0129'Junio C Hamano2-2/+6
A few more leakfixes. * ps/leakfixes-0129: scalar: free result of `remote_default_branch()` unix-socket: fix memory leak when chdir(3p) fails
2025-02-06Merge branch 'ps/zlib-ng'Junio C Hamano20-140/+107
The code paths to interact with zlib has been cleaned up in preparation for building with zlib-ng. * ps/zlib-ng: ci: make "linux-musl" job use zlib-ng ci: switch linux-musl to use Meson compat/zlib: allow use of zlib-ng as backend git-zlib: cast away potential constness of `next_in` pointer compat/zlib: provide stubs for `deflateSetHeader()` compat/zlib: provide `deflateBound()` shim centrally git-compat-util: move include of "compat/zlib.h" into "git-zlib.h" compat: introduce new "zlib.h" header git-compat-util: drop `z_const` define compat: drop `uncompress2()` compatibility shim
2025-02-06Merge branch 'js/bundle-unbundle-fd-reuse-fix'Junio C Hamano3-1/+6
The code path used when "git fetch" fetches from a bundle file closed the same file descriptor twice, which sometimes broke things unexpectedly when the file descriptor was reused, which has been corrected. * js/bundle-unbundle-fd-reuse-fix: bundle: avoid closing file descriptor twice
2025-02-06Merge branch 'ps/ci-misc-updates'Junio C Hamano7-90/+95
CI updates (containerization, dropping stale ones, etc.). * ps/ci-misc-updates: ci: remove stale code for Azure Pipelines ci: use latest Ubuntu release ci: stop special-casing for Ubuntu 16.04 gitlab-ci: add linux32 job testing against i386 gitlab-ci: remove the "linux-old" job github: simplify computation of the job's distro github: convert all Linux jobs to be containerized github: adapt containerized jobs to be rootless t7422: fix flaky test caused by buffered stdout t0060: fix EBUSY in MinGW when setting up runtime prefix
2025-02-04builtin/repack: fix `--keep-unreachable` when there are no packsPatrick Steinhardt2-1/+20
The "--keep-unreachable" flag is supposed to append any unreachable objects to the newly written pack. This flag is explicitly documented as appending both packed and loose unreachable objects to the new packfile. And while this works alright when repacking with preexisting packfiles, it stops working when the repository does not have any packfiles at all. The root cause are the conditions used to decide whether or not we want to append "--pack-loose-unreachable" to git-pack-objects(1). There are a couple of conditions here: - `has_existing_non_kept_packs()` checks whether there are existing packfiles. This condition makes sense to guard "--keep-pack=", "--unpack-unreachable" and "--keep-unreachable", because all of these flags only make sense in combination with existing packfiles. But it does not make sense to disable `--pack-loose-unreachable` when there aren't any preexisting packfiles, as loose objects can be packed into the new packfile regardless of that. - `delete_redundant` checks whether we want to delete any objects or packs that are about to become redundant. The documentation of `--keep-unreachable` explicitly says that `git repack -ad` needs to be executed for the flag to have an effect. It is not immediately obvious why such redundant objects need to be deleted in order for "--pack-unreachable-objects" to be effective. But as things are working as documented this is nothing we'll change for now. - `pack_everything & PACK_CRUFT` checks that we're not creating a cruft pack. This condition makes sense in the context of "--pack-loose-unreachable", as unreachable objects would end up in the cruft pack anyway. So while the second and third condition are sensible, it does not make any sense to condition `--pack-loose-unreachable` on the existence of packfiles. Fix the bug by splitting out the "--pack-loose-unreachable" and only making it depend on the second and third condition. Like this, loose unreachable objects will be packed regardless of any preexisting packfiles. Signed-off-by: Patrick Steinhardt <ps@pks.im> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-04remote: relocate valid_remote_nameMeet Soni4-11/+12
Move the `valid_remote_name()` function from the refspec subsystem to the remote subsystem to better align with the separation of concerns. Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-04refspec: relocate apply_refspecs and related funtionsMeet Soni4-42/+44
Move the functions `apply_refspecs()` and `apply_negative_refspecs()` from `remote.c` to `refspec.c`. These functions focus on applying refspecs, so centralizing them in `refspec.c` improves code organization by keeping refspec-related logic in one place. Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-04refspec: relocate matching related functionsMeet Soni3-122/+139
Move the functions `refspec_find_match()`, `refspec_find_all_matches()` and `refspec_find_negative_match()` from `remote.c` to `refspec.c`. These functions focus on matching refspecs, so centralizing them in `refspec.c` improves code organization by keeping refspec-related logic in one place. Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-04remote: rename query_refspecs functionsMeet Soni3-12/+12
Rename functions related to handling refspecs in preparation for their move from `remote.c` to `refspec.c`. Update their names to better reflect their intent: - `query_refspecs()` -> `refspec_find_match()` for clarity, as it finds a single matching refspec. - `query_refspecs_multiple()` -> `refspec_find_all_matches()` to better reflect that it collects all matching refspecs instead of returning just the first match. - `query_matches_negative_refspec()` -> `refspec_find_negative_match()` for consistency with the updated naming convention, even though this static function didn't strictly require renaming. Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-04refspec: relocate refname_matches_negative_refspec_itemMeet Soni3-48/+57
Move the functions `refname_matches_negative_refspec_item()`, `refspec_match()`, and `match_name_with_pattern()` from `remote.c` to `refspec.c`. These functions focus on refspec matching, so placing them in `refspec.c` aligns with the separation of concerns. Keep refspec-related logic in `refspec.c` and remote-specific logic in `remote.c` for better code organization. Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-04remote: rename function omit_name_by_refspecMeet Soni3-10/+6
Rename the function `omit_name_by_refspec()` to `refname_matches_negative_refspec_item()` to provide clearer intent. The previous function name was vague and did not accurately describe its purpose. By using `refname_matches_negative_refspec_item`, make the function's purpose more intuitive, clarifying that it checks if a reference name matches any negative refspec. Rename function parameters for consistency with existing naming conventions. Use `refname` instead of `name` to align with terminology in `refs.h`. Remove the redundant doc comment since the function name is now self-explanatory. Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03t6423: fix suppression of Git’s exit code in testsAyush Chandekar1-3/+6
Some test in t6423 supress Git's exit code, which can cause test failures go unnoticed. Specifically using git <subcommand> | <other-command> masks potential failures of the Git command. This commit ensures that Git's exit status is correctly propogated by: - Avoiding pipes that suppress exit codes. Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03help: add "show" as a valid configuration valueDavid Aguilar3-2/+4
Add a literal value for showing the suggested autocorrection for consistency with the rest of the help.autocorrect options. Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03help: show the suggested command when help.autocorrect is falseDavid Aguilar3-11/+16
Make the handling of false boolean values for help.autocorrect consistent with the handling of value 0 by showing the suggested commands but not running them. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03Merge branch 'sc/help-autocorrect-one' into da/help-autocorrect-one-fixJunio C Hamano2-16/+35
* sc/help-autocorrect-one: help: interpret boolean string values for help.autocorrect
2025-02-03t5401: prefer test_path_is_* helper functionambar chakravartty1-8/+8
"test -f" does not provide a nice error message when we hit test failures, so use test_path_is_file instead. Signed-off-by: ambar chakravartty <amch9605@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03The seventh batchJunio C Hamano1-0/+15
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03Merge branch 'kn/reflog-migration-fix-fix'Junio C Hamano4-10/+30
Fix bugs in an earlier attempt to fix "git refs migration". * kn/reflog-migration-fix-fix: refs/reftable: fix uninitialized memory access of `max_index` reftable: write correct max_update_index to header
2025-02-03Merge branch 'kn/pack-write-with-reduced-globals'Junio C Hamano9-74/+106
Code clean-up. * kn/pack-write-with-reduced-globals: pack-write: pass hash_algo to internal functions pack-write: pass hash_algo to `write_rev_file()` pack-write: pass hash_algo to `write_idx_file()` pack-write: pass repository to `index_pack_lockfile()` pack-write: pass hash_algo to `fixup_pack_header_footer()`
2025-02-03Merge branch 'ps/build-meson-fixes'Junio C Hamano7-38/+218
More build fixes and enhancements on meson based build procedure. * ps/build-meson-fixes: ci: wire up Visual Studio build with Meson ci: raise error when Meson generates warnings meson: fix compilation with Visual Studio meson: make the CSPRNG backend configurable meson: wire up fuzzers meson: wire up generation of distribution archive meson: wire up development environments meson: fix dependencies for generated headers meson: populate project version via GIT-VERSION-GEN GIT-VERSION-GEN: allow running without input and output files GIT-VERSION-GEN: simplify computing the dirty marker
2025-02-03Merge branch 'ps/3.0-remote-deprecation'Junio C Hamano21-59/+132
Following the procedure we established to introduce breaking changes for Git 3.0, allow an early opt-in for removing support of $GIT_DIR/branches/ and $GIT_DIR/remotes/ directories to configure remotes. * ps/3.0-remote-deprecation: remote: announce removal of "branches/" and "remotes/" builtin/pack-redundant: remove subcommand with breaking changes ci: repurpose "linux-gcc" job for deprecations ci: merge linux-gcc-default into linux-gcc Makefile: wire up build option for deprecated features
2025-02-03Merge branch 'jk/combine-diff-cleanup'Junio C Hamano4-184/+102
Code clean-up for code paths around combined diff. * jk/combine-diff-cleanup: tree-diff: make list tail-passing more explicit tree-diff: simplify emit_path() list management tree-diff: use the name "tail" to refer to list tail tree-diff: drop list-tail argument to diff_tree_paths() combine-diff: drop public declaration of combine_diff_path_size() tree-diff: inline path_appendnew() tree-diff: pass whole path string to path_appendnew() tree-diff: drop path_appendnew() alloc optimization run_diff_files(): de-mystify the size of combine_diff_path struct diff: add a comment about combine_diff_path.parent.path combine-diff: use pointer for parent paths tree-diff: clear parent array in path_appendnew() combine-diff: add combine_diff_path_new() run_diff_files(): delay allocation of combine_diff_path
2025-02-03Merge branch 'tb/unsafe-hash-cleanup'Junio C Hamano12-70/+107
The API around choosing to use unsafe variant of SHA-1 implementation has been updated in an attempt to make it harder to abuse. * tb/unsafe-hash-cleanup: hash.h: drop unsafe_ function variants csum-file: introduce hashfile_checkpoint_init() t/helper/test-hash.c: use unsafe_hash_algo() csum-file.c: use unsafe_hash_algo() hash.h: introduce `unsafe_hash_algo()` csum-file.c: extract algop from hashfile_checksum_valid() csum-file: store the hash algorithm as a struct field t/helper/test-tool: implement sha1-unsafe helper
2025-02-03ci: set CI_JOB_IMAGE for coverity jobJeff King1-1/+1
The main GitHub Actions workflow switched away from the "$distro" variable in b133d3071a (github: simplify computation of the job's distro, 2025-01-10). Since the Coverity job also depends on our ci/install-dependencies.sh script, it needs to likewise set CI_JOB_IMAGE to find the correct dependencies (without this patch, we don't install curl and the build fails). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03Merge branch 'ps/ci-misc-updates' into jk/ci-coverity-updateJunio C Hamano7-95/+100
* ps/ci-misc-updates: ci: remove stale code for Azure Pipelines ci: use latest Ubuntu release ci: stop special-casing for Ubuntu 16.04 gitlab-ci: add linux32 job testing against i386 gitlab-ci: remove the "linux-old" job github: simplify computation of the job's distro github: convert all Linux jobs to be containerized github: adapt containerized jobs to be rootless t7422: fix flaky test caused by buffered stdout t0060: fix EBUSY in MinGW when setting up runtime prefix
2025-01-31t/unit-tests: convert strcmp-offset test to use clar test frameworkSeyi Kuforiji4-37/+47
Adapt strcmp-offset test script to clar framework by using clar assertions where necessary. Introduce `test_strcmp_offset__empty()` to verify `check_strcmp_offset()` behavior when both input strings are empty. This ensures the function correctly handles edge cases and returns expected values. Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31t/unit-tests: convert strbuf test to use clar test frameworkSeyi Kuforiji4-124/+121
Adapt strbuf test script to clar framework by using clar assertions where necessary. Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31t/unit-tests: adapt example decorate test to use clar test frameworkSeyi Kuforiji4-76/+66
Introduce `test_example_decorate__initialize()` to explicitly set up object IDs and retrieve corresponding objects before tests run. This ensures a consistent and predictable test state without relying on data from previous tests. Add `test_example_decorate__cleanup()` to clear decorations after each test, preventing interference between tests and ensuring each runs in isolation. Adapt example decorate test script to clar framework by using clar assertions where necessary. Previously, tests relied on data written by earlier tests, leading to unintended dependencies between them. This explicitly initializes the necessary state within `test_example_decorate__readd`, ensuring it does not depend on prior test executions. Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31t/unit-tests: convert hashmap test to use clar test frameworkSeyi Kuforiji3-116/+114
Adapts hashmap test script to clar framework by using clar assertions where necessary. Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31ci: fix base commit fallback for check-whitespace and check-styleJustin Tobler1-2/+2
The check-whitespace and check-style CI scripts require a base commit. In GitLab CI, the base commit can be provided by several different predefined CI variables depending on the type of pipeline being performed. In 30c4f7e350 (check-whitespace: detect if no base_commit is provided, 2024-07-23), the GitLab check-whitespace CI job was modified to support CI_MERGE_REQUEST_DIFF_BASE_SHA as a fallback base commit if CI_MERGE_REQUEST_TARGET_BRANCH_SHA was not provided. The same fallback strategy was also implemented for the GitLab check-style CI job in bce7e52d4e (ci: run style check on GitHub and GitLab, 2024-07-23). The base commit fallback is implemented using shell parameter expansion where, if the first variable is unset, the second variable is used as fallback. In GitLab CI, these variables can be set but null. This has the unintended effect of selecting an empty first variable which results in CI jobs providing an invalid base commit and failing. Fix the issue by defaulting to the fallback variable if the first is unset or null. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31global: adapt callers to use generic hash context helpersPatrick Steinhardt19-107/+103
Adapt callers to use generic hash context helpers instead of using the hash algorithm to update them. This makes the callsites easier to reason about and removes the possibility that the wrong hash algorithm is used to update the hash context's state. And as a nice side effect this also gets rid of a bunch of users of `the_hash_algo`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31hash: provide generic wrappers to update hash contextsPatrick Steinhardt2-0/+27
The hash context is supposed to be updated via the `git_hash_algo` structure, which contains a list of function pointers to update, clone or finalize a hashing context. This requires the callers to track which algorithm was used to initialize the context and continue to use the exact same algorithm. If they fail to do that correctly, it can happen that we start to access context state of one hash algorithm with functions of a different hash algorithm. The result would typically be a segfault, as could be seen e.g. in the patches part of 98422943f0 (Merge branch 'ps/weak-sha1-for-tail-sum-fix', 2025-01-01). The situation was significantly improved starting with 04292c3796 (hash.h: drop unsafe_ function variants, 2025-01-23) and its parent commits. These refactorings ensure that it is not possible to mix up safe and unsafe variants of the same hash algorithm anymore. But in theory, it is still possible to mix up different hash algorithms with each other, even though this is a lot less likely to happen. But still, we can do better: instead of asking the caller to remember the hash algorithm used to initialize a context, we can instead make the context itself remember which algorithm it has been initialized with. If we do so, callers can use a set of generic helpers to update the context and don't need to be aware of the hash algorithm at all anymore. Adapt the context initialization functions to store the hash algorithm in the hashing context and introduce these generic helpers. Callers will be adapted in the subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31hash: stop typedeffing the hash contextPatrick Steinhardt22-74/+73
We generally avoid using `typedef` in the Git codebase. One exception though is the `git_hash_ctx`, likely because it used to be a union rather than a struct until the preceding commit refactored it. But now that it is a normal `struct` there isn't really a need for a typedef anymore. Drop the typedef and adapt all callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31hash: convert hashing context to a structurePatrick Steinhardt2-21/+22
The `git_hash_context` is a union containing the different hash-specific states for SHA1, its unsafe variant as well as SHA256. We know that only one of these states will ever be in use at the same time because hash contexts cannot be used for multiple different hashes at the same point in time. We're about to extend the structure though to keep track of the hash algorithm used to initialize the context, which is impossible to do while the context is a union. Refactor it to instead be a structure that contains the union of context states. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31Merge branch 'tb/unsafe-hash-cleanup' into ps/hash-cleanupJunio C Hamano12-70/+107
* tb/unsafe-hash-cleanup: hash.h: drop unsafe_ function variants csum-file: introduce hashfile_checkpoint_init() t/helper/test-hash.c: use unsafe_hash_algo() csum-file.c: use unsafe_hash_algo() hash.h: introduce `unsafe_hash_algo()` csum-file.c: extract algop from hashfile_checksum_valid() csum-file: store the hash algorithm as a struct field t/helper/test-tool: implement sha1-unsafe helper
2025-01-31The sixth batchJunio C Hamano1-0/+7
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31Merge branch 'jc/show-index-h-update'Junio C Hamano2-2/+2
Doc and short-help text for "show-index" has been clarified to stress that the command reads its data from the standard input. * jc/show-index-h-update: show-index: the short help should say the command reads from its input
2025-01-31Merge branch 'ja/doc-notes-markup-updates'Junio C Hamano2-111/+112
Doc mark-up updates. * ja/doc-notes-markup-updates: doc: convert git-notes to new documentation format
2025-01-31Merge branch 'sk/strlen-returns-size_t'Junio C Hamano1-3/+3
Code clean-up. * sk/strlen-returns-size_t: date.c: Fix type missmatch warings from msvc
2025-01-31Merge branch 'ja/doc-restore-markup-update'Junio C Hamano1-55/+55
Doc mark-up updates. * ja/doc-restore-markup-update: doc: convert git-restore to new style format
2025-01-30setup: fix reinit of repos with incompatible GIT_DEFAULT_HASHPatrick Steinhardt2-1/+15
The exact same issue as described in the preceding commit also exists for GIT_DEFAULT_HASH. Thus, reinitializing a repository that e.g. uses SHA1 with `GIT_DEFAULT_HASH=sha256 git init` will cause the object format of that repository to change to SHA256. This is of course bogus as any existing objects and refs will not be converted, thus causing repository corruption: $ git init repo Initialized empty Git repository in /tmp/repo/.git/ $ cd repo/ $ git commit --allow-empty -m message [main (root-commit) 35a7344] message $ GIT_DEFAULT_HASH=sha256 git init Reinitialized existing Git repository in /tmp/repo/.git/ $ git show fatal: your current branch appears to be broken Fix the issue by ignoring the environment variable in case the repo has already been initialized with an object hash. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-30setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMATPatrick Steinhardt2-1/+12
The GIT_DEFAULT_REF_FORMAT environment variable can be set to influence the default ref format that new repostiories shall be initialized with. While this is the expected behaviour when creating a new repository, it is not when reinitializing a repository: we should retain the ref format currently used by it in that case. This doesn't work correctly right now: $ git init --ref-format=files repo Initialized empty Git repository in /tmp/repo/.git/ $ GIT_DEFAULT_REF_FORMAT=reftable git init repo fatal: could not open '/tmp/repo/.git/refs/heads' for writing: Is a directory Instead of retaining the current ref format, the reinitialization tries to reinitialize the repository with the different format. This action fails when git-init(1) tries to write the ".git/refs/heads" stub, which in the context of the reftable backend is always written as a file so that we can detect clients which inadvertently try to access the repo with the wrong ref format. Seems like the protection mechanism works for this case, as well. Fix the issue by ignoring the environment variable in case the repo has already been initialized with a ref storage format. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-30t0001: remove duplicate testPatrick Steinhardt1-9/+0
The test in question is an exact copy of the testcase preceding it. Remove it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-30apply: detect overflow when parsing hunk headerPhillip Wood2-0/+16
"git apply" uses strtoul() to parse the numbers in the hunk header but silently ignores overflows. As LONG_MAX is a legitimate return value for strtoul() we need to set errno to zero before the call to strtoul() and check that it is still zero afterwards. The error message we display is not particularly helpful as it does not say what was wrong. However, it seems pretty unlikely that users are going to trigger this error in practice and we can always improve it later if needed. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-30scalar: free result of `remote_default_branch()`Patrick Steinhardt1-1/+3
We don't free the result of `remote_default_branch()`, leading to a memory leak. This leak is exposed by t9211, but only when run with Meson with the `-Db_sanitize=leak` option: Direct leak of 5 byte(s) in 1 object(s) allocated from: #0 0x5555555cfb93 in malloc (scalar+0x7bb93) #1 0x5555556b05c2 in do_xmalloc ../wrapper.c:55:8 #2 0x5555556b06c4 in do_xmallocz ../wrapper.c:89:8 #3 0x5555556b0656 in xmallocz ../wrapper.c:97:9 #4 0x5555556b0728 in xmemdupz ../wrapper.c:113:16 #5 0x5555556b07a7 in xstrndup ../wrapper.c:119:9 #6 0x5555555d3a4b in remote_default_branch ../scalar.c:338:14 #7 0x5555555d20e6 in cmd_clone ../scalar.c:493:28 #8 0x5555555d196b in cmd_main ../scalar.c:992:14 #9 0x5555557c4059 in main ../common-main.c:64:11 #10 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0) #11 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0) #12 0x555555592054 in _start (scalar+0x3e054) DEDUP_TOKEN: __interceptor_malloc--do_xmalloc--do_xmallocz--xmallocz--xmemdupz--xstrndup--remote_default_branch--cmd_clone--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start SUMMARY: LeakSanitizer: 5 byte(s) leaked in 1 allocation(s). As the `branch` variable may contain a string constant obtained from parsing command line arguments we cannot free the leaking variable directly. Instead, introduce a new `branch_to_free` variable that only ever gets assigned the allocated string and free that one to plug the leak. It is unclear why the leak isn't flagged when running the test via our Makefile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-30unix-socket: fix memory leak when chdir(3p) failsPatrick Steinhardt1-1/+3
When trying to create a Unix socket in a path that exceeds the maximum socket name length we try to first change the directory into the parent folder before creating the socket to reduce the length of the name. When this fails we error out of `unix_sockaddr_init()` with an error code, which indicates to the caller that the context has not been initialized. Consequently, they don't release that context. This leads to a memory leak: when we have already populated the context with the original directory that we need to chdir(3p) back into, but then the chdir(3p) into the socket's parent directory fails, then we won't release the original directory's path. The leak is exposed by t0301, but only when running tests in a directory hierarchy whose path is long enough to make the socket name length exceed the maximum socket name length: Direct leak of 129 byte(s) in 1 object(s) allocated from: #0 0x5555555e85c6 in realloc.part.0 lsan_interceptors.cpp.o #1 0x55555590e3d6 in xrealloc ../wrapper.c:140:8 #2 0x5555558c8fc6 in strbuf_grow ../strbuf.c:114:2 #3 0x5555558cacab in strbuf_getcwd ../strbuf.c:605:3 #4 0x555555923ff6 in unix_sockaddr_init ../unix-socket.c:65:7 #5 0x555555923e42 in unix_stream_connect ../unix-socket.c:84:6 #6 0x55555562a984 in send_request ../builtin/credential-cache.c:46:11 #7 0x55555562a89e in do_cache ../builtin/credential-cache.c:108:6 #8 0x55555562a655 in cmd_credential_cache ../builtin/credential-cache.c:178:3 #9 0x555555700547 in run_builtin ../git.c:480:11 #10 0x5555556ff0e0 in handle_builtin ../git.c:740:9 #11 0x5555556ffee8 in run_argv ../git.c:807:4 #12 0x5555556fee6b in cmd_main ../git.c:947:19 #13 0x55555593f689 in main ../common-main.c:64:11 #14 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0) #15 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0) #16 0x5555555ad1d4 in _start (git+0x591d4) DEDUP_TOKEN: ___interceptor_realloc.part.0--xrealloc--strbuf_grow--strbuf_getcwd--unix_sockaddr_init--unix_stream_connect--send_request--do_cache--cmd_credential_cache--run_builtin--handle_builtin--run_argv--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start SUMMARY: LeakSanitizer: 129 byte(s) leaked in 1 allocation(s). Fix this leak. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-29libgit: add higher-level libgit crateCalvin Wan13-8/+242
The C functions exported by libgit-sys do not provide an idiomatic Rust interface. To make it easier to use these functions via Rust, add a higher-level "libgit" crate, that wraps the lower-level configset API with an interface that is more Rust-y. This combination of $X and $X-sys crates is a common pattern for FFI in Rust, as documented in "The Cargo Book" [1]. [1] https://doc.rust-lang.org/cargo/reference/build-scripts.html#-sys-packages Co-authored-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-29libgit-sys: also export some config_set functionsJosh Steadmon3-1/+76
In preparation for implementing a higher-level Rust API for accessing Git configs, export some of the upstream configset API via libgitpub and libgit-sys. Since this will be exercised as part of the higher-level API in the next commit, no tests have been added for libgit-sys. While we're at it, add git_configset_alloc() and git_configset_free() functions in libgitpub so that callers can manage config_set structs on the heap. This also allows non-C external consumers to treat config_sets as opaque structs. Co-authored-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-29The fifth batchJunio C Hamano1-0/+21
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-29Merge branch 'ps/reflog-migration-with-logall-fix'Junio C Hamano2-1/+18
The "git refs migrate" command did not migrate the reflog for refs/stash, which is the contents of the stashes, which has been corrected. * ps/reflog-migration-with-logall-fix: refs: fix migration of reflogs respecting "core.logAllRefUpdates"
2025-01-29Merge branch 'am/trace2-with-valueless-true'Junio C Hamano5-6/+18
The trace2 code was not prepared to show a configuration variable that is set to true using the valueless true syntax, which has been corrected. * am/trace2-with-valueless-true: trace2: prevent segfault on config collection with valueless true
2025-01-29Merge branch 'kn/reflog-symref-fix'Junio C Hamano2-3/+9
reflog entries for symbolic ref updates were broken, which has been corrected. * kn/reflog-symref-fix: refs: fix creation of reflog entries for symrefs
2025-01-29Merge branch 'rs/ref-fitler-used-atoms-value-fix'Junio C Hamano8-72/+121
"git branch --sort=..." and "git for-each-ref --format=... --sort=..." did not work as expected with some atoms, which has been corrected. * rs/ref-fitler-used-atoms-value-fix: ref-filter: remove ref_format_clear() ref-filter: move is-base tip to used_atom ref-filter: move ahead-behind bases into used_atom
2025-01-29Merge branch 'ja/doc-commit-markup-updates'Junio C Hamano5-159/+161
Doc updates. * ja/doc-commit-markup-updates: doc: migrate git-commit manpage secondary files to new format doc: convert git commit config to new format doc: make more direct explanations in git commit options doc: the mode param of -u of git commit is optional doc: apply new documentation guidelines to git commit
2025-01-29Merge branch 'ds/path-walk-1'Junio C Hamano12-0/+1220
Introduce a new API to visit objects in batches based on a common path, or by type. * ds/path-walk-1: path-walk: drop redundant parse_tree() call path-walk: reorder object visits path-walk: mark trees and blobs as UNINTERESTING path-walk: visit tags and cached objects path-walk: allow consumer to specify object types t6601: add helper for testing path-walk API test-lib-functions: add test_cmp_sorted path-walk: introduce an object walk by path
2025-01-28libgit-sys: introduce Rust wrapper for libgit.aJosh Steadmon10-0/+263
Introduce libgit-sys, a Rust wrapper crate that allows Rust code to call functions in libgit.a. This initial patch defines build rules and an interface that exposes user agent string getter functions as a proof of concept. This library can be tested with `cargo test`. In later commits, a higher-level library containing a more Rust-friendly interface will be added at `contrib/libgit-rs`. Symbols in libgit can collide with symbols from other libraries such as libgit2. We avoid this by first exposing library symbols in public_symbol_export.[ch]. These symbols are prepended with "libgit_" to avoid collisions and set to visible using a visibility pragma. In build.rs, Rust builds contrib/libgit-rs/libgit-sys/libgitpub.a, which also contains libgit.a and other dependent libraries, with -fvisibility=hidden to hide all symbols within those libraries that haven't been exposed with a visibility pragma. Co-authored-by: Kyle Lippincott <spectral@google.com> Co-authored-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Kyle Lippincott <spectral@google.com> Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28common-main: split init and exit code into new filesJosh Steadmon6-81/+101
Currently, object files in libgit.a reference common_exit(), which is contained in common-main.o. However, common-main.o also includes main(), which references cmd_main() in git.o, which in turn depends on all the builtin/*.o objects. We would like to allow external users to link libgit.a without needing to include so many extra objects. Enable this by splitting common_exit() and check_bug_if_BUG() into a new file common-exit.c, and add common-exit.o to LIB_OBJS so that these are included in libgit.a. This split has previously been proposed ([1], [2]) to support fuzz tests and unit tests by avoiding conflicting definitions for main(). However, both of those issues were resolved by other methods of avoiding symbol conflicts. Now we are trying to make libgit.a more self-contained, so hopefully we can revisit this approach. Additionally, move the initialization code out of main() into a new init_git() function in its own file. Include this in libgit.a as well, so that external users can share our setup code without calling our main(). [1] https://lore.kernel.org/git/Yp+wjCPhqieTku3X@google.com/ [2] https://lore.kernel.org/git/20230517-unit-tests-v2-v2-1-21b5b60f4b32@google.com/ Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28ci: make "linux-musl" job use zlib-ngPatrick Steinhardt1-1/+1
We don't yet have any test coverage for the new zlib-ng backend as part of our CI. Add it by installing zlib-ng in Alpine Linux, which causes Meson to pick it up automatically. Note that we are somewhat limited with regards to where we run that job: Debian-based distributions don't have zlib-ng in their repositories, Fedora has it but doesn't run tests, and Alma Linux doesn't have the package either. Alpine Linux does have it available and is running our test suite, which is why it was picked. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28ci: switch linux-musl to use MesonPatrick Steinhardt7-9/+9
Switch over the "linux-musl" job to use Meson instead of Makefiles. This is done due to multiple reasons: - It simplifies our CI infrastructure a bit as we don't have to manually specify a couple of build options anymore. - It verifies that Meson detects and sets those build options automatically. - It makes it easier for us to wire up a new CI job using zlib-ng as backend. One platform compatibility that Meson cannot easily detect automatically is the `GIT_TEST_UTF8_LOCALE` variable used in tests. Wire up a build option for it, which we set via a new "MESONFLAGS" environment variable. Note that we also drop the CC variable, which is set to "gcc". We already default to GCC when CC is unset in "ci/lib.sh", so this is not needed. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28compat/zlib: allow use of zlib-ng as backendPatrick Steinhardt4-15/+64
The zlib-ng library is a hard fork of the old and venerable zlib library. It describes itself as zlib replacement with optimizations for "next generation" systems. As such, it contains several implementations of central algorithms using for example SSE2, AVX2 and other vectorized CPU intrinsics that supposedly speed up in- and deflating data. And indeed, compiling Git against zlib-ng leads to a significant speedup when reading objects. The following benchmark uses git-cat-file(1) with `--batch --batch-all-objects` in the Git repository: Benchmark 1: zlib Time (mean ± σ): 52.085 s ± 0.141 s [User: 51.500 s, System: 0.456 s] Range (min … max): 52.004 s … 52.335 s 5 runs Benchmark 2: zlib-ng Time (mean ± σ): 40.324 s ± 0.134 s [User: 39.731 s, System: 0.490 s] Range (min … max): 40.135 s … 40.484 s 5 runs Summary zlib-ng ran 1.29 ± 0.01 times faster than zlib So we're looking at a ~25% speedup compared to zlib. This is of course an extreme example, as it makes us read through all objects in the repository. But regardless, it should be possible to see some sort of speedup in most commands that end up accessing the object database. The zlib-ng library provides a compatibility layer that makes it a proper drop-in replacement for zlib: nothing needs to change in the build system to support it. Unfortunately though, this mode isn't easy to use on most systems because distributions do not allow you to install zlib-ng in that way, as that would mean that the zlib library would be globally replaced. Instead, many distributions provide a package that installs zlib-ng without the compatibility layer. This version does provide effectively the same APIs like zlib does, but all of the symbols are prefixed with `zng_` to avoid symbol collisions. Implement a new build option that allows us to link against zlib-ng directly. If set, we redefine zlib symbols so that we use the `zng_` prefixed versions thereof provided by that library. Like this, it becomes possible to install both zlib and zlib-ng (without the compat layer) and then pick whichever library one wants to link against for Git. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28git-zlib: cast away potential constness of `next_in` pointerPatrick Steinhardt1-1/+2
The `struct git_zstream::next_in` variable points to the input data and is used in combination with `struct z_stream::next_in`. While that latter field is not marked as a constant in zlib, it is marked as such in zlib-ng. This causes a couple of compiler errors when we try to assign these fields to one another due to mismatching constness. Fix the issue by casting away the potential constness of `next_in`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28compat/zlib: provide stubs for `deflateSetHeader()`Patrick Steinhardt2-4/+19
The function `deflateSetHeader()` has been introduced with zlib v1.2.2.1, so we don't use it when linking against an older version of it. Refactor the code to instead provide a central stub via "compat/zlib.h" so that we can adapt it based on whether or not we use zlib-ng in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28compat/zlib: provide `deflateBound()` shim centrallyPatrick Steinhardt2-4/+4
The `deflateBound()` function has only been introduced with zlib 1.2.0. When linking against a zlib version older than that we thus provide our own compatibility shim. Move this shim into "compat/zlib.h" so that we can adapt it based on whether or not we use zlib-ng in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28git-compat-util: move include of "compat/zlib.h" into "git-zlib.h"Patrick Steinhardt8-4/+8
We include "compat/zlib.h" in "git-compat-util.h", which is unnecessarily broad given that we only have a small handful of files that use the zlib library. Move the header into "git-zlib.h" instead and adapt users of zlib to include that header. One exception is the reftable library, as we don't want to use the Git-specific wrapper of zlib there, so we include "compat/zlib.h" instead. Furthermore, we move the include into "reftable/system.h" so that users of the library other than Git can wire up zlib themselves. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28compat: introduce new "zlib.h" headerPatrick Steinhardt3-2/+8
Introduce a new "compat/zlib-compat.h" header that we include instead of including <zlib.h> directly. This will allow us to wire up zlib-ng as an alternative backend for zlib compression in a subsequent commit. Note that we cannot just call the file "compat/zlib.h", as that may otherwise cause us to include that file instead of <zlib.h>. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28git-compat-util: drop `z_const` definePatrick Steinhardt1-1/+0
Before including <zlib.h> we explicitly define `z_const` to an empty value. This has the effect that the `z_const` macro in "zconf.h" itself will remain empty instead of being defined as `const`, which effectively adapts a couple of APIs so that their parameters are not marked as being constants. It is dubious though whether this is something we actually want: not marking a parameter as a constant doesn't make it any less constant than it was. The define was added via 07564773c2 (compat: auto-detect if zlib has uncompress2(), 2022-01-24), where it was seemingly carried over from our internal compatibility shim for `uncompress2()` that was removed in the preceding commit. The commit message doesn't mention why we carry over the define and make it public, either, and I cannot think of any reason for why we would want to have it. Drop the define. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28compat: drop `uncompress2()` compatibility shimPatrick Steinhardt4-107/+0
Our compat library has an implementation of zlib's `uncompress2()` function that gets used when linking against an old version of zlib that doesn't yet have it. The last user of `uncompress2()` got removed in 15a60b747e (reftable/block: open-code call to `uncompress2()`, 2024-04-08), so the compatibility code is not required anymore. Drop it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28The fourth batchJunio C Hamano1-0/+20
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28Merge branch 'jp/t8002-printf-fix'Junio C Hamano1-2/+2
Test fix. * jp/t8002-printf-fix: t8002: fix ambiguous printf conversion specifications
2025-01-28Merge branch 'ps/reftable-sign-compare'Junio C Hamano19-151/+157
The reftable/ library code has been made -Wsign-compare clean. * ps/reftable-sign-compare: reftable: address trivial -Wsign-compare warnings reftable/blocksource: adjust `read_block()` to return `ssize_t` reftable/blocksource: adjust type of the block length reftable/block: adjust type of the restart length reftable/block: adapt header and footer size to return a `size_t` reftable/basics: adjust `hash_size()` to return `uint32_t` reftable/basics: adjust `common_prefix_size()` to return `size_t` reftable/record: handle overflows when decoding varints reftable/record: drop unused `print` function pointer meson: stop disabling -Wsign-compare
2025-01-28Merge branch 'mh/credential-cache-authtype-request-fix'Junio C Hamano2-2/+17
The "cache" credential back-end did not handle authtype correctly, which has been corrected. * mh/credential-cache-authtype-request-fix: credential-cache: respect authtype capability
2025-01-28Merge branch 'jk/pack-header-parse-alignment-fix'Junio C Hamano6-50/+64
It was possible for "git unpack-objects" and "git index-pack" to make an unaligned access, which has been corrected. * jk/pack-header-parse-alignment-fix: index-pack, unpack-objects: use skip_prefix to avoid magic number index-pack, unpack-objects: use get_be32() for reading pack header parse_pack_header_option(): avoid unaligned memory writes packfile: factor out --pack_header argument parsing bswap.h: squelch potential sparse -Wcast-truncate warnings
2025-01-28Merge branch 'ps/build-meson-subtree'Junio C Hamano6-10/+95
The meson-driven build is now aware of "git-subtree" housed in contrib/subtree hierarchy. * ps/build-meson-subtree: meson: wire up the git-subtree(1) command meson: introduce build option for contrib contrib/subtree: fix building docs
2025-01-28Merge branch 'mh/connect-sign-compare'Junio C Hamano1-12/+11
The code in connect.c has been updated to work around complaints from -Wsign-compare. * mh/connect-sign-compare: connect: address -Wsign-compare warnings
2025-01-28Merge branch 'sk/unit-tests'Junio C Hamano8-147/+137
Move a few more unit tests to the clar test framework. * sk/unit-tests: t/unit-tests: convert reftable tree test to use clar test framework t/unit-tests: adapt priority queue test to use clar test framework t/unit-tests: convert mem-pool test to use clar test framework t/unit-tests: handle dashes in test suite filenames
2025-01-28Merge branch 'jc/show-usage-help'Junio C Hamano41-64/+121
The help text from "git $cmd -h" appear on the standard output for some $cmd and the standard error for others. The built-in commands have been fixed to show them on the standard output consistently. * jc/show-usage-help: builtin: send usage() help text to standard output oddballs: send usage() help text to standard output builtins: send usage_with_options() help text to standard output usage: add show_usage_if_asked() parse-options: add show_usage_with_options_if_asked() t0012: optionally check that "-h" output goes to stdout
2025-01-27pack-objects: prevent name hash version changeDerrick Stolee1-0/+8
When the --name-hash-version option is used in 'git pack-objects', it can change from the initial assignment to when it is used based on interactions with other arguments. Specifically, when writing or reading bitmaps, we must force version 1 for now. This could change in the future when the bitmap format can store a name hash version value, indicating which was used during the writing of the packfile. Protect the 'git pack-objects' process from getting confused by failing with a BUG() statement if the value of the name hash version changes between calls to pack_name_hash_fn(). Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27test-tool: add helper for name-hash valuesDerrick Stolee6-0/+87
Add a new test-tool helper, name-hash, to output the value of the name-hash algorithms for the input list of strings, one per line. Since the name-hash values can be stored in the .bitmap files, it is important that these hash functions do not change across Git versions. Add a simple test to t5310-pack-bitmaps.sh to provide some testing of the current values. Due to how these functions are implemented, it would be difficult to change them without disturbing these values. The paths used for this test are carefully selected to demonstrate some of the behavior differences of the two current name hash versions, including which conditions will cause them to collide. Create a performance test that uses test_size to demonstrate how collisions occur for these hash algorithms. This test helps inform someone as to the behavior of the name-hash algorithms for their repo based on the paths at HEAD. My copy of the Git repository shows modest statistics around the collisions of the default name-hash algorithm: Test this tree -------------------------------------------------- 5314.1: paths at head 4.5K 5314.2: distinct hash value: v1 4.1K 5314.3: maximum multiplicity: v1 13 5314.4: distinct hash value: v2 4.2K 5314.5: maximum multiplicity: v2 9 Here, the maximum collision multiplicity is 13, but around 10% of paths have a collision with another path. In a more interesting example, the microsoft/fluentui [1] repo had these statistics at time of committing: Test this tree -------------------------------------------------- 5314.1: paths at head 19.5K 5314.2: distinct hash value: v1 8.2K 5314.3: maximum multiplicity: v1 279 5314.4: distinct hash value: v2 17.8K 5314.5: maximum multiplicity: v2 44 [1] https://github.com/microsoft/fluentui That demonstrates that of the nearly twenty thousand path names, they are assigned around eight thousand distinct values. 279 paths are assigned to a single value, leading the packing algorithm to sort objects from those paths together, by size. With the v2 name hash function, the maximum multiplicity lowers to 44, leaving some room for further improvement. In a more extreme example, an internal monorepo had a much worse collision rate: Test this tree -------------------------------------------------- 5314.1: paths at head 227.3K 5314.2: distinct hash value: v1 72.3K 5314.3: maximum multiplicity: v1 14.4K 5314.4: distinct hash value: v2 166.5K 5314.5: maximum multiplicity: v2 138 Here, we can see that the v2 name hash function provides somem improvements, but there are still a number of collisions that could lead to repacking problems at this scale. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27p5313: add size comparison testDerrick Stolee1-0/+70
As custom options are added to 'git pack-objects' and 'git repack' to adjust how compression is done, use this new performance test script to demonstrate their effectiveness in performance and size. The recently-added --name-hash-version option allows for testing different name hash functions. Version 2 intends to preserve some of the locality of version 1 while more often breaking collisions due to long filenames. Distinguishing objects by more of the path is critical when there are many name hash collisions and several versions of the same path in the full history, giving a significant boost to the full repack case. The locality of the hash function is critical to compressing something like a shallow clone or a thin pack representing a push of a single commit. This can be seen by running pt5313 on the open source fluentui repository [1]. Most commits will have this kind of output for the thin and big pack cases, though certain commits (such as [2]) will have problematic thin pack size for other reasons. [1] https://github.com/microsoft/fluentui [2] a637a06df05360ce5ff21420803f64608226a875 Checked out at the parent of [2], I see the following statistics: Test HEAD --------------------------------------------------------------- 5313.2: thin pack with version 1 0.37(0.44+0.02) 5313.3: thin pack size with version 1 1.2M 5313.4: big pack with version 1 2.04(7.77+0.23) 5313.5: big pack size with version 1 20.4M 5313.6: shallow fetch pack with version 1 1.41(2.94+0.11) 5313.7: shallow pack size with version 1 34.4M 5313.8: repack with version 1 95.70(676.41+2.87) 5313.9: repack size with version 1 439.3M 5313.10: thin pack with version 2 0.12(0.12+0.06) 5313.11: thin pack size with version 2 22.0K 5313.12: big pack with version 2 2.80(5.43+0.34) 5313.13: big pack size with version 2 25.9M 5313.14: shallow fetch pack with version 2 1.77(2.80+0.19) 5313.15: shallow pack size with version 2 33.7M 5313.16: repack with version 2 33.68(139.52+2.58) 5313.17: repack size with version 2 160.5M To make comparisons easier, I will reformat this output into a different table style: | Test | V1 Time | V2 Time | V1 Size | V2 Size | |--------------|---------|---------|---------|---------| | Thin Pack | 0.37 s | 0.12 s | 1.2 M | 22.0 K | | Big Pack | 2.04 s | 2.80 s | 20.4 M | 25.9 M | | Shallow Pack | 1.41 s | 1.77 s | 34.4 M | 33.7 M | | Repack | 95.70 s | 33.68 s | 439.3 M | 160.5 M | The v2 hash function successfully differentiates the CHANGELOG.md files from each other, which leads to significant improvements in the thin pack (simulating a push of this commit) and the full repack. There is some bloat in the "big pack" scenario and essentially the same results for the shallow pack. In the case of the Git repository, these numbers show some of the issues with this approach: | Test | V1 Time | V2 Time | V1 Size | V2 Size | |--------------|---------|---------|---------|---------| | Thin Pack | 0.02 s | 0.02 s | 1.1 K | 1.1 K | | Big Pack | 1.69 s | 1.95 s | 13.5 M | 14.5 M | | Shallow Pack | 1.26 s | 1.29 s | 12.0 M | 12.2 M | | Repack | 29.51 s | 29.01 s | 237.7 M | 238.2 M | Here, the attempts to remove conflicts in the v2 function seem to cause slight bloat to these sizes. This shows that the Git repository benefits a lot from cross-path delta pairs. The results are similar with the nodejs/node repo: | Test | V1 Time | V2 Time | V1 Size | V2 Size | |--------------|---------|---------|---------|---------| | Thin Pack | 0.02 s | 0.02 s | 1.6 K | 1.6 K | | Big Pack | 4.61 s | 3.26 s | 56.0 M | 52.8 M | | Shallow Pack | 7.82 s | 7.51 s | 104.6 M | 107.0 M | | Repack | 88.90 s | 73.75 s | 740.1 M | 764.5 M | Here, the v2 name-hash causes some size bloat more often than it reduces the size, but it also universally improves performance time, which is an interesting reversal. This must mean that it is helping to short-circuit some delta computations even if it is not finding the most efficient ones. The performance improvement cannot be explained only due to the I/O cost of writing the resulting packfile. The Linux kernel repository was the initial target of the default name hash value, and its naming conventions are practically build to take the most advantage of the default name hash values: | Test | V1 Time | V2 Time | V1 Size | V2 Size | |--------------|----------|----------|---------|---------| | Thin Pack | 0.17 s | 0.07 s | 4.6 K | 4.6 K | | Big Pack | 17.88 s | 12.35 s | 201.1 M | 159.1 M | | Shallow Pack | 11.05 s | 22.94 s | 269.2 M | 273.8 M | | Repack | 727.39 s | 566.95 s | 2.5 G | 2.5 G | Here, the thin and big packs gain some performance boosts in time, with a modest gain in the size of the big pack. The shallow pack, however, is more expensive to compute, likely because similarly-named files across different directories are farther apart in the name hash ordering in v2. The repack also gains benefits in computation time but no meaningful change to the full size. Finally, an internal Javascript repo of moderate size shows significant gains when repacking with --name-hash-version=2 due to it having many name hash collisions. However, it's worth noting that only the full repack case has significant differences from the v1 name hash: | Test | V1 Time | V2 Time | V1 Size | V2 Size | |-----------|-----------|----------|---------|---------| | Thin Pack | 8.28 s | 7.28 s | 16.8 K | 16.8 K | | Big Pack | 12.81 s | 11.66 s | 29.1 M | 29.1 M | | Shallow | 4.86 s | 4.06 s | 42.5 M | 44.1 M | | Repack | 3126.50 s | 496.33 s | 6.2 G | 855.6 M | Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27pack-objects: add GIT_TEST_NAME_HASH_VERSIONDerrick Stolee9-10/+41
Add a new environment variable to opt-in to different values of the --name-hash-version=<n> option in 'git pack-objects'. This allows for extra testing of the feature without repeating all of the test scenarios. Unlike many GIT_TEST_* variables, we are choosing to not add this to the linux-TEST-vars CI build as that test run is already overloaded. The behavior exposed by this test variable is of low risk and should be sufficient to allow manual testing when an issue arises. But this option isn't free. There are a few tests that change behavior with the variable enabled. First, there are a few tests that are very sensitive to certain delta bases being picked. These are both involving the generation of thin bundles and then counting their objects via 'git index-pack --fix-thin' which pulls the delta base into the new packfile. For these tests, disable the option as a decent long-term option. Second, there are some tests that compare the exact output of a 'git pack-objects' process when using bitmaps. The warning that ignores the --name-hash-version=2 and forces version 1 causes these tests to fail. Disable the environment variable to get around this issue. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27repack: add --name-hash-version optionDerrick Stolee5-3/+48
The new '--name-hash-version' option for 'git repack' is a simple pass-through to the underlying 'git pack-objects' subcommand. However, this subcommand may have other options and a temporary filename as part of the subcommand execution that may not be predictable or could change over time. The existing test_subcommand method requires an exact list of arguments for the subcommand. This is too rigid for our needs here, so create a new method, test_subcommand_flex. Use it to check that the --name-hash-version option is passing through. Since we are modifying the 'git repack' command, let's bring its usage in line with the Documentation's synopsis. This removes it from the allow list in t0450 so it will remain in sync in the future. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27pack-objects: add --name-hash-version optionDerrick Stolee3-6/+109
The previous change introduced a new pack_name_hash_v2() function that intends to satisfy much of the hash locality features of the existing pack_name_hash() function while also distinguishing paths with similar final components of their paths. This change adds a new --name-hash-version option for 'git pack-objects' to allow users to select their preferred function version. This use of an integer version allows for future expansion and a direct way to later store a name hash version in the .bitmap format. For now, let's consider how effective this mechanism is when repacking a repository with different name hash versions. Specifically, we will execute 'git pack-objects' the same way a 'git repack -adf' process would, except we include --name-hash-version=<n> for testing. On the Git repository, we do not expect much difference. All path names are short. This is backed by our results: | Stage | Pack Size | Repack Time | |-----------------------|-----------|-------------| | After clone | 260 MB | N/A | | --name-hash-version=1 | 127 MB | 129s | | --name-hash-version=2 | 127 MB | 112s | This example demonstrates how there is some natural overhead coming from the cloned copy because the server is hosting many forks and has not optimized for exactly this set of reachable objects. But the full repack has similar characteristics for both versions. Let's consider some repositories that are hitting too many collisions with version 1. First, let's explore the kinds of paths that are commonly causing these collisions: * "/CHANGELOG.json" is 15 characters, and is created by the beachball [1] tool. Only the final character of the parent directory can differentiate different versions of this file, but also only the two most-significant digits. If that character is a letter, then this is always a collision. Similar issues occur with the similar "/CHANGELOG.md" path, though there is more opportunity for differences In the parent directory. * Localization files frequently have common filenames but differentiates via parent directories. In C#, the name "/strings.resx.lcl" is used for these localization files and they will all collide in name-hash. [1] https://github.com/microsoft/beachball I've come across many other examples where some internal tool uses a common name across multiple directories and is causing Git to repack poorly due to name-hash collisions. One open-source example is the fluentui [2] repo, which uses beachball to generate CHANGELOG.json and CHANGELOG.md files, and these files have very poor delta characteristics when comparing against versions across parent directories. | Stage | Pack Size | Repack Time | |-----------------------|-----------|-------------| | After clone | 694 MB | N/A | | --name-hash-version=1 | 438 MB | 728s | | --name-hash-version=2 | 168 MB | 142s | [2] https://github.com/microsoft/fluentui In this example, we see significant gains in the compressed packfile size as well as the time taken to compute the packfile. Using a collection of repositories that use the beachball tool, I was able to make similar comparisions with dramatic results. While the fluentui repo is public, the others are private so cannot be shared for reproduction. The results are so significant that I find it important to share here: | Repo | --name-hash-version=1 | --name-hash-version=2 | |----------|-----------------------|-----------------------| | fluentui | 440 MB | 161 MB | | Repo B | 6,248 MB | 856 MB | | Repo C | 37,278 MB | 6,755 MB | | Repo D | 131,204 MB | 7,463 MB | Future changes could include making --name-hash-version implied by a config value or even implied by default during a full repack. It is important to point out that the name hash value is stored in the .bitmap file format, so we must force --name-hash-version=1 when bitmaps are being read or written. Later, the bitmap format could be updated to be aware of the name hash version so deltas can be quickly computed across the bitmapped/not-bitmapped boundary. To promote the safety of this parameter, the validate_name_hash_version() method will die() if the given name-hash version is incorrect and will disable newer versions if not yet compatible with other features, such as --write-bitmap-index. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27pack-objects: create new name-hash function versionJonathan Tan1-0/+28
As we will explore in later changes, the default name-hash function used in 'git pack-objects' has a tendency to cause collisions and cause poor delta selection. This change creates an alternative that avoids some collisions while preserving some amount of hash locality. The pack_name_hash() method has not been materially changed since it was introduced in ce0bd64 (pack-objects: improve path grouping heuristics., 2006-06-05). The intention here is to group objects by path name, but also attempt to group similar file types together by making the most-significant digits of the hash be focused on the final characters. Here's the crux of the implementation: /* * This effectively just creates a sortable number from the * last sixteen non-whitespace characters. Last characters * count "most", so things that end in ".c" sort together. */ while ((c = *name++) != 0) { if (isspace(c)) continue; hash = (hash >> 2) + (c << 24); } As the comment mentions, this only cares about the last sixteen non-whitespace characters. This cause some filenames to collide more than others. This collision is somewhat by design in order to promote hash locality for files that have similar types (.c, .h, .json) or could be the same file across a directory rename (a/foo.txt to b/foo.txt). This leads to decent cross-path deltas in cases like shallow clones or packing a repository with very few historical versions of files that share common data with other similarly-named files. However, when the name-hash instead leads to a large number of name-hash collisions for otherwise unrelated files, this can lead to confusing the delta calculation to prefer cross-path deltas over previous versions of the same file. The new pack_name_hash_v2() function attempts to fix this issue by taking more of the directory path into account through its hash function. Its naming implies that we will later wire up details for choosing a name-hash function by version. The first change is to be more careful about paths using non-ASCII characters. With these characters in mind, reverse the bits in the byte as the least-significant bits have the highest entropy and we want to maximize their influence. This is done with some bit manipulation that swaps the two halves, then the quarters within those halves, and then the bits within those quarters. The second change is to perform hash composition operations at every level of the path. This is done by storing a 'base' hash value that contains the hash of the parent directory. When reaching a directory boundary, we XOR the current level's name-hash value with a downshift of the previous level's hash. This perturbation intends to create low-bit distinctions for paths with the same final 16 bytes but distinct parent directory structures. The collision rate and effectiveness of this hash function will be explored in later changes as the function is integrated with 'git pack-objects' and 'git repack'. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27refs/reftable: fix uninitialized memory access of `max_index`Karthik Nayak1-3/+3
When migrating reflogs between reference backends, maintaining the original order of the reflog entries is crucial. To achieve this, an `index` field is stored within the `ref_update` struct that encodes the relative order of reflog entries. This field is used by the reftable backend as update index for the respective reflog entries to maintain that ordering. These update indices must be respected when writing table headers, which encode the minimum and maximum update index of contained records in the header and footer. This logic was added in commit bc67b4ab5f (reftable: write correct max_update_index to header, 2025-01-15), which started to use `reftable_writer_set_limits()` to propagate the mininum and maximum update index of all records contained in a ref transaction. However, we only set the maximum update index for the first transaction argument, even though there can be multiple such arguments. This is the case when we write to multiple stacks in a single transaction, e.g. when updating references in two different worktrees at once. Consequently, the update index for all but the first argument remain uninitialized, which may cause undefined behaviour. Fix this by moving the assignment of the maximum update index in `reftable_be_transaction_finish()` inside the loop, which ensures that all elements of the array are correctly initialized. Furthermore, initialize the `max_index` field to 0 when queueing a new transaction argument. This is not strictly necessary, as all elements of `write_transaction_table_arg.max_index` are now assigned correctly. However, this initialization is added for consistency and to safeguard against potential future changes that might inadvertently introduce uninitialized memory access. Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-25bundle: avoid closing file descriptor twiceJohannes Schindelin3-1/+6
Already when introduced in c7a8a16239 (Add bundle transport, 2007-09-10), the `bundle` transport had a bug where it would open a file descriptor to the bundle file and then close it _twice_: First, the file descriptor (`data->fd`) is passed to `unbundle()`, which would use it as the `stdin` of the `index-pack` process, which as a consequence would close it via `start_command()`. However, `data->fd` would still hold the numerical value of the file descriptor, and `close_bundle()` would see that and happily close it again. This seems not to have caused too many problems in almost two decades, but I encountered a situation today where it _does_ cause problems: In i686 variants of Git for Windows, it seems that file descriptors are reused quickly after they have been closed. In the particular scenario I faced, `git fetch <bundle> <ref>` gets the same file descriptor value when opening the bundle file and importing its embedded packfile (which implicitly closes the file descriptor) and then when opening a pack file in `fetch_and_consume_refs()` while looking up an object's header. Later on, after the bundle has been imported (and the `close_bundle()` function erroneously closes the file descriptor that has _already_ been closed when using it as `stdin` for `git index-pack`), the same file descriptor value has now been reused via `use_pack()`. Now, when either the recursive fetch (which defaults to "on", unfortunately) or a commit-graph update needs to `mmap()` the packfile, it fails due to a now-invalid file descriptor that _should_ point to the pack file but doesn't anymore. To fix that, let's invalidate `data->fd` after calling `unbundle()`. That way, `close_bundle()` does not close a file descriptor that may have been reused for something different. While at it, document that `unbundle()` closes the file descriptor, and ensure that it also does that when failing to verify the bundle. Luckily, this bug does not affect the bundle URI feature, it only affects the `git fetch <bundle>` code path. Note that this patch does not _completely_ clarifies who is responsible to close that file descriptor, as `run_command()` may fail _without_ closing `cmd->in`. Addressing this issue thoroughly, however, would require a rather thorough re-design of the `start_command()` and `finish_command()` functionality to make it a lot less murky who is responsible for what file descriptors. At least this here patch is relatively easy to reason about, and addresses a hard failure (`fatal: mmap: could not determine filesize`) at the expense of leaking a file descriptor under very rare circumstances in which `git fetch` would error out anyway. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-24gc: add `--expire-to` optionZheNing Hu3-2/+47
This commit extends the functionality of `git gc` by adding a new option, `--expire-to=<dir>`. Previously, this feature was implemented in 91badeba32 (builtin/repack.c: implement `--expire-to` for storing pruned objects, 2022-10-24), which allowing users to specify a directory where unreachable and expired cruft packs are stored during garbage collection. However, users had to run `git repack --cruft --expire-to=<dir>` followed by `git prune` to achieve similar results within `git gc`. By introducing `--expire-to=<dir>` directly into `git gc`, we simplify the process for users who wish to manage their repository's cleanup more efficiently. This change involves passing the `--expire-to=<dir>` parameter through to `git repack`, making it easier for users to set up a backup location for cruft packs that will be pruned. Due to the original `git gc --prune=now` deleting all unreachable objects by passing the `-a` parameter to git repack. With the addition of the `--cruft` and `--expire-to` options, it is necessary to modify this default behavior: instead of deleting these unreachable objects, they should be merged into a cruft pack and collected in a specified directory. Therefore, we do not pass `-a` to the repack command but instead pass `--cruft`, `--expire-to`, and `--cruft-expiration=now` to repack. Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-24config.txt: add trailer.* variablesJulian Prein3-135/+140
The trailer.* configuration variables are currently only described in git-interpret-trailers(1) but affect git-commit and git-tag as well. Move that section into its own config/trailer.txt file and also include it in git-config(1). Signed-off-by: Julian Prein <julian@druckdev.xyz> Acked-by: Eric Sesterhenn <eric.sesterhenn@x41-dsec.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-24remote: announce removal of "branches/" and "remotes/"Patrick Steinhardt9-43/+99
Back when Git was in its infancy, remotes were configured via separate files in "branches/" (back in 2005). This mechanism was replaced later that year with the "remotes/" directory. Both mechanisms have eventually been replaced by config-based remotes, and it is very unlikely that anybody still uses these directories to configure their remotes. Both of these directories have been marked as deprecated, one in 2005 and the other one in 2011. Follow through with the deprecation and finally announce the removal of these features in Git 3.0. Signed-off-by: Patrick Steinhardt <ps@pks.im> [jc: with a small tweak to the help message] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23The third batchJunio C Hamano1-0/+22
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23Merge branch 'jc/cli-doc-option-and-config'Junio C Hamano1-0/+17
Doc update. * jc/cli-doc-option-and-config: gitcli: document that command line trumps config and env
2025-01-23Merge branch 'mh/doc-credential-helpers-with-pat'Junio C Hamano2-12/+46
Document that it is insecure to use Personal Access Tokens, which some hosting providers take as username/password, embedded in URLs. * mh/doc-credential-helpers-with-pat: docs: discuss caching personal access tokens docs: list popular credential helpers
2025-01-23Merge branch 'ak/instaweb-python-port-binding-fix'Junio C Hamano1-2/+2
The "instaweb" bound only to local IP address without "--local" and to all addresses with "--local", which was the other way around, when using Python's http.server class, which has been corrected. * ak/instaweb-python-port-binding-fix: instaweb: fix ip binding for the python http.server
2025-01-23Merge branch 'sj/meson-doc-technical-dependency-fix'Junio C Hamano1-0/+1
The meson build procedure for Documentation/technical/ hierarchy was missing necessary dependencies, which has been corrected. * sj/meson-doc-technical-dependency-fix: meson: fix missing deps for technical articles
2025-01-23Merge branch 'tc/meson-use-our-version-def-h'Junio C Hamano2-2/+9
The meson build procedure looked for the 'version-def.h' file in a wrong directory, which has been corrected. * tc/meson-use-our-version-def-h: meson: ensure correct version-def.h is used
2025-01-23Merge branch 'en/object-name-with-funny-refname-fix'Junio C Hamano3-5/+113
Extended SHA-1 expression parser did not work well when a branch with an unusual name (e.g. "foo{bar") is involved. * en/object-name-with-funny-refname-fix: object-name: be more strict in parsing describe-like output object-name: fix resolution of object names containing curly braces
2025-01-23hash.h: drop unsafe_ function variantsTaylor Blau2-30/+0
Now that all callers have been converted from: the_hash_algo->unsafe_init_fn(); to unsafe_hash_algo(the_hash_algo)->init_fn(); and similar, we can remove the scaffolding for the unsafe_ function variants and force callers to use the new unsafe_hash_algo() mechanic instead. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23csum-file: introduce hashfile_checkpoint_init()Taylor Blau4-4/+15
In 106140a99f (builtin/fast-import: fix segfault with unsafe SHA1 backend, 2024-12-30) and 9218c0bfe1 (bulk-checkin: fix segfault with unsafe SHA1 backend, 2024-12-30), we observed the effects of failing to initialize a hashfile_checkpoint with the same hash function implementation as is used by the hashfile it is used to checkpoint. While both 106140a99f and 9218c0bfe1 work around the immediate crash, changing the hash function implementation within the hashfile API to, for example, the non-unsafe variant would re-introduce the crash. This is a result of the tight coupling between initializing hashfiles and hashfile_checkpoints. Introduce and use a new function which ensures that both parts of a hashfile and hashfile_checkpoint pair use the same hash function implementation to avoid such crashes. A few things worth noting: - In the change to builtin/fast-import.c::stream_blob(), we can see that by removing the explicit reference to 'the_hash_algo->unsafe_init_fn()', we are hardened against the hashfile API changing away from the_hash_algo (or its unsafe variant) in the future. - The bulk-checkin code no longer needs to explicitly zero-initialize the hashfile_checkpoint, since it is now done as a result of calling 'hashfile_checkpoint_init()'. - Also in the bulk-checkin code, we add an additional call to prepare_to_stream() outside of the main loop in order to initialize 'state->f' so we know which hash function implementation to use when calling 'hashfile_checkpoint_init()'. This is OK, since subsequent 'prepare_to_stream()' calls are noops. However, we only need to call 'prepare_to_stream()' when we have the HASH_WRITE_OBJECT bit set in our flags. Without that bit, calling 'prepare_to_stream()' does not assign 'state->f', so we have nothing to initialize. - Other uses of the 'checkpoint' in 'deflate_blob_to_pack()' are appropriately guarded. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23t/helper/test-hash.c: use unsafe_hash_algo()Taylor Blau1-12/+5
Remove a series of conditionals within the shared cmd_hash_impl() helper that powers the 'sha1' and 'sha1-unsafe' helpers. Instead, replace them with a single conditional that transforms the specified hash algorithm into its unsafe variant. Then all subsequent calls can directly use whatever function it wants to call without having to decide between the safe and unsafe variants. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23csum-file.c: use unsafe_hash_algo()Taylor Blau1-11/+11
Instead of calling the unsafe_ hash function variants directly, make use of the shared 'algop' pointer by initializing it to: f->algop = unsafe_hash_algo(the_hash_algo); , thus making all calls use the unsafe variants directly. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23hash.h: introduce `unsafe_hash_algo()`Taylor Blau2-1/+38
In 253ed9ecff (hash.h: scaffolding for _unsafe hashing variants, 2024-09-26), we introduced "unsafe" variants of the SHA-1 hashing functions by introducing new functions like "unsafe_init_fn()" and so on. This approach has a major shortcoming that callers must remember to consistently use one variant or the other. Failing to consistently use (or not use) the unsafe variants can lead to crashes at best, or subtle memory corruption issues at worst. In the hashfile API, this isn't difficult to achieve, but verifying that all callers consistently use the unsafe variants is somewhat of a chore given how spread out all of the callers are. In the sha1 and sha1-unsafe test helpers, all of the calls to various hash functions are guarded by an "if (unsafe)" conditional, which is repetitive and cumbersome. Address these issues by introducing a new pattern whereby one 'git_hash_algo' can return a pointer to another 'git_hash_algo' that represents the unsafe version of itself. So instead of having something like: if (unsafe) the_hash_algo->init_fn(...); the_hash_algo->update_fn(...); the_hash_algo->final_fn(...); else the_hash_algo->unsafe_init_fn(...); the_hash_algo->unsafe_update_fn(...); the_hash_algo->unsafe_final_fn(...); we can instead write: struct git_hash_algo *algop = the_hash_algo; if (unsafe) algop = unsafe_hash_algo(algop); algop->init_fn(...); algop->update_fn(...); algop->final_fn(...); This removes the existing shortcoming by no longer forcing the caller to "remember" which variant of the hash functions it wants to call, only to hold onto a 'struct git_hash_algo' pointer that is initialized once. Similarly, while there currently is still a way to "mix" safe and unsafe functions, this too will go away after subsequent commits remove all direct calls to the unsafe_ variants. Note that hash_algo_by_ptr() needs an adjustment to allow passing in the unsafe variant of a hash function. All other query functions on the hash_algos array will continue to return the safe variants of any function. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23csum-file.c: extract algop from hashfile_checksum_valid()Taylor Blau1-6/+7
Perform a similar transformation as in the previous commit, but focused instead on hashfile_checksum_valid(). This function does not work with a hashfile structure itself, and instead validates the raw contents of a file written using the hashfile API. We'll want to be prepared for a similar change to this function in the future, so prepare ourselves for that by extracting 'the_hash_algo' into its own field for use within this function. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23csum-file: store the hash algorithm as a struct fieldTaylor Blau2-9/+12
Throughout the hashfile API, we rely on a reference to 'the_hash_algo', and call its _unsafe function variants directly. Prepare for a future change where we may use a different 'git_hash_algo' pointer (instead of just relying on 'the_hash_algo' throughout) by making the 'git_hash_algo' pointer a member of the 'hashfile' structure itself. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23t/helper/test-tool: implement sha1-unsafe helperTaylor Blau6-23/+45
With the new "unsafe" SHA-1 build knob, it is convenient to have a test-tool that can exercise Git's unsafe SHA-1 wrappers for testing, similar to 't/helper/test-tool sha1'. Implement that helper by altering the implementation of that test-tool (in cmd_hash_impl(), which is generic and parameterized over different hash functions) to conditionally run the unsafe variants of the chosen hash function, and expose the new behavior via a new 'sha1-unsafe' test helper. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23trace2: prevent segfault on config collection with valueless trueAdam Murray5-6/+18
When TRACE2 analytics is enabled, a configuration variable set to "valueless true" causes a segfault. Steps to Reproduce GIT_TRACE2=true GIT_TRACE2_CONFIG_PARAMS=status.* git -c status.relativePaths version Expected Result git version 2.46.0 Actual Result zsh: segmentation fault GIT_TRACE2=true Add checks to prevent the segfault and instead show that the variable without value. Signed-off-by: Adam Murray <ad@canva.com> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-23refs: fix creation of reflog entries for symrefsKarthik Nayak2-3/+9
The commit 297c09eabb (refs: allow multiple reflog entries for the same refname, 2024-12-16) added logic to exit early in `lock_ref_for_update()` after obtaining the required lock. This was added as a performance optimization on a false assumption that no further processing was required for reflog-only updates. However the assumption was wrong. For a symref's reflog entry, the update needs to be populated with the old_oid value, but the early exit skipped this necessary step. This caused a bug in Git 2.48 in the files backend where target references of symrefs being updated would create a corrupted reflog entry for the symref since the old_oid is not populated. Everything the early exit skipped in the code path is necessary for both regular and symbolic ref, so eliminate the mistaken optimization, and also add a test to ensure that such an issue doesn't arise in the future. Reported-by: Nika Layzell <nika@thelayzells.com> Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22path-walk: drop redundant parse_tree() callJeff King1-1/+0
This call to parse_tree() was flagged by Coverity for ignoring the return value. But if we look a little further up the function, we can see that there is already a call to parse_tree_gently(), and we'll return early if that fails. So by this point the tree will always be parsed, and the call is redundant. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22Merge branch 'ps/build-meson-fixes' into ps/zlib-ngJunio C Hamano7-38/+218
* ps/build-meson-fixes: ci: wire up Visual Studio build with Meson ci: raise error when Meson generates warnings meson: fix compilation with Visual Studio meson: make the CSPRNG backend configurable meson: wire up fuzzers meson: wire up generation of distribution archive meson: wire up development environments meson: fix dependencies for generated headers meson: populate project version via GIT-VERSION-GEN GIT-VERSION-GEN: allow running without input and output files GIT-VERSION-GEN: simplify computing the dirty marker
2025-01-22ci: wire up Visual Studio build with MesonPatrick Steinhardt2-0/+90
Add a new job to GitHub Actions and GitLab CI that builds and tests Meson-based builds with Visual Studio. A couple notes: - While the build job is mandatory, the test job is marked as "manual" on GitLab so that it doesn't run by default. We already have a bunch of Windows-based jobs, and the computational overhead that these cause is simply out of proportion to run the test suite twice. The same isn't true for GitHub as I could not find a way to make a subset of jobs manually triggered. - We disable Perl. This is because we pick up Perl from Git for Windows, which outputs different paths ("/c/" instead of "C:\") than what we expect in our tests. - We don't use the Git for Windows SDK. Instead, the build only depends on Visual Studio, Meson and Git for Windows. All the other dependencies like curl, pcre2 and zlib get pulled in and compiled automatically by Meson and thus do not have to be provided by the system. - We open-code "ci/run-test-slice.sh". This is because we only have direct access to PowerShell, so we manually implement the logic. There is an upstream pull request for the Meson build system [1] to implement test slicing in Meson directly. - We don't process test artifacts for failed CI jobs. This is done to keep down prerequisites to a minimum. All tests are passing. [1]: https://github.com/mesonbuild/meson/pull/14092 Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22ci: raise error when Meson generates warningsPatrick Steinhardt1-0/+1
Meson prints warnings in several cases, like for example when using a feature supported by the current version of Meson, but not yet supported by the minimum required version as declared by the project. These warnings will not cause the setup to fail by default, which makes it quite easy to miss them. Improve this by passing `--fatal-meson-warnings` to `meson setup` so that our CI jobs will fail on warnings. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22meson: fix compilation with Visual StudioPatrick Steinhardt1-0/+8
The Visual Studio compiler defaults to C89 unless explicitly asked to use a different version of the C standard. We don't specify any C standard at all though in our Meson build, and consequently compiling Git fails: ...\git\git-compat-util.h(14): fatal error C1189: #error: "Required C99 support is in a test phase. Please see git-compat-util.h for more details." Fix the issue by specifying the project's C standard. Funny enough, specifying C99 does not work because apparently, `__STDC_VERSION__` is not getting defined in that version at all. Instead, we have to specify C11 as the project's C standard, which is also done in our CMake build instructions. We don't want to generally enforce C11 though, as our requiremets only state that a C99 compiler is required. In fact, we don't even require plain C99, but rather the GNU variant thereof. Meson allows us to handle this case rather easily by specifying "gnu99,c11", which will cause it to fall back to C11 in case GNU C99 is unsupported. This feature has only been introduced with Meson 1.3.0 though, and we support 0.61.0 and newer. In case we use such an oldish version though we fall back to requiring GNU99 unconditionally. This means that Windows essentially requires Meson 1.3.0 and newer when using Visual Studio, but I doubt that this is ever going to be a real problem. Tested-by: M Hickford <mirth.hickford@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22meson: make the CSPRNG backend configurablePatrick Steinhardt2-7/+23
The CSPRNG backend is not configurable in Meson and isn't quite discoverable, either. Make it configurable and add the actual backend used to the summary. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22meson: wire up fuzzersPatrick Steinhardt4-1/+28
Meson does not yet know to build our fuzzers. Introduce a new build option "fuzzers" and wire up the fuzzers in case it is enabled. Adapt our CI jobs so that they build the fuzzers by default. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22meson: wire up generation of distribution archivePatrick Steinhardt1-0/+13
Meson knows to generate distribution archives via `meson dist`. In addition to generating the archive itself, this target also knows to compile and execute tests from that archive, which helps to ensure that the result is an adequate drop-in replacement for the versioned project. While this already works as-is, one omission is that we don't propagate the commit that this is built from into the resulting archive. This can be fixed though by adding a distribution script that propagates the version into the "version" file, which GIT-VERSION-GEN knows to read if present. Use GIT-VERSION-GEN to populate that file. As the script is executed in the build directory, not in the directory where we generate the archive, we have to use a shell to resolve the "MESON_DIST_ROOT" environment variable. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22meson: wire up development environmentsPatrick Steinhardt1-0/+8
The Meson build system is able to wire up development environments. The intent is to make build artifacts of the project available. This is typically used to export e.g. paths to linkable libraries, which isn't all that interesting in our context given that we don't have an official library interface. But what we can use this mechanism for is to expose the built Git executables as well as the build directory. This allows users to play around with the built Git version in the devenv, and allows them to execute our test scripts directly with the built distribution. Wire up this feature, which can then be used via `meson devenv` in the build directory. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22meson: fix dependencies for generated headersPatrick Steinhardt1-9/+9
We generate a couple of headers from our documentation. These headers are added to the libgit sources, but two of them aren't used by the library, but instead by our builtins. This can cause parallel builds to fail because the builtin object may be compiled before the header was generated. Fix the issue by adding both "config-list.h" and "hook-list.h" to the list of builtin sources. While "command-list.h" is generated similarly, it is used by "help.c" and thus part of the libgit sources indeed. Reported-by: Evan Martin <evan.martin@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22meson: populate project version via GIT-VERSION-GENPatrick Steinhardt1-1/+8
The Git version for Meson is currently wired up manually. It can thus grow (and already has grown) stale quite easily, as having multiple sources of truth is never a good idea. This issue is mostly of cosmetic nature as we don't use the project version anywhere, and instead use the GIT-VERSION-GEN script to propagate the correct version into our build. But it is somewhat puzzling when `meson setup` announces to build an old Git release. There are a couple of alternatives for how to solve this: - We can keep the version undefined, but this makes Meson output "undefined" for the version, as well. - We can use GIT-VERSION-GEN to generate the version for us. At the point of configuring the project we haven't yet figured out host details though, and thus we didn't yet set up the shell environment. While not an issue for Unix-based systems, this would be an issue in Windows, where the shell typically gets provided via Git for Windows and thus requires some special setup. - We can pull the default version out of GIT-VERSION-GEN and move it into its own file. This likely requires some adjustments for scripts that bump the version, but allows Meson to read the version from that file trivially. Pick the second option and use GIT-VERSION-GEN as it gives us the most accurate version. In order to fix the bootstrapping issue on Windows systems we simply set the version to 'unknown' in case no shell was found. As the version is only of cosmetic value this isn't really much of an issue. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22GIT-VERSION-GEN: allow running without input and output filesPatrick Steinhardt1-15/+29
The GIT-VERSION-GEN script requires an input file containing formatting directives to be replaced as well as an output file that will get overwritten in case the file contents have changed. When computing the project version for Meson we don't want to have either though: - We only want to compute the version without anything else, but don't have an input file that would match that exact format. While we could of course introduce a new file just for that usecase, it feels suboptimal to add another file every time we want to have a slightly different format for versioned data. - The computed version needs to be read from stdout so that Meson can wire it up for the project. Extend the script to handle both usecases by recognizing `--format=` as alternative to providing an input path and by writing to stdout in case no output file was given. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22GIT-VERSION-GEN: simplify computing the dirty markerPatrick Steinhardt1-5/+1
The GIT-VERSION-GEN script computes the version that Git is being built from. When building from a commit with an unclean worktree it knows to append "-dirty" to that version to indicate that there were custom changes applied and that it isn't the exact same as that commit. The dirtiness check is done manually via git-diff-index(1), which is somewhat puzzling though: we already use git-describe(1) to compute the version, which also knows to compute dirtiness via the "--dirty" flag. But digging back in history explains why: the "-dirty" suffix was added in 31e0b2ca81 (GIT 1.5.4.3, 2008-02-23), and git-describe(1) didn't yet have support for "--dirty" back then. Refactor the script to use git-describe(1). Despite being simpler, it also results in a small speedup: Benchmark 1: git describe --dirty --match "v[0-9]*" Time (mean ± σ): 12.5 ms ± 0.3 ms [User: 6.3 ms, System: 8.8 ms] Range (min … max): 12.0 ms … 13.5 ms 200 runs Benchmark 2: git describe --match "v[0-9]*" HEAD && git update-index -q --refresh && git diff-index --name-only HEAD -- Time (mean ± σ): 17.9 ms ± 1.1 ms [User: 8.8 ms, System: 14.4 ms] Range (min … max): 17.0 ms … 30.6 ms 148 runs Summary git describe --dirty --match "v[0-9]*" ran 1.43 ± 0.09 times faster than git describe --match "v[0-9]*" && git update-index -q --refresh && git diff-index --name-only HEAD -- While the speedup doesn't really matter on Unix-based systems, where filesystem operations are typically fast, they do matter on Windows where the commands take a couple hundred milliseconds. A quick and dirty check on that system shows a speedup from ~800ms to ~400ms. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22builtin/pack-redundant: remove subcommand with breaking changesPatrick Steinhardt3-0/+10
The git-pack-redundant(1) subcommand has been castrated to require the "--i-still-use-this" option to do anything since 4406522b (pack-redundant: escalate deprecation warning to an error, 2023-03-23), which appeared in Git 2.41 and was announced for removal with 53a92c9552 (Documentation/BreakingChanges: announce removal of git-pack-redundant(1), 2024-09-02). Stop compiling the subcommand in case the `WITH_BREAKING_CHANGES` build flag is set. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22ci: repurpose "linux-gcc" job for deprecationsPatrick Steinhardt3-3/+4
The "linux-gcc" job isn't all that interesting by itself and can be considered more or less the "standard" job: it is running with a reasonably up-to-date image and uses GCC as a compiler, both of which we already cover in other jobs. There is one exception though: we change the default branch to be "main" instead of "master", so it is forging ahead a bit into the future to make sure that this change does not cause havoc. So let's expand on this a bit and also add the new "WITH_BREAKING_CHANGES" flag to the mix. Rename the job to "linux-breaking-changes" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22ci: merge linux-gcc-default into linux-gccPatrick Steinhardt3-13/+0
The "linux-gcc-default" job is mostly doing the same as the "linux-gcc" job, except for a couple of minor differences: - We use an explicit GCC version instead of the default version provided by the distribution. We have other jobs that test with "gcc-8", making this distinction pointless. - We don't set up the Python version explicitly, and instead use the default Python version. Python 2 has been end-of-life for quite a while now though, making this distinction less interesting. - We set up the default branch name to be "main" in "linux-gcc". We have other testcases that don't and also some that explicitly use "master". - We use "ubuntu:20.04" in one job and "ubuntu:latest" in another. We already have a couple other jobs testing these respectively. So overall, the job does not add much to our test coverage. Drop the "linux-gcc-default" job and adapt "linux-gcc" to start using the default GCC compiler, effectively merging those two jobs into one. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22Makefile: wire up build option for deprecated featuresPatrick Steinhardt6-0/+19
With 57ec9254eb (docs: introduce document to announce breaking changes, 2024-06-14), we have introduced a new document that tracks upcoming breaking changes in the Git project. In 2454970930 (BreakingChanges: early adopter option, 2024-10-11) we have amended the document a bit to mention that any introduced breaking changes must be accompanied by logic that allows us to enable the breaking change at compile-time. While we already have two breaking changes lined up, neither of them has such a switch because they predate those instructions. Introduce the proposed `WITH_BREAKING_CHANGES` preprocessor macro and wire it up with both our Makefiles and Meson. This does not yet wire up the build flag for existing deprecations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22refs: fix migration of reflogs respecting "core.logAllRefUpdates"Patrick Steinhardt2-1/+18
In 246cebe320 (refs: add support for migrating reflogs, 2024-12-16) we have added support to git-refs(1) to migrate reflogs between reference backends. It was reported [1] though that not we don't migrate reflogs for a subset of references, most importantly "refs/stash". This issue is caused by us still honoring "core.logAllRefUpdates" when trying to migrate reflogs: we do queue the updates, but depending on the value of that config we may decide to just skip writing the reflog entry altogether. And given that: - The default for "core.logAllRefUpdates" is to only create reflogs for branches, remotes, note refs and "HEAD" - "refs/stash" is neither of these ref types. We end up skipping the reflog creation for that particular reference. Fix the bug by setting `REF_FORCE_CREATE_REFLOG`, which instructs the ref backends to create the reflog entry regardless of the config or any preexisting state. [1]: <Z5BTQRlsOj1sygun@tapette.crustytoothpaste.net> Reported-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21reftable: address trivial -Wsign-compare warningsPatrick Steinhardt3-14/+7
Address the last couple of trivial -Wsign-compare warnings in the reftable library and remove the DISABLE_SIGN_COMPARE_WARNINGS macro that we have in "reftable/system.h". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21reftable/blocksource: adjust `read_block()` to return `ssize_t`Patrick Steinhardt4-24/+31
The `block_source_read_block()` function and its implementations return an integer as a result that reflects either the number of bytes read, or an error. As such its return type, a signed integer, isn't wrong, but it doesn't give the reader a good hint what it actually returns. Refactor the function to return an `ssize_t` instead, which is typical for functions similar to read(3p) and should thus give readers a better signal what they can expect as a result. Adjust callers to better handle the returned value to avoid warnings with -Wsign-compare. One of these callers is `reader_get_block()`, whose return value is only ever used by its callers to figure out whether or not the read was successful. So instead of bubbling up the `ssize_t` there, too, we adapt it to only indicate success or errors. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21reftable/blocksource: adjust type of the block lengthPatrick Steinhardt1-1/+1
The block length is used to track the number of bytes available in a specific block. As such, it is never set to a negative value, but is still represented by a signed integer. Adjust the type of the variable to be `size_t`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21reftable/block: adjust type of the restart lengthPatrick Steinhardt2-8/+6
The restart length is tracked as a positive integer even though it cannot ever be negative. Furthermore, it is effectively capped via the MAX_RESTARTS variable. Adjust the type of the variable to be `uint32_t`. While this type is excessive given that MAX_RESTARTS fits into an `uint16_t`, other places already use 32 bit integers for restarts, so this type is being more consistent. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21reftable/block: adapt header and footer size to return a `size_t`Patrick Steinhardt3-5/+5
The functions `header_size()` and `footer_size()` return a positive integer representing the size of the header and footer, respectively, dependent on the version of the reftable format. Similar to the preceding commit, these functions return a signed integer though, which is nonsensical given that there is no way for these functions to return negative. Adapt the functions to return a `size_t` instead to fix a couple of sign comparison warnings. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21reftable/basics: adjust `hash_size()` to return `uint32_t`Patrick Steinhardt9-50/+44
The `hash_size()` function returns the number of bytes used by the hash function. Weirdly enough though, it returns a signed integer for its size even though the size obviously cannot ever be negative. The only case where it could be negative is if the function returned an error when asked for an unknown hash, but we assert(3p) instead. Adjust the type of `hash_size()` to be `uint32_t` and adapt all places that use signed integers for the hash size to follow suit. This also allows us to get rid of a couple asserts that we had which verified that the size was indeed positive, which further stresses the point that this refactoring makes sense. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21reftable/basics: adjust `common_prefix_size()` to return `size_t`Patrick Steinhardt5-13/+10
The `common_prefix_size()` function computes the length of the common prefix between two buffers. As such its return value will always be an unsigned integer, as the length cannot be negative. Regardless of that, the function returns a signed integer, which is nonsensical and causes a couple of -Wsign-compare warnings all over the place. Adjust the function to return a `size_t` instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21reftable/record: handle overflows when decoding varintsPatrick Steinhardt3-32/+53
The logic to decode varints isn't able to detect integer overflows: as long as the buffer still has more data available, and as long as the current byte has its 0x80 bit set, we'll continue to add up these values to the result. This will eventually cause the `uint64_t` to overflow, at which point we'll return an invalid result. Refactor the function so that it is able to detect such overflows. The implementation is basically copied from Git's own `decode_varint()`, which already knows to handle overflows. The only adjustment is that we also take into account the string view's length in order to not overrun it. The reftable documentation explicitly notes that those two encoding schemas are supposed to be the same: Varint encoding ^^^^^^^^^^^^^^^ Varint encoding is identical to the ofs-delta encoding method used within pack files. Decoder works as follows: .... val = buf[ptr] & 0x7f while (buf[ptr] & 0x80) { ptr++ val = ((val + 1) << 7) | (buf[ptr] & 0x7f) } .... While at it, refactor `put_var_int()` in the same way by copying over the implementation of `encode_varint()`. While `put_var_int()` doesn't have an issue with overflows, it generates warnings with -Wsign-compare. The implementation of `encode_varint()` doesn't, is battle-tested and at the same time way simpler than what we currently have. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21reftable/record: drop unused `print` function pointerPatrick Steinhardt1-3/+0
In 42c424d69d (t/helper: inline printing of reftable records, 2024-08-22) we stopped using the `print` function of the reftable record vtable and instead moved its implementation into the single user of it. We didn't remove the function itself from the vtable though. Drop it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21meson: stop disabling -Wsign-comparePatrick Steinhardt1-1/+0
In 4f9264b0cd (config.mak.dev: drop `-Wno-sign-compare`, 2024-12-06) we have started an effort to make our codebase compile with -Wsign-compare. But while we removed the -Wno-sign-compare flag from "config.mak.dev", we didn't adjust the Meson build instructions in the same way. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21t8002: fix ambiguous printf conversion specificationsJan Palus1-2/+2
In e7fb2ca945 (builtin/blame: fix out-of-bounds write with blank boundary commits, 2025-01-10), we have introduced two new tests that expect a certain amount of padding. This padding is generated via printf using the "%0.s" conversion specification. That directive is ambiguous because it might be interpreted as field width (most shells) or 0-padding flag for numeric fields (coreutils). Fix this issue by using "%${N}s" instead, which is already being used in other tests (i.e. t5300, t0450) and is unambiguous. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Jan Palus <jpalus@fastmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21pack-write: pass hash_algo to internal functionsKarthik Nayak1-14/+16
The internal functions `write_rev_trailer()`, `write_rev_trailer()`, `write_mtimes_header()` and write_mtimes_trailer()` use the global `the_hash_algo` variable to access the repository's hash function. Pass the hash_algo down from callers, all of which already have access to the variable. This removes all global variables from the 'pack-write.c' file, so remove the 'USE_THE_REPOSITORY_VARIABLE' macro. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21pack-write: pass hash_algo to `write_rev_file()`Karthik Nayak4-16/+29
The `write_rev_file()` function uses the global `the_hash_algo` variable to access the repository's hash_algo. To avoid global variable usage, pass a hash_algo from the layers above. Also modify children functions `write_rev_file_order()` and `write_rev_header()` to accept 'the_hash_algo'. Altough the layers above could have access to the hash_algo internally, simply pass in `the_hash_algo`. This avoids any compatibility issues and bubbles up global variable usage to upper layers which can be eventually resolved. However, in `midx-write.c`, since all usage of global variables is removed, don't reintroduce them and instead use the `repo` available in the context. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21pack-write: pass hash_algo to `write_idx_file()`Karthik Nayak6-16/+27
The `write_idx_file()` function uses the global `the_hash_algo` variable to access the repository's hash_algo. To avoid global variable usage, pass a hash_algo from the layers above. Since `stage_tmp_packfiles()` also resides in 'pack-write.c' and calls `write_idx_file()`, update it to accept a `struct git_hash_algo` as a parameter and pass it through to the callee. Altough the layers above could have access to the hash_algo internally, simply pass in `the_hash_algo`. This avoids any compatibility issues and bubbles up global variable usage to upper layers which can be eventually resolved. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21pack-write: pass repository to `index_pack_lockfile()`Karthik Nayak4-6/+8
The `index_pack_lockfile()` function uses the global `the_repository` variable to access the repository. To avoid global variable usage, pass the repository from the layers above. Altough the layers above could have access to the repository internally, simply pass in `the_repository`. This avoids any compatibility issues and bubbles up global variable usage to upper layers which can be eventually resolved. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21pack-write: pass hash_algo to `fixup_pack_header_footer()`Karthik Nayak6-22/+26
The `fixup_pack_header_footer()` function uses the global `the_hash_algo` variable to access the repository's hash function. To avoid global variable usage, pass a hash_algo from the layers above. Altough the layers above could have access to the hash_algo internally, simply pass in `the_hash_algo`. This avoids any compatibility issues and bubbles up global variable usage to upper layers which can be eventually resolved. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21ref-filter: remove ref_format_clear()René Scharfe6-18/+0
Now that ref_format_clear() no longer releases any memory we don't need it anymore. Remove it and its counterpart, ref_format_init(). Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21ref-filter: move is-base tip to used_atomRené Scharfe3-28/+62
The string_list "is_base_tips" in struct ref_format stores the committish part of "is-base:<committish>". It has the same problems that its sibling string_list "bases" had. Fix them the same way as the previous commit did for the latter, by replacing the string_list with fields in "used_atom". Helped-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21ref-filter: move ahead-behind bases into used_atomRené Scharfe4-26/+59
verify_ref_format() parses a ref-filter format string and stores recognized items in the static array "used_atom". For "ahead-behind:<committish>" it stores the committish part in a string_list member "bases" of struct ref_format. ref_sorting_options() also parses bare ref-filter format items and stores stores recognized ones in "used_atom" as well. The committish parts go to a dummy struct ref_format in parse_sorting_atom(), though, and are leaked and forgotten. If verify_ref_format() is called before ref_sorting_options(), like in git for-each-ref, then all works well if the sort key is included in the format string. If it isn't then sorting cannot work as the committishes are missing. If ref_sorting_options() is called first, like in git branch, then we have the additional issue that if the sort key is included in the format string then filter_ahead_behind() can't see its committish, will not generate any results for it and thus it will be expanded to an empty string. Fix those issues by replacing the string_list with a field in used_atom for storing the committish. This way it can be shared for handling both ref-filter format strings and sorting options in the same command. Reported-by: Ross Goldberg <ross.goldberg@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21The second batchJunio C Hamano1-0/+22
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21Merge branch 'sk/unit-test-hash'Junio C Hamano3-25/+50
Test update. * sk/unit-test-hash: t/unit-tests: convert hash to use clar test framework
2025-01-21Merge branch 'mh/gitattr-doc-markup-fix'Junio C Hamano1-1/+1
Doc markup fix. * mh/gitattr-doc-markup-fix: docs: fix typesetting of merge driver placeholders
2025-01-21Merge branch 'dk/zsh-config-completion-fix'Junio C Hamano1-6/+11
Completion script updates for zsh * dk/zsh-config-completion-fix: completion: repair config completion for Zsh
2025-01-21Merge branch 'aj/difftool-config-doc-fix'Junio C Hamano2-2/+2
Docfix. * aj/difftool-config-doc-fix: difftool docs: restore correct position of tool list
2025-01-21Merge branch 'ps/the-repository'Junio C Hamano74-299/+407
More code paths have a repository passed through the callchain, instead of assuming the primary the_repository object. * ps/the-repository: match-trees: stop using `the_repository` graph: stop using `the_repository` add-interactive: stop using `the_repository` tmp-objdir: stop using `the_repository` resolve-undo: stop using `the_repository` credential: stop using `the_repository` mailinfo: stop using `the_repository` diagnose: stop using `the_repository` server-info: stop using `the_repository` send-pack: stop using `the_repository` serve: stop using `the_repository` trace: stop using `the_repository` pager: stop using `the_repository` progress: stop using `the_repository`
2025-01-21Merge branch 'jt/fsck-skiplist-parse-fix'Junio C Hamano4-3/+13
A misconfigured "fsck.skiplist" configuration variable was not diagnosed as an error, which has been corrected. * jt/fsck-skiplist-parse-fix: fsck: reject misconfigured fsck.skipList
2025-01-21Merge branch 'ps/reftable-get-random-fix'Junio C Hamano6-21/+33
The code to compute "unique" name used git_rand() which can fail or get stuck; the callsite does not require cryptographic security. Introduce the "insecure" mode and use it appropriately. * ps/reftable-get-random-fix: reftable/stack: accept insecure random bytes wrapper: allow generating insecure random bytes
2025-01-21Merge branch 'jk/t7407-use-test-grep'Junio C Hamano1-2/+2
Test clean-up. * jk/t7407-use-test-grep: t7407: use test_grep
2025-01-21Merge branch 'jk/lsan-race-ignore-false-positive'Junio C Hamano2-11/+11
The code to check LSan results has been simplified and made more robust. * jk/lsan-race-ignore-false-positive: test-lib: add a few comments to LSan log checking test-lib: simplify lsan results check test-lib: invert return value of check_test_results_san_file_empty
2025-01-21index-pack, unpack-objects: use skip_prefix to avoid magic numberJeff King2-6/+6
When parsing --pack_header=, we manually skip 14 bytes to the data. Let's use skip_prefix() to do this automatically. Note that we overwrite our pointer to the front of the string, so we have to add more context to the error message. We could avoid this by declaring an extra pointer to hold the value, but I think the modified message is actually preferable; it should give translators a bit more context. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21index-pack, unpack-objects: use get_be32() for reading pack headerJeff King3-12/+16
Both of these commands read the incoming pack into a static unsigned char buffer in BSS, and then parse it by casting the start of the buffer to a struct pack_header. This can result in SIGBUS on some platforms if the compiler doesn't place the buffer in a position that is properly aligned for 4-byte integers. This reportedly happens with unpack-objects (but not index-pack) on sparc64 when compiled with clang (but not gcc). But we are definitely in the wrong in both spots; since the buffer's type is unsigned char, we can't depend on larger alignment. When it works it is only because we are lucky. We'll fix this by switching to get_be32() to read the headers (just like the last few commits similarly switched us to put_be32() for writing into the same buffer). It would be nice to factor this out into a common helper function, but the interface ends up quite awkward. Either the caller needs to hardcode how many bytes we'll need, or it needs to pass us its fill()/use() functions as pointers. So I've just fixed both spots in the same way; this is not code that is likely to be repeated a third time (most of the pack reading code uses an mmap'd buffer, which should be properly aligned). I did make one tweak to the shared code: our pack_version_ok() macro expects us to pass the big-endian value we'd get by casting. We can introduce a "native" variant which uses the host integer ordering. Reported-by: Koakuma <koachan@protonmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21parse_pack_header_option(): avoid unaligned memory writesJeff King1-6/+9
In order to recreate a pack header in our in-memory buffer, we cast the buffer to a "struct pack_header" and assign the individual fields. This is reported to cause SIGBUS on sparc64 due to alignment issues. We can work around this by using put_be32() which will write individual bytes into the buffer. Reported-by: Koakuma <koachan@protonmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21packfile: factor out --pack_header argument parsingJeff King4-23/+30
Both index-pack and unpack-objects accept a --pack_header argument. This is an undocumented internal argument used by receive-pack and fetch to pass along information about the header of the pack, which they've already read from the incoming stream. In preparation for a bugfix, let's factor the duplicated code into a common helper. The callers are still responsible for identifying the option. While this could likewise be factored out, it is more flexible this way (e.g., if they ever started using parse-options and wanted to handle both the stuck and unstuck forms). Likewise, the callers are responsible for reporting errors, though they both just call die(). I've tweaked unpack-objects to match index-pack in marking the error for translation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21bswap.h: squelch potential sparse -Wcast-truncate warningsJunio C Hamano1-12/+12
In put_be32(), we right-shift a uint32_t value various amounts and then assign the low 8-bits to individual "unsigned char" bytes, throwing away the high bits. For shifts smaller than 24 bits, those thrown away bits will be arbitrary bits from the original uint32_t. This works exactly as we want, but if you feed a constant, then sparse complains. For example if we write this (which we plan to do in a future patch): put_be32(hdr, PACK_SIGNATURE); then "make sparse" produces: compat/bswap.h:175:22: error: cast truncates bits from constant value (5041 becomes 41) compat/bswap.h:176:22: error: cast truncates bits from constant value (504143 becomes 43) compat/bswap.h:177:22: error: cast truncates bits from constant value (5041434b becomes 4b) And the same issue exists in the other put_be*() functions, when used with a constant. We can silence this warning by explicitly masking off the truncated bits. The compiler is smart enough to know the result is the same, and the asm generated by gcc (with both -O0 and -O2) is identical. Curiously this line already exists: put_be32(&hdr_version, INDEX_EXTENSION_VERSION2); in the fsmonitor.c file, but it does not get flagged because the CPP macro expands to a small integer (2). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17t/unit-tests: convert reftable tree test to use clar test frameworkSeyi Kuforiji3-21/+13
Adapts reftable tree test script to clar framework by using clar assertions where necessary. Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17t/unit-tests: adapt priority queue test to use clar test frameworkSeyi Kuforiji4-93/+96
Convert the prio-queue test script to clar framework by using clar assertions where necessary. Test functions are created as a standalone to test different cases. update the type of the variable `j` from int to `size_t`, this ensures compatibility with the type used for result_size, which is also size_t, preventing a potential warning or error caused by comparisons between signed and unsigned integers. Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17t/unit-tests: convert mem-pool test to use clar test frameworkSeyi Kuforiji4-33/+27
Adapt the mem-pool test script to use clar framework by using clar assertions where necessary.Test functions are created as a standalone to test different test cases. Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17t/unit-tests: handle dashes in test suite filenamesSeyi Kuforiji1-0/+1
"generate-clar-decls.sh" script is designed to extract function signatures that match a specific pattern derived from the unit test file's name. The script does not know to massage file names with dashes, which will make it search for functions that look like, for example, `test_mem-pool_*`. Having dashes in function names is not allowed though, so these patterns won't ever match a legal function name. Adapt script to translate dashes (`-`) in test suite filenames to underscores (`_`) to correctly extract the function signatures and run the corresponding tests. This will be used by subsequent commits which follows the same construct. Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17builtin: send usage() help text to standard outputJunio C Hamano19-32/+37
Using the show_usage_and_exit_if_asked() helper we introduced earlier, fix callers of usage() that want to show the help text when explicitly asked by the end-user. The help text now goes to the standard output stream for them. These are the bog standard "if we got only '-h', then that is a request for help" callers. Their if (argc == 2 && !strcmp(argv[1], "-h")) usage(message); are simply replaced with show_usage_and_exit_if_asked(argc, argv, message); With this, the built-ins tested by t0012 all send their help text to their standard output stream, so the check in t0012 that was half tightened earlier is now fully tightened to insist on standard error stream being empty. Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17oddballs: send usage() help text to standard outputJunio C Hamano5-4/+13
Using the show_usage_if_asked() helper we introduced earlier, fix callers of usage() that want to show the help text when explicitly asked by the end-user. The help text now goes to the standard output stream for them. The callers in this step are oddballs in that their invocations of usage() are *not* guarded by if (argc == 2 && !strcmp(argv[1], "-h") usage(...); There are (unnecessarily) being clever ones that do things like if (argc != 2 || !strcmp(argv[1], "-h") usage(...); to say "I know I take only one argument, so argc != 2 is always an error regardless of what is in argv[]. Ah, by the way, even if argc is 2, "-h" is a request for usage text, so we do the same". Some like "git var -h" just do not treat "-h" any specially, and let it take the same error code paths as a parameter error. Now we cannot do the same, so these callers are rewrittin to do the show_usage_and_exit_if_asked() first and then handle the usage error the way they used to. Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17builtins: send usage_with_options() help text to standard outputJunio C Hamano14-32/+31
Using the show_usage_with_options_if_asked() helper we introduced earlier, fix callers of usage_with_options() that want to show the help text when explicitly asked by the end-user. The help text now goes to the standard output stream for them. The test in t7600 for "git merge -h" may want to be retired, as the same is covered by t0012 already, but it is specifically testing that the "-h" option gets a response even with a corrupt index file, so for now let's leave it there. Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17usage: add show_usage_if_asked()Junio C Hamano2-3/+26
Some commands call usage() when they are asked to give the help message with "git cmd -h", but this has the same problem as we fixed with callers of usage_with_options() for the same purpose. Introduce a helper function that captures the common pattern if (argc == 2 && !strcmp(argv[1], "-h")) usage(usage); and replaces it with show_usage_if_asked(argc, argv, usage); to help correct these code paths. Note that this helper function still exits with status 129, and t0012 insists on it. After converting all the mistaken callers of usage_with_options() to call this new helper, we may want to address it---the end user is asking us to give the help text, and we are doing exactly as asked, so there is no reason to exit with non-zero status. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17parse-options: add show_usage_with_options_if_asked()Junio C Hamano2-0/+14
Many commands call usage_with_options() when they are asked to give the help message, but it sends the help text to the standard error stream. When the user asked for it with "git cmd -h", the help message is the primary output from the command, hence we should send it to the standard output stream, instead. Introduce a helper function that captures the common pattern if (argc == 2 && !strcmp(argv[1], "-h")) usage_with_options(usage, options); and replaces it with show_usage_with_options_if_asked(argc, argv, usage, options); to help correct code paths. Note that this helper function still exits with status 129, and t0012 insists on it. After converting all the mistaken callers of usage_with_options() to call this new helper, we may want to address it---the end user is asking us to give the help text, and we are doing exactly as asked, so there is no reason to exit with non-zero status. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17t0012: optionally check that "-h" output goes to stdoutJeff King1-2/+9
For most commands, "git foo -h" will send the help output to stdout, as this is what parse-options.c does. But some commands send it to stderr instead. This is usually because they call usage_with_options(), and should be switched to show_usage_help_and_exit_if_asked(). Currently t0012 is permissive and allows either behavior. We'd like it to eventually enforce that help goes to stdout, and teaching it to do so identifies the commands that need to be changed. But during the transition period, we don't want to enforce that for most test runs. So let's introduce a flag that will let most test runs use the permissive behavior, and people interested in converting commands can run: GIT_TEST_HELP_MUST_BE_STDOUT=1 ./t0012-help.sh to see the failures. Eventually (when all builtins have been converted) we'll remove this flag entirely and always check the strict behavior. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17gitcli: document that command line trumps config and envJunio C Hamano1-0/+17
We centrally explain that "--no-whatever" is the way to countermand the "--whatever" option. Explain that a configured default and the value specified by an environment variable can be overridden by the corresponding command line option, too. Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17meson: wire up the git-subtree(1) commandPatrick Steinhardt2-1/+73
Wire up the git-subtree(1) command, which is part of "contrib/". Note that we have to move around the exact location where we include the "contrib/" subdirectory so that it comes after building the docs so that we have access to some of the common functionality. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17meson: introduce build option for contribPatrick Steinhardt2-1/+5
We unconditionally wire up building command completion present in the "contrib/" directory. This may or may not be what users want, and we don't provide a way to disable it. Introduce a new "contrib" build option. This option is introduced as an array so that users can manually pick which exact features they want to include from the "contrib" directory. By default, we build and install shell completions, which is a commonly used feature and also the current default. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17contrib/subtree: fix building docsPatrick Steinhardt2-8/+17
In a38edab7c8 (Makefile: generate doc versions via GIT-VERSION-GEN, 2024-12-06), we have refactored how we build our documentation by injecting the Git version into the Asciidoc and AsciiDoctor config files instead of doing so via arguments. As such, the original config files were removed, where the expectation is that they get generated via `GIT-VERSION-GEN` now. Whie the git-subtree(1) command part of "contrib/" also builds docs using these same config files, its Makefile wasn't adjusted accordingly and thus building the docs is broken. Fix this by using `GIT-VERSION-GEN` to generate those files. Reported-by: Renato Botelho <garga@FreeBSD.org> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17connect: address -Wsign-compare warningsMike Hommey1-12/+11
Most of the warnings were about loop variables being declared as ints with a condition using a size_t, whereby switching the variable to size_t fixes the warning. One other case was comparing the result of strlen to an int passed as an argument, which turns out could just as well be passed as a size_t, albeit trickling to other functions. Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-16The first batchJunio C Hamano1-0/+18
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-16Merge branch 'mb/t7110-use-test-path-helper'Junio C Hamano1-6/+6
Test modernization. * mb/t7110-use-test-path-helper: t7110: replace `test -f` with `test_path_is_*` helpers
2025-01-16Merge branch 'ps/meson-weak-sha1-build'Junio C Hamano2-15/+44
meson-based build now supports the unsafe-sha1 build knob. * ps/meson-weak-sha1-build: meson: provide a summary of configured backends meson: wire up unsafe SHA1 backend meson: add missing dots for build options meson: simplify conditions for HTTPS and SHA1 dependencies meson: require SecurityFramework when it's used as SHA1 backend meson: deduplicate access to SHA1/SHA256 backend options meson: consistenlty spell 'CommonCrypto'
2025-01-16Merge branch 'ps/more-sign-compare'Junio C Hamano13-106/+112
More -Wsign-compare fixes. * ps/more-sign-compare: sign-compare: avoid comparing ptrdiff with an int/unsigned commit-reach: use `size_t` to track indices when computing merge bases shallow: fix -Wsign-compare warnings builtin/log: fix remaining -Wsign-compare warnings builtin/log: use `size_t` to track indices commit-reach: use `size_t` to track indices in `get_reachable_subset()` commit-reach: use `size_t` to track indices in `remove_redundant()` commit-reach: fix type of `min_commit_date` commit-reach: fix index used to loop through unsigned integer prio-queue: fix type of `insertion_ctr`
2025-01-16Merge branch 'ps/object-collision-check'Junio C Hamano1-24/+42
CI jobs gave sporadic failures, which turns out that that the object finalization code was giving an error when it did not have to. * ps/object-collision-check: object-file: retry linking file into place when occluding file vanishes object-file: don't special-case missing source file in collision check object-file: rename variables in `check_collision()` object-file: fix race in object collision check
2025-01-16Merge branch 'as/long-option-help-i18n'Junio C Hamano1-3/+40
Tweak the help text used for the option value placeholders by parse-options API so that translations can customize the "<>" placeholder signal (e.g. "--option=<value>"). * as/long-option-help-i18n: parse-options: localize mark-up of placeholder text in the short help
2025-01-16Merge branch 're/submodule-parse-opt'Junio C Hamano1-111/+105
"git submodule" learned various ways to spell the same option, e.g. "--branch=B" can be spelled "--branch B" or "-bB". * re/submodule-parse-opt: git-submodule.sh: rename some variables git-submodule.sh: improve variables readability git-submodule.sh: add some comments git-submodule.sh: get rid of unused variable git-submodule.sh: get rid of isnumber git-submodule.sh: improve parsing of short options git-submodule.sh: improve parsing of some long options
2025-01-15doc: migrate git-commit manpage secondary files to new formatJean-Noël Avila2-6/+6
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-15doc: convert git commit config to new formatJean-Noël Avila1-10/+15
Also prevent git-commit manpage to refer to itself in the config description by using a variable. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>