git.git - The core git plumbing

Age	Commit message (Collapse)	Author	Files	Lines
2024-06-17	Merge branch 'jk/am-retry'	Junio C Hamano	1	-0/+3
	"git am" has a safety feature to prevent it from starting a new session when there already is a session going. It reliably triggers when a mbox is given on the command line, but it has to rely on the tty-ness of the standard input. Add an explicit way to opt out of this safety with a command line option. * jk/am-retry: test-terminal: drop stdin handling am: add explicit "--retry" option
2024-06-17	Merge branch 'ps/ref-storage-migration'	Junio C Hamano	3	-2/+77
	A new command has been added to migrate a repository that uses the files backend for its ref storage to use the reftable backend, with limitations. * ps/ref-storage-migration: builtin/refs: new command to migrate ref storage formats refs: implement logic to migrate between ref storage formats refs: implement removal of ref storages worktree: don't store main worktree twice reftable: inline `merged_table_release()` refs/files: fix NULL pointer deref when releasing ref store refs/files: extract function to iterate through root refs refs/files: refactor `add_pseudoref_and_head_entries()` refs: allow to skip creation of reflog entries refs: pass storage format to `ref_store_init()` explicitly refs: convert ref storage format to an enum setup: unset ref storage when reinitializing repository version
2024-06-12	Merge branch 'jk/sparse-leakfix'	Junio C Hamano	1	-17/+32
	Many memory leaks in the sparse-checkout code paths have been plugged. * jk/sparse-leakfix: sparse-checkout: free duplicate hashmap entries sparse-checkout: free string list after displaying sparse-checkout: free pattern list in sparse_checkout_list() sparse-checkout: free sparse_filename after use sparse-checkout: refactor temporary sparse_checkout_patterns sparse-checkout: always free "line" strbuf after reading input sparse-checkout: reuse --stdin buffer when reading patterns dir.c: always copy input to add_pattern() dir.c: free removed sparse-pattern hashmap entries sparse-checkout: clear patterns when init() sees existing sparse file dir.c: free strings in sparse cone pattern hashmaps sparse-checkout: pass string literals directly to add_pattern() sparse-checkout: free string list in write_cone_to_file()
2024-06-12	Merge branch 'gt/t-hash-unit-test'	Junio C Hamano	1	-3/+1
	A pair of test helpers that essentially are unit tests on hash algorithms have been rewritten using the unit-tests framework. * gt/t-hash-unit-test: t/: migrate helper/test-{sha1, sha256} to unit-tests/t-hash strbuf: introduce strbuf_addstrings() to repeatedly add a string
2024-06-10	Merge branch 'jk/leakfixes'	Junio C Hamano	1	-24/+26
	Memory leaks in "git mv" has been plugged. * jk/leakfixes: mv: replace src_dir with a strvec mv: factor out empty src_dir removal mv: move src_dir cleanup to end of cmd_mv() t-strvec: mark variable-arg helper with LAST_ARG_MUST_BE_NULL t-strvec: use va_end() to match va_start()
2024-06-06	Merge branch 'rs/difftool-env-simplify'	Junio C Hamano	1	-8/+4
	Code simplification. * rs/difftool-env-simplify: difftool: add env vars directly in run_file_diff()
2024-06-06	Merge branch 'ps/leakfixes'	Junio C Hamano	12	-431/+563
	Leakfixes. * ps/leakfixes: builtin/mv: fix leaks for submodule gitfile paths builtin/mv: refactor to use `struct strvec` builtin/mv duplicate string list memory builtin/mv: refactor `add_slash()` to always return allocated strings strvec: add functions to replace and remove strings submodule: fix leaking memory for submodule entries commit-reach: fix memory leak in `ahead_behind()` builtin/credential: clear credential before exit config: plug various memory leaks config: clarify memory ownership in `git_config_string()` builtin/log: stop using globals for format config builtin/log: stop using globals for log config convert: refactor code to clarify ownership of check_roundtrip_encoding diff: refactor code to clarify memory ownership of prefixes config: clarify memory ownership in `git_config_pathname()` http: refactor code to clarify memory ownership checkout: clarify memory ownership in `unique_tracking_name()` strbuf: fix leak when `appendwholeline()` fails with EOF transport-helper: fix leaking helper name
2024-06-06	am: add explicit "--retry" option	Jeff King	1	-0/+3
	After a patch fails, you can ask "git am" to try applying it again with new options by running without any of the resume options. E.g.: git am <patch # oops, it failed; let's try again git am --3way But since this second command has no explicit resume option (like "--continue"), it looks just like an invocation to read a fresh patch from stdin. To avoid confusing the two cases, there are some heuristics, courtesy of 8d18550318 (builtin-am: reject patches when there's a session in progress, 2015-08-04): if (in_progress) { /* * Catch user error to feed us patches when there is a session * in progress: * * 1. mbox path(s) are provided on the command-line. * 2. stdin is not a tty: the user is trying to feed us a patch * from standard input. This is somewhat unreliable -- stdin * could be /dev/null for example and the caller did not * intend to feed us a patch but wanted to continue * unattended. */ if (argc \|\| (resume_mode == RESUME_FALSE && !isatty(0))) die(_("previous rebase directory %s still exists but mbox given."), state.dir); if (resume_mode == RESUME_FALSE) resume_mode = RESUME_APPLY; [...] So if no resume command is given, then we require that stdin be a tty, and otherwise complain about (potentially) receiving an mbox on stdin. But of course you might not actually have a terminal available! And sadly there is no explicit way to hit this same code path; this is the only place that sets RESUME_APPLY. So you're stuck, and scripts like our test suite have to bend over backwards to create a pseudo-tty. Let's provide an explicit option to trigger this mode. The code turns out to be quite simple; just setting "resume_mode" to RESUME_FALSE is enough to dodge the tty check, and then our state is the same as it would be with the heuristic case (which we'll continue to allow). When we don't have a session in progress, there's already code to complain when resume_mode is set (but we'll add a new test to cover that). To test the new option, we'll convert the existing tests that rely on the fake stdin tty. That lets us test them on more platforms, and will let us simplify test_terminal a bit in a future patch. It does, however, mean we're not testing the tty heuristic at all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06	builtin/refs: new command to migrate ref storage formats	Patrick Steinhardt	1	-0/+75
	Introduce a new command that allows the user to migrate a repository between ref storage formats. This new command is implemented as part of a new git-refs(1) executable. This is due to two reasons: - There is no good place to put the migration logic in existing commands. git-maintenance(1) felt unwieldy, and git-pack-refs(1) is not the correct place to put it, either. - I had it in my mind to create a new low-level command for accessing refs for quite a while already. git-refs(1) is that command and can over time grow more functionality relating to refs. This should help discoverability by consolidating low-level access to refs into a single executable. As mentioned in the preceding commit that introduces the ref storage format migration logic, the new `git refs migrate` command still has a bunch of restrictions. These restrictions are documented accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06	refs: convert ref storage format to an enum	Patrick Steinhardt	2	-2/+2
	The ref storage format is tracked as a simple unsigned integer, which makes it harder than necessary to discover what that integer actually is or where its values are defined. Convert the ref storage format to instead be an enum. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05	sparse-checkout: free duplicate hashmap entries	Jeff King	1	-1/+8
	In insert_recursive_pattern(), we create a new pattern_entry to insert into the parent_hashmap. If we find that the same entry already exists in the hashmap, we skip adding the new one. But we forget to free the new one, creating a leak. We can fix it by cleaning up the discarded entry. It would probably be possible to avoid creating it in the first place, but it's non-trivial. We'd have to define a "keydata" struct that lets us compare the existing entries to the broken-out fields. It's probably not worth the complexity, so we'll punt on that for now. There is one subtlety here: our insertion is happening in a loop, with each iteration looking at the pattern we just inserted (hence the "recursive" in the name). So if we skip insertion, what do we look at? The obvious answer is that we should remember the existing duplicate we found and use that. But I _think_ in that case, we probably already have all of the recursive bits already (from when the original entry was added). And so just breaking out of the loop would be correct. But I'm not 100% sure on that; after all, the original leaky code could have done the same break, but it didn't. So I went with the "obvious answer" above, which has no chance of changing the behavior aside from fixing the leak. With this patch, t1091 can now be marked leak-free. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05	sparse-checkout: free string list after displaying	Jeff King	1	-0/+2
	In sparse_checkout_list(), we put the hashmap entries into a string_list so we can sort them. But after printing, we forget to free the list. This patch drops 5 leaks from t1091. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05	sparse-checkout: free pattern list in sparse_checkout_list()	Jeff King	1	-3/+2
	In sparse_checkout_list(), we create a pattern_list that needs to eventually be cleared. We remember to do so in the regular code path, but the cone-mode path does an early return, and forgets to clean up. We could fix the leak by adding a new call to clear_pattern_list(). But we can simplify even further by just skipping the early return, pushing the other code path (which consists now of only one line!) into an else block. That also matches the same cone/non-cone if/else used in some other functions. This fixes 15 leaks found in t1091. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05	sparse-checkout: free sparse_filename after use	Jeff King	1	-0/+2
	We allocate a heap buffer via get_sparse_checkout_filename(). Most calls remember to free it, but sparse_checkout_init() forgets to, causing a leak. Ironically, it remembers to do so in the error return paths, but not in the path that makes it all the way to the function end! Fixing this clears up 6 leaks from t1091. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05	sparse-checkout: refactor temporary sparse_checkout_patterns	Jeff King	1	-1/+8
	In update_working_directory(), we take in a pattern_list, attach it to the repository index by assigning it to index->sparse_checkout_patterns, and then call unpack_trees. Afterwards, we remove it by setting index->sparse_checkout_patterns back to NULL. But there are two possible leaks here: 1. If the index already had a populated sparse_checkout_patterns, we've obliterated it. We can fix this by saving and restoring it, rather than always setting it back to NULL. 2. We may call the function with a NULL pattern_list, expecting it to use the on-disk sparse file. In that case, the index routines will lazy-load the sparse patterns automatically. But now at the end of the function when we restore the patterns, we'll leak those lazy-loaded ones! We can fix this by freeing the pattern list before overwriting its pointer whenever it does not match what was passed in (in practice this should only happen when the passed-in list is NULL, but this is erring on the defensive side). Together these remove 48 indirect leaks found in t1091. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05	sparse-checkout: always free "line" strbuf after reading input	Jeff King	1	-0/+1
	In add_patterns_from_input(), we may read lines from a file with a loop like this: while (!strbuf_getline(&line, file)) { ... strbuf_to_cone_pattern(&line, pl); } /* we don't strbuf_release(&line) here! */ This generally is OK because strbuf_to_cone_pattern() consumes the buffer via strbuf_detach(). But we can leak in a few cases: 1. We don't always consume the buffer! If the line ends up empty after trimming, we leave strbuf_to_cone_pattern() without detaching. In most cases this is OK, because a subsequent getline() call will use the same buffer. But if you had an empty line at the end of file, for example, it would leak. 2. Even if strbuf_to_cone_pattern() always consumed the buffer, there's a subtle issue with strbuf_getline(). As we saw in 94e2aa555e (strbuf: fix leak when `appendwholeline()` fails with EOF, 2024-05-27), it's possible for it to return EOF with an allocated buffer (e.g., if the underlying getdelim() call saw an error). So we should always strbuf_release() after finishing a read loop like this. Note that even the code to read patterns from argv has the same problem. Because that also uses strbuf_to_cone_pattern(), we stuff each argv entry into a strbuf. It uses the same "line" strbuf as the getline code, but we should position the strbuf_release() to cover both code paths. This fixes at least 9 leaks found in t1091. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05	sparse-checkout: reuse --stdin buffer when reading patterns	Jeff King	1	-5/+4
	When we read patterns from --stdin, we loop on strbuf_getline(), and detach each line we read to pass into add_pattern(). This used to be necessary because add_pattern() required that the pattern strings remain valid while the pattern_list was in use. But it also created a leak, since we didn't record the detached buffers anywhere else. Now that add_pattern() has been modified to make its own copy of the strings, we can stop detaching and fix the leak. This fixes 4 leaks detected in t1091. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04	sparse-checkout: clear patterns when init() sees existing sparse file	Jeff King	1	-0/+1
	In sparse_checkout_init(), we first try to load patterns from an existing file. If we found any, we return immediately, but end up leaking the patterns we parsed. Fixing this reduces the number of leaks in t7002 from 9 down to 5. Note that there are two other exits from the function, but they don't need the same treatment: - if we can't resolve HEAD, we write out a hard-coded sparse file and return. But we know the pattern list is empty there, since we didn't find any in the on-disk file and we haven't yet added any of our own. - otherwise, we do populate the list and then tail-call into write_patterns_and_update(). But that function frees the pattern_list itself, so we don't need to. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04	sparse-checkout: pass string literals directly to add_pattern()	Jeff King	1	-8/+3
	The add_pattern() function takes a pattern string, but neither makes a copy of it nor takes ownership of the memory. So it is the caller's responsibility to make sure the string hangs around as long as the pattern_list which references it. There are a few cases in sparse-checkout where we use string literal patterns by stuffing them into a strbuf, detaching the buffer, and then passing the result into add_pattern(). This creates a leak when the pattern_list is eventually cleared, since we don't retain a copy of the detached buffer to free. But we can observe that the whole strbuf dance is unnecessary. The point was presumably[1] to satisfy the lifetime requirement of the string. But string literals have static duration; we can count on them lasting for the whole program. So we can fix the leak by just passing them directly. And as a bonus, that simplifies the code. The leaks can be seen in t7002, which drops from 25 leaks to 22 with this patch. It also makes t3602 and t1090 leak-free. In the long run, we will also want to clean up this (undocumented!) memory lifetime requirement of add_pattern(). But that can come in a later patch; passing the string literals directly will be the right thing either way. [1] The code in question comes from 416adc8711 (sparse-checkout: update working directory in-process for 'init', 2019-11-21) and 99dfa6f970 (sparse-checkout: use in-process update for disable subcommand, 2019-11-21), but I didn't see anything in their commit messages or on the list explaining the strbufs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04	sparse-checkout: free string list in write_cone_to_file()	Jeff King	1	-0/+2
	We use a string list to hold sorted and de-duped patterns, but don't free it before leaving the function, causing a leak. This drops the number of leaks found in t7002 from 27 to 25. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30	Merge branch 'jc/fix-2.45.1-and-friends-for-maint'	Junio C Hamano	1	-11/+2
	Adjust jc/fix-2.45.1-and-friends-for-2.39 for more recent maintenance track. * jc/fix-2.45.1-and-friends-for-maint: Revert "fsck: warn about symlink pointing inside a gitdir" Revert "Add a helper function to compare file contents" clone: drop the protections where hooks aren't run tests: verify that `clone -c core.hooksPath=/dev/null` works again Revert "core.hooksPath: add some protection while cloning" init: use the correct path of the templates directory again hook: plug a new memory leak ci: stop installing "gcc-13" for osx-gcc ci: avoid bare "gcc" for osx-gcc job ci: drop mention of BREW_INSTALL_PACKAGES variable send-email: avoid creating more than one Term::ReadLine object send-email: drop FakeTerm hack
2024-05-30	Merge branch 'jc/undecided-is-not-necessarily-sha1-fix'	Junio C Hamano	3	-0/+26
	The base topic started to make it an error for a command to leave the hash algorithm unspecified, which revealed a few commands that were not ready for the change. Give users a knob to revert back to the "default is sha-1" behaviour as an escape hatch, and start fixing these breakages. * jc/undecided-is-not-necessarily-sha1-fix: apply: fix uninitialized hash function builtin/hash-object: fix uninitialized hash function builtin/patch-id: fix uninitialized hash function t1517: test commands that are designed to be run outside repository setup: add an escape hatch for "no more default hash algorithm" change
2024-05-30	Merge branch 'ps/refs-without-the-repository-updates'	Junio C Hamano	14	-25/+38
	Further clean-up the refs subsystem to stop relying on the_repository, and instead use the repository associated to the ref_store object. * ps/refs-without-the-repository-updates: refs/packed: remove references to `the_hash_algo` refs/files: remove references to `the_hash_algo` refs/files: use correct repository refs: remove `dwim_log()` refs: drop `git_default_branch_name()` refs: pass repo when peeling objects refs: move object peeling into "object.c" refs: pass ref store when detecting dangling symrefs refs: convert iteration over replace refs to accept ref store refs: retrieve worktree ref stores via associated repository refs: refactor `resolve_gitlink_ref()` to accept a repository refs: pass repo when retrieving submodule ref store refs: track ref stores via strmap refs: implement releasing ref storages refs: rename `init_db` callback to avoid confusion refs: adjust names for `init` and `init_db` callbacks
2024-05-30	Merge branch 'ps/undecided-is-not-necessarily-sha1'	Junio C Hamano	5	-7/+19
	Before discovering the repository details, We used to assume SHA-1 as the "default" hash function, which has been corrected. Hopefully this will smoke out codepaths that rely on such an unwarranted assumptions. * ps/undecided-is-not-necessarily-sha1: repository: stop setting SHA1 as the default object hash oss-fuzz/commit-graph: set up hash algorithm builtin/shortlog: don't set up revisions without repo builtin/diff: explicitly set hash algo when there is no repo builtin/bundle: abort "verify" early when there is no repository builtin/blame: don't access potentially unitialized `the_hash_algo` builtin/rev-parse: allow shortening to more than 40 hex characters remote-curl: fix parsing of detached SHA256 heads attr: fix BUG() when parsing attrs outside of repo attr: don't recompute default attribute source parse-options-cb: only abbreviate hashes when hash algo is known path: move `validate_headref()` to its only user path: harden validation of HEAD with non-standard hashes
2024-05-30	mv: replace src_dir with a strvec	Jeff King	1	-6/+4
	We manually manage the src_dir array with ALLOC_GROW. Using a strvec is a little more ergonomic, and makes the memory ownership more clear. It does mean that we copy the strings (which were otherwise just pointers into the "sources" strvec), but using the same rationale as 9fcd9e4e72 (builtin/mv duplicate string list memory, 2024-05-27), it's just not enough to be worth worrying about here. As a bonus, this gets rid of some "int"s used for allocation management (though in practice these were limited to command-line sizes and thus not overflowable). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30	mv: factor out empty src_dir removal	Jeff King	1	-19/+23
	This pulls the loop added by b6f51e3db9 (mv: cleanup empty WORKING_DIRECTORY, 2022-08-09) into a sub-function. That reduces clutter in cmd_mv() and makes it easier to see that the lifetime of the a_src_dir strbuf is limited to this code (and thus its cleanup doesn't need to go after the "out" label). Another option would be to just declare the strbuf inside the loop, since it is only used there. But this refactor retains the existing property that we can reuse the allocated buffer for each iteration of the loop. That optimization is probably overkill, but I think the sub-function is more readable anyway, and then keeping the optimization is basically free. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30	mv: move src_dir cleanup to end of cmd_mv()	Jeff King	1	-1/+1
	Commit b6f51e3db9 (mv: cleanup empty WORKING_DIRECTORY, 2022-08-09) added an auxiliary array where we store directory arguments that we see while processing the incoming arguments. After actually moving things, we then use that array to remove now-empty directories, and then immediately free the array. But if the actual move queues any errors in only_match_skip_worktree, that can cause us to jump straight to the "out" label to clean up, skipping the free() and leaking the array. Let's push the free() down past the "out" label so that we always clean up (the array is initialized to NULL, so this is always safe). We'll hold on to the memory a little longer than necessary, but clarity is more important than micro-optimizing here. Note that the adjacent "a_src_dir" strbuf does not suffer the same problem; it is only allocated during the removal step. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30	Merge branch 'ps/leakfixes' into jk/leakfixes	Junio C Hamano	12	-431/+563
	* ps/leakfixes: builtin/mv: fix leaks for submodule gitfile paths builtin/mv: refactor to use `struct strvec` builtin/mv duplicate string list memory builtin/mv: refactor `add_slash()` to always return allocated strings strvec: add functions to replace and remove strings submodule: fix leaking memory for submodule entries commit-reach: fix memory leak in `ahead_behind()` builtin/credential: clear credential before exit config: plug various memory leaks config: clarify memory ownership in `git_config_string()` builtin/log: stop using globals for format config builtin/log: stop using globals for log config convert: refactor code to clarify ownership of check_roundtrip_encoding diff: refactor code to clarify memory ownership of prefixes config: clarify memory ownership in `git_config_pathname()` http: refactor code to clarify memory ownership checkout: clarify memory ownership in `unique_tracking_name()` strbuf: fix leak when `appendwholeline()` fails with EOF transport-helper: fix leaking helper name
2024-05-29	strbuf: introduce strbuf_addstrings() to repeatedly add a string	Ghanshyam Thakkar	1	-3/+1
	In a following commit we are going to port code from "t/helper/test-sha256.c", t/helper/test-hash.c and "t/t0015-hash.sh" to a new "t/unit-tests/t-hash.c" file using the recently added unit test framework. To port code like: perl -e "$\| = 1; print q{aaaaaaaaaa} for 1..100000;" we are going to need a new strbuf_addstrings() function that repeatedly adds the same string a number of times to a buffer. Such a strbuf_addstrings() function would already be useful in "json-writer.c" and "builtin/submodule-helper.c" as both of these files already have code that repeatedly adds the same string. So let's introduce such a strbuf_addstrings() function in "strbuf.{c,h}" and use it in both "json-writer.c" and "builtin/submodule-helper.c". We use the "strbuf_addstrings" name as this way strbuf_addstr() and strbuf_addstrings() would be similar for strings as strbuf_addch() and strbuf_addchars() for characters. Helped-by: Junio C Hamano <gitster@pobox.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Co-authored-by: Achu Luma <ach.lumap@gmail.com> Signed-off-by: Achu Luma <ach.lumap@gmail.com> Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-28	Merge branch 'jc/format-patch-more-aggressive-range-diff'	Junio C Hamano	1	-1/+1
	The default "creation-factor" used by "git format-patch" has been raised to make it more aggressively find matching commits. * jc/format-patch-more-aggressive-range-diff: format-patch: run range-diff with larger creation-factor
2024-05-28	Merge branch 'tb/pack-bitmap-write-cleanups'	Junio C Hamano	1	-6/+13
	The pack bitmap code saw some clean-up to prepare for a follow-up topic. * tb/pack-bitmap-write-cleanups: pack-bitmap: introduce `bitmap_writer_free()` pack-bitmap-write.c: avoid uninitialized 'write_as' field pack-bitmap: drop unused `max_bitmaps` parameter pack-bitmap: avoid use of static `bitmap_writer` pack-bitmap-write.c: move commit_positions into commit_pos fields object.h: add flags allocated by pack-bitmap.h
2024-05-28	Merge branch 'ps/builtin-config-cleanup'	Junio C Hamano	1	-429/+541
	Code clean-up to reduce inter-function communication inside builtin/config.c done via the use of global variables. * ps/builtin-config-cleanup: (21 commits) builtin/config: pass data between callbacks via local variables builtin/config: convert flags to a local variable builtin/config: track "fixed value" option via flags only builtin/config: convert `key` to a local variable builtin/config: convert `key_regexp` to a local variable builtin/config: convert `regexp` to a local variable builtin/config: convert `value_pattern` to a local variable builtin/config: convert `do_not_match` to a local variable builtin/config: move `respect_includes_opt` into location options builtin/config: move default value into display options builtin/config: move type options into display options builtin/config: move display options into local variables builtin/config: move location options into local variables builtin/config: refactor functions to have common exit paths config: make the config source const builtin/config: check for writeability after source is set up builtin/config: move actions into `cmd_config_actions()` builtin/config: move legacy options into `cmd_config()` builtin/config: move subcommand options into `cmd_config()` builtin/config: move legacy mode into its own function ...
2024-05-28	Merge branch 'ps/pseudo-ref-terminology'	Junio C Hamano	1	-1/+1
	Terminology to call various ref-like things are getting straightened out. * ps/pseudo-ref-terminology: refs: refuse to write pseudorefs ref-filter: properly distinuish pseudo and root refs refs: pseudorefs are no refs refs: classify HEAD as a root ref refs: do not check ref existence in `is_root_ref()` refs: rename `is_special_ref()` to `is_pseudo_ref()` refs: rename `is_pseudoref()` to `is_root_ref()` Documentation/glossary: define root refs as refs Documentation/glossary: clarify limitations of pseudorefs Documentation/glossary: redefine pseudorefs as special refs
2024-05-27	builtin/mv: fix leaks for submodule gitfile paths	Patrick Steinhardt	1	-19/+25
	Similar to the preceding commit, we have effectively given tracking memory ownership of submodule gitfile paths. Refactor the code to start tracking allocated strings in a separate `struct strvec` such that we can easily plug those leaks. Mark now-passing tests as leak free. Note that ideally, we wouldn't require two separate data structures to track those paths. But we do need to store `NULL` pointers for the gitfile paths such that we can indicate that its corresponding entries in the other arrays do not have such a path at all. And given that `struct strvec`s cannot store `NULL` pointers we cannot use them to store this information. There is another small gotcha that is easy to miss: you may be wondering why we don't want to store `SUBMODULE_WITH_GITDIR` in the strvec. This is because this is a mere sentinel value and not actually a string at all. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27	builtin/mv: refactor to use `struct strvec`	Patrick Steinhardt	1	-65/+60
	Memory allocation patterns in git-mv(1) are extremely hard to follow: We copy around string pointers into manually-managed arrays, some of which alias each other, but only sometimes, while we also drop some of those strings at other times without ever daring to free them. While this may be my own subjective feeling, it seems like others have given up as the code has multiple calls to `UNLEAK()`. These are not sufficient though, and git-mv(1) is still leaking all over the place even with them. Refactor the code to instead track strings in `struct strvec`. While this has the effect of effectively duplicating some of the strings without an actual need, it is way easier to reason about and fixes all of the aliasing of memory that has been going on. It allows us to get rid of the `UNLEAK()` calls and also fixes leaks that those calls did not paper over. Mark tests which are now leak-free accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27	builtin/mv duplicate string list memory	Patrick Steinhardt	1	-6/+13
	makes the next patch easier, where we will migrate to the paths being owned by a strvec. given that we are talking about command line parameters here it's also not like we have tons of allocations that this would save while at it, fix a memory leak Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27	builtin/mv: refactor `add_slash()` to always return allocated strings	Patrick Steinhardt	1	-18/+20
	The `add_slash()` function will only conditionally return an allocated string when the passed-in string did not yet have a trailing slash. This makes the memory ownership harder to track than really necessary. It's dubious whether this optimization really buys us all that much. The number of times we execute this function is bounded by the number of arguments to git-mv(1), so in the typical case we may end up saving an allocation or two. Simplify the code to unconditionally return allocated strings. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27	builtin/credential: clear credential before exit	Patrick Steinhardt	1	-0/+2
	We never release memory associated with `struct credential`. Fix this and mark the corresponding test as leak free. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27	config: clarify memory ownership in `git_config_string()`	Patrick Steinhardt	6	-15/+15
	The out parameter of `git_config_string()` is a `const char ` even though we transfer ownership of memory to the caller. This is quite misleading and has led to many memory leaks all over the place. Adapt the parameter to instead be `char `. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27	builtin/log: stop using globals for format config	Patrick Steinhardt	1	-202/+265
	This commit does the exact same as the preceding commit, only for the format configuration instead of the log configuration. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27	builtin/log: stop using globals for log config	Patrick Steinhardt	1	-103/+156
	We're using global variables to store the log configuration. Many of these can be set both via the command line and via the config, and depending on how they are being set, they may contain allocated strings. This leads to hard-to-track memory ownership and memory leaks. Refactor the code to instead use a `struct log_config` that is being allocated on the stack. This allows us to more clearly scope the variables, track memory ownership and ultimately release the memory. This also prepares us for a change to `git_config_string()`, which will be adapted to have a `char ` out parameter instead of `const char `. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27	config: clarify memory ownership in `git_config_pathname()`	Patrick Steinhardt	5	-6/+6
	The out parameter of `git_config_pathname()` is a `const char ` even though we transfer ownership of memory to the caller. This is quite misleading and has led to many memory leaks all over the place. Adapt the parameter to instead be `char `. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27	checkout: clarify memory ownership in `unique_tracking_name()`	Patrick Steinhardt	2	-15/+19
	The function `unique_tracking_name()` returns an allocated string, but does not clearly indicate this because its return type is `const char ` instead of `char `. This has led to various callsites where we never free its returned memory at all, which causes memory leaks. Plug those leaks and mark now-passing tests as leak free. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27	difftool: add env vars directly in run_file_diff()	René Scharfe	1	-8/+4
	Add the environment variables of the child process directly using strvec_push() instead of building an array out of them and then adding that using strvec_pushv(). The new code is shorter and avoids magic array index values and fragile array padding. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24	Merge branch 'fixes/2.45.1/2.44' into jc/fix-2.45.1-and-friends-for-maint	Junio C Hamano	1	-11/+2
	* fixes/2.45.1/2.44: Revert "fsck: warn about symlink pointing inside a gitdir" Revert "Add a helper function to compare file contents" clone: drop the protections where hooks aren't run tests: verify that `clone -c core.hooksPath=/dev/null` works again Revert "core.hooksPath: add some protection while cloning" init: use the correct path of the templates directory again hook: plug a new memory leak ci: stop installing "gcc-13" for osx-gcc ci: avoid bare "gcc" for osx-gcc job ci: drop mention of BREW_INSTALL_PACKAGES variable send-email: avoid creating more than one Term::ReadLine object send-email: drop FakeTerm hack
2024-05-24	Merge branch 'fixes/2.45.1/2.43' into fixes/2.45.1/2.44	Junio C Hamano	1	-11/+2
	* fixes/2.45.1/2.43: Revert "fsck: warn about symlink pointing inside a gitdir" Revert "Add a helper function to compare file contents" clone: drop the protections where hooks aren't run tests: verify that `clone -c core.hooksPath=/dev/null` works again Revert "core.hooksPath: add some protection while cloning" init: use the correct path of the templates directory again hook: plug a new memory leak ci: stop installing "gcc-13" for osx-gcc ci: avoid bare "gcc" for osx-gcc job ci: drop mention of BREW_INSTALL_PACKAGES variable send-email: avoid creating more than one Term::ReadLine object send-email: drop FakeTerm hack
2024-05-24	Merge branch 'fixes/2.45.1/2.42' into fixes/2.45.1/2.43	Junio C Hamano	1	-11/+2
	* fixes/2.45.1/2.42: Revert "fsck: warn about symlink pointing inside a gitdir" Revert "Add a helper function to compare file contents" clone: drop the protections where hooks aren't run tests: verify that `clone -c core.hooksPath=/dev/null` works again Revert "core.hooksPath: add some protection while cloning" init: use the correct path of the templates directory again hook: plug a new memory leak ci: stop installing "gcc-13" for osx-gcc ci: avoid bare "gcc" for osx-gcc job ci: drop mention of BREW_INSTALL_PACKAGES variable send-email: avoid creating more than one Term::ReadLine object send-email: drop FakeTerm hack
2024-05-24	Merge branch 'fixes/2.45.1/2.41' into fixes/2.45.1/2.42	Junio C Hamano	1	-11/+2
	* fixes/2.45.1/2.41: Revert "fsck: warn about symlink pointing inside a gitdir" Revert "Add a helper function to compare file contents" clone: drop the protections where hooks aren't run tests: verify that `clone -c core.hooksPath=/dev/null` works again Revert "core.hooksPath: add some protection while cloning" init: use the correct path of the templates directory again hook: plug a new memory leak ci: stop installing "gcc-13" for osx-gcc ci: avoid bare "gcc" for osx-gcc job ci: drop mention of BREW_INSTALL_PACKAGES variable send-email: avoid creating more than one Term::ReadLine object send-email: drop FakeTerm hack
2024-05-24	Merge branch 'fixes/2.45.1/2.40' into fixes/2.45.1/2.41	Junio C Hamano	1	-11/+2
	* fixes/2.45.1/2.40: Revert "fsck: warn about symlink pointing inside a gitdir" Revert "Add a helper function to compare file contents" clone: drop the protections where hooks aren't run tests: verify that `clone -c core.hooksPath=/dev/null` works again Revert "core.hooksPath: add some protection while cloning" init: use the correct path of the templates directory again hook: plug a new memory leak ci: stop installing "gcc-13" for osx-gcc ci: avoid bare "gcc" for osx-gcc job ci: drop mention of BREW_INSTALL_PACKAGES variable send-email: avoid creating more than one Term::ReadLine object send-email: drop FakeTerm hack
2024-05-24	Merge branch 'jc/fix-2.45.1-and-friends-for-2.39' into fixes/2.45.1/2.40	Junio C Hamano	1	-11/+2
	Revert overly aggressive "layered defence" that went into 2.45.1 and friends, which broke "git-lfs", "git-annex", and other use cases, so that we can rebuild necessary counterparts in the open. * jc/fix-2.45.1-and-friends-for-2.39: Revert "fsck: warn about symlink pointing inside a gitdir" Revert "Add a helper function to compare file contents" clone: drop the protections where hooks aren't run tests: verify that `clone -c core.hooksPath=/dev/null` works again Revert "core.hooksPath: add some protection while cloning" init: use the correct path of the templates directory again hook: plug a new memory leak ci: stop installing "gcc-13" for osx-gcc ci: avoid bare "gcc" for osx-gcc job ci: drop mention of BREW_INSTALL_PACKAGES variable send-email: avoid creating more than one Term::ReadLine object send-email: drop FakeTerm hack
2024-05-23	Merge branch 'la/hide-trailer-info'	Junio C Hamano	1	-6/+6
	The trailer API has been reshuffled a bit. * la/hide-trailer-info: trailer unit tests: inspect iterator contents trailer: document parse_trailers() usage trailer: retire trailer_info_get() from API trailer: make trailer_info struct private trailer: make parse_trailers() return trailer_info pointer interpret-trailers: access trailer_info with new helpers sequencer: use the trailer iterator trailer: teach iterator about non-trailer lines trailer: add unit tests for trailer iterator Makefile: sort UNIT_TEST_PROGRAMS
2024-05-23	Merge branch 'ps/pseudo-ref-terminology' into ps/ref-storage-migration	Junio C Hamano	1	-1/+1
	* ps/pseudo-ref-terminology: refs: refuse to write pseudorefs ref-filter: properly distinuish pseudo and root refs refs: pseudorefs are no refs refs: classify HEAD as a root ref refs: do not check ref existence in `is_root_ref()` refs: rename `is_special_ref()` to `is_pseudo_ref()` refs: rename `is_pseudoref()` to `is_root_ref()` Documentation/glossary: define root refs as refs Documentation/glossary: clarify limitations of pseudorefs Documentation/glossary: redefine pseudorefs as special refs
2024-05-23	Merge branch 'ps/refs-without-the-repository-updates' into ↵	Junio C Hamano	14	-25/+38
	ps/ref-storage-migration * ps/refs-without-the-repository-updates: refs/packed: remove references to `the_hash_algo` refs/files: remove references to `the_hash_algo` refs/files: use correct repository refs: remove `dwim_log()` refs: drop `git_default_branch_name()` refs: pass repo when peeling objects refs: move object peeling into "object.c" refs: pass ref store when detecting dangling symrefs refs: convert iteration over replace refs to accept ref store refs: retrieve worktree ref stores via associated repository refs: refactor `resolve_gitlink_ref()` to accept a repository refs: pass repo when retrieving submodule ref store refs: track ref stores via strmap refs: implement releasing ref storages refs: rename `init_db` callback to avoid confusion refs: adjust names for `init` and `init_db` callbacks
2024-05-21	clone: drop the protections where hooks aren't run	Johannes Schindelin	1	-11/+1
	As part of the security bug-fix releases v2.39.4, ..., v2.45.1, I introduced logic to safeguard `git clone` from running hooks that were installed _during_ the clone operation. The rationale was that Git's CVE-2024-32002, CVE-2021-21300, CVE-2019-1354, CVE-2019-1353, CVE-2019-1352, and CVE-2019-1349 should have been low-severity vulnerabilities but were elevated to critical/high severity by the attack vector that allows a weakness where files inside `.git/` can be inadvertently written during a `git clone` to escalate to a Remote Code Execution attack by virtue of installing a malicious `post-checkout` hook that Git will then run at the end of the operation without giving the user a chance to see what code is executed. Unfortunately, Git LFS uses a similar strategy to install its own `post-checkout` hook during a `git clone`; In fact, Git LFS is installing four separate hooks while running the `smudge` filter. While this pattern is probably in want of being improved by introducing better support in Git for Git LFS and other tools wishing to register hooks to be run at various stages of Git's commands, let's undo the clone protections to unbreak Git LFS-enabled clones. This reverts commit 8db1e8743c0 (clone: prevent hooks from running during a clone, 2024-03-28). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21	apply: fix uninitialized hash function	Junio C Hamano	1	-0/+10
	"git apply" can work outside a repository as a better "GNU patch", but when it does so, it still assumed that it can access the_hash_algo, which is no longer true in the new world order. Make sure we explicitly fall back to SHA-1 algorithm for backward compatibility. It is of dubious value to make this configurable to other hash algorithms, as the code does not use the_hash_algo for hashing purposes when working outside a repository (which is how the_hash_algo is left to NULL)---it is only used to learn the max length of the hash when parsing the object names on the "index" line, but failing to parse the "index" line is not a hard failure, and the program does not support operations like applying binary patches and --3way fallback that requires object access outside a repository. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21	builtin/hash-object: fix uninitialized hash function	Patrick Steinhardt	1	-0/+3
	The git-hash-object(1) command allows users to hash an object even without a repository. Starting with c8aed5e8da (repository: stop setting SHA1 as the default object hash, 2024-05-07), this will make us hit an uninitialized hash function, which subsequently leads to a segfault. Fix this by falling back to SHA-1 explicitly when running outside of a Git repository. Users can use GIT_DEFAULT_HASH environment to specify what hash algorithm they want, so arguably this code should not be needed. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21	builtin/patch-id: fix uninitialized hash function	Patrick Steinhardt	1	-0/+13
	In c8aed5e8da (repository: stop setting SHA1 as the default object hash, 2024-05-07), we have adapted `initialize_repository()` to no longer set up a default hash function. As this function is also used to set up `the_repository`, the consequence is that `the_hash_algo` will now by default be a `NULL` pointer unless the hash algorithm was configured properly. This is done as a mechanism to detect cases where we may be using the wrong hash function by accident. This change now causes git-patch-id(1) to segfault when it's run outside of a repository. As this command can read diffs from stdin, it does not necessarily need a repository, but then relies on `the_hash_algo` to compute the patch ID itself. It is somewhat dubious that git-patch-id(1) relies on `the_hash_algo` in the first place. Quoting its manpage: A "patch ID" is nothing but a sum of SHA-1 of the file diffs associated with a patch, with line numbers ignored. As such, it’s "reasonably stable", but at the same time also reasonably unique, i.e., two patches that have the same "patch ID" are almost guaranteed to be the same thing. We explicitly document patch IDs to be using SHA-1. Furthermore, patch IDs are supposed to be stable for most of the part. But even with the same input, the patch IDs will now be different depending on the repo's configured object hash. Work around the issue by setting up SHA-1 when there was no startup repository for now. This is arguably not the correct fix, but for now we rather want to focus on getting the segfault fixed. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-20	Merge branch 'kn/ref-transaction-symref'	Junio C Hamano	13	-16/+20
	Updates to symbolic refs can now be made as a part of ref transaction. * kn/ref-transaction-symref: refs: remove `create_symref` and associated dead code refs: rename `refs_create_symref()` to `refs_update_symref()` refs: use transaction in `refs_create_symref()` refs: add support for transactional symref updates refs: move `original_update_refname` to 'refs.c' refs: support symrefs in 'reference-transaction' hook files-backend: extract out `create_symref_lock()` refs: accept symref values in `ref_transaction_update()`
2024-05-17	refs: remove `dwim_log()`	Patrick Steinhardt	1	-1/+1
	Remove `dwim_log()` in favor of `repo_dwim_log()` so that we can get rid of one more dependency on `the_repository`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17	refs: drop `git_default_branch_name()`	Patrick Steinhardt	2	-2/+5
	The `git_default_branch_name()` function is a thin wrapper around `repo_default_branch_name()` with two differences: - We implicitly rely on `the_repository`. - We cache the default branch name. None of the callsites of `git_default_branch_name()` are hot code paths though, so the caching of the branch name is not really required. Refactor the callsites to use `repo_default_branch_name()` instead and drop `git_default_branch_name()`, thus getting rid of one more case where we rely on `the_repository`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17	refs: pass repo when peeling objects	Patrick Steinhardt	5	-7/+7
	Both `peel_object()` and `peel_iterated_oid()` implicitly rely on `the_repository` to look up objects. Despite the fact that we want to get rid of `the_repository`, it also leads to some restrictions in our ref iterators when trying to retrieve the peeled value for a repository other than `the_repository`. Refactor these functions such that both take a repository as argument and remove the now-unnecessary restrictions. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17	refs: pass ref store when detecting dangling symrefs	Patrick Steinhardt	2	-2/+4
	Both `warn_dangling_symref()` and `warn_dangling_symrefs()` derive the ref store via `the_repository`. Adapt them to instead take in the ref store as a parameter. While at it, rename the functions to have a `ref_` prefix to align them with other functions that take a ref store. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17	refs: convert iteration over replace refs to accept ref store	Patrick Steinhardt	1	-5/+8
	The function `for_each_replace_ref()` is a bit of an oddball across the refs interfaces as it accepts a pointer to the repository instead of a pointer to the ref store. The only reason for us to accept a repository is so that we can eventually pass it back to the callback function that the caller has provided. This is somewhat arbitrary though, as callers that need the repository can instead make it accessible via the callback payload. Refactor the function to instead accept the ref store and adjust callers accordingly. This allows us to get rid of some of the boilerplate that we had to carry to pass along the repository and brings us in line with the other functions that iterate through refs. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17	refs: refactor `resolve_gitlink_ref()` to accept a repository	Patrick Steinhardt	2	-5/+8
	In `resolve_gitlink_ref()` we implicitly rely on `the_repository` to look up the submodule ref store. Now that we can look up submodule ref stores for arbitrary repositories we can improve this function to instead accept a repository as parameter for which we want to resolve the gitlink. Do so and adjust callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17	refs: pass repo when retrieving submodule ref store	Patrick Steinhardt	1	-2/+4
	Looking up submodule ref stores has two deficiencies: - The initialized subrepo will be attributed to `the_repository`. - The submodule ref store will be tracked in a global map. This makes it impossible to have submodule ref stores for a repository other than `the_repository`. Modify the function to accept the parent repository as parameter and move the global map into `struct repository`. Like this it becomes possible to look up submodule ref stores for arbitrary repositories. Note that this also adds a new reference to `the_repository` in `resolve_gitlink_ref()`, which is part of the refs interfaces. This will get adjusted in the next patch. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17	refs: rename `init_db` callback to avoid confusion	Patrick Steinhardt	1	-1/+1
	Reference backends have two callbacks `init` and `init_db`. The similarity of these two callbacks has repeatedly confused me whenever I was looking at them, where I always had to look up which of them does what. Rename the `init_db` callback to `create_on_disk`, which should hopefully be clearer. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-16	Merge branch 'ps/refs-without-the-repository'	Junio C Hamano	34	-239/+384
	The refs API lost functions that implicitly assumes to work on the primary ref_store by forcing the callers to pass a ref_store as an argument. * ps/refs-without-the-repository: refs: remove functions without ref store cocci: apply rules to rewrite callers of "refs" interfaces cocci: introduce rules to transform "refs" to pass ref store refs: add `exclude_patterns` parameter to `for_each_fullref_in()` refs: introduce missing functions that accept a `struct ref_store`
2024-05-16	Merge branch 'ps/refs-without-the-repository' into ↵	Junio C Hamano	34	-239/+384
	ps/refs-without-the-repository-updates * ps/refs-without-the-repository: refs: remove functions without ref store cocci: apply rules to rewrite callers of "refs" interfaces cocci: introduce rules to transform "refs" to pass ref store refs: add `exclude_patterns` parameter to `for_each_fullref_in()` refs: introduce missing functions that accept a `struct ref_store`
2024-05-15	Merge branch 'jp/tag-trailer'	Junio C Hamano	2	-26/+32
	"git tag" learned the "--trailer" option to futz with the trailers in the same way as "git commit" does. * jp/tag-trailer: builtin/tag: add --trailer option builtin/commit: refactor --trailer logic builtin/commit: use ARGV macro to collect trailers
2024-05-15	Merge branch 'ps/config-subcommands'	Junio C Hamano	1	-98/+414
	The operation mode options (like "--get") the "git config" command uses have been deprecated and replaced with subcommands (like "git config get"). * ps/config-subcommands: builtin/config: display subcommand help builtin/config: introduce "edit" subcommand builtin/config: introduce "remove-section" subcommand builtin/config: introduce "rename-section" subcommand builtin/config: introduce "unset" subcommand builtin/config: introduce "set" subcommand builtin/config: introduce "get" subcommand builtin/config: introduce "list" subcommand builtin/config: pull out function to handle `--null` builtin/config: pull out function to handle config location builtin/config: use `OPT_CMDMODE()` to specify modes builtin/config: move "fixed-value" option to correct group builtin/config: move option array around config: clarify memory ownership when preparing comment strings
2024-05-15	ref-filter: properly distinuish pseudo and root refs	Patrick Steinhardt	1	-1/+1
	The ref-filter interfaces currently define root refs as either a detached HEAD or a pseudo ref. Pseudo refs aren't root refs though, so let's properly distinguish those ref types. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: pass data between callbacks via local variables	Patrick Steinhardt	1	-38/+52
	We use several global variables to pass data between callers and callbacks in `get_color()` and `get_colorbool()`. Convert those to use callback data structures instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: convert flags to a local variable	Patrick Steinhardt	1	-19/+29
	Both the `do_all` and `use_key_regexp` bits essentially act like flags to `get_value()`. Let's convert them to actual flags so that we can get rid of the last two remaining global variables that track options. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: track "fixed value" option via flags only	Patrick Steinhardt	1	-7/+7
	We track the "fixed value" option via two separate bits: once via the global variable `fixed_value`, and once via the CONFIG_FLAGS_FIXED_VALUE bit in `flags`. This is confusing and may easily lead to issues when one is not aware that this is tracked via two separate mechanisms. Refactor the code to use the flag exclusively. We already pass it to all the required callsites anyway, except for `collect_config()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: convert `key` to a local variable	Patrick Steinhardt	1	-2/+5
	The `key` variable is used by the `get_value()` function for two purposes: - It is used to store the result of `git_config_parse_key()`, which is then passed on to `collect_config()`. - It is used as a store to convert the provided key to an all-lowercase key when `use_key_regexp` is set. Neither of these cases warrant a global variable at all. In the former case we can pass the key via `struct collect_config_data`. And in the latter case we really only want to have it as a temporary local variable such that we can free associated memory. Refactor the code accordingly to reduce our reliance on global state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: convert `key_regexp` to a local variable	Patrick Steinhardt	1	-8/+8
	The `key_regexp` variable is used by the `format_config()` callback when `use_key_regexp` is set. It is only ever set up by its only caller, `collect_config()` and can thus easily be moved into the `collect_config_data` structure. Do so to remove our reliance on global state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: convert `regexp` to a local variable	Patrick Steinhardt	1	-9/+9
	The `regexp` variable is used by the `format_config()` callback when `CONFIG_FLAGS_FIXED_VALUE` is not set. It is only ever set up by its only caller, `collect_config()` and can thus easily be moved into the `collect_config_data` structure. Do so to remove our reliance on global state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: convert `value_pattern` to a local variable	Patrick Steinhardt	1	-3/+3
	The `value_pattern` variable is used by the `format_config()` callback when `CONFIG_FLAGS_FIXED_VALUE` is used. It is only ever set up by its only caller, `collect_config()` and can thus easily be moved into the `collect_config_data` structure. Do so to remove our reliance on global state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: convert `do_not_match` to a local variable	Patrick Steinhardt	1	-3/+3
	The `do_not_match` variable is used by the `format_config()` callback as an indicator whether or not the passed regular expression is negated. It is only ever set up by its only caller, `collect_config()` and can thus easily be moved into the `collect_config_data` structure. Do so to remove our reliance on global state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: move `respect_includes_opt` into location options	Patrick Steinhardt	1	-7/+12
	The variable tracking whether or not we want to honor includes is tracked via a global variable. Move it into the location options instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: move default value into display options	Patrick Steinhardt	1	-8/+11
	The default value is tracked via a global variable. Move it into the display options instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: move type options into display options	Patrick Steinhardt	1	-31/+29
	The type options are tracked via a global variable. Move it into the display options instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: move display options into local variables	Patrick Steinhardt	1	-70/+101
	The display options are tracked via a set of global variables. Move them into a self-contained structure so that we can easily parse all relevant options and hand them over to the various functions that require them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: move location options into local variables	Patrick Steinhardt	1	-137/+176
	The location options are tracked via a set of global variables. Move them into a self-contained structure so that we can easily parse all relevant options and hand them over to the various functions that require them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: refactor functions to have common exit paths	Patrick Steinhardt	1	-26/+38
	Refactor functions to have a single exit path. This will make it easier in subsequent commits to add common cleanup code. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: check for writeability after source is set up	Patrick Steinhardt	1	-5/+5
	The `check_write()` function verifies that we do not try to write to a config source that cannot be written to, like for example stdin. But while the new subcommands do call this function, they do so before calling `handle_config_location()`. Consequently, we only end up checking the default config location for writeability, not the location that was actually specified by the caller of git-config(1). Fix this by calling `check_write()` after `handle_config_location()`. We will further clarify the relationship between those two functions in a subsequent commit where we remove the global state that both implicitly rely on. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: move actions into `cmd_config_actions()`	Patrick Steinhardt	1	-25/+23
	We only use actions in the legacy mode. Convert them to an enum and move them into `cmd_config_actions()` to clearly demonstrate that they are not used anywhere else. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: move legacy options into `cmd_config()`	Patrick Steinhardt	1	-30/+30
	Move the legacy options as well some of the variables it references into `cmd_config_action()`. This reduces our reliance on global state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: move subcommand options into `cmd_config()`	Patrick Steinhardt	1	-14/+14
	Move the subcommand options as well as the `subcommand` variable into `cmd_config()`. This reduces our reliance on global state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: move legacy mode into its own function	Patrick Steinhardt	1	-19/+24
	In `cmd_config()` we first try to parse the provided arguments as subcommands and, if this is successful, call the respective functions of that subcommand. Otherwise we continue with the "legacy" mode that uses implicit actions and/or flags. Disentangle this by moving the legacy mode into its own function. This allows us to move the options into the respective functions and clearly separates concerns. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	builtin/config: stop printing full usage on misuse	Patrick Steinhardt	1	-17/+11
	When invoking git-config(1) with a wrong set of arguments we end up calling `usage_builtin_config()` after printing an error message that says what was wrong. As that function ends up printing the full list of options, which is quite long, the actual error message will be buried by a wall of text. This makes it really hard to figure out what exactly caused the error. Furthermore, now that we have recently introduced subcommands, the usage information may actually be misleading as we unconditionally print options of the subcommand-less mode. Fix both of these issues by just not printing the options at all anymore. Instead, we call `usage()` that makes us report in a single line what has gone wrong. This should be way more discoverable for our users and addresses the inconsistency. Furthermore, this change allow us to inline the options into the respective functions that use them to parse the command line. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	pack-bitmap: introduce `bitmap_writer_free()`	Taylor Blau	1	-1/+2
	Now that there is clearer memory ownership around the bitmap_writer structure, introduce a bitmap_writer_free() function that callers may use to free any memory associated with their instance of the bitmap_writer structure. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	pack-bitmap: drop unused `max_bitmaps` parameter	Taylor Blau	1	-2/+1
	The `max_bitmaps` parameter in `bitmap_writer_select_commits()` was introduced back in 7cc8f97108 (pack-objects: implement bitmap writing, 2013-12-21), making it original to the bitmap implementation in Git itself. When that patch was merged via 0f9e62e084 (Merge branch 'jk/pack-bitmap', 2014-02-27), its sole caller in builtin/pack-objects.c passed a value of "-1" for `max_bitmaps`, indicating no limit. Since then, the only other caller (in midx.c, added via c528e17966 (pack-bitmap: write multi-pack bitmaps, 2021-08-31)) also uses a value of "-1" for `max_bitmaps`. Since no callers have needed a finite limit for the `max_bitmaps` parameter in the nearly decade that has passed since 0f9e62e084, let's remove the parameter and any dead pieces of code connected to it. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	pack-bitmap: avoid use of static `bitmap_writer`	Taylor Blau	1	-6/+13
	The pack-bitmap machinery uses a structure called 'bitmap_writer' to collect the data necessary to write out .bitmap files. Since its introduction in 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21), there has been a single static bitmap_writer structure, which is responsible for all bitmap writing-related operations. In practice, this is OK, since we are only ever writing a single .bitmap file in a single process (e.g., `git multi-pack-index write --bitmap`, `git pack-objects --write-bitmap-index`, `git repack -b`, etc.). However, having a single static variable makes issues like data ownership unclear, when to free variables, what has/hasn't been initialized unclear. Refactor this code to be written in terms of a given bitmap_writer structure instead of relying on a static global. Note that this exposes the structure definition of the bitmap_writer at the pack-bitmap.h level. We could work around this by, e.g., forcing callers to declare their writers as: struct bitmap_writer writer; bitmap_writer_init(&bitmap_writer); and then declaring `bitmap_writer_init()` as taking in a double-pointer like so: void bitmap_writer_init(struct bitmap_writer *writer); which would avoid us having to expose the definition of the structure itself. This patch takes a different approach, since future patches (like for the ongoing pseudo-merge bitmaps work) will want to modify the innards of this structure (in the previous example, via pseudo-merge.c). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	Sync with Git 2.45.1	Junio C Hamano	3	-8/+131
	* tag 'v2.45.1': (42 commits) Git 2.45.1 Git 2.44.1 Git 2.43.4 Git 2.42.2 Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks ...
2024-05-13	Merge branch 'ps/undecided-is-not-necessarily-sha1' into ↵	Junio C Hamano	5	-7/+19
	jc/undecided-is-not-necessarily-sha1-fix * ps/undecided-is-not-necessarily-sha1: repository: stop setting SHA1 as the default object hash oss-fuzz/commit-graph: set up hash algorithm builtin/shortlog: don't set up revisions without repo builtin/diff: explicitly set hash algo when there is no repo builtin/bundle: abort "verify" early when there is no repository builtin/blame: don't access potentially unitialized `the_hash_algo` builtin/rev-parse: allow shortening to more than 40 hex characters remote-curl: fix parsing of detached SHA256 heads attr: fix BUG() when parsing attrs outside of repo attr: don't recompute default attribute source parse-options-cb: only abbreviate hashes when hash algo is known path: move `validate_headref()` to its only user path: harden validation of HEAD with non-standard hashes
2024-05-10	Merge branch 'ps/config-subcommands' into ps/builtin-config-cleanup	Junio C Hamano	1	-98/+414
	* ps/config-subcommands: builtin/config: display subcommand help builtin/config: introduce "edit" subcommand builtin/config: introduce "remove-section" subcommand builtin/config: introduce "rename-section" subcommand builtin/config: introduce "unset" subcommand builtin/config: introduce "set" subcommand builtin/config: introduce "get" subcommand builtin/config: introduce "list" subcommand builtin/config: pull out function to handle `--null` builtin/config: pull out function to handle config location builtin/config: use `OPT_CMDMODE()` to specify modes builtin/config: move "fixed-value" option to correct group builtin/config: move option array around config: clarify memory ownership when preparing comment strings
2024-05-08	Merge branch 'ps/the-index-is-no-more'	Junio C Hamano	29	-375/+359
	The singleton index_state instance "the_index" has been eliminated by always instantiating "the_repository" and replacing references to "the_index" with references to its .index member. * ps/the-index-is-no-more: repository: drop `initialize_the_repository()` repository: drop `the_index` variable builtin/clone: stop using `the_index` repository: initialize index in `repo_init()` builtin: stop using `the_index` t/helper: stop using `the_index`
2024-05-08	Merge branch 'bc/credential-scheme-enhancement'	Junio C Hamano	4	-8/+41
	The credential helper protocol, together with the HTTP layer, have been enhanced to support authentication schemes different from username & password pair, like Bearer and NTLM. * bc/credential-scheme-enhancement: credential: add method for querying capabilities credential-cache: implement authtype capability t: add credential tests for authtype credential: add support for multistage credential rounds t5563: refactor for multi-stage authentication docs: set a limit on credential line length credential: enable state capability credential: add an argument to keep state http: add support for authtype and credential docs: indicate new credential protocol fields credential: add a field called "ephemeral" credential: gate new fields on capability credential: add a field for pre-encoded credentials http: use new headers for each object request remote-curl: reset headers on new request credential: add an authtype field
2024-05-07	cocci: apply rules to rewrite callers of "refs" interfaces	Patrick Steinhardt	34	-239/+384
	Apply the rules that rewrite callers of "refs" interfaces to explicitly pass `struct ref_store`. The resulting patch has been applied with the `--whitespace=fix` option. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	refs: add `exclude_patterns` parameter to `for_each_fullref_in()`	Patrick Steinhardt	2	-4/+4
	The `for_each_fullref_in()` function is supposedly the ref-store-less equivalent of `refs_for_each_fullref_in()`, but the latter has gained a new parameter `exclude_patterns` over time. Bring these two functions back in sync again by adding the parameter to the former function, as well. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	builtin/tag: add --trailer option	John Passaro	1	-9/+29
	git-tag supports interpreting trailers from an annotated tag message, using --list --format="%(trailers)". However, the available methods to add a trailer to a tag message (namely -F or --editor) are not as ergonomic. In a previous patch, we moved git-commit's implementation of its --trailer option to the trailer.h API. Let's use that new function to teach git-tag the same --trailer option, emulating as much of git-commit's behavior as much as possible. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: John Passaro <john.a.passaro@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	builtin/commit: refactor --trailer logic	John Passaro	1	-8/+2
	git-commit adds user trailers to the commit message by passing its `--trailer` arguments to a child process running `git-interpret-trailers --in-place`. This logic is broadly useful, not just for git-commit but for other commands constructing message bodies (e.g. git-tag). Let's move this logic from git-commit to a new function in the trailer API, so that it can be re-used in other commands. Helped-by: Patrick Steinhardt <ps@pks.im> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: John Passaro <john.a.passaro@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	builtin/commit: use ARGV macro to collect trailers	John Passaro	1	-9/+1
	Replace git-commit's callback for --trailer with the standard OPT_PASSTHRU_ARGV macro. The callback only adds its values to a strvec and sanity-checks that `unset` is always false; both of these are already implemented in the parse-option API. Signed-off-by: John Passaro <john.a.passaro@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	refs: rename `refs_create_symref()` to `refs_update_symref()`	Karthik Nayak	2	-2/+2
	The `refs_create_symref()` function is used to update/create a symref. But it doesn't check the old target of the symref, if existing. It force updates the symref. In this regard, the name `refs_create_symref()` is a bit misleading. So let's rename it to `refs_update_symref()`. This is akin to how 'git-update-ref(1)' also allows us to create apart from update. While we're here, rename the arguments in the function to clarify what they actually signify and reduce confusion. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	refs: accept symref values in `ref_transaction_update()`	Karthik Nayak	6	-4/+8
	The function `ref_transaction_update()` obtains ref information and flags to create a `ref_update` and add them to the transaction at hand. To extend symref support in transactions, we need to also accept the old and new ref targets and process it. This commit adds the required parameters to the function and modifies all call sites. The two parameters added are `new_target` and `old_target`. The `new_target` is used to denote what the reference should point to when the transaction is applied. Some functions allow this parameter to be NULL, meaning that the reference is not changed. The `old_target` denotes the value the reference must have before the update. Some functions allow this parameter to be NULL, meaning that the old value of the reference is not checked. We also update the internal function `ref_transaction_add_update()` similarly to take the two new parameters. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/shortlog: don't set up revisions without repo	Patrick Steinhardt	1	-1/+1
	It is possible to run git-shortlog(1) outside of a repository by passing it output from git-log(1) via standard input. Obviously, as there is no repository in that context, it is thus unsupported to pass any revisions as arguments. Regardless of that we still end up calling `setup_revisions()`. While that works alright, it is somewhat strange. Furthermore, this is about to cause problems when we unset the default object hash. Refactor the code to only call `setup_revisions()` when we have a repository. This is safe to do as we already verify that there are no arguments when running outside of a repository anyway. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/diff: explicitly set hash algo when there is no repo	Patrick Steinhardt	1	-0/+9
	The git-diff(1) command can be used outside repositories to diff two files with each other. But even if there is no repository we will end up hashing the files that we are diffing so that we can print the "index" line: ``` diff --git a/a b/b index 7898192..6178079 100644 --- a/a +++ b/b @@ -1 +1 @@ -a +b ``` We implicitly use SHA1 to calculate the hash here, which is because `the_repository` gets initialized with SHA1 during the startup routine. We are about to stop doing this though such that `the_repository` only ever has a hash function when it was properly initialized via a repo's configuration. To give full control to our users, we would ideally add a new switch to git-diff(1) that allows them to specify the hash function when executed outside of a repository. But for now, we only convert the code to make this explicit such that we can stop setting the default hash algorithm for `the_repository`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/bundle: abort "verify" early when there is no repository	Patrick Steinhardt	1	-0/+5
	Verifying a bundle requires us to have a repository. This is encoded in `verify_bundle()`, which will return an error if there is no repository. We call `open_bundle()` before we call `verify_bundle()` though, which already performs some verifications even though we may ultimately abort due to a missing repository. This is problematic because `open_bundle()` already reads the bundle header and verifies that it contains a properly formatted hash. When there is no repository we have no clue what hash function to expect though, so we always end up assuming SHA1 here, which may or may not be correct. Furthermore, we are about to stop initializing `the_hash_algo` when there is no repository, which will lead to segfaults. Check early on whether we have a repository to fix this issue. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/blame: don't access potentially unitialized `the_hash_algo`	Patrick Steinhardt	1	-3/+2
	We access `the_hash_algo` in git-blame(1) before we have executed `parse_options_start()`, which may not be properly set up in case we have no repository. This is fine for most of the part because all the call paths that lead to it (git-blame(1), git-annotate(1) as well as git-pick-axe(1)) specify `RUN_SETUP` and thus require a repository. There is one exception though, namely when passing `-h` to print the help. Here we will access `the_hash_algo` even if there is no repo. This works fine right now because `the_hash_algo` gets sets up to point to the SHA1 algorithm via `initialize_repository()`. But we're about to stop doing this, and thus the code would lead to a `NULL` pointer exception. Prepare the code for this and only access `the_hash_algo` after we are sure that there is a proper repository. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/rev-parse: allow shortening to more than 40 hex characters	Patrick Steinhardt	1	-3/+2
	The `--short=` option for git-rev-parse(1) allows the user to specify to how many characters object IDs should be shortened to. The option is broken though for SHA256 repositories because we set the maximum allowed hash size to `the_hash_algo->hexsz` before we have even set up the repo. Consequently, `the_hash_algo` will always be SHA1 and thus we truncate every hash after at most 40 characters. Fix this by accessing `the_hash_algo` only after we have set up the repo. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	Merge branch 'ps/the-index-is-no-more' into ps/undecided-is-not-necessarily-sha1	Junio C Hamano	29	-375/+359
	* ps/the-index-is-no-more: repository: drop `initialize_the_repository()` repository: drop `the_index` variable builtin/clone: stop using `the_index` repository: initialize index in `repo_init()` builtin: stop using `the_index` t/helper: stop using `the_index`
2024-05-06	format-patch: run range-diff with larger creation-factor	Junio C Hamano	1	-1/+1
	We see too often that a range-diff added to format-patch output shows too many "unmatched" patches. This is because the default value for creation-factor is set to a relatively low value. It may be justified for other uses (like you have a yet-to-be-sent new iteration of your series, and compare it against the 'seen' branch that has an older iteration, probably with the '--left-only' option, to pick out only your patches while ignoring the others) of "range-diff" command, but when the command is run as part of the format-patch, the user _knows_ and expects that the patches in the old and the new iterations roughly correspond to each other, so we can and should use a much higher default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: display subcommand help	Patrick Steinhardt	1	-2/+3
	Until now, `git config -h` would have printed help for the old-style syntax. Now that all modes have proper subcommands though it is preferable to instead display the subcommand help. Drop the `NO_INTERNAL_HELP` flag to do so. While at it, drop the help mismatch in t0450 and add the `--get-colorbool` option to the usage such that git-config(1)'s synopsis and `git config -h` match. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: introduce "edit" subcommand	Patrick Steinhardt	1	-26/+55
	Introduce a new "edit" subcommand to git-config(1). Please refer to preceding commits regarding the motivation behind this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: introduce "remove-section" subcommand	Patrick Steinhardt	1	-0/+32
	Introduce a new "remove-section" subcommand to git-config(1). Please refer to preceding commits regarding the motivation behind this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: introduce "rename-section" subcommand	Patrick Steinhardt	1	-0/+32
	Introduce a new "rename-section" subcommand to git-config(1). Please refer to preceding commits regarding the motivation behind this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: introduce "unset" subcommand	Patrick Steinhardt	1	-0/+39
	Introduce a new "unset" subcommand to git-config(1). Please refer to preceding commits regarding the motivation behind this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: introduce "set" subcommand	Patrick Steinhardt	1	-0/+63
	Introduce a new "set" subcommand to git-config(1). Please refer to preceding commits regarding the motivation behind this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: introduce "get" subcommand	Patrick Steinhardt	1	-9/+60
	Introduce a new "get" subcommand to git-config(1). Please refer to preceding commits regarding the motivation behind this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: introduce "list" subcommand	Patrick Steinhardt	1	-12/+78
	While git-config(1) has several modes, those modes are not exposed with subcommands but instead by specifying action flags like `--unset` or `--list`. This user interface is not really in line with how our more modern commands work, where it is a lot more customary to say e.g. `git remote list`. Furthermore, to add to the confusion, git-config(1) also allows the user to request modes implicitly by just specifying the correct number of arguments. Thus, `git config foo.bar` will retrieve the value of "foo.bar" while `git config foo.bar baz` will set it to "baz". Overall, this makes for a confusing interface that could really use a makeover. It hurts discoverability of what you can do with git-config(1) and is comparatively easy to get wrong. Converting the command to have subcommands instead would go a long way to help address these issues. One concern in this context is backwards compatibility. Luckily, we can introduce subcommands without breaking backwards compatibility at all. This is because all the implicit modes of git-config(1) require that the first argument is a properly formatted config key. And as config keys _must_ have a dot in their name, any value without a dot would have been discarded by git-config(1) previous to this change. Thus, given that none of the subcommands do have a dot, they are unambiguous. Introduce the first such new subcommand, which is "git config list". To retain backwards compatibility we only conditionally use subcommands and will fall back to the old syntax in case no subcommand was detected. This should help to transition to the new-style syntax until we eventually deprecate and remove the old-style syntax. Note that the way we handle this we're duplicating some functionality across old and new syntax. While this isn't pretty, it helps us to ensure that there really is no change in behaviour for the old syntax. Amend tests such that we run them both with old and new style syntax. As tests are now run twice, state from the first run may be still be around in the second run and thus cause tests to fail. Add cleanup logic as required to fix such tests. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: pull out function to handle `--null`	Patrick Steinhardt	1	-6/+9
	Pull out function to handle the `--null` option, which we are about to reuse in subsequent commits. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: pull out function to handle config location	Patrick Steinhardt	1	-65/+68
	There's quite a bunch of options to git-config(1) that allow the user to specify which config location to use when reading or writing config options. The logic to handle this is thus by necessity also quite involved. Pull it out into a separate function so that we can reuse it in subsequent commits which introduce proper subcommands. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: use `OPT_CMDMODE()` to specify modes	Patrick Steinhardt	1	-18/+14
	The git-config(1) command has various different modes which are accessible via e.g. `--get-urlmatch` or `--unset-all`. These modes are declared with `OPT_BIT()`, which causes two minor issues: - The respective modes also have a negated form `--no-get-urlmatch`, which is unintended. - We have to manually handle exclusiveness of the modes. Switch these options to instead use `OPT_CMDMODE()`, which is made exactly for this usecase. Remove the now-unneeded check that only a single mode is given, which is now handled by the parse-options interface. While at it, format optional placeholders for arguments to conform to our style guidelines by using `[<placeholder>]`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: move "fixed-value" option to correct group	Patrick Steinhardt	1	-1/+1
	The `--fixed-value` option can be used to alter how the value-pattern parameter is interpreted for the various actions of git-config(1). But while it is an option, it is currently listed as part of the actions group, which is wrong. Move the option to the "Other" group, which hosts the various options known to git-config(1). Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: move option array around	Patrick Steinhardt	1	-48/+48
	Move around the option array. This will help us with a follow-up commit that introduces subcommands to git-config(1). Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	config: clarify memory ownership when preparing comment strings	Patrick Steinhardt	1	-5/+6
	The ownership of memory returned when preparing a comment string is quite intricate: when the returned value is different than the passed value, then the caller is responsible to free the memory. This is quite subtle, and it's even easier to miss because the returned value is in fact a `const char *`. Adapt the function to always return either `NULL` or a newly allocated string. The function is called at most once per git-config(1), so it's not like this micro-optimization really matters. Thus, callers are now always responsible for freeing the value. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02	trailer: make parse_trailers() return trailer_info pointer	Linus Arver	1	-2/+2
	This is the second and final preparatory commit for making the trailer_info struct private to the trailer implementation. Make trailer_info_get() do the actual work of allocating a new trailer_info struct, and return a pointer to it. Because parse_trailers() wraps around trailer_info_get(), it too can return this pointer to the caller. From the trailer API user's perspective, the call to trailer_info_new() can be replaced with parse_trailers(); do so in interpret-trailers. Because trailer_info_new() is no longer called by interpret-trailers, remove this function from the trailer API. With this change, we no longer allocate trailer_info on the stack --- all uses of it are via a pointer where the actual data is always allocated at runtime through trailer_info_new(). Make trailer_info_release() free this dynamically allocated memory. Finally, due to the way the function signatures of parse_trailers() and trailer_info_get() have changed, update the callsites in format_trailers_from_commit() and trailer_iterator_init() accordingly. Helped-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Linus Arver <linus@ucla.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-02	interpret-trailers: access trailer_info with new helpers	Linus Arver	1	-6/+6
	Instead of directly accessing trailer_info members, access them indirectly through new helper functions exposed by the trailer API. This is the first of two preparatory commits which will allow us to use the so-called "pimpl" (pointer to implementation) idiom for the trailer API, by making the trailer_info struct private to the trailer implementation (and thus hidden from the API). Helped-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Linus Arver <linus@ucla.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-30	Merge branch 'js/for-each-repo-keep-going'	Junio C Hamano	2	-5/+15
	A scheduled "git maintenance" job is expected to work on all repositories it knows about, but it stopped at the first one that errored out. Now it keeps going. * js/for-each-repo-keep-going: maintenance: running maintenance should not stop on errors for-each-repo: optionally keep going on an error
2024-04-30	Merge branch 'aj/stash-staged-fix'	Junio C Hamano	1	-2/+2
	"git stash -S" did not handle binary files correctly, which has been corrected. * aj/stash-staged-fix: stash: fix "--staged" with binary files
2024-04-30	Merge branch 'jc/format-patch-rfc-more'	Junio C Hamano	1	-4/+22
	The "--rfc" option of "git format-patch" learned to take an optional string value to be used in place of "RFC" to tweak the "[PATCH]" on the subject header. * jc/format-patch-rfc-more: format-patch: "--rfc=-(WIP)" appends to produce [PATCH (WIP)] format-patch: allow --rfc to optionally take a value, like --rfc=WIP
2024-04-30	Merge branch 'ds/format-patch-rfc-and-k'	Junio C Hamano	1	-1/+3
	The "-k" and "--rfc" options of "format-patch" will now error out when used together, as one tells us not to add anything to the title of the commit, and the other one tells us to add "RFC" in addition to "PATCH". * ds/format-patch-rfc-and-k: format-patch: ensure that --rfc and -k are mutually exclusive
2024-04-30	Merge branch 'xx/disable-replace-when-building-midx'	Junio C Hamano	1	-0/+3
	The procedure to build multi-pack-index got confused by the replace-refs mechanism, which has been corrected by disabling the latter. * xx/disable-replace-when-building-midx: midx: disable replace objects
2024-04-29	Sync with 2.44.1	Johannes Schindelin	3	-8/+131
	* maint-2.44: (41 commits) Git 2.44.1 Git 2.43.4 Git 2.42.2 Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel ...
2024-04-25	Merge branch 'rj/add-i-leak-fix'	Junio C Hamano	1	-3/+6
	Leakfix. * rj/add-i-leak-fix: add: plug a leak on interactive_add add-patch: plug a leak handling the '/' command add-interactive: plug a leak in get_untracked_files apply: plug a leak in apply_data
2024-04-24	maintenance: running maintenance should not stop on errors	Johannes Schindelin	1	-3/+4
	In https://github.com/microsoft/git/issues/623, it was reported that maintenance stops on a missing repository, omitting the remaining repositories that were scheduled for maintenance. This is undesirable, as it should be a best effort type of operation. It should still fail due to the missing repository, of course, but not leave the non-missing repositories in unmaintained shapes. Let's use `for-each-repo`'s shiny new `--keep-going` option that we just introduced for that very purpose. This change will be picked up when running `git maintenance start`, which is run implicitly by `scalar reconfigure`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-24	for-each-repo: optionally keep going on an error	Johannes Schindelin	1	-2/+11
	In https://github.com/microsoft/git/issues/623, it was reported that the regularly scheduled maintenance stops if one repo in the middle of the list was found to be missing. This is undesirable, and points out a gap in the design of `git for-each-repo`: We need a mode where that command does not stop on an error, but continues to try running the specified command with the other repositories. Imitating the `--keep-going` option of GNU make, this commit teaches `for-each-repo` the same trick: to continue with the operation on all the remaining repositories in case there was a problem with one repository, still setting the exit code to indicate an error occurred. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-23	Merge branch 'ps/run-auto-maintenance-in-receive-pack'	Junio C Hamano	1	-11/+10
	The "receive-pack" program (which responds to "git push") was not converted to run "git maintenance --auto" when other codepaths that used to run "git gc --auto" were updated, which has been corrected. * ps/run-auto-maintenance-in-receive-pack: builtin/receive-pack: convert to use git-maintenance(1) run-command: introduce function to prepare auto-maintenance process
2024-04-23	Merge branch 'ta/fast-import-parse-path-fix'	Junio C Hamano	1	-78/+84
	The way "git fast-import" handles paths described in its input has been tightened up and more clearly documented. * ta/fast-import-parse-path-fix: fast-import: make comments more precise fast-import: forbid escaped NUL in paths fast-import: document C-style escapes for paths fast-import: improve documentation for path quoting fast-import: remove dead strbuf fast-import: allow unquoted empty path for root fast-import: directly use strbufs for paths fast-import: tighten path unquoting
2024-04-23	format-patch: "--rfc=-(WIP)" appends to produce [PATCH (WIP)]	Junio C Hamano	1	-2/+6
	In the previous step, the "--rfc" option of "format-patch" learned to take an optional string value to prepend to the subject prefix, so that --rfc=WIP can give "[WIP PATCH]". There may be cases in which the extra string wants to come after the subject prefix. Extend the mechanism to allow "--rfc=-(WIP)" [] to signal that the extra string is to be appended instead of getting prepended, resulting in "[PATCH (WIP)]". In the documentation, discourage (ab)using "--rfc=-RFC" to say "[PATCH RFC]" just to be different, when "[RFC PATCH]" is the norm. [Footnote] The syntax takes inspiration from Perl's open syntax that opens pipes "open fh, '\|-', 'cmd'", where the dash signals "the other stuff comes here". Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-23	format-patch: allow --rfc to optionally take a value, like --rfc=WIP	Junio C Hamano	1	-4/+19
	With the "--rfc" option, we can tweak the "[PATCH]" (or whatever string specified with the "--subject-prefix" option, instead of "PATCH") that we prefix the title of the commit with into "[RFC PATCH]", but some projects may want "[rfc PATCH]". Adding a new option, e.g., "--rfc-lowercase", to support such need every time somebody wants to use different strings would lead to insanity of accumulating unbounded number of such options. Allow an optional value specified for the option, so that users can use "--rfc=rfc" (think of "--rfc" without value as a short-hand for "--rfc=RFC") if they wanted to. This can of course be (ab)used to make the prefix "[WIP PATCH]" by passing "--rfc=WIP". Passing an empty string, i.e., "--rfc=", is the same as "--no-rfc" to override an option given earlier on the same command line. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-22	add: plug a leak on interactive_add	Rubén Justo	1	-3/+6
	Plug a leak we have since 5a76aff1a6 (add: convert to use parse_pathspec, 2013-07-14). This leak can be triggered with: $ git add -p anything Fixing this leak allows us to mark as leak-free the following tests: + t3701-add-interactive.sh + t7514-commit-patch.sh Mark them with "TEST_PASSES_SANITIZE_LEAK=true" to notice and fix promply any new leak that may be introduced and triggered by them in the future. Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-22	stash: fix "--staged" with binary files	Adam Johnson	1	-2/+2
	"git stash --staged" errors out when given binary files, after saving the stash. This behaviour dates back to the addition of the feature in 41a28eb6c1 (stash: implement '--staged' option for 'push' and 'save', 2021-10-18). Adding the "--binary" option of "diff-tree" fixes this. The "diff-tree" call in stash_patch() also omits "--binary", but that is fine since binary files cannot be selected interactively. Helped-By: Jeff King <peff@peff.net> Helped-By: Randall S. Becker <randall.becker@nexbridge.ca> Signed-off-by: Adam Johnson <me@adamj.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-19	format-patch: ensure that --rfc and -k are mutually exclusive	Dragan Simic	1	-1/+3
	Fix a bug that allows the "--rfc" and "-k" options to be specified together when "git format-patch" is executed, which was introduced in the commit e0d7db7423a9 ("format-patch: --rfc honors what --subject-prefix sets"). Add a couple of additional tests to t4014, to cover additional cases of the mutual exclusivity between different "git format-patch" options. Signed-off-by: Dragan Simic <dsimic@manjaro.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-19	Sync with 2.43.4	Johannes Schindelin	3	-8/+131
	* maint-2.43: (40 commits) Git 2.43.4 Git 2.42.2 Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories ...
2024-04-19	Sync with 2.42.2	Johannes Schindelin	3	-8/+131
	* maint-2.42: (39 commits) Git 2.42.2 Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories has_dir_name(): do not get confused by characters < '/' ...
2024-04-19	Sync with 2.41.1	Johannes Schindelin	3	-8/+131
	* maint-2.41: (38 commits) Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories has_dir_name(): do not get confused by characters < '/' docs: document security issues around untrusted .git dirs ...
2024-04-19	Sync with 2.40.2	Johannes Schindelin	4	-26/+135
	* maint-2.40: (39 commits) Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories has_dir_name(): do not get confused by characters < '/' docs: document security issues around untrusted .git dirs upload-pack: disable lazy-fetching by default ...
2024-04-19	Sync with 2.39.4	Johannes Schindelin	4	-26/+135
	* maint-2.39: (38 commits) Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories has_dir_name(): do not get confused by characters < '/' docs: document security issues around untrusted .git dirs upload-pack: disable lazy-fetching by default fetch/clone: detect dubious ownership of local repositories ...
2024-04-19	Merge branch 'ownership-checks-in-local-clones'	Johannes Schindelin	1	-5/+34
	This topic addresses two CVEs: - CVE-2024-32020: Local clones may end up hardlinking files into the target repository's object database when source and target repository reside on the same disk. If the source repository is owned by a different user, then those hardlinked files may be rewritten at any point in time by the untrusted user. - CVE-2024-32021: When cloning a local source repository that contains symlinks via the filesystem, Git may create hardlinks to arbitrary user-readable files on the same filesystem as the target repository in the objects/ directory. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-19	clone: prevent hooks from running during a clone	Johannes Schindelin	1	-1/+11
	Critical security issues typically combine relatively common vulnerabilities such as case confusion in file paths with other weaknesses in order to raise the severity of the attack. One such weakness that has haunted the Git project in many a submodule-related CVE is that any hooks that are found are executed during a clone operation. Examples are the `post-checkout` and `fsmonitor` hooks. However, Git's design calls for hooks to be disabled by default, as only disabled example hooks are copied over from the templates in `<prefix>/share/git-core/templates/`. As a defense-in-depth measure, let's prevent those hooks from running. Obviously, administrators can choose to drop enabled hooks into the template directory, though, _and_ it is also possible to override `core.hooksPath`, in which case the new check needs to be disabled. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-18	builtin/clone: stop using `the_index`	Patrick Steinhardt	1	-4/+3
	Convert git-clone(1) to use `the_repository->index` instead of `the_index`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18	builtin: stop using `the_index`	Patrick Steinhardt	28	-371/+356
	Convert builtins to use `the_repository->index` instead of `the_index`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17	init: refactor the template directory discovery into its own function	Johannes Schindelin	1	-18/+4
	We will need to call this function from `hook.c` to be able to prevent hooks from running that were written as part of a `clone` but did not originate from the template directory. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17	submodule: require the submodule path to contain directories only	Johannes Schindelin	1	-1/+31
	Submodules are stored in subdirectories of their superproject. When these subdirectories have been replaced with symlinks by a malicious actor, all kinds of mayhem can be caused. This _should_ not be possible, but many CVEs in the past showed that _when_ possible, it allows attackers to slip in code that gets executed during, say, a `git clone --recursive` operation. Let's add some defense-in-depth to disallow submodule paths to have anything except directories in them. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17	clone_submodule: avoid using `access()` on directories	Johannes Schindelin	1	-1/+1
	In 0060fd1511b (clone --recurse-submodules: prevent name squatting on Windows, 2019-09-12), I introduced code to verify that a git dir either does not exist, or is at least empty, to fend off attacks where an inadvertently (and likely maliciously) pre-populated git dir would be used while cloning submodules recursively. The logic used `access(<path>, X_OK)` to verify that a directory exists before calling `is_empty_dir()` on it. That is a curious way to check for a directory's existence and might well fail for unwanted reasons. Even the original author (it was I ;-) ) struggles to explain why this function was used rather than `stat()`. This code was _almost_ copypastad in the previous commit, but that `access()` call was caught during review. Let's use `stat()` instead also in the code that was almost copied verbatim. Let's not use `lstat()` because in the unlikely event that somebody snuck a symbolic link in, pointing to a crafted directory, we want to verify that that directory is empty. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17	submodules: submodule paths must not contain symlinks	Johannes Schindelin	1	-0/+35
	When creating a submodule path, we must be careful not to follow symbolic links. Otherwise we may follow a symbolic link pointing to a gitdir (which are valid symbolic links!) e.g. while cloning. On case-insensitive filesystems, however, we blindly replace a directory that has been created as part of the `clone` operation with a symlink when the path to the latter differs only in case from the former's path. Let's simply avoid this situation by expecting not ever having to overwrite any existing file/directory/symlink upon cloning. That way, we won't even replace a directory that we just created. This addresses CVE-2024-32002. Reported-by: Filip Hejsek <filip.hejsek@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17	clone: prevent clashing git dirs when cloning submodule in parallel	Filip Hejsek	1	-0/+17
	While it is expected to have several git dirs within the `.git/modules/` tree, it is important that they do not interfere with each other. For example, if one submodule was called "captain" and another submodule "captain/hooks", their respective git dirs would clash, as they would be located in `.git/modules/captain/` and `.git/modules/captain/hooks/`, respectively, i.e. the latter's files could clash with the actual Git hooks of the former. To prevent these clashes, and in particular to prevent hooks from being written and then executed as part of a recursive clone, we introduced checks as part of the fix for CVE-2019-1387 in a8dee3ca61 (Disallow dubiously-nested submodule git directories, 2019-10-01). It is currently possible to bypass the check for clashing submodule git dirs in two ways: 1. parallel cloning 2. checkout --recurse-submodules Let's check not only before, but also after parallel cloning (and before checking out the submodule), that the git dir is not clashing with another one, otherwise fail. This addresses the parallel cloning issue. As to the parallel checkout issue: It requires quite a few manual steps to create clashing git dirs because Git itself would refuse to initialize the inner one, as demonstrated by the test case. Nevertheless, let's teach the recursive checkout (namely, the `submodule_move_head()` function that is used by the recursive checkout) to be careful to verify that it does not use a clashing git dir, and if it does, disable it (by deleting the `HEAD` file so that subsequent Git calls won't recognize it as a git dir anymore). Note: The parallel cloning test case contains a `cat err` that proved to be highly useful when analyzing the racy nature of the operation (the operation can fail with three different error messages, depending on timing), and was left on purpose to ease future debugging should the need arise. Signed-off-by: Filip Hejsek <filip.hejsek@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17	upload-pack: disable lazy-fetching by default	Jeff King	1	-0/+2
	The upload-pack command tries to avoid trusting the repository in which it's run (e.g., by not running any hooks and not using any config that contains arbitrary commands). But if the server side of a fetch or a clone is a partial clone, then either upload-pack or its child pack-objects may run a lazy "git fetch" under the hood. And it is very easy to convince fetch to run arbitrary commands. The "server" side can be a local repository owned by someone else, who would be able to configure commands that are run during a clone with the current user's permissions. This issue has been designated CVE-2024-32004. The fix in this commit's parent helps in this scenario, as well as in related scenarios using SSH to clone, where the untrusted .git directory is owned by a different user id. But if you received one as a zip file, on a USB stick, etc, it may be owned by your user but still untrusted. This has been designated CVE-2024-32465. To mitigate the issue more completely, let's disable lazy fetching entirely during `upload-pack`. While fetching from a partial repository should be relatively rare, it is certainly not an unreasonable workflow. And thus we need to provide an escape hatch. This commit works by respecting a GIT_NO_LAZY_FETCH environment variable (to skip the lazy-fetch), and setting it in upload-pack, but only when the user has not already done so (which gives us the escape hatch). The name of the variable is specifically chosen to match what has already been added in 'master' via e6d5479e7a (git: extend --no-lazy-fetch to work across subprocesses, 2024-02-27). Since we're building this fix as a backport for older versions, we could cherry-pick that patch and its earlier steps. However, we don't really need the niceties (like a "--no-lazy-fetch" option) that it offers. By using the same name, everything should just work when the two are eventually merged, but here are a few notes: - the blocking of the fetch in e6d5479e7a is incomplete! It sets fetch_if_missing to 0 when we setup the repository variable, but that isn't enough. pack-objects in particular will call prefetch_to_pack() even if that variable is 0. This patch by contrast checks the environment variable at the lowest level before we call the lazy fetch, where we can be sure to catch all code paths. Possibly the setting of fetch_if_missing from e6d5479e7a can be reverted, but it may be useful to have. For example, some code may want to use that flag to change behavior before it gets to the point of trying to start the fetch. At any rate, that's all outside the scope of this patch. - there's documentation for GIT_NO_LAZY_FETCH in e6d5479e7a. We can live without that here, because for the most part the user shouldn't need to set it themselves. The exception is if they do want to override upload-pack's default, and that requires a separate documentation section (which is added here) - it would be nice to use the NO_LAZY_FETCH_ENVIRONMENT macro added by e6d5479e7a, but those definitions have moved from cache.h to environment.h between 2.39.3 and master. I just used the raw string literals, and we can replace them with the macro once this topic is merged to master. At least with respect to CVE-2024-32004, this does render this commit's parent commit somewhat redundant. However, it is worth retaining that commit as defense in depth, and because it may help other issues (e.g., symlink/hardlink TOCTOU races, where zip files are not really an interesting attack vector). The tests in t0411 still pass, but now we have _two_ mechanisms ensuring that the evil command is not run. Let's beef up the existing ones to check that they failed for the expected reason, that we refused to run upload-pack at all with an alternate user id. And add two new ones for the same-user case that both the restriction and its escape hatch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17	midx: disable replace objects	Xing Xin	1	-0/+3
	We observed a series of clone failures arose in a specific set of repositories after we fully enabled the MIDX bitmap feature within our Codebase service. These failures were accompanied with error messages such as: Cloning into bare repository 'clone.git'... remote: Enumerating objects: 8, done. remote: Total 8 (delta 0), reused 0 (delta 0), pack-reused 8 (from 1) Receiving objects: 100% (8/8), done. fatal: did not receive expected object ... fatal: fetch-pack: invalid index-pack output Temporarily disabling the MIDX feature eliminated the reported issues. After some investigation we found that all repositories experiencing failures contain replace references, which seem to be improperly acknowledged by the MIDX bitmap generation logic. A more thorough explanation about the root cause from Taylor Blau says: Indeed, the pack-bitmap-write machinery does not itself call disable_replace_refs(). So when it generates a reachability bitmap, it is doing so with the replace refs in mind. You can see that this is indeed the cause of the problem by looking at the output of an instrumented version of Git that indicates what bits are being set during the bitmap generation phase. With replace refs (incorrectly) enabled, we get: [2, 4, 6, 8, 13, 3, 6, 7, 3, 4, 6, 8] and doing the same after calling disable_replace_refs(), we instead get: [2, 5, 6, 13, 3, 6, 7, 3, 4, 6, 8] Single pack bitmaps are unaffected by this issue because we generate them from within pack-objects, which does call disable_replace_refs(). This patch updates the MIDX logic to disable replace objects within the multi-pack-index builtin, and a test showing a clone (which would fail with MIDX bitmap) is added to demonstrate the bug. Helped-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Xing Xin <xingxin.xx@bytedance.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17	builtin/receive-pack: convert to use git-maintenance(1)	Patrick Steinhardt	1	-11/+10
	In 850b6edefa (auto-gc: extract a reusable helper from "git fetch", 2020-05-06), we have introduced a helper function `run_auto_gc()` that kicks off `git gc --auto`. The intent of this function was to pass down the "--quiet" flag to git-gc(1) as required without duplicating this at all callsites. In 7c3e9e8cfb (auto-gc: pass --quiet down from am, commit, merge and rebase, 2020-05-06) we then converted callsites that need to pass down this flag to use the new helper function. This has the notable omission of git-receive-pack(1), which is the only remaining user of `git gc --auto` that sets up the proccess manually. This is probably because it unconditionally passes down the `--quiet` flag and thus didn't benefit much from the new helper function. In a95ce12430 (maintenance: replace run_auto_gc(), 2020-09-17) we then replaced `run_auto_gc()` with `run_auto_maintenance()` which invokes git-maintenance(1) instead of git-gc(1). This command is the modern replacement for git-gc(1) and is both more thorough and also more flexible because administrators can configure which tasks exactly to run during maintenance. But due to git-receive-pack(1) not using `run_auto_gc()` in the first place it did not get converted to use git-maintenance(1) like we do everywhere else now. Address this oversight and start to use the newly introduced function `prepare_auto_maintenance()`. This will also make it easier for us to adapt this code together with all the other callsites that invoke auto-maintenance in the future. This removes the last internal user of `git gc --auto`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16	credential: add method for querying capabilities	brian m. carlson	2	-0/+16
	Right now, there's no specific way to determine whether a credential helper or git credential itself supports a given set of capabilities. It would be helpful to have such a way, so let's let credential helpers and git credential take an argument, "capability", which has it list the capabilities and a version number on standard output. Specifically choose a format that is slightly different from regular credential output and assume that no capabilities are supported if a non-zero exit status occurs or the data deviates from the format. It is common for users to write small shell scripts as the argument to credential.helper, which will almost never be designed to emit capabilities. We want callers to gracefully handle this case by assuming that they are not capable of extended support because that is almost certainly the case, and specifying the error behavior up front does this and preserves backwards compatibility in a graceful way. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16	credential-cache: implement authtype capability	brian m. carlson	1	-3/+17
	Now that we have full support in Git for the authtype capability, let's add support to the cache credential helper. When parsing data, we always set the initial capabilities because we're the helper, and we need both the initial and helper capabilities to be set in order to have the helper capabilities take effect. When emitting data, always emit the supported capability and make sure we emit items only if we have them and they're supported by the caller. Since we may no longer have a username or password, be sure to emit those conditionally as well so we don't segfault on a NULL pointer. Similarly, when comparing credentials, consider both the password and credential fields when we're matching passwords. Adjust the partial credential detection code so that we can store credentials missing a username or password as long as they have an authtype and credential. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16	credential: add support for multistage credential rounds	brian m. carlson	1	-0/+1
	Over HTTP, NTLM and Kerberos require two rounds of authentication on the client side. It's possible that there are custom authentication schemes that also implement this same approach. Since these are tricky schemes to implement and the HTTP library in use may not always handle them gracefully on all systems, it would be helpful to allow the credential helper to implement them instead for increased portability and robustness. To allow this to happen, add a boolean flag, continue, that indicates that instead of failing when we get a 401, we should retry another round of authentication. However, this necessitates some changes in our current credential code so that we can make this work. Keep the state[] headers between iterations, but only use them to send to the helper and only consider the new ones we read from the credential helper to be valid on subsequent iterations. That avoids us passing stale data when we finally approve or reject the credential. Similarly, clear the multistage and wwwauth[] values appropriately so that we don't pass stale data or think we're trying a multiround response when we're not. Remove the credential values so that we can actually fill a second time with new responses. Limit the number of iterations of reauthentication we do to 3. This means that if there's a problem, we'll terminate with an error message instead of retrying indefinitely and not informing the user (and possibly conducting a DoS on the server). In our tests, handle creating multiple response output files from our helper so we can verify that each of the messages sent is correct. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16	credential: gate new fields on capability	brian m. carlson	3	-5/+7
	We support the new credential and authtype fields, but we lack a way to indicate to a credential helper that we'd like them to be used. Without some sort of indication, the credential helper doesn't know if it should try to provide us a username and password, or a pre-encoded credential. For example, the helper might prefer a more restricted Bearer token if pre-encoded credentials are possible, but might have to fall back to more general username and password if not. Let's provide a simple way to indicate whether Git (or, for that matter, the helper) is capable of understanding the authtype and credential fields. We send this capability when we generate a request, and the other side may reply to indicate to us that it does, too. For now, don't enable sending capabilities for the HTTP code. In a future commit, we'll introduce appropriate handling for that code, which requires more in-depth work. The logic for determining whether a capability is supported may seem complex, but it is not. At each stage, we emit the capability to the following stage if all preceding stages have declared it. Thus, if the caller to git credential fill didn't declare it, then we won't send it to the helper, and if fill's caller did send but the helper doesn't understand it, then we won't send it on in the response. If we're an internal user, then we know about all capabilities and will request them. For "git credential approve" and "git credential reject", we set the helper capability before calling the helper, since we assume that the input we're getting from the external program comes from a previous call to "git credential fill", and thus we'll invoke send a capability to the helper if and only if we got one from the standard input, which is the correct behavior. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17	builtin/clone: refuse local clones of unsafe repositories	Patrick Steinhardt	1	-0/+14
	When performing a local clone of a repository we end up either copying or hardlinking the source repository into the target repository. This is significantly more performant than if we were to use git-upload-pack(1) and git-fetch-pack(1) to create the new repository and preserves both disk space and compute time. Unfortunately though, performing such a local clone of a repository that is not owned by the current user is inherently unsafe: - It is possible that source files get swapped out underneath us while we are copying or hardlinking them. While we do perform some checks here to assert that we hardlinked the expected file, they cannot reliably thwart time-of-check-time-of-use (TOCTOU) style races. It is thus possible for an adversary to make us copy or hardlink unexpected files into the target directory. Ideally, we would address this by starting to use openat(3P), fstatat(3P) and friends. Due to platform compatibility with Windows we cannot easily do that though. Furthermore, the scope of these fixes would likely be quite broad and thus not fit for an embargoed security release. - Even if we handled TOCTOU-style races perfectly, hardlinking files owned by a different user into the target repository is not a good idea in general. It is possible for an adversary to rewrite those files to contain whatever data they want even after the clone has completed. Address these issues by completely refusing local clones of a repository that is not owned by the current user. This reuses our existing infra we have in place via `ensure_valid_ownership()` and thus allows a user to override the safety guard by adding the source repository path to the "safe.directory" configuration. This addresses CVE-2024-32020. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17	builtin/clone: abort when hardlinked source and target file differ	Patrick Steinhardt	1	-1/+20
	When performing local clones with hardlinks we refuse to copy source files which are symlinks as a mitigation for CVE-2022-39253. This check can be raced by an adversary though by changing the file to a symlink after we have checked it. Fix the issue by checking whether the hardlinked destination file matches the source file and abort in case it doesn't. This addresses CVE-2024-32021. Reported-by: Apple Product Security <product-security@apple.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17	builtin/clone: stop resolving symlinks when copying files	Patrick Steinhardt	1	-5/+1
	When a user performs a local clone without `--no-local`, then we end up copying the source repository into the target repository directly. To optimize this even further, we try to hardlink files into place instead of copying data over, which helps both disk usage and speed. There is an important edge case in this context though, namely when we try to hardlink symlinks from the source repository into the target repository. Depending on both platform and filesystem the resulting behaviour here can be different: - On macOS and NetBSD, calling link(3P) with a symlink target creates a hardlink to the file pointed to by the symlink. - On Linux, calling link(3P) instead creates a hardlink to the symlink itself. To unify this behaviour, 36596fd2df (clone: better handle symlinked files at .git/objects/, 2019-07-10) introduced logic to resolve symlinks before we try to link(3P) files. Consequently, the new behaviour was to always create a hard link to the target of the symlink on all platforms. Eventually though, we figured out that following symlinks like this can cause havoc when performing a local clone of a malicious repository, which resulted in CVE-2022-39253. This issue was fixed via 6f054f9fb3 (builtin/clone.c: disallow `--local` clones with symlinks, 2022-07-28), by refusing symlinks in the source repository. But even though we now shouldn't ever link symlinks anymore, the code that resolves symlinks still exists. In the best case the code does not end up doing anything because there are no symlinks anymore. In the worst case though this can be abused by an adversary that rewrites the source file after it has been checked not to be a symlink such that it actually is a symlink when we call link(3P). Thus, it is still possible to recreate CVE-2022-39253 due to this time-of-check-time-of-use bug. Remove the call to `realpath()`. This doesn't yet address the actual vulnerability, which will be handled in a subsequent commit. Reported-by: Apple Product Security <product-security@apple.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-16	Merge branch 'rs/date-mode-pass-by-value'	Junio C Hamano	1	-2/+2
	The codepaths that reach date_mode_from_type() have been updated to pass "struct date_mode" by value to make them thread safe. * rs/date-mode-pass-by-value: date: make DATE_MODE thread-safe
2024-04-15	Merge branch 'gt/add-u-commit-i-pathspec-check'	Junio C Hamano	3	-4/+17
	"git add -u <pathspec>" and "git commit [-i] <pathspec>" did not diagnose a pathspec element that did not match any files in certain situations, unlike "git add <pathspec>" did. * gt/add-u-commit-i-pathspec-check: builtin/add: error out when passing untracked path with -u builtin/commit: error out when passing untracked path with -i revision: optionally record matches with pathspec elements
2024-04-15	Merge branch 'ds/fetch-config-parse-microfix'	Junio C Hamano	1	-0/+1
	A config parser callback function fell through instead of returning after recognising and processing a variable, wasting cycles, which has been corrected. * ds/fetch-config-parse-microfix: fetch: return when parsing submodule.recurse
2024-04-15	Merge branch 'ma/win32-unix-domain-socket'	Junio C Hamano	2	-0/+5
	Windows binary used to decide the use of unix-domain socket at build time, but it learned to make the decision at runtime instead. * ma/win32-unix-domain-socket: Win32: detect unix socket support at runtime
2024-04-15	fast-import: make comments more precise	Thalia Archibald	1	-3/+3
	The former is somewhat imprecise. The latter became out of sync with the behavior in e814c39c2f (fast-import: refactor parsing of spaces, 2014-06-18). Signed-off-by: Thalia Archibald <thalia@archibald.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15	fast-import: forbid escaped NUL in paths	Thalia Archibald	1	-0/+2
	NUL cannot appear in paths. Even disregarding filesystem path limitations, the tree object format delimits with NUL, so such a path cannot be encoded by Git. When a quoted path is unquoted, it could possibly contain NUL from "\000". Forbid it so it isn't truncated. fast-import still has other issues with NUL, but those will be addressed later. Signed-off-by: Thalia Archibald <thalia@archibald.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15	fast-import: remove dead strbuf	Thalia Archibald	1	-5/+0
	The strbuf in `note_change_n` is to copy the remainder of `p` before potentially invalidating it when reading the next line. However, `p` is not used after that point. It has been unused since the function was created in a8dd2e7d2b (fast-import: Add support for importing commit notes, 2009-10-09) and looks to be a fossil from adapting `file_change_m`. Remove it. Signed-off-by: Thalia Archibald <thalia@archibald.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15	fast-import: allow unquoted empty path for root	Thalia Archibald	1	-3/+0
	Ever since filerename was added in f39a946a1f (Support wholesale directory renames in fast-import, 2007-07-09) and filecopy in b6f3481bb4 (Teach fast-import to recursively copy files/directories, 2007-07-15), both have produced an error when the destination path is empty. Later, when support for targeting the root directory with an empty string was added in 2794ad5244 (fast-import: Allow filemodify to set the root, 2010-10-10), this had the effect of allowing the quoted empty string (`""`), but forbidding its unquoted variant (``). This seems to have been intended as simple data validation for parsing two paths, rather than a syntax restriction, because it was not extended to the other operations. All other occurrences of paths (in filemodify, filedelete, the source of filecopy and filerename, and ls) allow both. For most of this feature's lifetime, the documentation has not prescribed the use of quoted empty strings. In e5959106d6 (Documentation/fast-import: put explanation of M 040000 <dataref> "" in context, 2011-01-15), its documentation was changed from “`<path>` may also be an empty string (`""`) to specify the root of the tree” to “The root of the tree can be represented by an empty string as `<path>`”. Thus, we should assume that some front-ends have depended on this behavior. Remove this restriction for the destination paths of filecopy and filerename and change tests targeting the root to test `""` and ``. Signed-off-by: Thalia Archibald <thalia@archibald.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15	fast-import: directly use strbufs for paths	Thalia Archibald	1	-37/+27
	Previously, one case would not write the path to the strbuf: when the path is unquoted and at the end of the string. It was essentially copy-on-write. However, with the logic simplification of the previous commit, this case was eliminated and the strbuf is always populated. Directly use the strbufs now instead of an alias. Since this already changes all the lines that use the strbufs, rename them from `uq` to be more descriptive. That they are unquoted is not their most important property, so name them after what they carry. Additionally, `file_change_m` no longer needs to copy the path before reading inline data. Signed-off-by: Thalia Archibald <thalia@archibald.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15	fast-import: tighten path unquoting	Thalia Archibald	1	-43/+65
	Path parsing in fast-import is inconsistent and many unquoting errors are suppressed or not checked. <path> appears in the grammar in these places: filemodify ::= 'M' SP <mode> (<dataref> \| 'inline') SP <path> LF filedelete ::= 'D' SP <path> LF filecopy ::= 'C' SP <path> SP <path> LF filerename ::= 'R' SP <path> SP <path> LF ls ::= 'ls' SP <dataref> SP <path> LF ls-commit ::= 'ls' SP <path> LF and fast-import.c parses them in five different ways: 1. For filemodify and filedelete: Try to unquote <path>. If it unquotes without errors, use the unquoted version; otherwise, treat it as literal bytes to the end of the line (including any number of SP). 2. For filecopy (source) and filerename (source): Try to unquote <path>. If it unquotes without errors, use the unquoted version; otherwise, treat it as literal bytes up to, but not including, the next SP. 3. For filecopy (dest) and filerename (dest): Like 1., but an unquoted empty string is forbidden. 4. For ls: If <path> starts with `"`, unquote it and report parse errors; otherwise, treat it as literal bytes to the end of the line (including any number of SP). 5. For ls-commit: Unquote <path> and report parse errors. (It must start with `"` to disambiguate from ls.) In the first three, any errors from trying to unquote a string are suppressed, so a quoted string that contains invalid escapes would be interpreted as literal bytes. For example, `"\xff"` would fail to unquote (because hex escapes are not supported), and it would instead be interpreted as the byte sequence '"', '\\', 'x', 'f', 'f', '"', which is certainly not intended. Some front-ends erroneously use their language's standard quoting routine instead of matching Git's, which could silently introduce escapes that would be incorrectly parsed due to this and lead to data corruption. The documentation states “To use a source path that contains SP the path must be quoted.”, so it is expected that some implementations depend on spaces being allowed in paths in the final position. Thus we have two documented ways to parse paths, so simplify the implementation to that. Now we have: 1. `parse_path_eol` for filemodify, filedelete, filecopy (dest), filerename (dest), ls, and ls-commit: If <path> starts with `"`, unquote it and report parse errors; otherwise, treat it as literal bytes to the end of the line (including any number of SP). 2. `parse_path_space` for filecopy (source) and filerename (source): If <path> starts with `"`, unquote it and report parse errors; otherwise, treat it as literal bytes up to, but not including, the next SP. It must be followed by SP. There remain two special cases: The dest <path> in filecopy and rename cannot be an unquoted empty string (this will be addressed subsequently) and <path> in ls-commit must be quoted to disambiguate it from ls. Signed-off-by: Thalia Archibald <thalia@archibald.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12	Merge branch 'jc/checkout-detach-wo-tracking-report'	Junio C Hamano	1	-1/+2
	"git checkout/switch --detach foo", after switching to the detached HEAD state, gave the tracking information for the 'foo' branch, which was pointless. Tested-by: M Hickford <mirth.hickford@gmail.com> cf. <CAGJzqsmE9FDEBn=u3ge4LA3ha4fDbm4OWiuUbMaztwjELBd7ug@mail.gmail.com> * jc/checkout-detach-wo-tracking-report: checkout: omit "tracking" information on a detached HEAD
2024-04-12	Merge branch 'js/merge-tree-3-trees'	Junio C Hamano	1	-1/+1
	Match the option argument type in the help text to the correct type updated by a recent series. * js/merge-tree-3-trees: merge-tree: fix argument type of the `--merge-base` option
2024-04-12	merge-tree: fix argument type of the `--merge-base` option	Johannes Schindelin	1	-1/+1
	In 5f43cf5b2e4 (merge-tree: accept 3 trees as arguments, 2024-01-28), I taught `git merge-tree` to perform three-way merges on trees. This commit even changed the manual page to state that the `--merge-base` option takes a tree-ish rather than requiring a commit. But I forgot to adjust the in-program help text. This patch fixes that. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-10	Merge branch 'kn/clarify-update-ref-doc'	Junio C Hamano	1	-13/+13
	Doc update, as a preparation to enhance "git update-ref --stdin". * kn/clarify-update-ref-doc: githooks: use {old,new}-oid instead of {old,new}-value update-ref: use {old,new}-oid instead of {old,new}value
2024-04-09	Merge branch 'rj/use-adv-if-enabled'	Junio C Hamano	1	-11/+7
	Use advice_if_enabled() API to rewrite a simple pattern to call advise() after checking advice_enabled(). * rj/use-adv-if-enabled: add: use advise_if_enabled for ADVICE_ADD_EMBEDDED_REPO add: use advise_if_enabled for ADVICE_ADD_EMPTY_PATHSPEC add: use advise_if_enabled for ADVICE_ADD_IGNORED_FILE
2024-04-09	Merge branch 'ps/pack-refs-auto'	Junio C Hamano	2	-48/+69
	"git pack-refs" learned the "--auto" option, which is a useful addition to be triggered from "git gc --auto". Acked-by: Karthik Nayak <karthik.188@gmail.com> cf. <CAOLa=ZRAEA7rSUoYL0h-2qfEELdbPHbeGpgBJRqesyhHi9Q6WQ@mail.gmail.com> * ps/pack-refs-auto: builtin/gc: pack refs when using `git maintenance run --auto` builtin/gc: forward git-gc(1)'s `--auto` flag when packing refs t6500: extract objects with "17" prefix builtin/gc: move `struct maintenance_run_opts` builtin/pack-refs: introduce new "--auto" flag builtin/pack-refs: release allocated memory refs/reftable: expose auto compaction via new flag refs: remove `PACK_REFS_ALL` flag refs: move `struct pack_refs_opts` to where it's used t/helper: drop pack-refs wrapper refs/reftable: print errors on compaction failure reftable/stack: gracefully handle failed auto-compaction due to locks reftable/stack: use error codes when locking fails during compaction reftable/error: discern locked/outdated errors reftable/stack: fix error handling in `reftable_stack_init_addition()`
2024-04-05	date: make DATE_MODE thread-safe	René Scharfe	1	-2/+2
	date_mode_from_type() modifies a static variable and returns a pointer to it. This is not thread-safe. Most callers of date_mode_from_type() use it via the macro DATE_MODE and pass its result on to functions like show_date(), which take a const pointer and don't modify the struct. Avoid the static storage by putting the variable on the stack and returning the whole struct date_mode. Change functions that take a constant pointer to expect the whole struct instead. Reduce the cost of passing struct date_mode around on 64-bit systems by reordering its members to close the hole between the 32-bit wide .type and the 64-bit aligned .strftime_fmt as well as the alignment hole at the end. sizeof reports 24 before and 16 with this change on x64. Keep .type at the top to still allow initialization without designator -- though that's only done in a single location, in builtin/blame.c. Signed-off-by: René Scharfe <l.s.r@web.de> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05	Merge branch 'jk/core-comment-string'	Junio C Hamano	9	-38/+39
	core.commentChar used to be limited to a single byte, but has been updated to allow an arbitrary multi-byte sequence. * jk/core-comment-string: config: add core.commentString config: allow multi-byte core.commentChar environment: drop comment_line_char compatibility macro wt-status: drop custom comment-char stringification sequencer: handle multi-byte comment characters when writing todo list find multi-byte comment chars in unterminated buffers find multi-byte comment chars in NUL-terminated strings prefer comment_line_str to comment_line_char for printing strbuf: accept a comment string for strbuf_add_commented_lines() strbuf: accept a comment string for strbuf_commented_addf() strbuf: accept a comment string for strbuf_stripspace() environment: store comment_line_char as a string strbuf: avoid shadowing global comment_line_char name commit: refactor base-case of adjust_comment_line_char() strbuf: avoid static variables in strbuf_add_commented_lines() strbuf: simplify comment-handling in add_lines() helper config: forbid newline as core.commentChar
2024-04-05	Merge branch 'rs/config-comment'	Junio C Hamano	4	-12/+22
	"git config" learned "--comment=<message>" option to leave a comment immediately after the "variable = value" on the same line in the configuration file. * rs/config-comment: config: allow tweaking whitespace between value and comment config: fix --comment formatting config: add --comment option to add a comment
2024-04-05	fetch: return when parsing submodule.recurse	Derrick Stolee	1	-0/+1
	When parsing config keys, the normal pattern is to return 0 after completing the logic for a specific config key, since no other key will match. One instance, for "submodule.recurse", was missing this case in builtin/fetch.c. This is a very minor change, and will have minimal impact to performance. This particular block was edited recently in 56e8bb4fb4 (fetch: use `fetch_config` to store "fetch.recurseSubmodules" value, 2023-05-17), which led to some hesitation that perhaps this omission was on purpose. However, no later cases within git_fetch_config() will match the key if equal to "submodule.recurse" and neither will any key matches within the catch-all git_default_config(). Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03	builtin/add: error out when passing untracked path with -u	Ghanshyam Thakkar	1	-1/+8
	When passing untracked path with -u option, it silently succeeds. There is no error message and the exit code is zero. This is inconsistent with other instances of git commands where the expected argument is a known path. In those other instances, we error out when the path is not known. Fix this by passing a character array to add_files_to_cache() to collect the pathspec matching information and report the error if a pathspec does not match any cache entry. Also add a testcase to cover this scenario. Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03	builtin/commit: error out when passing untracked path with -i	Ghanshyam Thakkar	1	-1/+6
	When we provide a pathspec which does not match any tracked path alongside --include, we do not error like without --include. If there is something staged, it will commit the staged changes and ignore the pathspec which does not match any tracked path. And if nothing is staged, it will print the status. Exit code is 0 in both cases (unlike without --include). This is also described in the TODO comment before the relevant testcase. Fix this by passing a character array to add_files_to_cache() to collect the pathspec matching information and error out if the given path is untracked. Also, amend the testcase to check for the error message and remove the TODO comment. Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03	revision: optionally record matches with pathspec elements	Junio C Hamano	3	-4/+5
	Unlike "git add" and other end-user facing commands, where it is diagnosed as an error to give a pathspec with an element that does not match any path, the diff machinery does not care if some elements of the pathspec do not match. Given that the diff machinery is heavily used in pathspec-limited "git log" machinery, and it is common for a path to come and go while traversing the project history, this is usually a good thing. However, in some cases we would want to know if all the pathspec elements matched. For example, "git add -u <pathspec>" internally uses the machinery used by "git diff-files" to decide contents from what paths to add to the index, and as an end-user facing command, "git add -u" would want to report an unmatched pathspec element. Add a new .ps_matched member next to the .prune_data member in "struct rev_info" so that we can optionally keep track of the use of .prune_data pathspec elements that can be inspected by the caller. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03	Win32: detect unix socket support at runtime	Matthias Aßhauer	2	-0/+5
	Windows 10 build 17063 introduced support for unix sockets to Windows. bb390b1 (git-compat-util: include declaration for unix sockets in windows, 2021-09-14) introduced a way to build git with unix socket support on Windows, but you still had to decide at build time which Windows version the compiled executable was supposed to run on. We can detect at runtime wether the operating system supports unix sockets and act accordingly for all supported Windows versions. This fixes https://github.com/git-for-windows/git/issues/3892 Signed-off-by: Matthias Aßhauer <mha1993@live.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03	Merge branch 'bl/cherry-pick-empty'	Junio C Hamano	2	-7/+47
	Allow git-cherry-pick(1) to automatically drop redundant commits via a new `--empty` option, similar to the `--empty` options for git-rebase(1) and git-am(1). Includes a soft deprecation of `--keep-redundant-commits` as well as some related docs changes and sequencer code cleanup. * bl/cherry-pick-empty: cherry-pick: add `--empty` for more robust redundant commit handling cherry-pick: enforce `--keep-redundant-commits` incompatibility sequencer: do not require `allow_empty` for redundant commit options sequencer: handle unborn branch with `--allow-empty` rebase: update `--empty=ask` to `--empty=stop` docs: clean up `--empty` formatting in git-rebase(1) and git-am(1) docs: address inaccurate `--empty` default with `--exec`
2024-04-03	Merge branch 'rs/strbuf-expand-bad-format'	Junio C Hamano	3	-26/+10
	Code clean-up. * rs/strbuf-expand-bad-format: cat-file: use strbuf_expand_bad_format() factor out strbuf_expand_bad_format()
2024-04-02	update-ref: use {old,new}-oid instead of {old,new}value	Karthik Nayak	1	-13/+13
	The `git-update-ref` command is used to modify references. The usage of {old,new}value in the documentation refers to the OIDs. This is fine since the command only works with regular references which hold OIDs. But if the command is updated to support symrefs, we'd also be dealing with {old,new}-refs. To improve clarity around what exactly {old,new}value mean, let's rename it to {old,new}-oid. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-01	Merge branch 'jk/rebase-apply-leakfix'	Junio C Hamano	1	-3/+3
	Leakfix. * jk/rebase-apply-leakfix: rebase: use child_process_clear() to clean
2024-04-01	Merge branch 'pb/advice-merge-conflict'	Junio C Hamano	1	-5/+9
	Hints that suggest what to do after resolving conflicts can now be squelched by disabling advice.mergeConflict. Acked-by: Phillip Wood <phillip.wood123@gmail.com> cf. <e040c631-42d9-4501-a7b8-046f8dac6309@gmail.com> * pb/advice-merge-conflict: builtin/am: allow disabling conflict advice sequencer: allow disabling conflict advice
2024-04-01	Merge branch 'jk/pretty-subject-cleanup'	Junio C Hamano	3	-3/+3
	Code clean-up in the "git log" machinery that implements custom log message formatting. * jk/pretty-subject-cleanup: format-patch: fix leak of empty header string format-patch: simplify after-subject MIME header handling format-patch: return an allocated string from log_write_email_headers() log: do not set up extra_headers for non-email formats pretty: drop print_email_subject flag pretty: split oneline and email subject printing shortlog: stop setting pp.print_email_subject
2024-04-01	Merge branch 'pw/checkout-conflict-errorfix'	Junio C Hamano	1	-22/+38
	"git checkout --conflict=bad" reported a bad conflictStyle as if it were given to a configuration variable; it has been corrected to report that the command line option is bad. * pw/checkout-conflict-errorfix: checkout: fix interaction between --conflict and --merge checkout: cleanup --conflict=<style> parsing merge options: add a conflict style member merge-ll: introduce LL_MERGE_OPTIONS_INIT xdiff-interface: refactor parsing of merge.conflictstyle