git.git - The core git plumbing

Age	Commit message (Collapse)	Author	Files	Lines
2024-11-21	strvec: introduce new `strvec_splice()` function	Patrick Steinhardt	3	-0/+93
	Introduce a new `strvec_splice()` function that can replace a range of strings in the vector with another array of strings. This function will be used in subsequent commits. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21	line-log: fix leak when rewriting commit parents	Patrick Steinhardt	2	-0/+2
	In `process_ranges_merge_commit()` we try to figure out which of the parents can be blamed for the given line changes. When we figure out that none of the files in the line-log have changed we assign the complete blame to that commit and rewrite the parents of the current commit to only use that single parent. This is done via `commit_list_append()`, which is misleadingly _not_ appending to the list of parents. Instead, we overwrite the parents with the blamed parent. This makes us lose track of the old pointers, creating a memory leak. Fix this issue by freeing the parents before we overwrite them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21	bisect: fix various cases where we leak commit list items	Patrick Steinhardt	2	-8/+23
	There are various cases where we leak commit list items because we evict items from the list, but don't free them. Plug those. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21	bisect: fix leaking commit list items in `check_merge_base()`	Patrick Steinhardt	1	-2/+2
	While we free the result commit list at the end of `check_merge_base()`, we forget to free any items that we have already iterated over. Fix this by using a separate variable to iterate through them. This leak is exposed by t6030, but plugging it does not make the whole test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21	bisect: fix multiple leaks in `bisect_next_all()`	Patrick Steinhardt	1	-2/+3
	There are multiple leaks in `bisect_next_all()`. For one we don't free the `tried` commit list. Second, one of the branches uses a direct return instead of jumping to the cleanup code. Fix these by freeing the commit list and converting the return to a goto. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21	bisect: fix leaking `current_bad_oid`	Patrick Steinhardt	1	-0/+1
	When reading bisect refs we read the reference mapping to the "bad" term into the global `current_bad_oid` variable. This is an allocated string, but because it is global we never have to free it. This changes though when `register_ref()` is being called multiple times, at which point we'll overwrite the previous pointer and thus make it unreachable. Fix this issue by freeing the previous value. This leak is exposed by t6030, but plugging it does not make the whole test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21	bisect: fix leaking string in `handle_bad_merge_base()`	Patrick Steinhardt	1	-0/+2
	When handling a bad merge base we print an error, which includes the set of good revisions joined by spaces. This string is allocated, but never freed. Fix this memory leak. Note that the local `bad_hex` varible also looks like a string that we should free. But in fact, `oid_to_hex()` returns an address to a static variable even though it is declared to return a non-constant string. The function signature is thus quite misleading and really should be fixed, but doing so is outside of the scope of this patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21	bisect: fix leaking good/bad terms when reading multipe times	Patrick Steinhardt	3	-8/+12
	Even though `read_bisect_terms()` is declared as assigning string constants, it in fact assigns allocated strings to the `read_bad` and `read_good` out parameters. The only callers of this function assign the result to global variables and thus don't have to free them in order to be leak-free. But that changes when executing the function multiple times because we'd then overwrite the previous value and thus make it unreachable. Fix the function signature and free the previous values. This leak is exposed by t0630, but plugging it does not make the whole test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21	builtin/blame: fix leaking blame entries with `--incremental`	Patrick Steinhardt	4	-7/+10
	When passing `--incremental` to git-blame(1) we exit early by jumping to the `cleanup` label. But some of the cleanups we perform are handled between the `goto` and its label, and thus we leak the data. Move the cleanups after the `cleanup` label. While at it, move the logic to free the scoreboard's `final_buf` into `cleanup_scoreboard()` and drop its `const` declaration. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-07	Merge branch 'ps/leakfixes-part-9' into ps/leakfixes-part-10	Junio C Hamano	38	-35/+115
	* ps/leakfixes-part-9: (22 commits) list-objects-filter-options: work around reported leak on error builtin/merge: release output buffer after performing merge dir: fix leak when parsing "status.showUntrackedFiles" t/helper: fix leaking buffer in "dump-untracked-cache" t/helper: stop re-initialization of `the_repository` sparse-index: correctly free EWAH contents dir: release untracked cache data combine-diff: fix leaking lost lines builtin/tag: fix leaking key ID on failure to sign transport-helper: fix leaking import/export marks builtin/commit: fix leaking cleanup config trailer: fix leaking strbufs when formatting trailers trailer: fix leaking trailer values builtin/commit: fix leaking change data contents upload-pack: fix leaking URI protocols pretty: clear signature check diff-lib: fix leaking diffopts in `do_diff_cache()` revision: fix leaking bloom filters builtin/grep: fix leak with `--max-count=0` grep: fix leak in `grep_splice_or()` ...
2024-11-04	list-objects-filter-options: work around reported leak on error	Patrick Steinhardt	2	-10/+8
	This one is a little bit more curious. In t6112, we have a test that exercises the `git rev-list --filter` option with invalid filters. We execute git-rev-list(1) via `test_must_fail`, which means that we check for leaks even though Git exits with an error code. This causes the following leak: Direct leak of 27 byte(s) in 1 object(s) allocated from: #0 0x5555555e6946 in realloc.part.0 lsan_interceptors.cpp.o #1 0x5555558fb4b6 in xrealloc wrapper.c:137:8 #2 0x5555558b6e06 in strbuf_grow strbuf.c:112:2 #3 0x5555558b7550 in strbuf_add strbuf.c:311:2 #4 0x5555557c1a88 in strbuf_addstr strbuf.h:310:2 #5 0x5555557c1d4c in parse_list_objects_filter list-objects-filter-options.c:261:3 #6 0x555555885ead in handle_revision_pseudo_opt revision.c:2899:3 #7 0x555555884e20 in setup_revisions revision.c:3014:11 #8 0x5555556c4b42 in cmd_rev_list builtin/rev-list.c:588:9 #9 0x5555555ec5e3 in run_builtin git.c:483:11 #10 0x5555555eb1e4 in handle_builtin git.c:749:13 #11 0x5555555ec001 in run_argv git.c:819:4 #12 0x5555555eaf94 in cmd_main git.c:954:19 #13 0x5555556fd569 in main common-main.c:64:11 #14 0x7ffff7ca714d in __libc_start_call_main (.../lib/libc.so.6+0x2a14d) #15 0x7ffff7ca7208 in __libc_start_main@GLIBC_2.2.5 (.../libc.so.6+0x2a208) #16 0x5555555ad064 in _start (git+0x59064) This leak is valid, as we call `die()` and do not clean up the memory at all. But what's curious is that this is the only leak reported, because we don't clean up any other allocated memory, either, and I have no idea why the leak sanitizer treats this buffer specially. In any case, we can work around the leak by shuffling things around a bit. Instead of calling `gently_parse_list_objects_filter()` and dying after we have modified the filter spec, we simply do so beforehand. Like this we don't allocate the buffer in the error case, which makes the reported leak go away. It's not pretty, but it manages to make t6112 leak free. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	builtin/merge: release output buffer after performing merge	Patrick Steinhardt	2	-0/+2
	The `obuf` member of `struct merge_options` is used to buffer output in some cases. In order to not discard its allocated memory we only release its contents in `merge_finalize()` when we're not currently recursing into a subtree. This results in some situations where we seemingly do not release the buffer reliably. We thus have calls to `strbuf_release()` for this buffer scattered across the codebase. But we're missing one callsite in git-merge(1), which causes a memory leak. We should ideally refactor this interface so that callers don't have to know about any such internals. But for now, paper over the issue by adding one more `strbuf_release()` call. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	dir: fix leak when parsing "status.showUntrackedFiles"	Patrick Steinhardt	2	-2/+3
	We use `repo_config_get_string()` to read "status.showUntrackedFiles" from the config subsystem. This function allocates the result, but we never free the result after parsing it. The value never leaves the scope of the calling function, so refactor it to instead use `repo_config_get_string_tmp()`, which does not hand over ownership to the caller. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	t/helper: fix leaking buffer in "dump-untracked-cache"	Patrick Steinhardt	1	-0/+2
	We never release the local `struct strbuf base` buffer, thus leaking memory. Fix this leak. This leak is exposed by t7063, but plugging it alone does not make the whole test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	t/helper: stop re-initialization of `the_repository`	Patrick Steinhardt	2	-2/+1
	While "common-main.c" already initializes `the_repository` for us, we do so a second time in the "read-cache" test helper. This causes a memory leak because the old repository's contents isn't released. Stop calling `initialize_repository()` to plug this leak. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	sparse-index: correctly free EWAH contents	Patrick Steinhardt	1	-2/+5
	While we free the `fsmonitor_dirty` member of `struct index_state`, we do not free the contents of that EWAH. Do so by using `ewah_free()` instead of `FREE_AND_NULL()`. This leak is exposed by t7519, but plugging it alone does not make the test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	dir: release untracked cache data	Patrick Steinhardt	1	-0/+8
	There are several cases where we invalidate untracked cache directory entries where we do not free the underlying data, but reset the number of entries. This causes us to leak memory because `free_untracked()` will not iterate over any potential entries which we still had in the array. Fix this issue by freeing old entries. The leak is exposed by t7519, but plugging it alone does not make the whole test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	combine-diff: fix leaking lost lines	Patrick Steinhardt	2	-2/+4
	The `cnt` variable tracks the number of lines in a patch diff. It can happen though that there are no newlines, in which case we'd still end up allocating our array of `sline`s. In fact, we always allocate it with `cnt + 2` entries: one extra entry for the deletion hunk at the end, and another entry that we don't seem to ever populate at all but acts as a kind of sentinel value. When we loop through the array to clear it at the end of this function we only loop until `lno < cnt`, and thus we may not end up releasing whatever the two extra `sline`s contain. While that shouldn't matter for the sentinel value, it does matter for the extra deletion hunk sline. Regardless of that, plug this memory leak by releasing both extra entries, which makes the logic a bit easier to reason about. While at it, fix the formatting of a local comment, which incidentally also provides the necessary context for why we overallocate the `sline` array. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	builtin/tag: fix leaking key ID on failure to sign	Patrick Steinhardt	2	-1/+2
	We do not free the key ID when signing a tag fails. Do so by using the common exit path. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	transport-helper: fix leaking import/export marks	Patrick Steinhardt	2	-0/+3
	Fix leaking import and export marks for transport helpers. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	builtin/commit: fix leaking cleanup config	Patrick Steinhardt	2	-5/+13
	The cleanup string set by the config is leaking when it is being overridden by an option. Fix this by tracking these via two separate variables such that we can free the old value. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	trailer: fix leaking strbufs when formatting trailers	Patrick Steinhardt	2	-5/+8
	When formatting trailer lines we iterate through each of the trailers and munge their respective token/value pairs according to the trailer options. When formatting a trailer that has its `item->token` pointer set we perform the munging in two local buffers. In the case where we figure out that the value is empty and `trim_empty` is set we just skip over the trailer item. But the buffers are local to the loop and we don't release their contents, leading to a memory leak. Plug this leak by lifting the buffers outside of the loop and releasing them on function return. This fixes the memory leaks, but also optimizes the loop as we don't have to reallocate the buffers on every single iteration. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	trailer: fix leaking trailer values	Patrick Steinhardt	1	-2/+8
	Fix leaking trailer values when replacing the value with a command or when the token value is empty. This leak is exposed by t7513, but plugging it does not make the whole test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	builtin/commit: fix leaking change data contents	Patrick Steinhardt	2	-1/+9
	While we free the worktree change data, we never free its contents. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	upload-pack: fix leaking URI protocols	Patrick Steinhardt	2	-0/+2
	We don't clear `struct upload_pack::uri_protocols`, which causes a memory leak. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	pretty: clear signature check	Patrick Steinhardt	5	-0/+5
	The signature check in the formatting context is never getting released. Fix this to plug the resulting memory leak. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	diff-lib: fix leaking diffopts in `do_diff_cache()`	Patrick Steinhardt	2	-0/+2
	In `do_diff_cache()` we initialize a new `rev_info` and then overwrite its `diffopt` with a user-provided set of options. This can leak memory because `repo_init_revisions()` may end up allocating memory for the `diffopt` itself depending on the configuration. And since that field is overwritten we won't ever free it. Plug the memory leak by releasing the diffopts before we overwrite them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	revision: fix leaking bloom filters	Patrick Steinhardt	2	-0/+6
	The memory allocated by `prepare_to_use_bloom_filter()` is not released by `release_revisions()`, causing a memory leak. Plug it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	builtin/grep: fix leak with `--max-count=0`	Patrick Steinhardt	2	-3/+11
	When executing with `--max-count=0` we'll return early from git-grep(1) without performing any cleanup, which causes memory leaks. Plug these. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	grep: fix leak in `grep_splice_or()`	Patrick Steinhardt	1	-0/+1
	In `grep_splice_or()` we search for the next `TRUE` node in our tree of grep expressions and replace it with the given new expression. But we don't free the old node, which causes a memory leak. Plug it. This leak is exposed by t7810, but plugging it alone isn't sufficient to make the test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	t/helper: fix leaks in "reach" test tool	Patrick Steinhardt	2	-0/+11
	The "reach" test tool doesn't bother to clean up any of its allocated resources, causing various leaks. Plug them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	builtin/ls-remote: plug leaking server options	Patrick Steinhardt	1	-0/+1
	The list of server options populated via `OPT_STRING_LIST()` is never cleared, causing a memory leak. Plug it. This leak is exposed by t5702, but plugging it alone does not make the whole test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-01	The seventh batch	Taylor Blau	1	-0/+39

2024-11-01	Merge branch 'jk/dumb-http-finalize'	Taylor Blau	16	-101/+153
	The dumb-http code regressed when the result of re-indexing a pack yielded an .idx file that differs in content from the .idx file it downloaded from the remote. This has been corrected by no longer relying on the .idx file we got from the remote. jk/dumb-http-finalize: packfile: use oidread() instead of hashcpy() to fill object_id packfile: use object_id in find_pack_entry_one() packfile: convert find_sha1_pack() to use object_id http-walker: use object_id instead of bare hash packfile: warn people away from parse_packed_git() packfile: drop sha1_pack_index_name() packfile: drop sha1_pack_name() packfile: drop has_pack_index() dumb-http: store downloaded pack idx as tempfile t5550: count fetches in "previously-fetched .idx" test midx: avoid duplicate packed_git entries
2024-11-01	Merge branch 'kh/update-ref'	Taylor Blau	2	-27/+25
	Documentation updates to 'git-update-ref(1)'. * kh/update-ref: Documentation: mutually link update-ref and symbolic-ref Documentation/git-update-ref.txt: discuss symbolic refs Documentation/git-update-ref.txt: remove confusing paragraph Documentation/git-update-ref.txt: demote symlink to last section Documentation/git-update-ref.txt: remove safety paragraphs Documentation/git-update-ref.txt: drop “flag”
2024-11-01	Merge branch 'ak/more-typofixes'	Taylor Blau	26	-37/+37
	More typofixes. * ak/more-typofixes: t: fix typos
2024-11-01	Merge branch 'rs/grep-lookahead'	Taylor Blau	2	-10/+29
	Fix 'git grep' regression on macOS by disabling lookahead when encountering invalid UTF-8 byte sequences. * rs/grep-lookahead: grep: disable lookahead on error
2024-11-01	Merge branch 'ak/t1016-cleanup'	Taylor Blau	1	-2/+2
	Test cleanup. * ak/t1016-cleanup: t1016: clean up style
2024-11-01	Merge branch 'ua/atoi'	Taylor Blau	5	-11/+58
	Replace various calls to atoi() with strtol_i() and strtoul_ui(), and add improved error handling. * ua/atoi: imap: replace atoi() with strtol_i() for UIDVALIDITY and UIDNEXT parsing merge: replace atoi() with strtol_i() for marker size validation daemon: replace atoi() with strtoul_ui() and strtol_i()
2024-11-01	Merge branch 'sa/notes-edit'	Taylor Blau	3	-5/+76
	Teach 'git notes add' and 'git notes append' a new '-e' flag, instructing them to open the note in $GIT_EDITOR before saving. * sa/notes-edit: notes: teach the -e option to edit messages in editor
2024-11-01	Merge branch 'sk/t9101-cleanup'	Taylor Blau	1	-17/+17
	Test cleanup. * sk/t9101-cleanup: t9101: ensure no whitespace after redirect
2024-11-01	Merge branch 'ss/duplicate-typos'	Taylor Blau	11	-11/+11
	Typofixes. * ss/duplicate-typos: global: Fix duplicate word typos
2024-11-01	Merge branch 'ps/upload-pack-doc'	Taylor Blau	1	-1/+5
	Documentation update to clarify that 'uploadpack.allowAnySHA1InWant' implies both 'allowTipSHA1InWant' and 'allowReachableSHA1InWant'. * ps/upload-pack-doc: doc: document how uploadpack.allowAnySHA1InWant impact other allow options
2024-11-01	Merge branch 'kh/mv-breakage'	Taylor Blau	1	-0/+12
	Demonstrate an assertion failure in 'git mv'. * kh/mv-breakage: t7001: add failure test which triggers assertion
2024-11-01	Merge branch 'rj/cygwin-exit'	Taylor Blau	1	-1/+1
	Treat ECONNABORTED the same as ECONNRESET in 'git credential-cache' to work around a possible Cygwin regression. This resolves a race condition caused by changes in Cygwin's handling of socket closures, allowing the client to exit cleanly when encountering ECONNABORTED. * rj/cygwin-exit: credential-cache: treat ECONNABORTED like ECONNRESET
2024-11-01	Merge branch 'ua/t3404-cleanup'	Taylor Blau	1	-25/+50
	Test update. * ua/t3404-cleanup: t3404: replace test with test_line_count() t3404: avoid losing exit status with focus on `git show` and `git cat-file`
2024-11-01	Merge branch 'ps/platform-compat-fixes'	Taylor Blau	30	-138/+275
	Various platform compatibility fixes split out of the larger effort to use Meson as the primary build tool. * ps/platform-compat-fixes: t6006: fix prereq handling with `test_format ()` http: fix build error on FreeBSD builtin/credential-cache: fix missing parameter for stub function t7300: work around platform-specific behaviour with long paths on MinGW t5500, t5601: skip tests which exercise paths with '[::1]' on Cygwin t3404: work around platform-specific behaviour on macOS 10.15 t1401: make invocation of tar(1) work with Win32-provided one t/lib-gpg: fix setup of GNUPGHOME in MinGW t/lib-gitweb: test against the build version of gitweb t/test-lib: wire up NO_ICONV prerequisite t/test-lib: fix quoting of TEST_RESULTS_SAN_FILE
2024-11-01	Merge branch 'jc/breaking-changes-early-adopter-option'	Taylor Blau	1	-1/+20
	Describe the policy to introduce breaking changes. * jc/breaking-changes-early-adopter-option: BreakingChanges: early adopter option
2024-10-30	The sixth batch	Taylor Blau	1	-0/+11

2024-10-30	Merge branch 'sk/t7011-cleanup'	Taylor Blau	1	-11/+11
	Test cleanup. * sk/t7011-cleanup: t7011: ensure no whitespace after redirect
2024-10-30	Merge branch 'co/t6050-pipefix'	Taylor Blau	1	-47/+86
	Avoid losing exit status by having Git command being tested on the upstream side of a pipe. * co/t6050-pipefix: t6050: avoid pipes with upstream Git commands
2024-10-30	Merge branch 'ks/t4205-fixup'	Taylor Blau	1	-1/+1
	Testfix. * ks/t4205-fixup: t4205: fix typo in 'NUL termination with --stat'
2024-10-30	Merge branch 'kh/submitting-patches'	Taylor Blau	1	-3/+3
	Docfix. * kh/submitting-patches: SubmittingPatches: tags -> trailers
2024-10-30	Merge branch 'ps/ref-filter-sort'	Taylor Blau	1	-8/+21
	Teaches the ref-filter machinery to recognize and avoid cases where sorting would be redundant. * ps/ref-filter-sort: ref-filter: format iteratively with lexicographic refname sorting
2024-10-30	Merge branch 'ps/reftable-strbuf'	Taylor Blau	24	-452/+728
	Implements a new reftable-specific strbuf replacement to reduce reftable's dependency on Git-specific data structures. * ps/reftable-strbuf: reftable: handle trivial `reftable_buf` errors reftable/stack: adapt `stack_filename()` to handle allocation failures reftable/record: adapt `reftable_record_key()` to handle allocation failures reftable/stack: adapt `format_name()` to handle allocation failures t/unit-tests: check for `reftable_buf` allocation errors reftable/blocksource: adapt interface name reftable: convert from `strbuf` to `reftable_buf` reftable/basics: provide new `reftable_buf` interface reftable: stop using `strbuf_addf()` reftable: stop using `strbuf_addbuf()`
2024-10-28	t6006: fix prereq handling with `test_format ()`	Patrick Steinhardt	1	-4/+4
	In df383b5842 (t/test-lib: wire up NO_ICONV prerequisite, 2024-10-16) we have introduced a new NO_ICONV prerequisite that makes us skip tests in case Git is not compiled with support for iconv. This change subtly broke t6006: while the test suite still passes, some of its tests won't execute because they run into an error. ./t6006-rev-list-format.sh: line 92: test_expect_%e: command not found The broken tests use `test_format ()`, and the mentioned commit simply prepended the new prerequisite to its arguments. But that does not work, as the function is not aware of prereqs at all and will now treat all of its arguments incorrectly. Fix this by making the function aware of prereqs by accepting an optional fourth argument. Adapt the callsites accordingly. Reported-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	packfile: use oidread() instead of hashcpy() to fill object_id	Jeff King	1	-1/+1
	When chasing a REF_DELTA, we need to pull the raw hash bytes out of the mmap'd packfile into an object_id struct. We do that with a raw hashcpy() of the appropriate length (that happens directly now, though before the previous commit it happened inside find_pack_entry_one(), also using a hashcpy). But I think this creates a potentially dangerous situation due to d4d364b2c7 (hash: convert `oidcmp()` and `oideq()` to compare whole hash, 2024-06-14). When using sha1, we'll have uninitialized bytes in the latter part of the object_id.hash buffer, which could fool oideq(), etc. We should use oidread() instead, which correctly zero-pads the extra bytes, as of c98d762ed9 (global: ensure that object IDs are always padded, 2024-06-14). As far as I can see, this has not been a problem in practice because the object_id we feed to find_pack_entry_one() is never used with oideq(), etc. It is being compared to the bytes mmap'd from a pack idx file, which of course do not have the extra padding bytes themselves. So there's no bug here, but this just puzzled me while looking at the code. We should do the more obviously safe thing, both for future-proofing and to avoid confusing readers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	packfile: use object_id in find_pack_entry_one()	Jeff King	7	-18/+18
	The main function we use to search a pack index for an object is find_pack_entry_one(). That function still takes a bare pointer to the hash, despite the fact that its underlying bsearch_pack() function needs an object_id struct. And so we end up making an extra copy of the hash into the struct just to do a lookup. As it turns out, all callers but one already have such an object_id. So we can just take a pointer to that struct and use it directly. This avoids the extra copy and provides a more type-safe interface. The one exception is get_delta_base() in packfile.c, when we are chasing a REF_DELTA from inside the pack (and thus we have a pointer directly to the mmap'd pack memory, not a struct). We can just bump the hashcpy() from inside find_pack_entry_one() to this one caller that needs it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	packfile: convert find_sha1_pack() to use object_id	Jeff King	5	-12/+15
	The find_sha1_pack() function has a few problems: - it's badly named, since it works with any object hash - it takes the hash as a bare pointer rather than an object_id struct We can fix both of these easily, as all callers actually have a real object_id anyway. I also found the existence of this function somewhat confusing, as it is about looking in an arbitrary set of linked packed_git structs. It's good for things like dumb-http which are looking in downloaded remote packs, and not our local packs. But despite the name, it is not a good way to find the pack which contains a local object (it skips the use of the midx, the pack mru list, and so on). So let's also add an explanatory comment above the declaration that may point people in the right direction. I suspect the calls in fast-import.c, which use the packed_git list from the repository struct, could actually just be using find_pack_entry(). But since we'd need to keep it anyway for dumb-http, I didn't dig further there. If we eventually drop dumb-http support, then it might be worth examining them to see if we can get rid of the function entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	http-walker: use object_id instead of bare hash	Jeff King	3	-16/+17
	We long ago switched most code to using object_id structs instead of bare "unsigned char *" hashes. This gives us more type safety from the compiler, and generally makes it easier to understand what we expect in each parameter. But the dumb-http code has lagged behind. And indeed, the whole "walker" subsystem interface has the same problem, though http-walker is the only user left. So let's update the walker interface to pass object_id structs (which we already have anyway at all call sites!), and likewise use those within the http-walker methods that it calls. This cleans up the dumb-http code a bit, but will also let us fix a few more commonly used helper functions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	packfile: warn people away from parse_packed_git()	Jeff King	1	-1/+10
	With a name like parse_packed_git(), you might think it's the right way to access a local pack index and its associated objects. But not so! It's a one-off used by the dumb-http code to access pack idx files we've downloaded from the remote, but whose packs we might not have. There's only one caller left for this function, and ideally we'd drop it completely and just inline it there. But that would require exposing other internals from packfile.[ch], like alloc_packed_git() and check_packed_git_idx(). So let's leave it be for now, and just warn people that it's probably not what they're looking for. Perhaps in the long run if we eventually drop dumb-http support, we can remove the function entirely then. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	packfile: drop sha1_pack_index_name()	Jeff King	3	-14/+4
	Like sha1_pack_name() that we dropped in the previous commit, this function uses an error-prone static strbuf and the somewhat misleading name "sha1". The only caller left is in pack-redundant.c. While this command is marked for potential removal in our BreakingChanges document, we still have it for now. But it's simple enough to convert it to use its own strbuf with the underlying odb_pack_name() function, letting us drop the otherwise obsolete function. Note that odb_pack_name() does its own strbuf_reset(), so it's safe to use directly within a loop like this. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	packfile: drop sha1_pack_name()	Jeff King	3	-14/+2
	The sha1_pack_name() function has a few ugly bits: - it writes into a static strbuf (and not even a ring buffer of them), which can lead to subtle invalidation problems - it uses the term "sha1", but it's really using the_hash_algo, which could be sha256 There's only one caller of it left. And in fact that caller is better off using the underlying odb_pack_name() function itself, since it's just copying the result into its own strbuf anyway. Converting that caller lets us get rid of this now-obselete function. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	packfile: drop has_pack_index()	Jeff King	3	-17/+8
	The has_pack_index() function has several oddities that may make it surprising if you are trying to find out if we have a pack with some $hash: - it is not looking for a valid pack that we found while searching object directories. It just looks for any pack-$hash.idx file in the pack directory. - it only looks in the local directory, not any alternates - it takes a bare "unsigned char" hash, which we try to avoid these days The only caller it has is in the dumb http code; it wants to know if we already have the pack idx in question. This can happen if we downloaded the pack (and generated its index) during a previous fetch. Before the previous patch ("dumb-http: store downloaded pack idx as tempfile"), it could also happen if we downloaded the .idx from the remote but didn't get the matching .pack. But since that patch, we don't hold on to those .idx files. So there's no need to look for the .idx file in the filesystem; we can just scan through the packed_git list to see if we have it. That lets us simplify the dumb http code a bit, as we know that if we have the .idx we have the matching .pack already. And it lets us get rid of this odd function that is unlikely to be needed again. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	dumb-http: store downloaded pack idx as tempfile	Jeff King	3	-7/+41
	This patch fixes a regression in b1b8dfde69 (finalize_object_file(): implement collision check, 2024-09-26) where fetching a v1 pack idx file over the dumb-http protocol would cause the fetch to fail. The core of the issue is that dumb-http stores the idx we fetch from the remote at the same path that will eventually hold the idx we generate from "index-pack --stdin". The sequence is something like this: 0. We realize we need some object X, which we don't have locally, and nor does the other side have it as a loose object. 1. We download the list of remote packs from objects/info/packs. 2. For each entry in that file, we download each pack index and store it locally in .git/objects/pack/pack-$hash.idx (the $hash is not something we can verify yet and is given to us by the remote). 3. We check each pack index we got to see if it has object X. When we find a match, we download the matching .pack file from the remote to a tempfile. We feed that to "index-pack --stdin", which reindexes the pack, rather than trusting that it has what the other side claims it does. In most cases, this will end up generating the exact same (byte-for-byte) pack index which we'll store at the same pack-$hash.idx path, because the index generation and $hash id are computed based on what's in the packfile. But: a. The other side might have used other options to generate the index. For instance we use index v2 by default, but long ago it was v1 (and you can still ask for v1 explicitly). b. The other side might even use a different mechanism to determine $hash. E.g., long ago it was based on the sorted list of objects in the packfile, but we switched to using the pack checksum in 1190a1acf8 (pack-objects: name pack files after trailer hash, 2013-12-05). The regression we saw in the real world was (3a). A recent client fetching from a server with a v1 index downloaded that index, then complained about trying to overwrite it with its own v2 index. This collision is otherwise harmless; we know we want to replace the remote version with our local one, but the collision check doesn't realize that. There are a few options to fix it: - we could teach index-pack a command-line option to ignore only pack idx collisions, and use it when the dumb-http code invokes index-pack. This would be an awkward thing to expose users to and would involve a lot of boilerplate to get the option down to the collision code. - we could delete the remote .idx file right before running index-pack. It should be redundant at that point (since we've just downloaded the matching pack). But it feels risky to delete something from our own .git/objects based on what the other side has said. I'm not entirely positive that a malicious server couldn't lie about which pack-$hash.idx it has and get us to delete something precious. - we can stop co-mingling the downloaded idx files in our local objects directory. This is a slightly bigger change but I think fixes the root of the problem more directly. This patch implements the third option. The big design questions are: where do we store the downloaded files, and how do we manage their lifetimes? There are some additional quirks to the dumb-http system we should consider. Remember that in step 2 we downloaded every pack index, but in step 3 we may only download some of the matching packs. What happens to those other idx files now? They sit in the .git/objects/pack directory, possibly waiting to be used at a later date. That may save bandwidth for a subsequent fetch, but it also creates a lot of weird corner cases: - our local object directory now has semi-untrusted .idx files sitting around, without their matching .pack - in case 3b, we noted that we might not generate the same hash as the other side. In that case even if we download the matching pack, our index-pack invocation will store it in a different pack-$hash.idx file. And the unmatched .idx will sit there forever. - if the server repacks, it may delete the old packs. Now we have these orphaned .idx files sitting around locally that will never be used (nor deleted). - if we repack locally we may delete our local version of the server's pack index and not realize we have it. So we'll download it again, even though we have all of the objects it mentions. I think the right solution here is probably some more complex cache management system: download the remote .idx files to their own storage directory, mark them as "seen" when we get their matching pack (to avoid re-downloading even if we repack), and then delete them when the server's objects/info/refs no longer mentions them. But since the dumb http protocol is so ancient and so inferior to the smart http protocol, I don't think it's worth spending a lot of time creating such a system. For this patch I'm just downloading the idx files to .git/objects/tmp_pack_*, and marking them as tempfiles to be deleted when we exit (and due to the name, any we miss due to a crash, etc, should eventually be removed by "git gc" runs based on timestamps). That is slightly worse for one case: if we download an idx but not the matching pack, we won't retain that idx for subsequent runs. But the flip side is that we're making other cases better (we never hold on to useless idx files forever). I suspect that worse case does not even come up often, since it implies that the packs are generated to match distinct parts of history (i.e., in practice even in a repo with many packs you're going to end up grabbing all of those packs to do a clone). If somebody really cares about that, I think the right path forward is a managed cache directory as above, and this patch is providing the first step in that direction anyway (by moving things out of the objects/pack/ directory). There are two test changes. One demonstrates the broken v1 index case (it double-checks the resulting clone with fsck to be careful, but prior to this patch it actually fails at the clone step). The other tweaks the expectation for a test that covers the "slightly worse" case to accommodate the extra index download. The code changes are fairly simple. We stop using finalize_object_file() to copy the remote's index file into place, and leave it as a tempfile. We give the tempfile a real ".idx" name, since the packfile code expects that, and thus we make sure it is out of the usual packs/ directory (so we'd never mistake it for a real local .idx). We also have to change parse_pack_index(), which creates a temporary packed_git to access our index (we need this because all of the pack idx code assumes we have that struct). It reads the index data from the tempfile, but prior to this patch would speculatively write the finalized name into the packed_git struct using the pack-$hash we expect to use. I was mildly surprised that this worked at all, since we call verify_pack_index() on the packed_git which mentions the final name before moving the file into place! But it works because parse_pack_index() leaves the mmap-ed data in the struct, so the lazy-open in verify_pack_index() never triggers, and we read from the tempfile, ignoring the filename in the struct completely. Hacky, but it works. After this patch, parse_pack_index() now uses the index filename we pass in to derive a matching .pack name. This is OK to change because there are only two callers, both in the dumb http code (and the other passes in an existing pack-$hash.idx name, so the derived name is going to be pack-$hash.pack, which is what we were using anyway). I'll follow up with some more cleanups in that area, but this patch is sufficient to fix the regression. Reported-by: fox <fox.gbr@townlong-yak.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	t5550: count fetches in "previously-fetched .idx" test	Jeff King	1	-2/+16
	We have a test in t5550 that looks at index fetching over dumb http. It creates two branches, each of which is completely stored in its own pack, then fetches the branches independently. What should (and does) happen is that the first fetch grabs both .idx files and one .pack file, and then the fetch of the second branch re-uses the previously downloaded .idx files (fetching none) and grabs the now-required .pack file. Since the next few patches will be touching this area of the code, let's beef up the test a little by checking that we're downloading the expected items at each step. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	midx: avoid duplicate packed_git entries	Jeff King	2	-3/+25
	When we scan a pack directory to load the idx entries we find into the packed_git list, we skip any of them that are contained in a midx. We then load them later lazily if we actually need to access the corresponding pack, referencing them both from the midx struct and the packed_git list. The lazy-load in the midx code checks to see if the midx already mentions the pack, but doesn't otherwise check the packed_git list. This makes sense, since we should have added any pack to both lists. But there's a loophole! If we call close_object_store(), that frees the midx entirely, but _not_ the packed_git structs, which we must keep around for Reasons[1]. If we then try to look up more objects, we'll auto-load the midx again, which won't realize that we've already loaded those packs, and will create duplicate entries in the packed_git list. This is possibly inefficient, because it means we may open and map the pack redundantly. But it can also lead to weird user-visible behavior. The case I found is in "git repack", which closes and reopens the midx after repacking and then calls update_server_info(). We end up writing the duplicate entries into objects/info/packs. We could obviously de-dup them while writing that file, but it seems like a violation of more core assumptions that we end up with these duplicate structs at all. We can avoid the duplicates reasonably efficiently by checking their names in the pack_map hash. This annoyingly does require a little more than a straight hash lookup due to the naming conventions, but it should only happen when we are about to actually open a pack. I don't think one extra malloc will be noticeable there. [1] I'm not entirely sure of all the details, except that we generally assume the packed_git structs never go away. We noted this restriction in the comment added by 6f1e9394e2 (object: fix leaking packfiles when closing object store, 2024-08-08), but it's somewhat vague. At any rate, if you try freeing the structs in close_object_store(), you can observe segfaults all over the test suite. So it might be fixable, but it's not trivial. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	The fifth batch	Taylor Blau	1	-0/+22

2024-10-25	Merge branch 'wm/shortlog-hash'	Taylor Blau	2	-0/+16
	Teaches 'shortlog' to explicitly use SHA-1 when operating outside of a repository. * wm/shortlog-hash: builtin/shortlog: explicitly set hash algo when there is no repo
2024-10-25	Merge branch 'sk/msvc-warnings'	Taylor Blau	3	-12/+21
	Fixes compile time warnings with 64-bit MSVC. * sk/msvc-warnings: mingw.c: Fix complier warnings for a 64 bit msvc
2024-10-25	Merge branch 'jc/a-commands-without-the-repo'	Taylor Blau	4	-10/+10
	Commands that can also work outside Git have learned to take the repository instance "repo" when we know we are in a repository, and NULL when we are not, in a parameter. The uses of the_repository variable in a few of them have been removed using the new calling convention. * jc/a-commands-without-the-repo: archive: remove the_repository global variable annotate: remove usage of the_repository global git: pass in repo to builtin based on setup_git_directory_gently
2024-10-25	Merge branch 'pb/clar-build-fix'	Taylor Blau	1	-0/+1
	Build fix. * pb/clar-build-fix: Makefile: fix dependency for $(UNIT_TEST_DIR)/clar/clar.o
2024-10-25	Merge branch 'bf/t-readme-mention-reftable'	Taylor Blau	1	-2/+3
	Doc update. * bf/t-readme-mention-reftable: t/README: add missing value for GIT_TEST_DEFAULT_REF_FORMAT
2024-10-25	Merge branch 'ak/typofix'	Taylor Blau	9	-14/+14
	More typofixes. * ak/typofix: t: fix typos
2024-10-25	Merge branch 'ak/typofixes'	Taylor Blau	15	-18/+18
	Typofixes. * ak/typofixes: t: fix typos t/helper: fix a typo t/perf: fix typos t/unit-tests: fix typos contrib: fix typos compat: fix typos
2024-10-25	Merge branch 'ps/ci-gitlab-windows'	Taylor Blau	5	-12/+87
	Enable Windows-based CI in GitLab. * ps/ci-gitlab-windows: gitlab-ci: exercise Git on Windows gitlab-ci: introduce stages and dependencies ci: handle Windows-based CI jobs in GitLab CI ci: create script to set up Git for Windows SDK t7300: work around platform-specific behaviour with long paths on MinGW
2024-10-25	Merge branch 'db/submodule-fetch-with-remote-name-fix'	Taylor Blau	2	-1/+28
	A "git fetch" from the superproject going down to a submodule used a wrong remote when the default remote names are set differently between them. * db/submodule-fetch-with-remote-name-fix: submodule: correct remote name with fetch
2024-10-24	imap: replace atoi() with strtol_i() for UIDVALIDITY and UIDNEXT parsing	Usman Akinyemi	1	-5/+8
	Replace unsafe uses of atoi() with strtol_i() to improve error handling when parsing UIDVALIDITY, UIDNEXT, and APPENDUID in IMAP commands. Invalid values, such as those with letters, now trigger error messages and prevent malformed status responses. I did not add any test for this commit as we do not have any test for git-imap-send(1) at this point. Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-24	merge: replace atoi() with strtol_i() for marker size validation	Usman Akinyemi	2	-2/+17
	Replace atoi() with strtol_i() for parsing conflict-marker-size to improve error handling. Invalid values, such as those containing letters now trigger a clear error message. Update the test to verify invalid input handling. Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-24	daemon: replace atoi() with strtoul_ui() and strtol_i()	Usman Akinyemi	2	-4/+33
	Replace atoi() with strtoul_ui() for --timeout and --init-timeout (non-negative integers) and with strtol_i() for --max-connections (signed integers). This improves error handling and input validation by detecting invalid values and providing clear error messages. Update tests to ensure these arguments are properly validated. Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-24	t: fix typos	Andrew Kreimer	26	-37/+37
	Fix typos and grammar in documentation, comments, etc. Via codespell. Reported-by: Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com> Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23	t7001: add failure test which triggers assertion	Kristoffer Haugsbakk	1	-0/+12
	`git mv a/a.txt a b/` is a nonsense instruction. Instead of failing gracefully the command trips over itself,[1] leaving behind unfinished work: 1. first it moves `a/a.txt` to `b/a.txt`; then 2. tries to move `a/`, including `a/a.txt`; then 3. figures out that it’s in a bad state (assertion); and finally 4. aborts. Now you’re left with a partially-updated index. The command should instead fail gracefully and make no changes to the index until it knows that it can complete a sensible action. For now just add a failing test since this has been known about for a while.[2] † 1: Caused by a `pos >= 0` assertion [2]: https://lore.kernel.org/git/d1f739fe-b28e-451f-9e01-3d2e24a0fe0d@app.fastmail.com/ Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-23	t9101: ensure no whitespace after redirect	Seyi Kuforiji	1	-17/+17
	This change updates the script to conform to the coding standards outlined in the Git project's documentation. According to the guidelines in Documentation/CodingGuidelines under "Redirection operators", there should be no whitespace after redirection operators. Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-22	t7011: ensure no whitespace after redirect	Seyi Kuforiji	1	-11/+11
	This change updates the script to conform to the coding standards outlined in the Git project's documentation. According to the guidelines in Documentation/CodingGuidelines under "Redirection operators", there should be no whitespace after redirection operators. Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-22	The third batch	Taylor Blau	1	-0/+8
	Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-22	Merge branch 'cw/worktree-relative'	Taylor Blau	8	-64/+312
	An extra worktree attached to a repository points at each other to allow finding the repository from the worktree and vice versa possible. Turn this linkage to relative paths. * cw/worktree-relative: worktree: add test for path handling in linked worktrees worktree: link worktrees with relative paths worktree: refactor infer_backlink() to use *strbuf worktree: repair copied repository and linked worktrees
2024-10-22	Merge branch 'ps/cache-tree-w-broken-index-entry'	Taylor Blau	5	-43/+97
	Fail gracefully instead of crashing when attempting to write the contents of a corrupt in-core index as a tree object. * ps/cache-tree-w-broken-index-entry: unpack-trees: detect mismatching number of cache-tree/index entries cache-tree: detect mismatching number of index entries cache-tree: refactor verification to return error codes
2024-10-22	t6050: avoid pipes with upstream Git commands	Chizoba ODINAKA	1	-47/+86
	In pipes, the exit code of a chain of commands is determined by the final command. In order not to miss the exit code of a failed Git command, avoid pipes instead write output of Git commands into a file. For better debugging experience, instances of "grep" were changed to "test_grep". "test_grep" provides more context in case of a failed "grep". Signed-off-by: Chizoba ODINAKA <chizobajames21@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-22	grep: disable lookahead on error	René Scharfe	2	-10/+29
	regexec(3) can fail. E.g. on macOS it fails if it is used with an UTF-8 locale to match a valid regex against a buffer containing invalid UTF-8 characters. git grep has two ways to search for matches in a file: Either it splits its contents into lines and matches them separately, or it matches the whole content and figures out line boundaries later. The latter is done by look_ahead() and it's quicker in the common case where most files don't contain a match. Fall back to line-by-line matching if look_ahead() encounters an regexec(3) error by propagating errors out of patmatch() and bailing out of look_ahead() if there is one. This way we at least can find matches in lines that contain only valid characters. That matches the behavior of grep(1) on macOS. pcre2match() dies if pcre2_jit_match() or pcre2_match() fail, but since we use the flag PCRE2_MATCH_INVALID_UTF it handles invalid UTF-8 characters gracefully. So implement the fall-back only for regexec(3) and leave the PCRE2 matching unchanged. Reported-by: David Gstir <david@sigma-star.at> Signed-off-by: René Scharfe <l.s.r@web.de> Tested-by: David Gstir <david@sigma-star.at> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-22	t1016: clean up style	Andrew Kreimer	1	-2/+2
	Use `test_config`. Remove whitespace after redirect operator. Reported-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21	t4205: fix typo in 'NUL termination with --stat'	Kousik Sanagavarapu	1	-1/+1
	Correct "expected" to rightly terminate with NUL ie '\0' instead of '0' which may have been typoed. We didn't notice this before because the test is run with "test_expect_failure", meaning the test would have been marked broken anyways. Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21	SubmittingPatches: tags -> trailers	Kristoffer Haugsbakk	1	-3/+3
	“Trailer” is the preferred nomenclature in this project. Also add a definite article where I think it makes sense. As we can see the rest of the document already prefers this term. This just gets rid of the last stragglers. Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21	Documentation: mutually link update-ref and symbolic-ref	Kristoffer Haugsbakk	2	-0/+8
	These two commands are similar enough to acknowledge each other on their documentation pages. See the previous commit where we discussed that option-less update-ref does not support updating symbolic refs but symbolic-ref does. Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21	Documentation/git-update-ref.txt: discuss symbolic refs	Kristoffer Haugsbakk	1	-0/+6
	Add a paragraph which just emphasizes that the command without any options does not support refs in the final arguments. This is clear already from the names `<new-oid>` and `<old-oid>` but the right balance of redundancy makes documentation robust against stray interpretation. This is also a good place to mention why `--stdin` has those `symref-*` commands. Suggested-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21	Documentation/git-update-ref.txt: remove confusing paragraph	Kristoffer Haugsbakk	1	-4/+0
	This paragraph interrupts the flow of the section by going into detail about what a symbolic ref file is and how it is implemented. It is not clear what the purpose is since symbolic refs were already mentioned prior (“possibly dereferencing the symbolic refs”). Worse, it can confuse the reader about what argument can be a symbolic ref since it just says “it” and not which of the parameters; in turn the reader can be lead to try `<new-oid>` and then get a confusing error since update-ref will just say that it is not a valid SHA1. gitglossary(7) already documents what a symref is, concretely, and quite well at that. Reported-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21	Documentation/git-update-ref.txt: demote symlink to last section	Kristoffer Haugsbakk	1	-8/+11
	Move the discussion of file system symbolic links to a new “Notes” section (inspired by the one in git-symbolic-ref(1)) since this is mostly of historical note at this point, not something that is needed in the main section of the documentation. Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21	Documentation/git-update-ref.txt: remove safety paragraphs	Kristoffer Haugsbakk	1	-15/+0
	Remove paragraphs which explain that using this command is safer than echoing the branch name into `HEAD`. Evoking the echo strategy is wrong now under the reftable backend since this file does not exist. And the ref file backend majority user base use porcelain commands to manage `HEAD` unless they are intentionally poking at the implementation. Maybe this warning was relevant for the usage patterns when it was added[1] but now it just takes up space. † 1: 129056370ab (Add missing documentation., 2005-10-04) Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21	Documentation/git-update-ref.txt: drop “flag”	Kristoffer Haugsbakk	1	-1/+1
	The other paragraphs on options say “With <option>,”. Let’s be uniform. Also add missing word “that”. Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21	ref-filter: format iteratively with lexicographic refname sorting	Patrick Steinhardt	1	-8/+21
	In bd98f9774e (ref-filter.c: filter & format refs in the same callback, 2023-11-14), we have introduced logic into the ref-filter subsystem that determines whether or not we can output references iteratively instead of first collecting all references, post-processing them and printing them once done. This has the advantage that we don't have to store all refs in memory and, when used with e.g. `--count=1`, that we don't have to read all refs in the first place. One restriction we have in place for that is that caller must not ask for sorted refs, because there is no way to sort the refs without first reading them all into an array. So the benefits can only be reaped when explicitly asking for output not to be sorted. But there is one exception here where we _can_ get away with sorting refs while streaming: ref backends sort references returned by their iterators in lexicographic order. So if the following conditions are all true we can do iterative streaming: - There must be at most a single sorting specification, as otherwise we're not using plain lexicographic ordering. - The sorting specification must use the "refname". - The sorting specification must not be using any flags, like case-insensitive sorting. Now the resulting logic does feel quite fragile overall, which makes me a bit uneasy. But after thinking about this for a while I couldn't find any obvious gaps in my reasoning. Furthermore, given that lexicographic sorting order is the default in git-for-each-ref(1), this is likely to benefit a whole lot of usecases out there. The following benchmark executes git-for-each-ref(1) in a crafted repo with 1 million references: Benchmark 1: git for-each-ref (revision = HEAD~) Time (mean ± σ): 6.756 s ± 0.014 s [User: 3.004 s, System: 3.541 s] Range (min … max): 6.738 s … 6.784 s 10 runs Benchmark 2: git for-each-ref (revision = HEAD) Time (mean ± σ): 6.479 s ± 0.017 s [User: 2.858 s, System: 3.422 s] Range (min … max): 6.450 s … 6.519 s 10 runs Summary git for-each-ref (revision = HEAD) 1.04 ± 0.00 times faster than git for-each-ref (revision = HEAD~) The change results in a slight performance improvement, but nothing that would really stand out. Something that cannot be seen in the benchmark though is peak memory usage, which went from 404.5MB to 68.96kB. A more interesting benchmark is printing a single referenence with `--count=1`: Benchmark 1: git for-each-ref --count=1 (revision = HEAD~) Time (mean ± σ): 6.655 s ± 0.018 s [User: 2.865 s, System: 3.576 s] Range (min … max): 6.630 s … 6.680 s 10 runs Benchmark 2: git for-each-ref --count=1 (revision = HEAD) Time (mean ± σ): 8.6 ms ± 1.3 ms [User: 2.3 ms, System: 6.1 ms] Range (min … max): 6.7 ms … 14.4 ms 266 runs Summary git git for-each-ref --count=1 (revision = HEAD) 770.58 ± 116.19 times faster than git for-each-ref --count=1 (revision = HEAD~) Whereas we scaled with the number of references before, we now print the first reference and exit immediately, which provides a massive win. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21	global: Fix duplicate word typos	Sven Strickroth	11	-11/+11
	Used regex to find these typos: (?<!struct )(?<=\s)([a-z]{1,}) \1(?=\s) Signed-off-by: Sven Strickroth <email@cs-ware.de> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21	notes: teach the -e option to edit messages in editor	Abraham Samuel Adekunle	3	-5/+76
	Notes can be added to a commit using: - "-m" to provide a message on the command line. - -C to copy a note from a blob object. - -F to read the note from a file. When these options are used, Git does not open an editor, it simply takes the content provided via these options and attaches it to the commit as a note. Improve flexibility to fine-tune the note before finalizing it by allowing the messages to be prefilled in the editor and edited after the messages have been provided through -[mF]. Signed-off-by: Abraham Samuel Adekunle <abrahamadekunle50@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-21	doc: document how uploadpack.allowAnySHA1InWant impact other allow options	Piotr Szlazak	1	-1/+5
	Document how setting of `uploadpack.allowAnySHA1InWant` influences other `uploadpack` options - `allowTipSHA1InWant` and `allowReachableSHA1InWant`. Signed-off-by: Piotr Szlazak <piotr.szlazak@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-18	credential-cache: treat ECONNABORTED like ECONNRESET	Ramsay Jones	1	-1/+1
	On Cygwin, t0301 fails because "git credential-cache exit" returns a non-zero exit code. What's supposed to happen here is: 1. The client (the "credential-cache" invocation above) connects to a previously-spawned credential-cache--daemon. 2. The client sends an "exit" command to the daemon. 3. The daemon unlinks the socket and then exits, closing the descriptor back to the client. 4. The client sees EOF on the descriptor and exits successfully. That works on most platforms, and even _used_ to work on Cygwin. But that changed in Cygwin's ef95c03522 (Cygwin: select: Fix FD_CLOSE handling, 2021-04-06). After that commit, the client sees a read error with errno set to ECONNABORTED, and it reports the error and dies. It's not entirely clear if this is a Cygwin bug. It seems that calling fclose() on the filehandles pointing to the sockets is sufficient to avoid this error return, even though exiting should in general look the same from the client's perspective. However, we can't just call fclose() here. It's important in step 3 above to unlink the socket before closing the descriptor to avoid the race mentioned by 7d5e9c9849 (credential-cache--daemon: clarify "exit" action semantics, 2016-03-18). The client will exit as soon as it sees the descriptor close, and the daemon may or may not have actually unlinked the socket by then. That makes test code like this: git credential exit && test_path_is_missing .git-credential-cache racy. So we probably _could_ fix this by calling: delete_tempfile(&socket_file); fclose(in); fclose(out); before we exit(). Or by replacing the exit() with a return up the stack, in which case the fclose() happens as we unwind. But in that case we'd still need to call delete_tempfile() here to avoid the race. But simpler still is that we can notice that we already special-case ECONNRESET on the client side, courtesy of 1f180e5eb9 (credential-cache: interpret an ECONNRESET as an EOF, 2017-07-27). We can just do the same thing here (I suspect that prior to the Cygwin commit that introduced this problem, we were really just seeing ECONNRESET instead of ECONNABORTED, so the "new" problem is just the switch of the errno values). There's loads more debugging in this thread: https://lore.kernel.org/git/9dc3e85f-a532-6cff-de11-1dfb2e4bc6b6@ramsayjones.plus.com/ but I've tried to summarize the useful bits in this commit message. [jk: commit message] Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-18	The third batch	Taylor Blau	1	-0/+13
	Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-18	Merge branch 'ps/maintenance-start-crash-fix'	Taylor Blau	2	-2/+21
	"git maintenance start" crashed due to an uninitialized variable reference, which has been corrected. * ps/maintenance-start-crash-fix: builtin/gc: fix crash when running `git maintenance start`
2024-10-18	Merge branch 'xx/protocol-v2-doc-markup-fix'	Taylor Blau	1	-2/+2
	Docfix. * xx/protocol-v2-doc-markup-fix: Documentation/gitprotocol-v2.txt: fix a slight inconsistency in format
2024-10-18	Merge branch 'tc/bundle-uri-leakfix'	Taylor Blau	1	-5/+13
	Leakfix. * tc/bundle-uri-leakfix: bundle-uri: plug leak in unbundle_from_file()
2024-10-18	Merge branch 'kh/checkout-ignore-other-docfix'	Taylor Blau	2	-5/+5
	Doc updates. * kh/checkout-ignore-other-docfix: checkout: refer to other-worktree branch, not ref
2024-10-18	Merge branch 'kh/merge-tree-doc'	Taylor Blau	1	-3/+9
	Docfix. * kh/merge-tree-doc: doc: merge-tree: improve example script
2024-10-18	Merge branch 'ng/rebase-merges-branch-name-as-label'	Taylor Blau	5	-26/+53
	"git rebase --rebase-merges" now uses branch names as labels when able. * ng/rebase-merges-branch-name-as-label: rebase-merges: try and use branch names as labels rebase-update-refs: extract load_branch_decorations load_branch_decorations: fix memory leak with non-static filters
2024-10-18	Merge branch 'kn/loose-object-layer-wo-global-hash'	Taylor Blau	1	-4/+3
	Code clean-up. * kn/loose-object-layer-wo-global-hash: loose: don't rely on repository global state
2024-10-18	Merge branch 'jc/doc-refspec-syntax'	Taylor Blau	1	-3/+4
	Doc updates. * jc/doc-refspec-syntax: doc: clarify <src> in refspec syntax
2024-10-18	Merge branch 'aa/t7300-modernize'	Taylor Blau	1	-185/+185
	Test modernization. * aa/t7300-modernize: t7300-clean.sh: use test_path_* helper functions for error logging
2024-10-17	reftable: handle trivial `reftable_buf` errors	Patrick Steinhardt	6	-41/+107
	Convert the reftable library such that we handle failures with the new `reftable_buf` interfaces. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17	reftable/stack: adapt `stack_filename()` to handle allocation failures	Patrick Steinhardt	1	-17/+45
	The `stack_filename()` function cannot pass any errors to the caller as it has a `void` return type. Adapt it and its callers such that we can handle errors and start handling allocation failures. There are two interesting edge cases in `reftable_stack_destroy()` and `reftable_addition_close()`. Both of these are trying to tear down their respective structures, and while doing so they try to unlink some of the tables they have been keeping alive. Any earlier attempts to do that may fail on Windows because it keeps us from deleting such tables while they are still open, and thus we re-try on close. It's okay and even expected that this can fail when the tables are still open by another process, so we handle the allocation failures gracefully and just skip over any file whose name we couldn't figure out. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17	reftable/record: adapt `reftable_record_key()` to handle allocation failures	Patrick Steinhardt	5	-22/+47
	The `reftable_record_key()` function cannot pass any errors to the caller as it has a `void` return type. Adapt it and its callers such that we can handle errors and start handling allocation failures. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17	reftable/stack: adapt `format_name()` to handle allocation failures	Patrick Steinhardt	1	-9/+19
	The `format_name()` function cannot pass any errors to the caller as it has a `void` return type. Adapt it and its callers such that we can handle errors and start handling allocation failures. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17	t/unit-tests: check for `reftable_buf` allocation errors	Patrick Steinhardt	5	-44/+44
	Adapt our unit tests to check for allocations errors. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17	reftable/blocksource: adapt interface name	Patrick Steinhardt	6	-36/+36
	Adapt the name of the `strbuf` block source to no longer relate to this interface, but instead to the `reftable_buf` interface. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17	reftable: convert from `strbuf` to `reftable_buf`	Patrick Steinhardt	24	-371/+374
	Convert the reftable library to use the `reftable_buf` interface instead of the `strbuf` interface. This is mostly a mechanical change via sed(1) with some manual fixes where functions for `strbuf` and `reftable_buf` differ. The converted code does not yet handle allocation failures. This will be handled in subsequent commits. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17	reftable/basics: provide new `reftable_buf` interface	Patrick Steinhardt	2	-0/+130
	Implement a new `reftable_buf` interface that will replace Git's own `strbuf` interface. This is done due to three reasons: - The `strbuf` interfaces do not handle memory allocation failures and instead causes us to die. This is okay in the context of Git, but is not in the context of the reftable library, which is supposed to be usable by third-party applications. - The `strbuf` interface is quite deeply tied into Git, which makes it hard to use the reftable library as a standalone library. Any dependent would have to carefully extract the relevant parts of it to make things work, which is not all that sensible. - The `strbuf` interface does not use the pluggable allocators that can be set up via `reftable_set_alloc()`. So we have good reasons to use our own type, and the implementation is rather trivial. Implement our own type. Conversion of the reftable library will be handled in subsequent commits. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17	reftable: stop using `strbuf_addf()`	Patrick Steinhardt	4	-37/+50
	We're about to introduce our own `reftable_buf` type to replace `strbuf`. One function we'll have to convert is `strbuf_addf()`, which is used in a handful of places. This function uses `snprintf()` internally, which makes porting it a bit more involved: - It is not available on all platforms. - Some platforms like Windows have broken implementations. So by using `snprintf()` we'd also push the burden on downstream users of the reftable library to make available a properly working version of it. Most callsites of `strbuf_addf()` are trivial to convert to not using it. We do end up using `snprintf()` in our unit tests, but that isn't much of a problem for downstream users of the reftable library. While at it, remove a useless call to `strbuf_reset()` in `t_reftable_stack_auto_compaction_with_locked_tables()`. We don't write to the buffer before this and initialize it with `STRBUF_INIT`, so there is no need to reset anything. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17	reftable: stop using `strbuf_addbuf()`	Patrick Steinhardt	3	-7/+8
	We're about to introduce our own `reftable_buf` type to replace `strbuf`. Get rid of the seldomly-used `strbuf_addbuf()` function such that we have to reimplement one less function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17	t: fix typos	Andrew Kreimer	9	-14/+14
	Fix typos in documentation, comments, etc. Via codespell. Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17	builtin/shortlog: explicitly set hash algo when there is no repo	Wolfgang Müller	2	-0/+16
	Whilst git-shortlog(1) does not explicitly need any repository information when run without reference to one, it still parses some of its arguments with parse_revision_opt() which assumes that the hash algorithm is set. However, in c8aed5e8da (repository: stop setting SHA1 as the default object hash, 2024-05-07) we stopped setting up a default hash algorithm and instead require commands to set it up explicitly. This was done for most other commands like in ab274909d4 (builtin/diff: explicitly set hash algo when there is no repo, 2024-05-07) but was missed for builtin/shortlog, making git-shortlog(1) segfault outside of a repository when given arguments like --author that trigger a call to parse_revision_opt(). Fix this for now by explicitly setting the hash algorithm to SHA1. Also add a regression test for the segfault. Thanks-to: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Wolfgang Müller <wolf@oriole.systems> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-17	mingw.c: Fix complier warnings for a 64 bit msvc	Sören Krecker	3	-12/+21
	Remove some complier warnings from msvc in compat/mingw.c for value truncation from 64 bit to 32 bit integers. Compiling compat/mingw.c under a 64 bit version of msvc produces warnings. An "int" is 32 bit, and ssize_t or size_t should be 64 bit long. Prepare compat/vcbuild/include/unistd.h to have a 64 bit type _ssize_t, when _WIN64 is defined and 32 bit otherwise. Further down in this include file, as before, ssize_t is defined as _ssize_t, if needed. Use size_t instead of int for all variables that hold the result of strlen() or wcslen() (which cannot be negative). Use ssize_t to hold the return value of read(). Signed-off-by: Sören Krecker <soekkle@freenet.de> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16	http: fix build error on FreeBSD	Patrick Steinhardt	1	-4/+6
	The `result` parameter passed to `http_request_reauth()` may either point to a `struct strbuf` or a `FILE `, where the `target` parameter tells us which of either it actually is. To accommodate for both types the pointer is a `void `, which we then pass directly to functions without doing a cast. This is fine on most platforms, but it breaks on FreeBSD because `fileno()` is implemented as a macro that tries to directly access the `FILE ` structure. Fix this issue by storing the `FILE ` in a local variable before we pass it on to other functions. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16	builtin/credential-cache: fix missing parameter for stub function	Patrick Steinhardt	1	-1/+2
	When not compiling the credential cache we may use a stub function for `cmd_credential_cache()`. With commit 9b1cb5070f (builtin: add a repository parameter for builtin functions, 2024-09-13), we have added a new parameter to all of those top-level `cmd_*()` functions, and did indeed adapt the non-stubbed-out `cmd_credential_cache()`. But we didn't adapt the stubbed-out variant, so the code does not compile. Fix this by adding the missing parameter. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16	t7300: work around platform-specific behaviour with long paths on MinGW	Patrick Steinhardt	1	-1/+1
	Windows by default has a restriction in place to only allow paths up to 260 characters. This restriction can nowadays be lifted by setting a registry key, but is still active by default. In t7300 we have one test that exercises the behaviour of git-clean(1) with such long paths. Interestingly enough, this test fails on my system that uses Windows 10 with mingw-w64 installed via MSYS2: instead of observing ENAMETOOLONG, we observe ENOENT. This behaviour is consistent across multiple different environments I have tried. I cannot say why exactly we observe a different error here, but I would not be surprised if this was either dependent on the Windows version, the version of MinGW, the current working directory of Git or any kind of combination of these. Work around the issue by handling both errors. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16	t5500, t5601: skip tests which exercise paths with '[::1]' on Cygwin	Patrick Steinhardt	2	-6/+19
	Parsing repositories which contain '[::1]' is broken on Cygwin. It seems as if Cygwin is confusing those as drive letter prefixes or something like this, but I couldn't deduce the actual root cause. Mark those tests as broken for now. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16	t3404: work around platform-specific behaviour on macOS 10.15	Patrick Steinhardt	1	-14/+12
	Two of our tests in t3404 use indented HERE docs where leading tabs on some of the lines are actually relevant. The tabs do get removed though, and we try to fix this up by using sed(1) to replace leading tabs in the actual output, as well. But macOS 10.15 uses an oldish version of sed(1) that has BSD lineage, which does not understand "\t", and thus we fail to strip those leading tabs and fail the test. Address this issue by using `q_to_tab` such that we do not have to strip leading tabs from the actual output. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16	t1401: make invocation of tar(1) work with Win32-provided one	Patrick Steinhardt	1	-1/+1
	Windows nowadays provides a tar(1) binary in "C:\Windows\system32". This version of tar(1) doesn't seem to handle the case where directory paths end with a trailing forward slash. And as we do that in t1401 the result is that the test fails. Drop the trailing slash. Other tests that use tar(1) work alright, this is the only instance where it has been failing. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16	t/lib-gpg: fix setup of GNUPGHOME in MinGW	Patrick Steinhardt	1	-1/+1
	In "t/lib-gpg.sh" we set up the "GNUPGHOME" environment variable to point to a test-specific directory. This is done by using "$PWD/gpghome" as value, where "$PWD" is the current test's trash directory. This is broken for MinGW though because "$PWD" will use Windows-style paths that contain drive letters. What we really want in this context is a Unix-style path, which we can get by using `$(pwd)` instead. It is somewhat puzzling that nobody ever hit this issue, but it may easily be that nobody ever tests on Windows with GnuPG installed, which would make us skip those tests. Adapt the code accordingly to fix tests using this library. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16	t/lib-gitweb: test against the build version of gitweb	Patrick Steinhardt	1	-2/+2
	When testing gitweb we set up the CGI script as "gitweb.perl", which is the source file of the build target "gitweb.cgi". This file doesn't have a patched shebang and still contains `++REPLACEMENT++` markers, but things generally work because we replace the configuration with our own test configuration. But this only works as long as "$GIT_BUILD_DIR" actually points to the source tree, because "gitweb.cgi" and "gitweb.perl" happen to sit next to each other. This is not the case though once you have out-of-tree builds like with CMake, where the source and built versions live in different directories. Consequently, "$GIT_BUILD_DIR/gitweb/gitweb.perl" won't exist there. While we could ask build systems with out-of-tree builds to instead set up GITWEB_TEST_INSTALLED, which allows us to override the location of the script, it goes against the spirit of this environment variable. We _don't_ want to test against an installed version, we want to use the version we have just built. Fix this by using "gitweb.cgi" instead. This means that you cannot run test scripts without building that file, but in general we do expect developers to build stuff before they test it anyway. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16	t/test-lib: wire up NO_ICONV prerequisite	Patrick Steinhardt	22	-106/+229
	The iconv library is used by Git to reencode files, commit messages and other things. As such it is a rather integral part, but given that many platforms nowadays use UTF-8 everywhere you can live without support for reencoding in many situations. It is thus optional to build Git with iconv, and some of our platforms wired up in "config.mak.uname" disable it. But while we support building without it, running our test suite with "NO_ICONV=Yes" causes many test failures. Wire up a new test prerequisite ICONV that gets populated via our GIT-BUILD-OPTIONS. Annotate failing tests accordingly. Note that this commit does not do a deep dive into every single test to assess whether the failure is expected or not. Most of the tests do smell like the expected kind of failure though. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-16	t/test-lib: fix quoting of TEST_RESULTS_SAN_FILE	Patrick Steinhardt	1	-1/+1
	When assembling our LSAN_OPTIONS that configure the leak sanitizer we end up prepending the string with various different colon-separated options via calls to `prepend_var`. One of the settings we add is the path where the sanitizer should store logs, which can be an arbitrary filesystem path. Naturally, filesystem paths may contain whitespace characters. And while it does seem as if we were quoting the value, we use escaped quotes and consequently split up the value if it does contain spaces. This leads to the following error in t0000 when having a value with whitespaces: .../t/test-lib.sh: eval: line 64: unexpected EOF while looking for matching `"' ++ return 1 error: last command exited with $?=1 not ok 5 - subtest: 3 passing tests The error itself is a bit puzzling at first. The basic problem is that the code sees the leading escaped quote during eval, but because we truncate everything after the space character it doesn't see the trailing escaped quote and thus fails to parse the string. Properly quote the value to fix the issue while using single-quotes to quote the inner value passed to eval. The issue can be reproduced by t0000 with such a path that contains spaces. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-15	The second batch	Taylor Blau	1	-0/+12
	Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-15	Merge branch 'jk/fsmonitor-event-listener-race-fix'	Taylor Blau	7	-18/+98
	On macOS, fsmonitor can fall into a race condition that results in a client waiting forever to be notified for an event that have already happened. This problem has been corrected. * jk/fsmonitor-event-listener-race-fix: fsmonitor: initialize fs event listener before accepting clients simple-ipc: split async server initialization and running
2024-10-15	Merge branch 'xx/remote-server-option-config'	Taylor Blau	12	-8/+184
	A new configuration variable remote.<name>.serverOption makes the transport layer act as if the --serverOption=<value> option is given from the command line. * xx/remote-server-option-config: ls-remote: leakfix for not clearing server_options fetch: respect --server-option when fetching multiple remotes transport.c::handshake: make use of server options from remote remote: introduce remote.<name>.serverOption configuration transport: introduce parse_transport_option() method
2024-10-15	Merge branch 'js/doc-platform-support-link-fix'	Taylor Blau	1	-2/+2
	Docfix. * js/doc-platform-support-link-fix: docs: fix the `maintain-git` links in `technical/platform-support`
2024-10-15	Merge branch 'jh/config-unset-doc-fix'	Taylor Blau	2	-3/+3
	Docfix. * jh/config-unset-doc-fix: git-config.1: remove value from positional args in unset usage
2024-10-14	t3404: replace test with test_line_count()	Usman Akinyemi	1	-14/+14
	Refactor t3404 to replace instances of `test` with `test_line_count()` for checking line counts. This improves readability and aligns with Git's current test practices. Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-14	t3404: avoid losing exit status with focus on `git show` and `git cat-file`	Usman Akinyemi	1	-23/+48
	The exit code of the preceding command in a pipe is disregarded. So if that preceding command is a Git command that fails, the test would not fail. Instead, by saving the output of that Git command to a file, and removing the pipe, we make sure the test will fail if that Git command fails. This particular patch focuses on all `git show` and some instances of `git cat-file`. Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-11	BreakingChanges: early adopter option	Junio C Hamano	1	-1/+20
	Discussing the desire to make breaking changes, declaring that breaking changes are made at a certain version boundary, and recording these decisions in this document, are necessary but not sufficient. We need to make sure that we can implement, test, and deploy such impactful changes. Earlier we considered to guard the breaking changes with a run-time check of the `feature.git<version>` configuration to allow brave users and developers to opt into them as early adoptors. But the engineering cost to support such a run-time switch, covering new and disappearing git subcommands and how "git help" would adjust the documentation to the run-time switch, would be unrealistically high to be worth it. Formalize the mechanism based on a compile-time switch to allow early adopters to opt into the breaking change in a version of Git before the planned version for the breaking change. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-11	t/README: add missing value for GIT_TEST_DEFAULT_REF_FORMAT	Bence Ferdinandy	1	-2/+3
	The documentation only lists "files" as a possible value, but "reftable" is also valid. Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-11	Makefile: fix dependency for $(UNIT_TEST_DIR)/clar/clar.o	Philippe Blain	1	-0/+1
	The clar source file '$(UNIT_TEST_DIR)/clar/clar.c' includes the generated 'clar.suite', but this dependency is not taken into account by our Makefile, so that it is possible for a parallel build to fail if Make tries to build 'clar.o' before 'clar.suite' is generated. Correctly specify the dependency. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-11	archive: remove the_repository global variable	John Cai	1	-3/+2
	As part of the effort to get rid of global state due to the global the_repository variable, replace the_repository with the repository argument that gets passed down through the builtin function. The repo might be NULL, but we should be safe in write_archive() because it detects if we are outside of a repository and calls setup_git_directory() which will error. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-11	annotate: remove usage of the_repository global	John Cai	1	-3/+2
	As part of the effort to get rid of global state due to the_repository variable, remove the the_repository with the repository argument that gets passed down through the builtin function. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-11	git: pass in repo to builtin based on setup_git_directory_gently	John Cai	2	-4/+6
	The current code in run_builtin() passes in a repository to the builtin based on whether cmd_struct's option flag has RUN_SETUP. This is incorrect, however, since some builtins that only have RUN_SETUP_GENTLY can potentially take a repository. setup_git_directory_gently() tells us whether or not a command is being run inside of a repository. Use the output of setup_git_directory_gently() to help determine whether or not there is a repository to pass to the builtin. If not, then we just pass NULL. As part of this patch, we need to modify add to check for a NULL repo before calling repo_git_config(), since add -h can be run outside of a repository. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10	Start the 2.48 cycle	Junio C Hamano	3	-2/+37
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10	Merge branch 'jk/output-prefix-cleanup'	Junio C Hamano	7	-43/+28
	Code clean-up. * jk/output-prefix-cleanup: diff: store graph prefix buf in git_graph struct diff: return line_prefix directly when possible diff: return const char from output_prefix callback diff: drop line_prefix_length field line-log: use diff_line_prefix() instead of custom helper
2024-10-10	Merge branch 'ps/leakfixes-part-8'	Junio C Hamano	62	-123/+329
	More leakfixes. * ps/leakfixes-part-8: (23 commits) builtin/send-pack: fix leaking list of push options remote: fix leaking push reports t/helper: fix leaks in proc-receive helper pack-write: fix return parameter of `write_rev_file_order()` revision: fix leaking saved parents revision: fix memory leaks when rewriting parents midx-write: fix leaking buffer pack-bitmap-write: fix leaking OID array pseudo-merge: fix leaking strmap keys pseudo-merge: fix various memory leaks line-log: fix several memory leaks diff: improve lifecycle management of diff queues builtin/revert: fix leaking `gpg_sign` and `strategy` config t/helper: fix leaking repository in partial-clone helper builtin/clone: fix leaking repo state when cloning with bundle URIs builtin/pack-redundant: fix various memory leaks builtin/stash: fix leaking `pathspec_from_file` submodule: fix leaking submodule entry list wt-status: fix leaking buffer with sparse directories shell: fix leaking strings ...
2024-10-10	Merge branch 'ds/line-log-asan-fix'	Junio C Hamano	2	-6/+32
	Use after free and double freeing at the end in "git log -L... -p" had been identified and fixed. * ds/line-log-asan-fix: line-log: protect inner strbuf from free
2024-10-10	Merge branch 'sk/doc-maintenance-schedule'	Junio C Hamano	1	-1/+3
	Doc update to clarify how periodical maintenance are scheduled, spread across time to avoid thundering hurds. * sk/doc-maintenance-schedule: doc: add a note about staggering of maintenance
2024-10-10	Merge branch 'tb/notes-amlog-doc'	Junio C Hamano	1	-0/+44
	Document "amlog" notes. * tb/notes-amlog-doc: Documentation: mention the amlog in howto/maintain-git.txt
2024-10-10	Merge branch 'ps/reftable-alloc-failures'	Junio C Hamano	38	-408/+893
	The reftable library is now prepared to expect that the memory allocation function given to it may fail to allocate and to deal with such an error. * ps/reftable-alloc-failures: (26 commits) reftable/basics: fix segfault when growing `names` array fails reftable/basics: ban standard allocator functions reftable: introduce `REFTABLE_FREE_AND_NULL()` reftable: fix calls to free(3P) reftable: handle trivial allocation failures reftable/tree: handle allocation failures reftable/pq: handle allocation failures when adding entries reftable/block: handle allocation failures reftable/blocksource: handle allocation failures reftable/iter: handle allocation failures when creating indexed table iter reftable/stack: handle allocation failures in auto compaction reftable/stack: handle allocation failures in `stack_compact_range()` reftable/stack: handle allocation failures in `reftable_new_stack()` reftable/stack: handle allocation failures on reload reftable/reader: handle allocation failures in `reader_init_iter()` reftable/reader: handle allocation failures for unindexed reader reftable/merged: handle allocation failures in `merged_table_init_iter()` reftable/writer: handle allocation failures in `reftable_new_writer()` reftable/writer: handle allocation failures in `writer_index_hash()` reftable/record: handle allocation failures when decoding records ...
2024-10-10	Merge branch 'ja/doc-synopsis-markup'	Junio C Hamano	8	-107/+209
	The way AsciiDoc is used for SYNOPSIS part of the manual pages has been revamped. The sources, at least for the simple cases, got vastly pleasant to work with. * ja/doc-synopsis-markup: doc: apply synopsis simplification on git-clone and git-init doc: update the guidelines to reflect the current formatting rules doc: introduce a synopsis typesetting
2024-10-10	t: fix typos	Andrew Kreimer	4	-4/+4
	Fix typos via codespell. Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10	t/helper: fix a typo	Andrew Kreimer	1	-1/+1
	Fix a typo in comments: bellow -> below. Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10	t/perf: fix typos	Andrew Kreimer	2	-3/+3
	Fix typos via codespell. Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10	t/unit-tests: fix typos	Andrew Kreimer	2	-2/+2
	Fix typos via codespell. Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10	contrib: fix typos	Andrew Kreimer	3	-4/+4
	Fix typos via codespell. Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10	compat: fix typos	Andrew Kreimer	3	-4/+4
	Fix typos and grammar. Reported-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10	checkout: refer to other-worktree branch, not ref	Kristoffer Haugsbakk	2	-5/+5
	We can only check out commits or branches, not refs in general. And the problem here is if another worktree is using the branch that we want to check out. Let’s be more direct and just talk about branches instead of refs. Also replace “be held” with “in use”. Further, “in use” is not restricted to a branch being checked out (e.g. the branch could be busy on a rebase), hence generalize to “or otherwise in use” in the option description. Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10	Documentation/gitprotocol-v2.txt: fix a slight inconsistency in format	Xing Xin	1	-2/+2
	Signed-off-by: Xing Xin <xingxin.xx@bytedance.com> Acked-by: Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10	bundle-uri: plug leak in unbundle_from_file()	Toon Claes	1	-5/+13
	The function `unbundle_from_file()` has two memory leaks: - We do not release the `struct bundle_header header` when hitting errors because we return early without any cleanup. - We do not release the `struct strbuf bundle_ref` at all. Plug these leaks by creating a common exit path where both of these variables are released. While at it, refactor the code such that the variable assignments do not happen inside the conditional statement itself according to our coding style. Signed-off-by: Toon Claes <toon@iotcl.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10	builtin/gc: fix crash when running `git maintenance start`	Patrick Steinhardt	2	-2/+21
	It was reported on the mailing list that running `git maintenance start` immediately segfaults starting with b6c3f8e12c (builtin/maintenance: fix leak in `get_schedule_cmd()`, 2024-09-26). And indeed, this segfault is trivial to reproduce up to a point where one is scratching their head why we didn't catch this regression in our test suite. The root cause of this error is `get_schedule_cmd()`, which does not populate the `out` parameter in all cases anymore starting with the mentioned commit. Callers do assume it to always be populated though and will e.g. call `strvec_split()` on the returned value, which will of course segfault when the variable is uninitialized. So why didn't we catch this trivial regression? The reason is that our tests always set up the "GIT_TEST_MAINT_SCHEDULER" environment variable via "t/test-lib.sh", which allows us to override the scheduler command with a custom one so that we don't accidentally modify the developer's system. But the faulty code where we don't set the `out` parameter will only get hit in case that environment variable is _not_ set, which is never the case when executing our tests. Fix the regression by again unconditionally allocating the value in the `out` parameter, if provided. Add a test that unsets the environment variable to catch future regressions in this area. Reported-by: Shubham Kanodia <shubham.kanodia10@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09	doc: clarify <src> in refspec syntax	Junio C Hamano	1	-3/+4
	We explicitly avoid saying "ref <src>" when introducing the source side of a refspec, because it can be a fully-spelled hexadecimal object name, and it also can be a pattern that is not quite a "ref". But we are loose when we introduce <dst> and say "ref <dst>", even though it can also be a pattern. Let's omit "ref" also from the destination side. Clarify that <src> can be a ref, a (limited glob) pattern, or an object name. Even though the very original design of refspec expected that '' was used only at the end (e.g., "refs/heads/" was expected, but not "refs/heads/-wip"), the code and its use evolved to handle a single '' anywhere in the pattern. Update the text to remove the mention of "the same prefix". Anything that matches the pattern are named by such a (limited glob) pattern in <src>. Also put a bit more stress on the fact that we accept only one '' in the pattern by saying "one and only one ``". Helped-by: Monika Kairaitytė <monika@kibit.lt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09	t7300-clean.sh: use test_path_* helper functions for error logging	Abraham Samuel Adekunle	1	-185/+185
	This test script uses "test - [def]", but when a test fails because the file passed to it does not exist, it fails silently without an error message. Use test_path_* helper functions, which are designed to give better error messages when their expectations are not met. I have added a mechanical validation that applies the same transformation done in this patch, when the test script is passed to a sed script as shown below. sed -e 's/^$ $test -f /\1test_path_is_file /' \ -e 's/^$ $test -d /\1test_path_is_dir /' \ -e 's/^$ $test -e /\1test_path_exists /' \ -e 's/^$ $! test -[edf] /\1test_path_is_missing /' \ -e 's/^$ *$test ! -[edf] /\1test_path_is_missing /' \ "$1" >foo.sh Reviewers can use the sed script to tranform the original test script and compare the result in foo.sh with the results of applying the patch. You will see an instance of "!(test -e 3)" which was manually replaced with ""test_path_is_missing 3", and everything else should match. Careful and deliberate observation was done to check instances where "test ! - [df] foo" was used in the test script to make sure that the test instances were expecting foo to EITHER be a file or a directory, and NOT a possibility of being both as this would make replacing "test ! -f foo" with "test_path_is_missing foo" unreasonable. In the tests control flow, foo has been created as EITHER a reguar file OR a directory and should NOT exist after "git clean" or "git clean -d", as the case maybe, has been called. This made it reasonable to replace "test ! -[df] foo" with "test_path_is_missing foo". Signed-off-by: Abraham Samuel Adekunle <abrahamadekunle50@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09	loose: don't rely on repository global state	Karthik Nayak	1	-4/+3
	In `loose.c`, we rely on the global variable `the_hash_algo`. As such we have guarded the file with the 'USE_THE_REPOSITORY_VARIABLE' definition. Let's derive the hash algorithm from the available repository variable and remove this guard. This brings us one step closer to removing the global 'the_repository' variable. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09	gitlab-ci: exercise Git on Windows	Patrick Steinhardt	1	-0/+33
	Add jobs that exercise Git on Windows. Unfortunately, building and especially testing Git on Windows is inherently slower compared to other Unix-like systems, mostly because spawning processes is way slower. We thus use the same layout as we use in GitHub Actions, where we have one build job, and then pass on the resulting build artifacts to ten test jobs that split up the work across each other. Unfortunately, the GitLab runners for Windows machines are embarassingly slow by themselves. So while this strategy leads to around 20 minutes of build time in GitHub Actions, the same pipeline takes around an hour in GitLab CI. Still, having late coverage is certainly better than having none at all. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09	gitlab-ci: introduce stages and dependencies	Patrick Steinhardt	1	-0/+18
	We're about to add a couple of jobs for Windows. As the Windows runners are quite slow, we will split those up across two stages: one stage to build the artifacts, and one stage that runs test slices in parallel. Introduce stages and "needs" dependencies for the preexisting jobs as a preparatory step. The stages will lead to a more natural representation of jobs in the UI, whereas the "needs" dependency ensures that jobs do not have to wait for all jobs in the preceding stage to finish. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09	ci: handle Windows-based CI jobs in GitLab CI	Patrick Steinhardt	1	-5/+13
	We try to abstract away any differences between different CI platforms in "ci/lib.sh", such that knowledge specific to e.g. GitHub Actions or GitLab CI is neatly encapsulated in a single place. Next to some generic variables, we also set up some variables that are specific to the actual platform that the CI operates on, e.g. Linux or macOS. We do not yet support Windows runners on GitLab CI. Unfortunately, those systems do not use the same "CI_JOB_IMAGE" environment variable as both Linux and macOS do. Instead, we can use the "OS" variable, which should have a value of "Windows_NT" on Windows platforms. Handle the combination of "$OS,$CI_JOB_IMAGE" and introduce support for Windows. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09	ci: create script to set up Git for Windows SDK	Patrick Steinhardt	2	-6/+22
	In order to build and test Git, we have to first set up the Git for Windows SDK, which contains various required tools and libraries. The SDK is basically a clone of [1], but that repository is quite large due to all the binaries it contains. We thus use both shallow clones and sparse checkouts to speed up the setup. To handle this complexity we use a GitHub action that is hosted externally at [2]. Unfortunately, this makes it rather hard to reuse the logic for CI platforms other than GitHub Actions. After chatting with Johannes Schindelin we came to the conclusion that it would be nice if the Git for Windows SDK would regularly publish releases that one can easily download and extract, thus moving all of the complexity into that single step. Like this, all that a CI job needs to do is to fetch and extract the resulting archive. This published release comes in the form of a new "ci-artifacts" tag that gets updated regularly [3]. Implement a new script that knows how to fetch and extract that script and convert GitHub Actions to use it. [1]: https://github.com/git-for-windows/git-sdk-64/ [2]: https://github.com/git-for-windows/setup-git-for-windows-sdk/ [3]: https://github.com/git-for-windows/git-sdk-64/releases/tag/ci-artifacts/ Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09	t7300: work around platform-specific behaviour with long paths on MinGW	Patrick Steinhardt	1	-1/+1
	Windows by default has a restriction in place to only allow paths up to 260 characters. This restriction can nowadays be lifted by setting a registry key, but is still active by default. In t7300 we have one test that exercises the behaviour of git-clean(1) with such long paths. Interestingly enough, this test fails on my system that uses Windows 10 with mingw-w64 installed via MSYS2: instead of observing ENAMETOOLONG, we observe ENOENT. This behaviour is consistent across multiple different environments I have tried. I cannot say why exactly we observe a different error here, but I would not be surprised if this was either dependent on the Windows version, the version of MinGW, the current working directory of Git or any kind of combination of these. Work around the issue by handling both errors. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09	rebase-merges: try and use branch names as labels	Nicolas Guichard	3	-16/+25
	When interactively rebasing merge commits, the commit message is parsed to extract a probably meaningful label name. For instance if the merge commit is “Merge branch 'feature0'”, then the rebase script will have thes lines: ``` label feature0 merge -C $sha feature0 # “Merge branch 'feature0' ``` This heuristic fails in the case of octopus merges or when the merge commit message is actually unrelated to the parent commits. An example that combines both is: ``` ---. 967bfa4 (HEAD -> integration) Integration \|\ \ \ \| \| \| 2135be1 (feature2, feat2) Feature 2 \| \|_\|/ \|/\| \| \| \| * c88b01a Feature 1 \| \|/ \|/\| \| * 75f3139 (feat0) Feature 0 \|/ * 25c86d0 (main) Initial commit ``` yields the labels Integration, Integration-2 and Integration-3. Fix this by using a branch name for each merge commit's parent that is the tip of at least one branch, and falling back to a label derived from the merge commit message otherwise. In the example above, the labels become feat0, Integration and feature2. Signed-off-by: Nicolas Guichard <nicolas@guichard.eu> Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09	rebase-update-refs: extract load_branch_decorations	Nicolas Guichard	3	-10/+23
	Extract load_branch_decorations from todo_list_add_update_ref_commands so it can be re-used in make_script_with_merges. Since it can now be called multiple times, use non-static lists and place it next to load_ref_decorations to re-use the decoration_loaded guard. Signed-off-by: Nicolas Guichard <nicolas@guichard.eu> Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09	load_branch_decorations: fix memory leak with non-static filters	Nicolas Guichard	1	-0/+5
	load_branch_decorations calls normalize_glob_ref on each string of filter's string_lists. This effectively replaces the potentially non-owning char* of those items with an owning char. Set the strdup_string flag on those string_lists. This was not caught until now because: - when passing string_lists already with the strdup_string already set, the behaviour was correct - when passing static string_lists, the new char remain reachable until program exit Signed-off-by: Nicolas Guichard <nicolas@guichard.eu> Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09	submodule: correct remote name with fetch	Daniel Black	2	-1/+28
	The code fetches the submodules remote based on the superproject remote name instead of the submodule remote name[1]. Instead of grabbing the default remote of the superproject repository, ask the default remote of the submodule we are going to run 'git fetch' in. 1. https://lore.kernel.org/git/ZJR5SPDj4Wt_gmRO@pweza/ Signed-off-by: Daniel Black <daniel@mariadb.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-09	doc: merge-tree: improve example script	Kristoffer Haugsbakk	1	-3/+9
	• Provide a commit message in the example command. The command will hang since it is waiting for a commit message on stdin. Which is usable but not straightforward enough since this is example code. • Use `\|\|` directly since that is more straightforward than checking the last exit status. Also use `echo` and `exit` since `die` is not defined. • Expose variable declarations. Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08	git-config.1: remove value from positional args in unset usage	Josh Heinrichs	2	-3/+3
	The synopsis for `git config unset` mentions two positional arguments: `<name>` and `<value>`. While the first argument is correct, the second is not. Users are expected to provide the value via `--value=<value>`. Remove the positional argument. The `--value=<value>` option is already documented correctly, so this is all we need to do to fix the documentation. Signed-off-by: Josh Heinrichs <joshiheinrichs@gmail.com> Reviewed-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08	fsmonitor: initialize fs event listener before accepting clients	Jeff King	3	-2/+12
	There's a racy hang in fsmonitor on macOS that we sometimes see in CI. When we serve a client, what's supposed to happen is: 1. The client thread calls with_lock__wait_for_cookie() in which we create a cookie file and then wait for a pthread_cond event 2. The filesystem event listener sees the cookie file creation, does some internal book-keeping, and then triggers the pthread_cond. But there's a problem: we start the listener that accepts client threads before we start the fs event thread. So it's possible for us to accept a client which creates the cookie file and starts waiting before the fs event thread is initialized, and we miss those filesystem events entirely. That leaves the client thread hanging forever. In CI, the symptom is that t9210 (which is testing scalar, which always enables fsmonitor under the hood) may hang forever in "scalar clone". It is waiting on "git fetch" which is waiting on the fsmonitor daemon. The race happens more frequently under load, but you can trigger it predictably with a sleep like this, which delays the start of the fs event thread: --- a/compat/fsmonitor/fsm-listen-darwin.c +++ b/compat/fsmonitor/fsm-listen-darwin.c @@ -510,6 +510,7 @@ void fsm_listen__loop(struct fsmonitor_daemon_state *state) FSEventStreamSetDispatchQueue(data->stream, data->dq); data->stream_scheduled = 1; + sleep(1); if (!FSEventStreamStart(data->stream)) { error(_("Failed to start the FSEventStream")); goto force_error_stop_without_loop; One solution might be to reverse the order of initialization: start the fs event thread before we start the thread listening for clients. But the fsmonitor code explicitly does it in the opposite direction. The fs event thread wants to refer to the ipc_server_data struct, so we need it to be initialized first. A further complication is that we need a signal from the fs event thread that it is actually ready and listening. And those details happen within backend-specific fsmonitor code, whereas the initialization is in the shared code. So instead, let's use the ipc_server init/start split added in the previous commit. The generic fsmonitor code will init the ipc_server but _not_ start it, leaving that to the backend specific code, which now needs to call ipc_server_start_async() at the right time. For macOS, that is right after we start the FSEventStream that you can see in the diff above. It's not clear to me if Windows suffers from the same problem (and we simply don't trigger it in CI), or if it is immune. Regardless, the obvious place to start accepting clients there is right after we've established the ReadDirectoryChanges watch. This makes the hangs go away in our macOS CI environment, even when compiled with the sleep() above. Helped-by: Koji Nakamaru <koji.nakamaru@gree.net> Signed-off-by: Jeff King <peff@peff.net> Acked-by: Koji Nakamaru <koji.nakamaru@gree.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08	simple-ipc: split async server initialization and running	Jeff King	5	-18/+88
	To start an async ipc server, you call ipc_server_run_async(). That initializes the ipc_server_data object, and starts all of the threads running, which may immediately start serving clients. This can create some awkward timing problems, though. In the fsmonitor daemon (the sole user of the simple-ipc system), we want to create the ipc server early in the process, which means we may start serving clients before the rest of the daemon is fully initialized. To solve this, let's break run_async() into two parts: an initialization which allocates all data and spawns the threads (without letting them run), and a start function which actually lets them begin work. Since we have two simple-ipc implementations, we have to handle this twice: - in ipc-unix-socket.c, we have a central listener thread which hands connections off to worker threads using a work_available mutex. We can hold that mutex after init, and release it when we're ready to start. We do need an extra "started" flag so that we know whether the main thread is holding the mutex or not (e.g., if we prematurely stop the server, we want to make sure all of the worker threads are released to hear about the shutdown). - in ipc-win32.c, we don't have a central mutex. So we'll introduce a new startup_barrier mutex, which we'll similarly hold until we're ready to let the threads proceed. We again need a "started" flag here to make sure that we release the barrier mutex when shutting down, so that the sub-threads can proceed to the finish. I've renamed the run_async() function to init_async() to make sure we catch all callers, since they'll now need to call the matching start_async(). We could leave run_async() as a wrapper that does both, but there's not much point. There are only two callers, one of which is fsmonitor, which will want to actually do work between the two calls. And the other is just a test-tool wrapper. For now I've added the start_async() calls in fsmonitor where they would otherwise have happened, so there should be no behavior change with this patch. Signed-off-by: Jeff King <peff@peff.net> Acked-by: Koji Nakamaru <koji.nakamaru@gree.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08	worktree: add test for path handling in linked worktrees	Caleb White	1	-0/+19
	A failure scenario reported in an earlier patch series[1] that several `git worktree` subcommands failed or misbehaved when invoked from within linked worktrees that used relative paths. This adds a test that executes a `worktree prune` command inside both an internally and an externally linked worktree and asserts that the other worktree was not pruned. [1]: https://lore.kernel.org/git/CAPig+cQXFy=xPVpoSq6Wq0pxMRCjS=WbkgdO+3LySPX=q0nPCw@mail.gmail.com/ Reported-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Caleb White <cdwhite3@pm.me> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08	worktree: link worktrees with relative paths	Caleb White	5	-51/+223
	Git currently stores absolute paths to both the main repository and linked worktrees. However, this causes problems when moving repositories or working in containerized environments where absolute paths differ between systems. The worktree links break, and users are required to manually execute `worktree repair` to repair them, leading to workflow disruptions. Additionally, mapping repositories inside of containerized environments renders the repository unusable inside the containers, and this is not repairable as repairing the worktrees inside the containers will result in them being broken outside the containers. To address this, this patch makes Git always write relative paths when linking worktrees. Relative paths increase the resilience of the worktree links across various systems and environments, particularly when the worktrees are self-contained inside the main repository (such as when using a bare repository with worktrees). This improves portability, workflow efficiency, and reduces overall breakages. Although Git now writes relative paths, existing repositories with absolute paths are still supported. There are no breaking changes to workflows based on absolute paths, ensuring backward compatibility. At a low level, the changes involve modifying functions in `worktree.c` and `builtin/worktree.c` to use `relative_path()` when writing the worktree’s `.git` file and the main repository’s `gitdir` reference. Instead of hardcoding absolute paths, Git now computes the relative path between the worktree and the repository, ensuring that these links are portable. Locations where these respective file are read have also been updated to properly handle both absolute and relative paths. Generally, relative paths are always resolved into absolute paths before any operations or comparisons are performed. Additionally, `repair_worktrees_after_gitdir_move()` has been introduced to address the case where both the `<worktree>/.git` and `<repo>/worktrees/<id>/gitdir` links are broken after the gitdir is moved (such as during a re-initialization). This function repairs both sides of the worktree link using the old gitdir path to reestablish the correct paths after a move. The `worktree.path` struct member has also been updated to always store the absolute path of a worktree. This ensures that worktree consumers never have to worry about trying to resolve the absolute path themselves. Signed-off-by: Caleb White <cdwhite3@pm.me> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08	worktree: refactor infer_backlink() to use *strbuf	Caleb White	1	-24/+24
	This lays the groundwork for the next patch, which needs the backlink returned from infer_backlink() as a `strbuf`. It seemed inefficient to convert from `strbuf` to `char` and back to `strbuf` again. This refactors infer_backlink() to return an integer result and use a pre-allocated `strbuf` for the inferred backlink path, replacing the previous `char` return type and improving efficiency. Signed-off-by: Caleb White <cdwhite3@pm.me> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08	Merge branch 'es/worktree-repair-copied' into cw/worktrees-relative	Junio C Hamano	3	-2/+59
	* es/worktree-repair-copied: worktree: repair copied repository and linked worktrees
2024-10-08	ls-remote: leakfix for not clearing server_options	Xing Xin	1	-0/+1
	Ensure `server_options` is properly cleared using `string_list_clear()` in `builtin/ls-remote.c:cmd_ls_remote`. Although we cannot yet enable `TEST_PASSES_SANITIZE_LEAK=true` for `t/t5702-protocol-v2.sh` due to other existing leaks, this fix ensures that "git-ls-remote" related server options tests pass the sanitize leak check: ... ok 12 - server-options are sent when using ls-remote ok 13 - server-options from configuration are used by ls-remote ... Signed-off-by: Xing Xin <xingxin.xx@bytedance.com> Reviewed-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08	fetch: respect --server-option when fetching multiple remotes	Xing Xin	2	-0/+12
	Fix an issue where server options specified via the command line (`--server-option` or `-o`) were not sent when fetching from multiple remotes using Git protocol v2. To reproduce the issue with a repository containing multiple remotes: GIT_TRACE_PACKET=1 git -c protocol.version=2 fetch --server-option=demo --all Observe that no server options are sent to any remote. The root cause was identified in `builtin/fetch.c:fetch_multiple`, which is invoked when fetching from more than one remote. This function forks a `git-fetch` subprocess for each remote but did not include the specified server options in the subprocess arguments. This commit ensures that command-line specified server options are properly passed to each subprocess. Relevant tests have been added. Signed-off-by: Xing Xin <xingxin.xx@bytedance.com> Reviewed-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08	transport.c::handshake: make use of server options from remote	Xing Xin	5	-0/+135
	Utilize the `server_options` from the corresponding remote during the handshake in `transport.c` when Git protocol v2 is detected. This helps initialize the `server_options` in `transport.h:transport` if no server options are set for the transport (typically via `--server-option` or `-o`). While another potential place to incorporate server options from the remote is in `transport.c:transport_get`, setting server options for a transport using a protocol other than v2 could lead to unexpected errors (see `transport.c:die_if_server_options`). Relevant tests and documentation have been updated accordingly. Signed-off-by: Xing Xin <xingxin.xx@bytedance.com> Reviewed-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08	remote: introduce remote.<name>.serverOption configuration	Xing Xin	3	-0/+19
	Currently, server options for Git protocol v2 can only be specified via the command line option "--server-option" or "-o", which is inconvenient when users want to specify a list of default options to send. Therefore, we are introducing a new configuration to hold a list of default server options, akin to the `push.pushOption` configuration for push options. Initially, I named the new configuration `fetch.serverOption` to align with `push.pushOption`. However, after discussing with Patrick, it was renamed to `remote.<name>.serverOption` as suggested, because: 1. Server options are designed to be server-specific, making it more logical to use a per-remote configuration. 2. Using "fetch." prefixed configurations in git-clone or git-ls-remote seems out of place and inconsistent in design. The parsing logic for `remote.<name>.serverOption` also relies on `transport.c:parse_transport_option`, similar to `push.pushOption`, and they follow the same priority design: 1. Server options set in lower-priority configuration files (e.g., /etc/gitconfig or $HOME/.gitconfig) can be overridden or unset in more specific repository configurations using an empty string. 2. Command-line specified server options take precedence over those from the configuration. Server options from configuration are stored to the corresponding `remote.h:remote` as a new field `server_options`. The field will be utilized in the subsequent commit to help initialize the `server_options` of `transport.h:transport`. And documentation have been updated accordingly. Helped-by: Patrick Steinhardt <ps@pks.im> Helped-by: Junio C Hamano <gitster@pobox.com> Reported-by: Liu Zhongbo <liuzhongbo.6666@bytedance.com> Signed-off-by: Xing Xin <xingxin.xx@bytedance.com> Reviewed-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08	transport: introduce parse_transport_option() method	Xing Xin	3	-8/+17
	Add the `parse_transport_option()` method to parse the `push.pushOption` configuration. This method will also be used in the next commit to handle the new `remote.<name>.serverOption` configuration for setting server options in Git protocol v2. Signed-off-by: Xing Xin <xingxin.xx@bytedance.com> Reviewed-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-07	docs: fix the `maintain-git` links in `technical/platform-support`	Johannes Schindelin	1	-2/+2
	These links should point to `.html` files, not to `.txt` ones. Compare also to 4945f046c7f (api docs: link to html version of api-trace2, 2022-09-16). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-07	unpack-trees: detect mismatching number of cache-tree/index entries	Patrick Steinhardt	2	-2/+7
	Same as the preceding commit, we unconditionally dereference the index's cache entries depending on the number of cache-tree entries, which can lead to a segfault when the cache-tree is corrupted. Fix this bug. This also makes t4058 pass with the leak sanitizer enabled. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-07	cache-tree: detect mismatching number of index entries	Patrick Steinhardt	2	-6/+11
	In t4058 we have some tests that exercise git-read-tree(1) when used with a tree that contains duplicate entries. While the expectation is that we fail, we ideally should fail gracefully without a segfault. But that is not the case: we never check that the number of entries in the cache-tree is less than or equal to the number of entries in the index. This can lead to an out-of-bounds read as we unconditionally access `istate->cache[idx]`, where `idx` is controlled by the number of cache-tree entries and the current position therein. The result is a segfault. Fix this segfault by adding a sanity check for the number of index entries before dereferencing them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-07	cache-tree: refactor verification to return error codes	Patrick Steinhardt	4	-35/+79
	The function `cache_tree_verify()` will `BUG()` whenever it finds that the cache-tree extension of the index is corrupt. The function is thus inherently untestable because the resulting call to `abort()` will be detected by our testing framework and labelled an error. And rightfully so: it shouldn't ever be possible to hit bugs, as they should indicate a programming error rather than corruption of on-disk state. Refactor the function to instead return error codes. This also ensures that the function can be used e.g. by git-fsck(1) without the whole process dying. Furthermore, this refactoring plugs some memory leaks when returning early by creating a common exit path. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-06	Git 2.47v2.47.0	Junio C Hamano	1	-1/+1
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-06	Merge tag 'l10n-2.47.0-rnd2' of https://github.com/git-l10n/git-po	Junio C Hamano	11	-1210/+9185
	l10n-2.47.0-rnd2 * tag 'l10n-2.47.0-rnd2' of https://github.com/git-l10n/git-po: l10n: Update German translation l10n: bg.po: Updated Bulgarian translation (5772t) l10n: vi: Updated translation for 2.47 l10n: zh_TW: Git 2.47 l10n: new lead for Catalan translation l10n: Update Catalan translation l10n: fr.po: 2.47.0 l10n: zh_CN: updated translation for 2.47 l10n: po-id for 2.47 l10n: tr: Update Turkish translations for 2.47.0 l10n: sv.po: Update Swedish translation
2024-10-06	Merge branch 'l10n-de-2.47' of github.com:ralfth/git	Jiang Xin	1	-49/+184
	* 'l10n-de-2.47' of github.com:ralfth/git: l10n: Update German translation
2024-10-06	Merge branch 'master' of github.com:alshopov/git-po	Jiang Xin	1	-54/+211
	* 'master' of github.com:alshopov/git-po: l10n: bg.po: Updated Bulgarian translation (5772t)