aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
34 hoursThe 10th batchHEADmastermainJunio C Hamano1-0/+4
Signed-off-by: Junio C Hamano <gitster@pobox.com>
34 hoursMerge branch 'kh/doc-send-email-paragraph-fix'Junio C Hamano1-1/+0
Docfix. * kh/doc-send-email-paragraph-fix: doc: send-email: fix broken list continuation
34 hoursMerge branch 'mh/doc-config-gui-gcwarning'Junio C Hamano1-0/+5
Docfix. * mh/doc-config-gui-gcwarning: config: document 'gui.GCWarning'
34 hoursMerge branch 'kh/doc-pre-commit-fix'Junio C Hamano1-7/+4
Docfix. * kh/doc-pre-commit-fix: doc: join default pre-commit paragraphs
34 hoursMerge branch 'jc/capability-leak'Junio C Hamano3-0/+42
Leakfix. * jc/capability-leak: connect: plug protocol capability leak
3 daysThe ninth batchJunio C Hamano1-0/+10
Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 daysMerge branch 'rs/ban-mktemp'Junio C Hamano10-33/+26
Rewrite the only use of "mktemp()" that is subject to TOCTOU race and Stop using the insecure "mktemp()" function. * rs/ban-mktemp: compat: remove gitmkdtemp() banned.h: ban mktemp(3) compat: remove mingw_mktemp() compat: use git_mkdtemp() wrapper: add git_mkdtemp()
3 daysMerge branch 'gf/win32-pthread-cond-init'Junio C Hamano1-1/+1
Emulation code clean-up. * gf/win32-pthread-cond-init: win32: pthread_cond_init should return a value
3 daysMerge branch 'ps/object-read-stream'Junio C Hamano20-729/+779
The "git_istream" abstraction has been revamped to make it easier to interface with pluggable object database design. * ps/object-read-stream: streaming: drop redundant type and size pointers streaming: move into object database subsystem streaming: refactor interface to be object-database-centric streaming: move logic to read packed objects streams into backend streaming: move logic to read loose objects streams into backend streaming: make the `odb_read_stream` definition public streaming: get rid of `the_repository` streaming: rely on object sources to create object stream packfile: introduce function to read object info from a store streaming: move zlib stream into backends streaming: create structure for filtered object streams streaming: create structure for packed object streams streaming: create structure for loose object streams streaming: create structure for in-core object streams streaming: allocate stream inside the backend-specific logic streaming: explicitly pass packfile info when streaming a packed object streaming: propagate final object type via the stream streaming: drop the `open()` callback function streaming: rename `git_istream` into `odb_read_stream`
4 daysThe eighth batchJunio C Hamano1-0/+18
Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 daysMerge branch 'je/doc-data-model'Junio C Hamano1-2/+0
Docfix. * je/doc-data-model: doc: remove stray text in Git data model
4 daysMerge branch 'lo/repo-struct-z'Junio C Hamano3-7/+19
"git repo struct" learned to take "-z" as a synonym to "--format=nul". * lo/repo-struct-z: repo: add -z as an alias for --format=nul to git-repo-structure repo: use [--format=... | -z] instead of [-z] in git-repo-info synopsis repo: remove blank line from Documentation/git-repo.adoc
4 daysMerge branch 'kh/advise-w-git-help-in-branch'Junio C Hamano3-5/+5
A help message from "git branch" now mentions "git help" instead of "man" when suggesting to read some documentation. * kh/advise-w-git-help-in-branch: branch: advice using git-help(1) instead of man(1)
4 daysMerge branch 'je/doc-pull'Junio C Hamano1-2/+2
Doc fixup. * je/doc-pull: doc: git-pull: fix 'git --rebase abort' typo
4 daysMerge branch 'tc/meson-cross-compile-fix'Junio C Hamano2-2/+3
Build fix. * tc/meson-cross-compile-fix: meson: use is_cross_build() where possible meson: only detect ICONV_OMITS_BOM if possible meson: ignore subprojects/.wraplock
4 daysMerge branch 'js/last-modified-with-sparse-checkouts'Junio C Hamano2-1/+10
"git last-modified" used to mishandle "--" to mark the beginning of pathspec, which has been corrected. * js/last-modified-with-sparse-checkouts: last-modified: support sparse checkouts
4 daysMerge branch 'rs/diff-index-find-copies-harder-optim'Junio C Hamano3-7/+31
Halve the memory consumed by artificial filepairs created during "git diff --find-copioes-harder", also making the operation run faster. * rs/diff-index-find-copies-harder-optim: diff-index: don't queue unchanged filepairs with diff_change()
4 daysMerge branch 'tc/last-modified-active-paths-optimization'Junio C Hamano1-1/+1
Recent optimization to "last-modified" command introduced use of uninitialized block of memory, which has been corrected. * tc/last-modified-active-paths-optimization: last-modified: fix use of uninitialized memory
10 daysThe seventh batchJunio C Hamano1-0/+15
Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 daysMerge branch 'en/replay-doc-revision-range'Junio C Hamano2-8/+7
The use of "revision" (a connected set of commits) has been clarified in the "git replay" documentation. * en/replay-doc-revision-range: Documentation/git-replay.adoc: fix errors around revision range
10 daysMerge branch 'yc/xdiff-patience-optim'Junio C Hamano1-1/+4
The way patience diff finds LCS has been optimized. * yc/xdiff-patience-optim: xdiff: optimize patience diff's LCS search
10 daysMerge branch 'bc/zsh-testsuite'Junio C Hamano2-3/+3
A few tests have been updated to work under the shell compatible mode of zsh. * bc/zsh-testsuite: t5564: fix test hang under zsh's sh mode t0614: use numerical comparison with test_line_count
10 daysMerge branch 'pw/replay-exclude-gpgsig-fix'Junio C Hamano1-1/+1
"git replay" forgot to omit the "gpgsig-sha256" extended header from the resulting commit the same way it omits "gpgsig", which has been corrected. * pw/replay-exclude-gpgsig-fix: replay: do not copy "gpgsign-sha256" header
10 daysconfig: document 'gui.GCWarning'Matthew Hughes1-0/+5
While investigating the config options set by 'scalar' I noticed this one wasn't documented. Signed-off-by: Matthew Hughes <matthewhughes934@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 daysdoc: send-email: fix broken list continuationKristoffer Haugsbakk1-1/+0
The list continuation has to be “immediately adjacent to the block being attached”.[1] [1]: https://web.archive.org/web/20251208172615/https://docs.asciidoctor.org/asciidoc/latest/lists/continuation/ Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 daysconnect: plug protocol capability leakJunio C Hamano3-0/+42
When pushing to a set of remotes using a nickname for the group, the client initializes the connection to each remote, talks to the remote and reads and parses capabilities line, and holds the capabilities in a file-scope static variable server_capabilities_v1. There are a few other such file-scope static variables, and these connections cannot be parallelized until they are refactored to a structure that keeps track of active connections. Which is *not* the theme of this patch ;-) For a single connection, the server_capabilities_v1 variable is initialized to NULL (at the program initialization), populated when we talk to the other side, used to look up capabilities of the other side possibly multiple times, and the memory is held by the variable until program exit, without leaking. When talking to multiple remotes, however, the server capabilities from the second connection overwrites without freeing the one from the first connection, which leaks. ==1080970==ERROR: LeakSanitizer: detected memory leaks Direct leak of 421 byte(s) in 2 object(s) allocated from: #0 0x5615305f849e in strdup (/home/gitster/g/git-jch/bin/bin/git+0x2b349e) (BuildId: 54d149994c9e85374831958f694bd0aa3b8b1e26) #1 0x561530e76cc4 in xstrdup /home/gitster/w/build/wrapper.c:43:14 #2 0x5615309cd7fa in process_capabilities /home/gitster/w/build/connect.c:243:27 #3 0x5615309cd502 in get_remote_heads /home/gitster/w/build/connect.c:366:4 #4 0x561530e2cb0b in handshake /home/gitster/w/build/transport.c:372:3 #5 0x561530e29ed7 in get_refs_via_connect /home/gitster/w/build/transport.c:398:9 #6 0x561530e26464 in transport_push /home/gitster/w/build/transport.c:1421:16 #7 0x561530800bec in push_with_options /home/gitster/w/build/builtin/push.c:387:8 #8 0x5615307ffb99 in do_push /home/gitster/w/build/builtin/push.c:442:7 #9 0x5615307fe926 in cmd_push /home/gitster/w/build/builtin/push.c:664:7 #10 0x56153065673f in run_builtin /home/gitster/w/build/git.c:506:11 #11 0x56153065342f in handle_builtin /home/gitster/w/build/git.c:779:9 #12 0x561530655b89 in run_argv /home/gitster/w/build/git.c:862:4 #13 0x561530652cba in cmd_main /home/gitster/w/build/git.c:984:19 #14 0x5615308dda0a in main /home/gitster/w/build/common-main.c:9:11 #15 0x7f051651bca7 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16 SUMMARY: AddressSanitizer: 421 byte(s) leaked in 2 allocation(s). Free the capablities data for the previous server before overwriting it with the next server to plug this leak. The added test fails without the freeing with SANITIZE=leak; I somehow couldn't get it fail reliably with SANITIZE=leak,address though. Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 daysdoc: join default pre-commit paragraphsKristoffer Haugsbakk1-7/+4
Join two paragraphs that start with the standard “The default <hook>, when enabled” into one and put it at the end of the “pre-commit” section. The trailing whitespace paragraph was added in the first commit for the doc, in 6d35cc76 (Document hooks., 2005-09-02). Then 3e14dd2c (mention use of "hooks.allownonascii" in "man githooks", 2019-02-20) updated the “pre-commit” section to mention the non-ASCII check that was added in d00e364d.[1] But this paragraph was added one-past the original “default” paragraph, after the env. variable paragraph, and starts exactly the same. That causes the flow of this section to feel off (paragraphs in order): 1. Invoked by <cmd> and what parameters it takes 2. The default 'pre-commit' hook catches introduction of trailing whitespace 3. `GIT_EDITOR=:` 4. The default pre-commit' hook catches introduction of non-ASCII filenames Let’s instead join these two paragrahs and explain the whole behavior of the default script. † 1: Extend sample pre-commit hook to check for non ascii filenames, 2009-05-19 Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 dayscompat: remove gitmkdtemp()René Scharfe5-14/+2
gitmkdtemp() has become a trivial wrapper around git_mkdtemp(). Remove this now unnecessary layer of indirection. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 daysbanned.h: ban mktemp(3)René Scharfe1-0/+3
Older versions of mktemp(3) generate easily guessable file names. The function checks if the generated name is used, which is unreliable, as a file with that name might then be created by some other process before we can do it ourselves. The function was dropped from POSIX due to its security problems. Forbid its use. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 dayscompat: remove mingw_mktemp()René Scharfe2-15/+0
Remove the mktemp(3) compatibility function now that its last caller was removed by the previous commit. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 dayscompat: use git_mkdtemp()René Scharfe1-3/+1
A file might appear at the path returned by mktemp(3) before we call mkdir(2). Use the more robust git_mkdtemp() instead, which retries a number of times and doesn't need to call lstat(2). Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 dayswrapper: add git_mkdtemp()René Scharfe2-2/+21
Extend git_mkstemps_mode() to optionally call mkdir(2) instead of open(2), then use that ability to create a mkdtemp(3) replacement, git_mkdtemp(). We'll start using it in the next commit. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
13 daysThe sixth batchJunio C Hamano1-0/+31
Signed-off-by: Junio C Hamano <gitster@pobox.com>
13 daysMerge branch 'rs/config-set-multi-error-message-fix'Junio C Hamano1-1/+1
The error message given by "git config set", when the variable being updated has more than one values defined, used old style "git config" syntax with an incorrect option in its hint, both of which have been corrected. * rs/config-set-multi-error-message-fix: config: fix suggestion for failed set of multi-valued option
13 daysMerge branch 'rs/config-unset-opthelp-fix'Junio C Hamano1-2/+2
The option help text given by "git config unset -h" described the "--all" option to "replace", not "unset", multiple variables, which has been corrected. * rs/config-unset-opthelp-fix: config: fix short help of unset flags
13 daysMerge branch 'ps/object-source-management'Junio C Hamano31-279/+414
Code refactoring around object database sources. * ps/object-source-management: odb: handle recreation of quarantine directories odb: handle changing a repository's commondir chdir-notify: add function to unregister listeners odb: handle initialization of sources in `odb_new()` http-push: stop setting up `the_repository` for each reference t/helper: stop setting up `the_repository` repeatedly builtin/index-pack: fix deferred fsck outside repos oidset: introduce `oidset_equal()` odb: move logic to disable ref updates into repo odb: refactor `odb_clear()` to `odb_free()` odb: adopt logic to close object databases setup: convert `set_git_dir()` to have file scope path: move `enter_repo()` into "setup.c"
13 daysMerge branch 'cc/fast-import-strip-if-invalid'Junio C Hamano8-29/+206
"git fast-import" learns "--strip-if-invalid" option to drop invalid cryptographic signature from objects. * cc/fast-import-strip-if-invalid: fast-import: add 'strip-if-invalid' mode to --signed-commits=<mode> commit: refactor verify_commit_buffer() fast-import: refactor finalize_commit_buffer()
13 daysMerge branch 'js/ci-show-breakage-in-dockerized-jobs'Junio C Hamano1-1/+1
Dockerised jobs at the GitHub Actions CI have been taught to show more details of failed tests. * js/ci-show-breakage-in-dockerized-jobs: ci(dockerized): do show the result of failing tests again
13 daysMerge branch 'kh/doc-committer-date-is-author-date'Junio C Hamano2-0/+14
The "--committer-date-is-author-date" option of "git am/rebase" is a misguided one. The documentation is updated to discourage its use. * kh/doc-committer-date-is-author-date: doc: warn against --committer-date-is-author-date
13 daysMerge branch 'jc/optional-path'Junio C Hamano11-23/+102
"git config get --path" segfaulted on an ":(optional)path" that does not exist, which has been corrected. * jc/optional-path: config: really treat missing optional path as not configured config: really pretend missing :(optional) value is not there config: mark otherwise unused function as file-scope static
13 daysMerge branch 'js/strip-scalar-too'Junio C Hamano1-1/+1
"make strip" has been taught to strip "scalar" as well as "git". * js/strip-scalar-too: make strip: include `scalar`
13 daysMerge branch 'en/xdiff-cleanup-2'Junio C Hamano13-110/+336
Code clean-up. * en/xdiff-cleanup-2: xdiff: rename rindex -> reference_index xdiff: change rindex from long to size_t in xdfile_t xdiff: make xdfile_t.nreff a size_t instead of long xdiff: make xdfile_t.nrec a size_t instead of long xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash xdiff: use unambiguous types in xdl_hash_record() xdiff: use size_t for xrecord_t.size xdiff: make xrecord_t.ptr a uint8_t instead of char xdiff: use ptrdiff_t for dstart/dend doc: define unambiguous type mappings across C and Rust
14 daysrepo: add -z as an alias for --format=nul to git-repo-structureLucas Seiki Oshiro3-3/+16
Other Git commands that have nul-terminated output, such as git-config, git-status, git-ls-files, and git-repo-info have a flag `-z` for using the null character as the record separator. Add the `-z` flag to git-repo-structure as an alias for `--format=nul`, making it consistent with the behavior of the other commands. Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 daysrepo: use [--format=... | -z] instead of [-z] in git-repo-info synopsisLucas Seiki Oshiro2-3/+3
The flag -z is only an alias for --format=null and even though --format and -z can be used together and repeated, only the last one is considered. Replace `[-z]` in the synopsis of git-repo-info by `[--format=... | -z]`, expliciting that the use of one of those flags replace the other. Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 daysrepo: remove blank line from Documentation/git-repo.adocLucas Seiki Oshiro1-1/+0
There was an extra blank line in git-repo-structure documentation, which led to an unwawnted '+' character after generating an HTML or PDF from that page. This can be seen, for example, in Git 2.52.0 online docs [1]. Remove that extra line. [1] https://git-scm.com/docs/git-repo/2.52.0 Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 daysmeson: use is_cross_build() where possibleToon Claes1-1/+1
In previous commit the first use of meson.can_run_host_binaries() was introduced. This is a guard around compiler.run() to ensure it's actually possible to execute the provided. In other places we've been having the same issue, but here `not meson.is_cross_build()` is used as guard. This does the trick, but it also prevents the code from running even when an exe_wrapper is configured. Switch to using meson.can_run_host_binaries() here as well. There is another place left that still uses `not meson.is_cross_build()`, but here it's a guard around fs.exists(). That function will always run on the build machine, so checking for cross-compilation is still in place here. Signed-off-by: Toon Claes <toon@iotcl.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 daysmeson: only detect ICONV_OMITS_BOM if possibleToon Claes1-1/+1
In our Meson setup it automatically detects whether ICONV_OMITS_BOM should be defined. To check this, a piece of code is compiled and ran. When cross-compiling, it's not possible to run this piece of code. Guard this test with a can_run_host_binaries() check to ensure it can run. Signed-off-by: Toon Claes <toon@iotcl.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 daysmeson: ignore subprojects/.wraplockToon Claes1-0/+1
When asking Meson to wrap subprojects, it generates a .wraplock file in the subprojects/ directory. Ignore this file. See also https://github.com/mesonbuild/meson/issues/14948. Signed-off-by: Toon Claes <toon@iotcl.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-03last-modified: support sparse checkoutsJohannes Schindelin2-1/+10
In a sparse checkout, a user might want to run `last-modified` on a directory outside the worktree. And even in non-sparse checkouts, a user might need to run that command on a directory that does not exist in the worktree. These use cases should be supported via the `--` separator between revision and file arguments, which is even advertised in the documentation. This patch fixes a tiny bug that prevents that from working. This fixes https://github.com/git-for-windows/git/issues/5978 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Derrick Stolee <stolee@gmail.com> Acked-by: Toon Claes <toon@iotcl.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-03doc: git-pull: fix 'git --rebase abort' typoJulia Evans1-2/+2
An earlier commit e9d221b0 (doc: git-pull: clarify how to exit a conflicted merge, 2025-10-15) misspelt `git rebase --abort` to `git --rebase abort`. Fix it. Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-03doc: remove stray text in Git data modelJulia Evans1-2/+0
I meant to delete this sentence fragment when rewriting this paragraph, but accidentally left it in. It's repetitive (since it was meant to be deleted) and it's causing some formatting issues with the note. Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-03branch: advice using git-help(1) instead of man(1)Kristoffer Haugsbakk3-5/+5
8fbd903e (branch: advise about ref syntax rules, 2024-03-05) added an advice about checking git-check-ref-format(1) for the ref syntax rules. The advice uses man(1). But git(1) is a multi-platform tool and man(1) may not be available on some platforms. It might also be slightly jarring to see a suggestion for running a command which is not from the Git suite. Let’s instead use git-help(1) in order to stay inside the land of git(1). This also means that `help.format` (for `man`, `html` or other formats) will be used if set. Also change to using single quotes (') to quote the command since that is more conventional. While here let’s also update the test to use `{SQ}`, which is more readable and easier to edit. Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-30The fifth batchJunio C Hamano1-0/+11
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-30Merge branch 'jk/asan-bonanza'Junio C Hamano7-40/+122
Various issues detected by Asan have been corrected. * jk/asan-bonanza: t: enable ASan's strict_string_checks option fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated fsck: remove redundant date timestamp check fsck: avoid strcspn() in fsck_ident() fsck: assert newline presence in fsck_ident() cache-tree: avoid strtol() on non-string buffer Makefile: turn on NO_MMAP when building with ASan pack-bitmap: handle name-hash lookups in incremental bitmaps compat/mmap: mark unused argument in git_munmap()
2025-11-30Merge branch 'je/doc-data-model'Junio C Hamano4-2/+311
Add a new manual that describes the data model. * je/doc-data-model: doc: add an explanation of Git's data model
2025-11-30Merge branch 'jc/whitespace-incomplete-line'Junio C Hamano9-88/+450
Both "git apply" and "git diff" learn a new whitespace error class, "incomplete-line". * jc/whitespace-incomplete-line: attr: enable incomplete-line whitespace error for this project diff: highlight and error out on incomplete lines apply: check and fix incomplete lines whitespace: allocate a few more bits and define WS_INCOMPLETE_LINE apply: revamp the parsing of incomplete lines diff: update the way rewrite diff handles incomplete lines diff: call emit_callback ecbdata everywhere diff: refactor output of incomplete line diff: keep track of the type of the last line seen diff: correct suppress_blank_empty hack diff: emit_line_ws_markup() if/else style fix whitespace: correct bit assignment comments
2025-11-30Merge branch 'ja/doc-synopsis-style'Junio C Hamano10-405/+427
Doc mark-up updates. * ja/doc-synopsis-style: doc: pull-fetch-param typofix doc: convert git push to synopsis style doc: convert git pull to synopsis style doc: convert git fetch to synopsis style
2025-11-30Merge branch 'lo/repo-info-all'Junio C Hamano3-21/+69
"git repo info" learned "--all" option. * lo/repo-info-all: repo: add --all to git-repo-info repo: factor out field printing to dedicated function
2025-11-30diff-index: don't queue unchanged filepairs with diff_change()René Scharfe3-7/+31
diff_cache() queues unchanged filepairs if the flag find_copies_harder is set, and uses diff_change() for that. This function allocates a filespec for each side, does a few other things that are unnecessary for unchanged filepairs and always sets the diff_flag has_changes, which is simply misleading in this case. Add a new streamlined function for queuing unchanged filepairs and use it in show_modified(), which is called by diff_cache() via oneway_diff() and do_oneway_diff(). It allocates only a single filespec for each filepair and uses it twice with reference counting. This has a measurable effect if there are a lot of them, like in the Linux repo: Benchmark 1: ./git_v2.52.0 -C ../linux diff --cached --find-copies-harder Time (mean ± σ): 31.8 ms ± 0.2 ms [User: 24.2 ms, System: 6.3 ms] Range (min … max): 31.5 ms … 32.3 ms 85 runs Benchmark 2: ./git -C ../linux diff --cached --find-copies-harder Time (mean ± σ): 23.9 ms ± 0.2 ms [User: 18.1 ms, System: 4.6 ms] Range (min … max): 23.5 ms … 24.4 ms 111 runs Summary ./git -C ../linux diff --cached --find-copies-harder ran 1.33 ± 0.01 times faster than ./git_v2.52.0 -C ../linux diff --cached --find-copies-harder Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-29last-modified: fix use of uninitialized memoryToon Claes1-1/+1
git-last-modified(1) uses a scratch bitmap to keep track of paths that have been changed between commits. To avoid reallocating a bitmap on each call of process_parent(), the scratch bitmap is kept and reused. Although, it seems an incorrect length is passed to memset(3). `struct bitmap` uses `eword_t` to for internal storage. This type is typedef'd to uint64_t. To fully zero the memory used by the bitmap, multiply the length (saved in `struct bitmap::word_alloc`) by the size of `eword_t`. Reported-by: Anders Kaseorg <andersk@mit.edu> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Toon Claes <toon@iotcl.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-28Documentation/git-replay.adoc: fix errors around revision rangeElijah Newren2-8/+7
There was significant confusion in the git-replay manual about what constitutes a revision range. As noted in f302c1e4aa09 (revisions(7): clarify that most commands take a single revision range, 2021-05-18): Commands that are specifically designed to take two distinct ranges (e.g. "git range-diff R1 R2" to compare two ranges) do exist, but they are exceptions. Unless otherwise noted, all "git" commands that operate on a set of commits work on a single revision range. `git replay` is not an exception, but a few places in the manual were written as though it were. These appear to have come in revisions to the original series, between v3->v4 (see https://lore.kernel.org/git/CAP8UFD3bpLrVW97DH7j=V9H2GsTSAkksC9L3QujQERFk_kLnZA@mail.gmail.com/ , "More than one <revision-range> can be passed") and between v6->v7 (https://lore.kernel.org/git/20231115143327.2441397-1-christian.couder@gmail.com/, "Takes ranges of commits"), and I missed both of these revisions when reviewing. Fix them now. There was also a reference to the "Commit Limiting options below", but this page has no such section of options; strike the misleading reference. It is worth noting that we are documenting existing behavior, rather than optimal behavior. Junio has multiple times suggested introducing alternative ways to walk revisions and use them in `git replay --advance`, e.g. at * https://lore.kernel.org/git/xmqqy1mqo6kv.fsf@gitster.g/ * https://lore.kernel.org/git/xmqq8rb3is8c.fsf@gitster.g/ * https://lore.kernel.org/git/xmqqtsydj2zk.fsf@gitster.g/ (item (2)) If/when we introduce some new revision walking flag that implements one of these alternate types of revision walks, we can update the --advance option and this manual appropriately. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-27xdiff: optimize patience diff's LCS searchYee Cheng Chin1-1/+4
The find_longest_common_sequence() function in patience diff is inefficient as it calls binary_search() for every unique line it encounters when deciding where to put it in the sequence. From instrumentation (using xctrace) on popular repositories, binary_search() takes up 50-60% of the run time within patience_diff() when performing a diff. To optimize this, add a boundary condition check before binary_search() is called to see if the encountered unique line is located after the entire currently tracked longest subsequence. If so, skip the unnecessary binary search and simply append the entry to the end of sequence. Given that most files compared in a diff are usually quite similar to each other, this condition is very common, and should be hit much more frequently than the binary search. Below are some end-to-end performance results by timing `git log --shortstat --oneline -500 --patience` on different repositories with the old and new code. Generally speaking this seems to give at least 8-10% speed up. The "binary search hit %" column describes how often the algorithm enters the binary search path instead of the new faster path. Even in the WebKit case we can see that it's quite rare (1.46%). | Repo | Speed difference | binary search hit % | |----------|------------------|---------------------| | vim | 1.27x | 0.01% | | pytorch | 1.16x | 0.02% | | cpython | 1.14x | 0.06% | | ripgrep | 1.14x | 0.03% | | git | 1.13x | 0.12% | | vscode | 1.09x | 0.10% | | WebKit | 1.08x | 1.46% | The benchmarks were done using hyperfine, on an Apple M1 Max laptop, with git compiled with `-O3 -flto`. Signed-off-by: Yee Cheng Chin <ychin.git@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-27t5564: fix test hang under zsh's sh modebrian m. carlson1-2/+2
This test starts a SOCKS server in Perl in the background and then kills it after the tests are done. However, when using zsh (in sh mode) in the tests, the start_socks function hangs until the background process is killed. Note that this does not reproduce in a simple shell script, so there is likely some interaction between job handling, our heavy use of eval in the test framework, and possibly other complexities of our test framework. What is clear, however, is that switching from a compound statement to a subshell fixes the problem entirely and the test passes with no problem, so do that. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-27t0614: use numerical comparison with test_line_countbrian m. carlson1-1/+1
In this comparison, we want to know whether the number of lines is greater than 1. Our test_line_count function passes the first argument as the comparison operator to test, so what we want is a numerical comparison, not a string comparison. While this does not produce a functional problem now, it could very well if we expected two or more items, in which case the value "10" would not match when it should. Furthermore, the "<" and ">" comparisons are new in POSIX 1003.1-2024 and we don't want to require such a new version of POSIX since many popular and supported operating systems were released before that version of POSIX was released. Finally, zsh's builtin test operator does not like the greater-than sign in "test", since it is only supported in the double-bracket extension. This has been reported and will be addressed in a future version, but since our code is also technically incorrect, as well as not very compatible, let's fix it by using a numeric comparison. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-26The fourth batchJunio C Hamano1-0/+39
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-26Merge branch 'gf/win32-pthread-cond-wait-err'Junio C Hamano2-1/+9
Emulation code clean-up. * gf/win32-pthread-cond-wait-err: win32: return error if SleepConditionVariableCS fails
2025-11-26Merge branch 'jk/ci-windows-meson-test-fix'Junio C Hamano3-1/+25
"Windows+meson" job at the GitHub Actions CI was hard to debug, as it did not show and save failed test artifacts, which has been corrected. * jk/ci-windows-meson-test-fix: ci(windows-meson-test): handle options and output like other test jobs unit-test: ignore --no-chain-lint
2025-11-26Merge branch 'pw/worktree-list-display-width-fix'Junio C Hamano2-25/+53
"git worktree list" attempts to show paths to worktrees while aligning them, but miscounted display columns for the paths when non-ASCII characters were involved, which has been corrected. * pw/worktree-list-display-width-fix: worktree list: quote paths worktree list: fix column spacing
2025-11-26Merge branch 'js/wincred-get-credential-alloc-fix'Junio C Hamano1-1/+1
Under-allocation fix. * js/wincred-get-credential-alloc-fix: wincred: avoid memory corruption
2025-11-26Merge branch 'js/cmake-libgit-fix'Junio C Hamano1-13/+1
Makefile based build have recently been updated to build a libgit.a that also has reftable and xdiff objects; CMake based build procedure has been updated to match. * js/cmake-libgit-fix: cmake: stop trying to build the reftable and xdiff libraries
2025-11-26Merge branch 'js/mingw-assign-comma-fix'Junio C Hamano1-20/+28
The "return errno = EFOO, -1" construct, which is heavily used in compat/mingw.c and triggers warnings under "-Wcomma", has been rewritten to avoid the warnings. * js/mingw-assign-comma-fix: mingw: avoid the comma operator
2025-11-26Merge branch 'js/ci-github-setup-go-update'Junio C Hamano1-1/+1
Update a version of action used at the GitHub Actrions CI. * js/ci-github-setup-go-update: ci: bump actions/setup-go from 5 to 6
2025-11-26Merge branch 'jk/test-mktemp-leakfix'Junio C Hamano1-1/+7
Test leakfix. * jk/test-mktemp-leakfix: test-mktemp: plug memory and descriptor leaks
2025-11-26Merge branch 'rs/xmkstemp-simplify'Junio C Hamano1-18/+1
Code simplification. * rs/xmkstemp-simplify: wrapper: simplify xmkstemp()
2025-11-26Merge branch 'ad/blame-diff-algorithm'Junio C Hamano9-24/+279
"git blame" learns "--diff-algorithm=<algo>" option. * ad/blame-diff-algorithm: blame: make diff algorithm configurable xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
2025-11-26Merge branch 'en/ort-rename-another-fix'Junio C Hamano2-10/+114
Yet another corner case fix around renames in the "ort" merge strategy. * en/ort-rename-another-fix: merge-ort: fix failing merges in special corner case merge-ort: remove debugging crud t6429: update comment to mention correct tool
2025-11-26ci(dockerized): do show the result of failing tests againJohannes Schindelin1-1/+1
The quality of tests and test suites is most apparent not when everything passes, but in how quickly bugs can be identified, analyzed, and resolved after test failures occur. As such, it is an unfortunate side effect of 2a21098b98a (github: adapt containerized jobs to be rootless, 2025-01-10) that the output of failed test cases, which was shown before that change directly in the build logs, is now no longer shown at all. The reason is a side effect of trying to run the build and the tests with permissions other than the `root` user, but without providing the prerequisite permissions to signal what tests failed and whose output hence needs to be included in the logs. The way this signaling works is for the workflow to write into special-purpose files whose path is specific to the current workflow step and which can be accessed via the `$GITHUB_ENV` environment variable, which differs between workflow steps. It is this file that is missing write permission for the `builder` user that was introduced in above-mentioned commit. The solution is simple: make the file world-writable. Technically, this write permission should be removed after the step has completed, if proper security practices were to be upheld, but since nothing uses that file again, it does not matter, and the fix is more succinct this way. This commit is best viewed with `--color-words`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> [jc: squashed Elijah's rewrite of the first paragraph of the log message] [jc: updated chmod to match "world-writable" in the log message] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-26Merge branch 'master' of https://github.com/j6t/gitkJunio C Hamano1-18/+69
* 'master' of https://github.com/j6t/gitk: gitk: add external diff file rename detection gitk: show unescaped file names on 'rename' and 'copy' lines gitk: fix a 'continue' statement outside a loop to 'return' gitk: persist position and size of the Tags and Heads window Revert "gitk: Only restore window size from ~/.gitk, not position"
2025-11-26replay: do not copy "gpgsign-sha256" headerPhillip Wood1-1/+1
When "git replay" replays a commit it copies the extended headers across from the original commit. However, if the original commit was signed, we do not want to copy the header associated with the signature is it wont be valid for the new commit. The code already knows to avoid coping the "gpgsig" header but does not know to avoid copying the "gpgsig-sha256" header. Add that header to the list of exclusions to match what "git commit --amend" does. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-26fast-import: add 'strip-if-invalid' mode to --signed-commits=<mode>Christian Couder6-24/+174
Tools like `git filter-repo`[1] use `git fast-export` and `git fast-import` to rewrite repository history. When rewriting history using one such tool though, commit signatures might become invalid because the commits they sign changed due to the changes in the repository history made by the tool between the fast-export and the fast-import steps. Note that as far as signature handling goes: * Since fast-export doesn't know what changes filter-repo may make to the stream, it can't know whether the signatures will still be valid. * Since filter-repo doesn't know what history canonicalizations fast-export performed (and it performs a few), it can't know whether the signatures will still be valid. * Therefore, fast-import is the only process in the pipeline that can know whether a specified signature remains valid. Having invalid signatures in a rewritten repository could be confusing, so users rewritting history might prefer to simply discard signatures that are invalid at the fast-import step. For example a common use case is to rewrite only "recent" history. While specifying commit ranges corresponding to "recent" commits could work, users worry about getting it wrong and want to just automatically rewrite everything, expecting older commit signatures to be untouched. To let them do that, let's add a new 'strip-if-invalid' mode to the `--signed-commits=<mode>` option of `git fast-import`. It would be interesting for the `--signed-tags=<mode>` option to have this mode too, but we leave that for a future improvement. It might also be possible for `git fast-export` to have such a mode in its `--signed-commits=<mode>` and `--signed-tags=<mode>` options, but the use cases for it are much less clear, so we also leave that for possible future improvements. For now let's just die() if 'strip-if-invalid' is passed to these options where it hasn't been implemented yet. [1]: https://github.com/newren/git-filter-repo Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-26Merge branch 'tb/external-diff-renamed'Johannes Sixt1-2/+38
* tb/external-diff-renamed: gitk: add external diff file rename detection
2025-11-26Merge branch 'js/persist-ref-window-geometry'Johannes Sixt1-15/+22
* js/persist-ref-window-geometry: gitk: persist position and size of the Tags and Heads window Revert "gitk: Only restore window size from ~/.gitk, not position"
2025-11-25odb: handle recreation of quarantine directoriesPatrick Steinhardt2-5/+7
In the preceding commit we have moved the logic that reparents object database sources on chdir(3p) from "setup.c" into "odb.c". Let's also do the same for any temporary quarantine directories so that the complete reparenting logic is self-contained in "odb.c". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25odb: handle changing a repository's commondirPatrick Steinhardt5-56/+83
The function `repo_set_gitdir()` is called in two situations: - To initialize the repository with its discovered location. As part of this we also set up the new object database. - To update the repository's discovered location in case the process changes its working directory so that we update relative paths. This means we also have to update any relative paths that are potentially used in the object database. In the context of the object database we ideally wouldn't ever have to worry about the second case: if all paths used by our object database sources were absolute, then we wouldn't have to update them. But unfortunately, the paths aren't only used to locate files owned by the given source, but we also use them for reporting purposes. One such example is `repo_get_object_directory()`, where we cannot just change semantics to always return absolute paths, as that is likely to break tooling out there. One solution to this would be to have both a "display path" and an "internal path". This would allow us to use internal paths for all internal matters, but continue to use the potentially-relative display paths so that we don't break compatibility. But converting the codebase to honor this split is quite a messy endeavour, and it wouldn't even help us with the goal to get rid of the need to update the display path on chdir(3p). Another solution would be to rework "setup.c" so that we never have to update paths in the first place. In that case, we'd only initialize the repository once we have figured out final locations for all directories. This would be a significant simplification of that subsystem indeed, but the current logic is so messy that it would take significant investments to get there. Meanwhile though, while object sources may still use relative paths, the best thing we can do is to handle the reparenting of the object source paths in the object database itself. This can be done by registering one callback for each object database so that we get notified whenever the current working directory changes, and we then perform the reparenting ourselves. Ideally, this wouldn't even happen on the object database level, but instead handled by each object database source. But we don't yet have proper pluggable object database sources, so this will need to be handled at a later point in time. The logic itself is rather simple: - We register the callback when creating the object database. - We unregister the callback when releasing it again. - We split up `set_git_dir_1()` so that it becomes possible to skip recreating the object database. This is required because the function is called both when the current working directory changes, but also when we set up the repository. Calling this function without skipping creation of the ODB will result in a bug in case it's already created. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25chdir-notify: add function to unregister listenersPatrick Steinhardt2-0/+20
While we (obviously) have a way to register new listeners that get called whenever we chdir(3p), we don't have an equivalent that can be used to unregister such a listener again. Add one, as it will be required in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25odb: handle initialization of sources in `odb_new()`Patrick Steinhardt3-14/+35
The logic to set up a new object database is currently distributed across two functions in "repository.c": - In `initialize_repository()` we initialize an empty object database. This object database is not fully initialized and doesn't have any sources attached to it. - The primary object database source is then created in `repo_set_gitdir()`. Ideally though, the logic should be entirely self-contained so that we can iterate more readily on how exactly the sources themselves get set up. Refactor `odb_new()` to handle both allocation and setup of the object database. This ensures that the object database is always initialized and ready for use, and it allows us to change how the sources get set up eventually. Note that `repo_set_gitdir()` still reaches into the sources when the function gets called with an already-initialized object database. This will be fixed in the next commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25http-push: stop setting up `the_repository` for each referencePatrick Steinhardt1-2/+3
When pushing references via HTTP we call `repo_init_revisions()` in a loop for each reference that we're about to push. As third argument we pass the result of `setup_git_directory()`, which causes us to reinitialize the repository every single time. This is an obvious waste of compute, as the repository that we're working in will never change across any of the initializations. The only reason that we do this is to retrieve the directory of the repository. Furthermore, this is about to create issues in a subsequent commit, where reinitializing the repository will cause a `BUG()`. Address this by storing the Git directory in a variable instead so that we don't have to call the function repeatedly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25t/helper: stop setting up `the_repository` repeatedlyPatrick Steinhardt1-14/+2
The "repository" test helper sets up `the_repository` twice. In fact though, we don't even have to set it up even once: all we need is to set up its hash algorithm, because we still depend on some subsystems that aren't free of `the_repository`. Refactor the code accordingly. This prepares for a subsequent change, where setting up the repository repeatedly will lead to a `BUG()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25builtin/index-pack: fix deferred fsck outside reposPatrick Steinhardt4-3/+47
When asked to perform object consistency checks via the `--fsck-objects` flag we verify that each object part of the pack is valid. In general, this check can even be performed outside of a Git repository: we don't need an initialized object database as we simply read the object from the packfile directly. But there's one exception: a subset of the object checks may be deferred to a later point in time. For now, this only concerns ".gitmodules" and ".gitattributes" files: whenever we see a tree referencing these files we queue them for a deferred check. This is done because we need to do some extra checks for those files to ensure that they are well-formed, and these checks need to be done regardless of whether the corresponding blobs are part of the packfile or not. This works inside a repository, but unfortunately the logic leads to a segfault when running outside of one. This is because we eventually call `odb_read_object()`, which will crash because the object database has not been initialized. There's multiple options here: - We could in theory create a purely in-memory database with only a packfile store that contains the single packfile. We don't really have the infrastructure for this yet though, and it would end up being quite hacky. - We could refuse to perform consistency checks outside of a repository. But most of the checks work alright, so this would be a regression. - We can skip the finalizing consistency checks when running outside of a repository. This is not as invasive as skipping all checks, but it's not great to randomly skip a subset of tests, either. None of these options really feel perfect. The first one would be the obvious choice if easily possible. There's another option though: instead of skipping the final object checks, we can die if there are any queued object checks. With this change we now die exactly if and only if we would have previously segfaulted. Like this we ensure that objects that _may_ fail the consistency checks won't be silently skipped, and at the same time we give users a much better error message. Refactor the code accordingly and add a test that would have triggered the segfault. Note that we also move down the logic to add the packfile to the store. There is no point doing this any earlier than right before we execute `fsck_finish()`, and it ensures that the logic to set up and perform the consistency check is self-contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25oidset: introduce `oidset_equal()`Patrick Steinhardt2-2/+23
Introduce a new function that allows the caller to verify whether two oidsets contain the exact same object IDs. Note that this change requires us to change `oidset_iter_init()` to accept a `const struct oidset`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25odb: move logic to disable ref updates into repoPatrick Steinhardt6-12/+13
Our object database sources have a field `disable_ref_updates`. This field can obviously be set to disable reference updates, but it is somewhat curious that this logic is hosted by the object database. The reason for this is that it was primarily added to keep us from accidentally updating references while an ODB transaction is ongoing. Any objects part of the transaction have not yet been committed to disk, so new references that point to them might get corrupted in case we never end up committing the transaction. As such, whenever we create a new transaction we set up a new temporary ODB source and mark it as disabling reference updates. This has one (and only one?) upside: once we have committed the transaction, the temporary source will be dropped and thus we clean up the disabled reference updates automatically. But other than that, it's somewhat misdesigned: - We can have multiple ODB sources, but only the currently active source inhibits reference updates. - We're mixing concerns of the refdb with the ODB. Arguably, the decision of whether we can update references or not should be handled by the refdb. But that wouldn't be a great fit either, as there can be one refdb per worktree. So we'd again have the same problem that a "global" intent becomes localized to a specific instance. Instead, move the setting into the repository. While at it, convert it into a boolean. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-24config: really treat missing optional path as not configuredJunio C Hamano6-12/+25
These callers expect that git_config_pathname() that returns 0 is a signal that the variable they passed has a string they need to act on. But with the introduction of ":(optional)path" earlier, that is no longer the case. If the path specified by the configuration variable is missing, their variable will get a NULL in it, and they need to act on it (often, just refraining from copying it elsewhere). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-24config: really pretend missing :(optional) value is not thereJunio C Hamano4-9/+76
Earlier we added support for a value spelled as ":(optional)path" for configuration variables whose values are of type "path", with the documented semantics "if the path is missing, behave as if such a variable definition is not even there." This has worked OK for code paths that reads configuration files and stores the configured value as a string, where NULL in such a string is treated as if the setting is not there, left as the default. However, there are other code paths that do not _ignore_ such NULL values and misbehave. "git config get --path" is one of them. When git_config_pathname() helper function finds that the value of the variable is an optional path *and* the path is missing, it leaves the destination pointer intact (which usually is left to NULL) and returns 0 to signal a success. format_config() helper however assumed that the destination pointer always gets a string, which no longer is the case, and segfaulted. Make sure that git_config_pathname() clears the destination pointer in such a case, and teach format_config() to react to the condition by returning 1 (which is different from 0 that is a normal success and negative that is an error) to its callers. Adjust the callers to react to this new return value that tells them to pretend as if they did not even see this partcular <key, value> pair. Reported-by: Han Jiang <jhcarl0814@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-24The third batchJunio C Hamano1-1/+32
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-24Merge branch 'jx/repo-struct-utf8width-fix'Junio C Hamano4-4/+153
The "git repo structure" subcommand tried to align its output but mixed up byte count and display column width, which has been corrected. * jx/repo-struct-utf8width-fix: builtin/repo: fix table alignment for UTF-8 characters t/unit-tests: add UTF-8 width tests for CJK chars
2025-11-24Merge branch 'kn/osxkeychain-idempotent-store-fix'Junio C Hamano3-30/+132
An earlier check added to osx keychain credential helper to avoid storing the credential itself supplied was overeager and rejected credential material supplied by other helper backends that it would have wanted to store, which has been corrected. * kn/osxkeychain-idempotent-store-fix: osxkeychain: avoid incorrectly skipping store operation
2025-11-24Merge branch 'kh/doc-commit-extra-references'Junio C Hamano1-4/+6
Doc update. * kh/doc-commit-extra-references: doc: commit: link to git-status(1) on all format options
2025-11-24Merge branch 'ps/object-source-loose'Junio C Hamano12-207/+287
A part of code paths that deals with loose objects has been cleaned up. * ps/object-source-loose: object-file: refactor writing objects via a stream object-file: rename `write_object_file()` object-file: refactor freshening of objects object-file: rename `has_loose_object()` object-file: read objects via the loose object source object-file: move loose object map into loose source object-file: hide internals when we need to reprepare loose sources object-file: move loose object cache into loose source object-file: introduce `struct odb_source_loose` object-file: move `fetch_if_missing` odb: adjust naming to free object sources odb: introduce `odb_source_new()` odb: fix subtle logic to check whether an alternate is usable
2025-11-24Merge branch 'qj/doc-http-bad-want-response'Junio C Hamano1-1/+2
Doc update. * qj/doc-http-bad-want-response: doc: clarify server behavior for invalid 'want' lines in HTTP protocol
2025-11-24Merge branch 'sa/replay-atomic-ref-updates'Junio C Hamano4-43/+277
"git replay" (experimental) learned to perform ref updates itself in a transaction by default, instead of emitting where each refs should point at and leaving the actual update to another command. * sa/replay-atomic-ref-updates: replay: add replay.refAction config option replay: make atomic ref updates the default behavior replay: use die_for_incompatible_opt2() for option validation
2025-11-24Merge branch 'bc/submodule-force-same-hash'Junio C Hamano4-4/+56
Adding a repository that uses a different hash function is a no-no, but "git submodule add" did nt prevent it, which has been corrected. * bc/submodule-force-same-hash: read-cache: drop submodule check from add_to_cache() object-file: disallow adding submodules of different hash algo
2025-11-24Merge branch 'jk/attr-macroexpand-wo-recursion'Junio C Hamano2-16/+54
The code to expand attribute macros has been rewritten to avoid recursion to avoid running out of stack space in an uncontrolled way. * jk/attr-macroexpand-wo-recursion: attr: avoid recursion when expanding attribute macros
2025-11-24config: fix short help of unset flagsRené Scharfe1-2/+2
The flags --all and --value of "git config unset" don't make the command "replace" or "show" anything, they are about selecting what to unset. Change their help text accordingly. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-24config: fix suggestion for failed set of multi-valued optionRené Scharfe1-1/+1
The command "git config set <name> <value>" fails for an option that has multiple values. List the "git config set" flags that can be used, instead of old-style "git config" actions. Reported-by: Paul Wintz <pwintz@ucsc.edu> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-24doc: pull-fetch-param typofixJean-Noël Avila via GitGitGadget1-1/+1
An earier patch had a typo discovered after it has been merged to 'next'. Fix it. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: drop redundant type and size pointersPatrick Steinhardt7-30/+15
In the preceding commits we have turned `struct odb_read_stream` into a publicly visible structure. Furthermore, this structure now contains the type and size of the object that we are about to stream. Consequently, the out-pointers that we used before to propagate the type and size of the streamed object are now somewhat redundant with the data contained in the structure itself. Drop these out-pointers and adapt callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: move into object database subsystemPatrick Steinhardt15-14/+14
The "streaming" terminology is somewhat generic, so it may not be immediately obvious that "streaming.{c,h}" is specific to the object database. Rectify this by moving it into the "odb/" directory so that it can be immediately attributed to the object subsystem. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: refactor interface to be object-database-centricPatrick Steinhardt7-51/+71
Refactor the streaming interface to be centered around object databases instead of centered around the repository. Rename the functions accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: move logic to read packed objects streams into backendPatrick Steinhardt3-135/+134
Move the logic to read packed object streams into the respective subsystem. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: move logic to read loose objects streams into backendPatrick Steinhardt3-178/+164
Move the logic to read loose object streams into the respective subsystem. This allows us to make a couple of function declarations private. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: make the `odb_read_stream` definition publicPatrick Steinhardt2-12/+14
Subsequent commits will move the backend-specific logic of setting up an object read stream into the specific subsystems. As the backends are now the ones that are responsible for allocating the stream they'll need to have the stream definition available to them. Make the stream definition public to prepare for this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: get rid of `the_repository`Patrick Steinhardt7-13/+32
Subsequent commits will move the backend-specific logic of object streaming into their respective subsystems. These subsystems have gotten rid of `the_repository` already, but we still use it in two locations in the streaming subsystem. Prepare for the move by fixing those two cases. Converting the logic in `open_istream_pack_non_delta()` is trivial as we already got the object database as input. But for `stream_blob_to_fd()` we have to add a new parameter to make it accessible. So, as we already have to adjust all callers anyway, rename the function to `odb_stream_blob_to_fd()` to indicate it's part of the object subsystem. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: rely on object sources to create object streamPatrick Steinhardt1-41/+24
When creating an object stream we first look up the object info and, if it's present, we call into the respective backend that contains the object to create a new stream for it. This has the consequence that, for loose object source, we basically iterate through the object sources twice: we first discover that the file exists as a loose object in the first place by iterating through all sources. And, once we have discovered it, we again walk through all sources to try and map the object. The same issue will eventually also surface once the packfile store becomes per-object-source. Furthermore, it feels rather pointless to first look up the object only to then try and read it. Refactor the logic to be centered around sources instead. Instead of first reading the object, we immediately ask the source to create the object stream for us. If the object exists we get stream, otherwise we'll try the next source. Like this we only have to iterate through sources once. But even more importantly, this change also helps us to make the whole logic pluggable. The object read stream subsystem does not need to be aware of the different source backends anymore, but eventually it'll only have to call the source's callback function. Note that at the current point in time we aren't fully there yet: - The packfile store still sits on the object database level and is thus agnostic of the sources. - We still have to call into both the packfile store and the loose object source. But both of these issues will soon be addressed. This refactoring results in a slight change to semantics: previously, it was `odb_read_object_info_extended()` that picked the source for us, and it would have favored packed (non-deltified) objects over loose objects. And while we still favor packed over loose objects for a single source with the new logic, we'll now favor a loose object from an earlier source over a packed object from a later source. Ultimately this shouldn't matter though: the stream doesn't indicate to the caller which source it is from and whether it was created from a packed or loose object, so such details are opaque to the caller. And other than that we should be able to assume that two objects with the same object ID should refer to the same content, so the streamed data would be the same, too. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23packfile: introduce function to read object info from a storePatrick Steinhardt3-43/+69
Extract the logic to read object info for a packed object from `do_oid_object_into_extended()` into a standalone function that operates on the packfile store. This function will be used in a subsequent commit. Note that this change allows us to make `find_pack_entry()` an internal implementation detail. As a consequence though we have to move around `packfile_store_freshen_object()` so that it is defined after that function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: move zlib stream into backendsPatrick Steinhardt1-52/+52
While all backend-specific data is now contained in a backend-specific structure, we still share the zlib stream across the loose and packed objects. Refactor the code and move it into the specific structures so that we fully detangle the different backends from one another. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: create structure for filtered object streamsPatrick Steinhardt1-29/+25
As explained in a preceding commit, we want to get rid of the union of stream-type specific data in `struct odb_read_stream`. Create a new structure for filtered object streams to move towards this design. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: create structure for packed object streamsPatrick Steinhardt1-35/+40
As explained in a preceding commit, we want to get rid of the union of stream-type specific data in `struct odb_read_stream`. Create a new structure for packed object streams to move towards this design. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: create structure for loose object streamsPatrick Steinhardt1-41/+44
As explained in a preceding commit, we want to get rid of the union of stream-type specific data in `struct odb_read_stream`. Create a new structure for loose object streams to move towards this design. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: create structure for in-core object streamsPatrick Steinhardt1-19/+25
As explained in a preceding commit, we want to get rid of the union of stream-type specific data in `struct odb_read_stream`. Create a new structure for in-core object streams to move towards this design. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: allocate stream inside the backend-specific logicPatrick Steinhardt1-38/+65
When creating a new stream we first allocate it and then call into backend-specific logic to populate the stream. This design requires that the stream itself contains a `union` with backend-specific members that then ultimately get populated by the backend-specific logic. This works, but it's awkward in the context of pluggable object databases. Each backend will need its own member in that union, and as the structure itself is completely opaque (it's only defined in "streaming.c") it also has the consequence that we must have the logic that is specific to backends in "streaming.c". Ideally though, the infrastructure would be reversed: we have a generic `struct odb_read_stream` and some helper functions in "streaming.c", whereas the backend-specific logic sits in the backend's subsystem itself. This can be realized by using a design that is similar to how we handle reference databases: instead of having a union of members, we instead have backend-specific structures with a `struct odb_read_stream base` as its first member. The backends would thus hand out the pointer to the base, but internally they know to cast back to the backend-specific type. This means though that we need to allocate different structures depending on the backend. To prepare for this, move allocation of the structure into the backend-specific functions that open a new stream. Subsequent commits will then create those new backend-specific structs. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: explicitly pass packfile info when streaming a packed objectPatrick Steinhardt1-10/+10
When streaming a packed object we first populate the stream with information about the pack that contains the object before calling `open_istream_pack_non_delta()`. This is done because we have already looked up both the pack and the object's offset, so it would be a waste of time to look up this information again. But the way this is done makes for a somewhat awkward calling interface, as the caller now needs to be aware of how exactly the function itself behaves. Refactor the code so that we instead explicitly pass the packfile info into `open_istream_pack_non_delta()`. This makes the calling convention explicit, but more importantly this allows us to refactor the function so that it becomes its responsibility to allocate the stream itself in a subsequent patch. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: propagate final object type via the streamPatrick Steinhardt1-15/+15
When opening the read stream for a specific object the caller is also expected to pass in a pointer to the object type. This type is passed down via multiple levels and will eventually be populated with the type of the looked-up object. The way we propagate down the pointer though is somewhat non-obvious. While `istream_source()` still expects the pointer and looks it up via `odb_read_object_info_extended()`, we also pass it down even further into the format-specific callbacks that perform another lookup. This is quite confusing overall. Refactor the code so that the responsibility to populate the object type rests solely with the format-specific callbacks. This will allow us to drop the call to `odb_read_object_info_extended()` in `istream_source()` entirely in a subsequent patch. Furthermore, instead of propagating the type via an in-pointer, we now propagate the type via a new field in the object stream. It already has a `size` field, so it's only natural to have a second field that contains the object type. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: drop the `open()` callback functionPatrick Steinhardt1-22/+15
When creating a read stream we first populate the structure with the open callback function and then subsequently call the function. This layout is somewhat weird though: - The structure needs to be allocated and partially populated with the open function before we can properly initialize it. - We only ever call the `open()` callback function right after having populated the `struct odb_read_stream::open` member, and it's never called thereafter again. So it is somewhat pointless to store the callback in the first place. Especially the first point creates a problem for us. In subsequent commits we'll want to fully move construction of the read source into the respective object sources. E.g., the loose object source will be the one that is responsible for creating the structure. But this creates a problem: if we first need to create the structure so that we can call the source-specific callback we cannot fully handle creation of the structure in the source itself. We could of course work around that and have the loose object source create the structure and populate its `open()` callback, only. But this doesn't really buy us anything due to the second bullet point above. Instead, drop the callback entirely and refactor `istream_source()` so that we open the streams immediately. This unblocks a subsequent step, where we'll also start to allocate the structure in the source-specific logic. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-23streaming: rename `git_istream` into `odb_read_stream`Patrick Steinhardt7-43/+43
In the following patches we are about to make the `git_istream` more generic so that it becomes fully controlled by the specific object source that wants to create it. As part of these refactorings we'll fully move the structure into the object database subsystem. Prepare for this change by renaming the structure from `git_istream` to `odb_read_stream`. This mirrors the `odb_write_stream` structure that we already have. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-21The second batchJunio C Hamano1-0/+19
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-21Merge branch 'jc/gitattributes-whitespace-no-indent-fix'Junio C Hamano1-1/+1
Ever since we added whitespace rules for this project, we misspelt an entry, which has been corrected. * jc/gitattributes-whitespace-no-indent-fix: .gitattributes: remove misspelled no-op whitespace attribute
2025-11-21Merge branch 'kn/maintenance-is-needed'Junio C Hamano14-43/+284
"git maintenance" command learned "is-needed" subcommand to tell if it is necessary to perform various maintenance tasks. * kn/maintenance-is-needed: maintenance: add 'is-needed' subcommand maintenance: add checking logic in `pack_refs_condition()` refs: add a `optimize_required` field to `struct ref_storage_be` reftable/stack: add function to check if optimization is required reftable/stack: return stack segments directly
2025-11-21Merge branch 'rs/diff-quiet-no-rename'Junio C Hamano2-0/+12
As "git diff --quiet" only cares about the existence of any changes, disable rename/copy detection to skip more expensive processing whose result will be discarded anyway. * rs/diff-quiet-no-rename: diff: disable rename detection with --quiet
2025-11-20config: mark otherwise unused function as file-scope staticJunio C Hamano2-2/+1
git_configset_get_pathname() is only used once inside config.c; we do not have to expose it as a public function. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-20win32: pthread_cond_init should return a valueGreg Funni1-1/+1
This value is not checked, but it must return to match POSIX Signed-off-by: Greg Funni <gfunni234@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-20win32: return error if SleepConditionVariableCS failsGreg Funni2-1/+9
If it fails, return an error. Signed-off-by: Greg Funni <gfunni234@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-20doc: warn against --committer-date-is-author-dateKristoffer Haugsbakk2-0/+14
This option could create a commit history which violates the assumption that commits have non-decreasing commit timestamps. Warn against that in both git-am(1) and git-rebase(1). The genesis of this option is from git-am(1) and was added in 3f01ad66 (am: Add --committer-date-is-author-date option, 2009-01-22). The commit message doesn’t give us an example of a use case, but the thread starter does:[1] I've a big set of patches in a mbox file: there's sufficient info inside for git-am to work. Yet, each time I do import these, my sha1sums are changing because of different commit dates. I'd like to force the commit date to match the info/date from the time I received the email (and therefore always get back the right sha1sums). [1]: https://lore.kernel.org/git/46d6db660901221441q60eb90bdge601a7a250c3a247@mail.gmail.com/ So the motivation was to treat git-am(1) as an import command that creates the same commit IDs. Putting aside the question of whether you should be using git-am(1) for importing commits, this approach is problematic: • you still need to apply the commits to the same base if you want the same hashes; and • you need the same committer. And if you expect the same committer, why is this person applying the same patches multiple times with the goal of making *identical* commits? That was all for git-am(1). It was added to git-rebase(1) in 570ccad3 (rebase: add options passed to git-am, 2009-03-18)[2] in order to plug options that could not be sent on to git-am(1). At this point the utility of the option graduated to making no sense; a use case for `git rebase --committer-date-is-author- date` is still yet to be found. Just warn against using this option on both commands and remind the user to consider whether they really need it. † 2: See also 7573cec5 (rebase -i: support --committer-date-is-author-date, 2020-08-17) for the commit for the merge backend Suggested-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-19odb: refactor `odb_clear()` to `odb_free()`Patrick Steinhardt3-14/+13
The function `odb_clear()` releases all resources allocated to an object database and ensures that all fields become zero'd out. Despite its naming though it doesn't really clear the object database so that it becomes ready for reuse afterwards again -- the caller would first have to reinitialize it, and that contradicts the terminology of "clearing" as we have defined it in our coding guidelines. There isn't really only a reason to have "clearing" semantics, either. There's only a single caller of `odb_clear()`, and that caller also ends up freeing the object database structure itself. Refactor the function to have "freeing" semantics instead, so that the structure itself is also freed, which allows us to drop some useless boilerplate to zero out the structure's members. This refactoring reveals that we're trying to close the commit graph multiple times: once directly via `free_commit_graph()`, and once via `odb_close()`. Drop the former call. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-19odb: adopt logic to close object databasesPatrick Steinhardt10-23/+30
The logic to close an object database is currently contained in the packfile subsystem. That choice is somewhat relatable, as most of the logic really is to close resources associated with the packfile store itself. But we also end up handling object sources and commit graphs, which certainly is not related to packfiles. Move the function into the object database subsystem and rename it to `odb_close()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-19setup: convert `set_git_dir()` to have file scopePatrick Steinhardt2-41/+40
We don't have any external callers of `set_git_dir()` anymore now that `enter_repo()` has been moved into "setup.c". Remove the declaration and mark the function as static. Note that this change requires us to move the implementation around so that we can avoid adding any new forward declarations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-19path: move `enter_repo()` into "setup.c"Patrick Steinhardt8-118/+123
The function `enter_repo()` is used to enter a repository at a given path. As such it sits way closer to setting up a repository than it does with handling paths, but regardless of that it's located in "path.c" instead of in "setup.c". Move the function into "setup.c". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-19Merge branch 'ps/object-source-loose' into ps/object-source-managementJunio C Hamano12-207/+287
A part of code paths that deals with loose objects has been cleaned up. * ps/object-source-loose: object-file: refactor writing objects via a stream object-file: rename `write_object_file()` object-file: refactor freshening of objects object-file: rename `has_loose_object()` object-file: read objects via the loose object source object-file: move loose object map into loose source object-file: hide internals when we need to reprepare loose sources object-file: move loose object cache into loose source object-file: introduce `struct odb_source_loose` object-file: move `fetch_if_missing` odb: adjust naming to free object sources odb: introduce `odb_source_new()` odb: fix subtle logic to check whether an alternate is usable
2025-11-19Merge branch 'ps/object-source-loose' into ps/object-read-streamJunio C Hamano12-207/+287
A part of code paths that deals with loose objects has been cleaned up. * ps/object-source-loose: object-file: refactor writing objects via a stream object-file: rename `write_object_file()` object-file: refactor freshening of objects object-file: rename `has_loose_object()` object-file: read objects via the loose object source object-file: move loose object map into loose source object-file: hide internals when we need to reprepare loose sources object-file: move loose object cache into loose source object-file: introduce `struct odb_source_loose` object-file: move `fetch_if_missing` odb: adjust naming to free object sources odb: introduce `odb_source_new()` odb: fix subtle logic to check whether an alternate is usable
2025-11-19doc: convert git push to synopsis styleJean-Noël Avila2-179/+201
- Switch the synopsis to a synopsis block which will automatically format placeholders in italics and keywords in monospace - Use _<placeholder>_ instead of <placeholder> in the description - Use `backticks` for keywords and more complex option descriptions. The new rendering engine will apply synopsis rules to these spans. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-19doc: convert git pull to synopsis styleJean-Noël Avila4-39/+38
- Switch the synopsis to a synopsis block which will automatically format placeholders in italics and keywords in monospace - Use _<placeholder>_ instead of <placeholder> in the description - Use `backticks` for keywords and more complex option descriptions. The new rendering engine will apply synopsis rules to these spans. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-19doc: convert git fetch to synopsis styleJean-Noël Avila6-189/+190
- Switch the synopsis to a synopsis block which will automatically format placeholders in italics and keywords in monospace - Use _<placeholder>_ instead of <placeholder> in the description - Use `backticks` for keywords and more complex option descriptions. The new rendering engine will apply synopsis rules to these spans. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-19Start 2.53 cycleJunio C Hamano3-2/+14
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-19Merge branch 'ps/ref-peeled-tags-fixes'Junio C Hamano5-11/+11
Another fix-up to "peeled-tags" topic. * ps/ref-peeled-tags-fixes: object: fix performance regression when peeling tags
2025-11-19Merge branch 'kn/refs-optim-cleanup'Junio C Hamano11-72/+42
Code clean-up. * kn/refs-optim-cleanup: t/pack-refs-tests: move the 'test_done' to callees refs: rename 'pack_refs_opts' to 'refs_optimize_opts' refs: move to using the '.optimize' functions
2025-11-19Merge branch 'ps/ref-peeled-tags'Junio C Hamano67-852/+825
Some ref backend storage can hold not just the object name of an annotated tag, but the object name of the object the tag points at. The code to handle this information has been streamlined. * ps/ref-peeled-tags: t7004: do not chdir around in the main process ref-filter: fix stale parsed objects ref-filter: parse objects on demand ref-filter: detect broken tags when dereferencing them refs: don't store peeled object IDs for invalid tags object: add flag to `peel_object()` to verify object type refs: drop infrastructure to peel via iterators refs: drop `current_ref_iter` hack builtin/show-ref: convert to use `reference_get_peeled_oid()` ref-filter: propagate peeled object ID upload-pack: convert to use `reference_get_peeled_oid()` refs: expose peeled object ID via the iterator refs: refactor reference status flags refs: fully reset `struct ref_iterator::ref` on iteration refs: introduce `.ref` field for the base iterator refs: introduce wrapper struct for `each_ref_fn`
2025-11-19Merge branch 'ps/packed-git-in-object-store'Junio C Hamano9-172/+223
The list of packfiles used in a running Git process is moved from the packed_git structure into the packfile store. * ps/packed-git-in-object-store: packfile: track packs via the MRU list exclusively packfile: always add packfiles to MRU when adding a pack packfile: move list of packs into the packfile store builtin/pack-objects: simplify logic to find kept or nonlocal objects packfile: fix approximation of object counts http: refactor subsystem to use `packfile_list`s packfile: move the MRU list into the packfile store packfile: use a `strmap` to store packs by name
2025-11-18xdiff: rename rindex -> reference_indexEzekiel Newren3-9/+9
The classic diff adds only the lines that it's going to consider, during the diff, to an array. A mapping between the compacted array, and the lines of the file that they reference, is facilitated by this array. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18xdiff: change rindex from long to size_t in xdfile_tEzekiel Newren1-1/+1
The field rindex describes an index offset for other arrays. Change it to size_t. Changing the type of rindex from long to size_t has no cascading refactor impact because it is only ever used to directly index other arrays. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18xdiff: make xdfile_t.nreff a size_t instead of longEzekiel Newren2-8/+8
size_t is used because nreff describes the number of elements in memory for rindex. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18xdiff: make xdfile_t.nrec a size_t instead of longEzekiel Newren6-26/+26
size_t is used because nrec describes the number of elements for both recs, and for 'changed' + 2. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hashEzekiel Newren5-20/+21
The ha field is serving two different purposes, which makes the code harder to read. At first glance, it looks like many places assume there could never be hash collisions between lines of the two input files. In reality, line_hash is used together with xdl_recmatch() to ensure correct comparisons of lines, even when collisions occur. To make this clearer, the old ha field has been split: * line_hash: a straightforward hash of a line, independent of any external context. Its type is uint64_t, as it comes from a fixed width hash function. * minimal_perfect_hash: Not a new concept, but now a separate field. It comes from the classifier's general-purpose hash table, which assigns each line a unique and minimal hash across the two files. A size_t is used here because it's meant to be used to index an array. This also avoids ` as usize` casts on the Rust side when using it to index a slice. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18xdiff: use unambiguous types in xdl_hash_record()Ezekiel Newren4-21/+21
Convert the function signature and body to use unambiguous types. char is changed to uint8_t because this function processes bytes in memory. unsigned long to uint64_t so that the hash output is consistent across platforms. `flags` was changed from long to uint64_t to ensure the high order bits are not dropped on platforms that treat long as 32 bits. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18xdiff: use size_t for xrecord_t.sizeEzekiel Newren5-20/+19
size_t is the appropriate type because size is describing the number of elements, bytes in this case, in memory. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18xdiff: make xrecord_t.ptr a uint8_t instead of charEzekiel Newren7-21/+21
Make xrecord_t.ptr uint8_t because it's referring to bytes in memory. In order to avoid a refactor avalanche, many uses of this field were cast to char* or similar. Places where casting was unnecessary: xemit.c:156 xmerge.c:124 xmerge.c:127 xmerge.c:164 xmerge.c:169 xmerge.c:172 xmerge.c:178 Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18xdiff: use ptrdiff_t for dstart/dendEzekiel Newren1-1/+1
ptrdiff_t is appropriate for dstart and dend because they both describe positive or negative offsets relative to a pointer. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18doc: define unambiguous type mappings across C and RustEzekiel Newren3-0/+226
Document other nuances when crossing the FFI boundary. Other language mappings may be added in the future. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18repo: add --all to git-repo-infoLucas Seiki Oshiro3-5/+51
Add a new flag `--all` to git-repo-info for requesting values for all the available keys. By using this flag, the user can retrieve all the values instead of searching what are the desired keys for what they wants. Helped-by: Karthik Nayak <karthik.188@gmail.com> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18repo: factor out field printing to dedicated functionLucas Seiki Oshiro1-16/+18
Move the field printing in git-repo-info to a new function called `print_field`, allowing it to be called by functions other than `print_fields`. Also change its use of quote_c_style() helper to output directly to the standard output stream, instead of taking a result in a strbuf and then printing it outselves. Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18worktree list: quote pathsPhillip Wood2-3/+22
If a worktree path contains newlines or other control characters it messes up the output of "git worktree list". Fix this by using quote_path() to display the worktree path. The output of "git worktree list" is designed for human consumption, scripts should be using the "--porcelain" option so this change should not break them. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18worktree list: fix column spacingPhillip Wood2-24/+33
The output of "git worktree list" displays a table containing the worktree path, HEAD OID and branch name for each worktree. The code aligns the columns by measuring the visual width of the worktree path when it is printed. Unfortunately it fails to use the visual width when calculating the width of the column so, if any of the paths contain a multibyte character, we can end up with excess padding between columns. The simplest fix would be to replace strlen() with utf8_strwidth() in measure_widths(). However that leaves us measuring the visual width twice and the byte length once. By caching the visual width and printing the padding separately to the worktree path, we only need to calculate the visual width once and do not need the byte length at all. The visual widths are stored in an arrays of structs rather than an array of ints as the next commit will add more struct members. Even if there are no multibyte characters in any of the paths we still print an extra space between the path and the object id as the field width is calculated as one plus the length of the path and we print an explicit space as well. This is fixed by not printing the extra space. The tests are updated to include multibyte characters in one of the worktree paths and to check the spacing of the columns. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18test-mktemp: plug memory and descriptor leaksJeff King1-1/+7
We test xmkstemp() in our helper by just calling: xmkstemp(xstrdup(argv[1])); This leaks both the copied string as well as the descriptor returned by the function. In practice this isn't a big deal, since we immediately exit the program, but: 1. LSan will complain about the memory leak. The only reason we did not notice this in our leak-checking builds is that both of the callers in the test suite (both in t0070) pass a broken template (and expect failure). So the function calls die() before we can actually leak. But it's an accident waiting to happen if anybody adds a call which succeeds. 2. Coverity complains about the descriptor leak. There's a long list of uninteresting or false positives in Coverity's results, but since we're here we might as well fix it, too. I didn't bother adding a new test that triggers the leak. It's not even in real production code, but just in the test-helper itself. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18ci(windows-meson-test): handle options and output like other test jobsJeff King2-1/+24
The GitHub windows-meson-test jobs directly run "meson test" with the --slice option. This means they skip all of the ci/lib.sh infrastructure, and in particular: 1. They do not actually set any GIT_TEST_OPTS like --verbose-log or -x. 2. They do not do the usual handle_failed_tests() magic to print test failures or tar up failed directories. As a result, you get almost no feedback at all when a test fails in this job, making debugging rather tricky. Let's try to make this behave more like the other CI jobs. Because we're on Windows, we can't just use the normal run-build-and-tests.sh script. Our build runs as a separate job (like the non-meson Windows job), and then we parallelize the tests across several job slices. So we need something like the run-test-slice.sh script that the "windows-test" job uses. In theory we could just swap out the "make" invocation there for "meson". But it doesn't quite work, because "make" knows how to pull GIT_TEST_OPTS out of GIT-BUILD-OPTIONS automatically. But for meson, we have to extract them into the --test-args option ourselves. I tried making the logic in run-test-slice.sh conditional, but there ended up being hardly any common code at all (and there are some tricky ordering constraints). So I added up with a new meson-specific test-slice runner. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18unit-test: ignore --no-chain-lintJeff King1-0/+1
In the same spirit as 9faf3963b6 (t: introduce compatibility options to clar-based tests, 2024-12-13), we should ignore --no-chain-lint passed to our clar tests, since it may appear in GIT_TEST_OPTS to be used with other tests. This is particularly important on Windows CI, where --no-chain-lint is added to the test options by default, and the meson build will pass all options to the unit tests. The only reason our meson Windows CI job does not run into this currently is that it is not respecting GIT_TEST_OPTS at all! So ignoring this option is a prerequisite to fixing that situation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18t: enable ASan's strict_string_checks optionJeff King1-0/+1
ASan has an option to enable strict string checking, where any pointer passed to a function that expects a NUL-terminated string will be checked for that NUL termination. This can sometimes produce false positives. E.g., it is not wrong to pass a buffer with { '1', '2', '\n' } into strtoul(). Even though it is not NUL-terminated, it will stop at the newline. But in trying it out, it identified two problematic spots in our test suite (which have now been adjusted): 1. The strtol() parsing in cache-tree.c was a real potential problem, which would have been very hard to find otherwise (since it required constructing a very specific broken index file). 2. The use of string functions in fsck_ident() were false positives, because we knew that there was always a trailing newline which would stop the functions from reading off the end of the buffer. But the reasoning behind that is somewhat fragile, and silencing those complaints made the code easier to reason about. So even though this did not find any earth-shattering bugs, and even had a few false positives, I'm sufficiently convinced that its complaints are more helpful than hurtful. Let's turn it on by default (since the test suite now runs cleanly with it) and see if it ever turns up any other instances. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18fsck: avoid parse_timestamp() on buffer that isn't NUL-terminatedJeff King1-4/+19
In fsck_ident(), we parse the timestamp with parse_timestamp(), which is really an alias for strtoumax(). But since our buffer may not be NUL-terminated, this can trigger a complaint from ASan's strict_string_checks mode. This is a false positive, since we know that the buffer contains a trailing newline (which we checked earlier in the function), and that strtoumax() would stop there. But it is worth working around ASan's complaint. One is because that will let us turn on strict_string_checks by default, which has helped catch other real problems. And two is that the safety of the current code is very hard to reason about (it subtly depends on distant code which could change). One option here is to just parse the number left-to-right ourselves. But we care about the size of a timestamp_t and detecting overflow, since that's part of the point of these checks. And doing that correctly is tricky. So we'll instead just pull the digits into a separate, NUL-terminated buffer, and use that to call parse_timestamp(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18fsck: remove redundant date timestamp checkJeff King1-1/+1
After calling "parse_timestamp(p, &end, 10)", we complain if "p == end", which would imply that we did not see any digits at all. But we know this cannot be the case, since we would have bailed already if we did not see any digits, courtesy of extra checks added by 8e4309038f (fsck: do not assume NUL-termination of buffers, 2023-01-19). Since then, checking "p == end" is redundant and we can drop it. This will make our lives a little easier as we refactor further. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18fsck: avoid strcspn() in fsck_ident()Jeff King1-10/+22
We may be operating on a buffer that is not NUL-terminated, but we use strcspn() to parse it. This is OK in practice, as discussed in 8e4309038f (fsck: do not assume NUL-termination of buffers, 2023-01-19), because we know there is at least a trailing newline in our buffer, and we always pass "\n" to strcspn(). So we know it will stop before running off the end of the buffer. But this is a subtle point to hang our memory safety hat on. And it confuses ASan's strict_string_checks mode, even though it is technically a false positive (that mode complains that we have no NUL, which is true, but it does not know that we have verified the presence of the newline already). Let's instead open-code the loop. As a bonus, this makes the logic more obvious (to my mind, anyway). The current code skips forward with strcspn until it hits "<", ">", or "\n". But then it must check which it saw to decide if that was what we expected or not, duplicating some logic between what's in the strcspn() and what's in the domain logic. Instead, we can just check each character as we loop and act on it immediately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18fsck: assert newline presence in fsck_ident()Jeff King1-7/+9
The fsck code purports to handle buffers that are not NUL-terminated, but fsck_ident() uses some string functions. This works OK in practice, as explained in 8e4309038f (fsck: do not assume NUL-termination of buffers, 2023-01-19). Before calling fsck_ident() we'll have called verify_headers(), which makes sure we have at least a trailing newline. And none of our string-like functions will walk past that newline. However, that makes this code at the top of fsck_ident() very confusing: *ident = strchrnul(*ident, '\n'); if (**ident == '\n') (*ident)++; We should always see that newline, or our memory safety assumptions have been violated! Further, using strchrnul() is weird, since the whole point is that if the newline is not there, we don't necessarily have a NUL at all, and might read off the end of the buffer. So let's have callers pass in the boundary of our buffer, which lets us safely find the newline with memchr(). And if it is not there, this is a BUG(), because it means our caller did not validate the input with verify_headers() as it was supposed to (and we are better off bailing rather than having memory-safety problems). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18cache-tree: avoid strtol() on non-string bufferJeff King1-13/+37
A cache-tree extension entry in the index looks like this: <name> NUL <entry_nr> SPACE <subtree_nr> NEWLINE <binary_oid> where the "_nr" items are human-readable base-10 ASCII. We parse them with strtol(), even though we do not have a NUL-terminated string (we'd generally have an mmap() of the on-disk index file). For a well-formed entry, this is not a problem; strtol() will stop when it sees the newline. But there are two problems: 1. A corrupted entry could omit the newline, causing us to read further. You'd mostly get stopped by seeing non-digits in the oid field (and if it is likewise truncated, there will still be 20 or more bytes of the index checksum). So it's possible, though unlikely, to read off the end of the mmap'd buffer. Of course a malicious index file can fake the oid and the index checksum to all (ASCII) 0's. This is further complicated by the fact that mmap'd buffers tend to be zero-padded up to the page boundary. So to run off the end, the index size also has to be a multiple of the page size. This is also unlikely, though you can construct a malicious index file that matches this. The security implications aren't too interesting. The index file is a local file anyway (so you can't attack somebody by cloning, but only if you convince them to operate in a .git directory you made, at which point attacking .git/config is much easier). And it's just a read overflow via strtol(), which is unlikely to buy you much beyond a crash. 2. ASan has a strict_string_checks option, which tells it to make sure that options to string functions (like strtol) have some eventual NUL, without regard to what the function would actually do (like stopping at a newline here). This option sometimes has false positives, but it can point to sketchy areas (like this one) where the input we use doesn't exhibit a problem, but different input _could_ cause us to misbehave. Let's fix it by just parsing the values ourselves with a helper function that is careful not to go past the end of the buffer. There are a few behavior changes here that should not matter: - We do not consider overflow, as strtol() would. But nor did the original code. However, we don't trust the value we get from the on-disk file, and if it says to read 2^30 entries, we would notice that we do not have that many and bail before reading off the end of the buffer. - Our helper does not skip past extra leading whitespace as strtol() would, but according to gitformat-index(5) there should not be any. - The original quit parsing at a newline or a NUL byte, but now we insist on a newline (which is what the documentation says, and what Git has always produced). Since we are providing our own helper function, we can tweak the interface a bit to make our lives easier. The original code does not use strtol's "end" pointer to find the end of the parsed data, but rather uses a separate loop to advance our "buf" pointer to the trailing newline. We can instead provide a helper that advances "buf" as it parses, letting us read strictly left-to-right through the buffer. I didn't add a new test here. It's surprisingly difficult to construct an index of exactly the right size due to the way we pad entries. But it is easy to trigger the problem in existing tests when using ASan's strict string checking, coupled with a recent change to use NO_MMAP with ASan builds. So: make SANITIZE=address cd t ASAN_OPTIONS=strict_string_checks=1 ./t0090-cache-tree.sh triggers it reliably. Technically it is not deterministic because there is ~8% chance (it's 1-(255/256)^20, or ^32 for sha256) that the trailing checksum hash has a NUL byte in it. But we compute enough cache-trees in the course of that script that we are very likely to hit the problem in one of them. We can look at making strict_string_checks the default for ASan builds, but there are some other cases we'd want to fix first. Reported-by: correctmost <cmlists@sent.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18Makefile: turn on NO_MMAP when building with ASanJeff King2-1/+8
Git often uses mmap() to access on-disk files. This leaves a blind spot in our SANITIZE=address builds, since ASan does not seem to handle mmap at all. Nor does the OS notice most out-of-bounds access, since it tends to round up to the nearest page size (so depending on how big the map is, you might have to overrun it by up to 4095 bytes to trigger a segfault). The previous commit demonstrates a memory bug that we missed. We could have made a new test where the out-of-bounds access was much larger, or where the mapped file ended closer to a page boundary. But the point of running the test suite with sanitizers is to catch these problems without having to construct specific tests. Let's enable NO_MMAP for our ASan builds by default, which should give us better coverage. This does increase the memory usage of Git, since we're copying from the filesystem into heap. But the repositories in the test suite tend to be small, so the overhead isn't really noticeable (and ASan already has quite a performance penalty). There are a few other known bugs that this patch will help flush out. However, they aren't directly triggered in the test suite (yet). So it's safe to turn this on now without breaking the test suite, which will help us add new tests to demonstrate those other bugs as we fix them. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18pack-bitmap: handle name-hash lookups in incremental bitmapsJeff King1-4/+25
If a bitmap has a name-hash cache, it is an array of 32-bit integers, one per entry in the bitmap, which we've mmap'd from the .bitmap file. We access it directly like this: if (bitmap_git->hashes) hash = get_be32(bitmap_git->hashes + index_pos); That works for both regular pack bitmaps and for non-incremental midx bitmaps. There is one bitmap_index with one "hashes" array, and index_pos is within its bounds (we do the bounds-checking when we load the bitmap). But for an incremental midx bitmap, we have a linked list of bitmap_index structs, and each one has only its own small slice of the name-hash array. If index_pos refers to an object that is not in the first bitmap_git of the chain, then we'll access memory outside of the bounds of its "hashes" array, and often outside of the mmap. Instead, we should walk through the list until we find the bitmap_index which serves our index_pos, and use its hash (after adjusting index_pos to make it relative to the slice we found). This is exactly what we do elsewhere for incremental midx lookups (like the pack_pos_to_midx() call a few lines above). But we can't use existing helpers like midx_for_object() here, because we're walking through the chain of bitmap_index structs (each of which refers to a midx), not the chain of incremental multi_pack_index structs themselves. The problem is triggered in the test suite, but we don't get a segfault because the out-of-bounds index is too small. The OS typically rounds our mmap up to the nearest page size, so we just end up accessing some extra zero'd memory. Nor do we catch it with ASan, since it doesn't seem to instrument mmaps at all. But if we build with NO_MMAP, then our maps are replaced with heap allocations, which ASan does check. And so: make NO_MMAP=1 SANITIZE=address cd t ./t5334-incremental-multi-pack-index.sh does show the problem (and this patch makes it go away). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18compat/mmap: mark unused argument in git_munmap()Jeff King1-1/+1
Our mmap compat code emulates mapping by using malloc/free. Our git_munmap() must take a "length" parameter to match the interface of munmap(), but we don't use it (it is up to the allocator to know how big the block is in free()). Let's mark it as UNUSED to avoid complaints from -Wunused-parameter. Otherwise you cannot build with "make DEVELOPER=1 NO_MMAP=1". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18ci: bump actions/setup-go from 5 to 6Johannes Schindelin1-1/+1
Bumps actions/setup-go from 5 to 6. This upgrade includes dependency updates that incorporate a fix for a critical vulnerability. [Originally opened at https://github.com/git-for-windows/git/pull/5811] - [Release notes](https://github.com/actions/setup-go/releases) - [Commits](https://github.com/actions/setup-go/compare/v5...v6) Originally-authored-by: dependabot[bot] <support@github.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-17mingw: avoid the comma operatorJohannes Schindelin1-20/+28
The pattern `return errno = ..., -1;` is observed several times in `compat/mingw.c`. It has served us well over the years, but now clang starts complaining: compat/mingw.c:723:24: error: possible misuse of comma operator here [-Werror,-Wcomma] 723 | return errno = ENOSYS, -1; | ^ See for example this failing workflow run: https://github.com/git-for-windows/git-sdk-arm64/actions/runs/15457893907/job/43513458823#step:8:201 Let's appease clang (and also reduce the use of the no longer common comma operator). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-17cmake: stop trying to build the reftable and xdiff librariesJohannes Schindelin1-13/+1
In the `en/make-libgit-a` topic branch, more precisely in the commits f3b4c89d59f1 (make: delete REFTABLE_LIB, add reftable to LIB_OBJS, 2025-10-02) and cf680cdb9543 (make: delete XDIFF_LIB, add xdiff to LIB_OBJS, 2025-10-02), the strategy to build three static libraries was rethought, and instead only one static library is now built. This is good. However, the CMake definition was not changed accordingly, and now CMake-based builds fail thusly: [...] Generating hook-list.h CMake Error at CMakeLists.txt:122 (string): string sub-command REPLACE requires at least four arguments. Call Stack (most recent call first): CMakeLists.txt:711 (parse_makefile_for_sources) CMake Error at CMakeLists.txt:122 (string): string sub-command REPLACE requires at least four arguments. Call Stack (most recent call first): CMakeLists.txt:717 (parse_makefile_for_sources) -- Configuring incomplete, errors occurred! Fix that by removing the parts that expect the reftable and xdiff objects to be defined separately in the Makefile, still. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-17wincred: avoid memory corruptionDavid Macek1-1/+1
`wcsncpy_s()` wants to write the terminating null character so we need to allocate one more space for it in the target memory block. This should fix crashes when trying to read passwords. When this happened, the password/token wouldn't print out and Git would therefore ask for a new password every time. Signed-off-by: David Macek <david.macek.0@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-17merge-ort: fix failing merges in special corner caseElijah Newren2-1/+106
At GitHub, we had a repository that was triggering git: merge-ort.c:3032: process_renames: Assertion `newinfo && !newinfo->merged.clean` failed. during git replay. This sounds similar to the somewhat recent f6ecb603ff8a (merge-ort: fix directory rename on top of source of other rename/delete, 2025-08-06), but the cause is different. Unlike that case, there are no rename-to-self situations arising in this case, and new to this case it can only be triggered during a replay operation on the 2nd or later commit being replayed, never on the first merge in the sequence. To trigger, the repository needs: * an upstream which: * renames a file to a different directory, e.g. old/file -> new/file * leaves other files remaining in the original directory (so that e.g. "old/" still exists upstream even though file has been removed from it and placed elsewhere) * a topic branch being rebased where: * a commit in the sequence: * modifies old/file * a subsequent commit in the sequence being replayed: * does NOT touch *anything* under new/ * does NOT touch old/file * DOES modify other paths under old/ * does NOT have any relevant renames that we need to detect _anywhere_ elsewhere in the tree (meaning this interacts interestingly with both directory renames and cached renames) In such a case, the assertion will trigger. The fix turns out to be surprisingly simple. I have a very vague recollection that I actually considered whether to add such an if-check years ago when I added the very similar one for oldinfo in 1b6b902d95a5 (merge-ort: process_renames() now needs more defensiveness, 2021-01-19), but I think I couldn't figure out a possible way to trigger it and was worried at the time that if I didn't know how to trigger it then I wasn't so sure that simply skipping it was correct. Waiting did give me a chance to put more thorough tests and checks into place for the rename-to-self cases a few months back, which I might not have found as easily otherwise. Anyway, put the check in place now and add a test that demonstrates the fix. Note that this bug, as demonstrated by the conditions listed above, runs at the intersection of relevant renames, trivial directory resolutions, and cached renames. All three of those optimizations are ones that unfortunately make the code (and testcases!) a bit more complex, and threading all three makes it a bit more so. However, the testcase isn't crazy enough that I'd expect no one to ever hit it in practice, and was confused why we didn't see it before. After some digging, I discovered that merge.directoryRenames=false is a workaround to this bug, and GitHub used that setting until recently (it was a "temporary" match-what-libgit2-does piece of code that lasted years longer than intended). Since the conditions I gave above for triggering this bug rule out the possibility of there being directory renames, one might assume that it shouldn't matter whether you try to detect such renames if there aren't any. However, due to commit a16e8efe5c2b (merge-ort: fix merge.directoryRenames=false, 2025-03-13), the heavy hammer used there means that merge.directoryRenames=false ALSO turns off rename caching, which is critical to triggering the bug. This becomes a bit more than an aside since... Re-reading that old commit, a16e8efe5c2b (merge-ort: fix merge.directoryRenames=false, 2025-03-13), it appears that the solution to this latest bug might have been at least a partial alternative solution to that old commit. And it may have been an improved alternative (or at least help implement one), since it may be able to avoid the heavy-handed disabling of rename cache. That might be an interesting future thing to investigate, but is not critical for the current fix. However, since I spent time digging it all up, at least leave a small comment tweak breadcrumb to help some future reader (myself or others) who wants to dig further to connect the dots a little quicker. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-17merge-ort: remove debugging crudElijah Newren1-1/+1
While developing commit a16e8efe5c2b (merge-ort: fix merge.directoryRenames=false, 2025-03-13), I was testing things out and had an extra condition on one of the if-blocks that I occasionally swapped between '&& 0' and '&& 1' to see the effects of the changes. I forgot to remove it before submitting and it wasn't caught in review. Remove it now. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-17t6429: update comment to mention correct toolElijah Newren1-8/+7
A comment at the top of t6429 mentions why the test doesn't exercise git rebase or git cherry-pick. However, it claims that it uses `test-tool fast-rebase`. That was true when the comment was written, but commit f920b0289ba3 (replay: introduce new builtin, 2023-11-24) changed it to use git replay without updating this comment. We could potentially just strike this second comment, since git replay is a bona fide built-in, but perhaps the explanation about why it focuses on git replay is still useful. Update the comment to make it accurate again. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-17make strip: include `scalar`Johannes Schindelin1-1/+1
When Scalar was made a canonical part of Git in 7b5c93c6c68 (scalar: include in standard Git build & installation, 2022-09-02), it was added to all relevant Makefile targets except for the `strip` target. Let's correct that. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-17wrapper: simplify xmkstemp()René Scharfe1-18/+1
Call xmkstemp_mode() instead of duplicating its error handling code. This switches the implementation from the system's mkstemp(3) to our own git_mkstemp_mode(), which works just as well. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-17blame: make diff algorithm configurableAntonin Delpeuch6-21/+278
The diff algorithm used in 'git-blame(1)' is set to 'myers', without the possibility to change it aside from the `--minimal` option. There has been long-standing interest in changing the default diff algorithm to "histogram", and Git 3.0 was floated as a possible occasion for taking some steps towards that: https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/ As a preparation for this move, it is worth making sure that the diff algorithm is configurable where useful. Make it configurable in the `git-blame(1)` command by introducing the `--diff-algorithm` option and make honor the `diff.algorithm` config variable. Keep Myers diff as the default. Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-17xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASKAntonin Delpeuch3-3/+1
The XDF_DIFF_ALGORITHM_MASK bit mask only includes bits for the patience and histogram diffs, not for the minimal one. This means that when reseting the diff algorithm to the default one, one needs to separately clear the bit for the minimal diff. There are places in the code that fail to do that: merge-ort.c and builtin/merge-file.c. Add the XDF_NEED_MINIMAL bit to the bit mask, and remove the separate clearing of this bit in the places where it hasn't been forgotten. Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-17Git 2.52v2.52.0maintJunio C Hamano2-4/+7
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-17Merge branch 'jc/ci-use-arm64-p4-on-macos'Junio C Hamano1-1/+1
We replaced deprecated macos-13 with macos-14 image in GitHub Actions CI, but we forgot that the image is for arm64. We have been seeing a lot of test failures ever since. Switch to arm64 binary for Perforce tests. * jc/ci-use-arm64-p4-on-macos: Use Perforce arm64 binary on macOS CI jobs
2025-11-16commit: refactor verify_commit_buffer()Christian Couder2-2/+22
In a following commit, we are going to check commit signatures, but we won't have a commit yet, only a commit buffer, and we are going to discard this commit buffer if the signature is invalid. So it would be wasteful to create a commit that we might discard, just to be able to check a commit signature. It would be simpler instead to be able to check commit signatures using only a commit buffer instead of a commit. To be able to do that, let's extract some code from the check_commit_signature() function into a new verify_commit_buffer() function, and then let's make check_commit_signature() call verify_commit_buffer(). Note that this doesn't fundamentally change how check_commit_signature() works. It used to call parse_signed_commit() which calls repo_get_commit_buffer(), parse_buffer_signed_by_header() and repo_unuse_commit_buffer(). Now these 3 functions are called directly by verify_commit_buffer(). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-16fast-import: refactor finalize_commit_buffer()Christian Couder1-4/+13
In a following commit we are going to finalize commit buffers with or without signatures in order to check the signatures and possibly drop them. To do so easily and without duplication, let's refactor the current code that finalizes commit buffers into a new finalize_commit_buffer() function. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-16builtin/repo: fix table alignment for UTF-8 charactersJiang Xin2-4/+54
The output table from "git repo structure" is misaligned when displaying UTF-8 characters (e.g., non-ASCII glyphs). E.g.: | 仓库结构 | 值 | | -------------- | ---- | | * 引用 | | | * 计数 | 67 | The previous implementation used simple width formatting with printf() which didn't properly handle multi-byte UTF-8 characters, causing misaligned table columns when displaying repository structure information. This change modifies the stats_table_print_structure function to use strbuf_utf8_align() instead of basic printf width specifiers. This ensures proper column alignment regardless of the character encoding of the content being displayed. Also add test cases for strbuf_utf8_align(), a function newly introduced in "builtin/repo.c". Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-16t/unit-tests: add UTF-8 width tests for CJK charsJiang Xin3-0/+99
The file "builtin/repo.c" uses utf8_strwidth() to calculate the display width of UTF-8 characters in a table, but the resulting output is still misaligned. Add test cases for both utf8_strwidth and utf8_strnwidth to verify that they correctly compute the display width for UTF-8 characters. Also updated the build configuration in Makefile and meson.build to include the new test suite in the build process. Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-16Use Perforce arm64 binary on macOS CI jobsJunio C Hamano1-1/+1
The previous step replaced deprecated macos-13 image with macos-14 image on GitHub Actions CI. While x86-64 binaries can work there, because macos-14 images are arm64 based (we could replace it with macos-14-large that is x86-64), it makes more sense to use arm64 binary there. Without this change, we have been getting unusually higher rate of failures from random macOS CI jobs railing to run t98xx series of tests. Helped-by: Koji Nakamaru <koji.nakamaru@gree.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-16Merge tag 'l10n-2.52.0-v1' of https://github.com/git-l10n/git-poJunio C Hamano10-8985/+13309
l10n-2.52.0-v1 * tag 'l10n-2.52.0-v1' of https://github.com/git-l10n/git-po: l10n: zh_CN: updated translation for 2.52 l10n: uk: add 2.52 translation l10n: zh_TW.po: update Git 2.52 translation l10n: Updated translation for vi-2.52 l10n: tr: Update Turkish translations l10n: po-id for 2.52 l10n: ga.po: Update Irish translation for Git 2.52 l10n: bg.po: Updated Bulgarian translation (6065t) l10n: fr: version 2.52 l10n: sv.po: Update Swedish translation
2025-11-16l10n: zh_CN: updated translation for 2.52Teng Long1-297/+1401
Reviewed-by: 依云 <lilydjwg@gmail.com> Signed-off-by: Teng Long <dyroneteng@gmail.com> Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2025-11-15read-cache: drop submodule check from add_to_cache()Jeff King3-4/+2
In add_to_cache(), we treat any directories as submodules, and complain if we can't resolve their HEAD. This call to resolve_gitlink_ref() was added by f937bc2f86 (add: error appropriately on repository with no commits, 2019-04-09), with the goal of improving the error message for empty repositories. But we already resolve the submodule HEAD in index_path(), which is where we find the actual oid we're going to use. Resolving it again here introduces some downsides: 1. It's more work, since we have to open up the submodule repository's files twice. 2. There are call paths that get to index_path() without going through add_to_cache(). For instance, we'd want a similar informative message if "git diff empty" finds that it can't resolve the submodule's HEAD. (In theory we can also get there through update-index, but AFAICT it refuses to consider directories as submodules at all, and just complains about them). 3. The resolution in index_path() catches more errors that we don't handle here. In particular, it will validate that the object format for the submodule matches that of the superproject. This isn't a bug, since our call in add_to_cache() throws away the oid it gets without looking at it. But it certainly caused confusion for me when looking at where the object-format check should go. So instead of resolving the submodule HEAD in add_to_cache(), let's just teach the call in index_path() to actually produce an error message (which it already does for other cases). That's probably what f937bc2f86 should have done in the first place, and it gives us a single point of resolution when adding a submodule to the index. The resulting output is slightly more verbose, as we propagate the error up the call stack, but I think that's OK (and again, matches many other errors we get when indexing fails). I've left the text of the error message as-is, though it is perhaps overly specific. There are many reasons that resolving the submodule HEAD might fail, though outside of corruption or system errors it is probably most likely that the submodule HEAD is simply on an unborn branch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-16Merge branch '2.52-uk' of github.com:arkid15r/git-ukrainian-l10nJiang Xin1-231/+1161
* '2.52-uk' of github.com:arkid15r/git-ukrainian-l10n: l10n: uk: add 2.52 translation
2025-11-15object-file: disallow adding submodules of different hash algobrian m. carlson3-1/+55
The design of the hash algorithm transition plan is that objects stored must be entirely in one algorithm since we lack any way to indicate a mix of algorithms. This also includes submodules, but we have traditionally not enforced this, which leads to various problems when trying to clone or check out the the submodule from the remote. Since this cannot work in the general case, restrict adding a submodule of a different algorithm to the index. Add tests for git add and git submodule add that these are rejected. Note that we cannot check this in git fsck because the malformed submodule is stored in the tree as an object ID which is either truncated (when a SHA-256 submodule is added to a SHA-1 repository) or padded with zeros (when a SHA-1 submodule is added to a SHA-256 repository). We cannot detect even the latter case because someone could have an actual submodule that actually ends in 24 zeros, which would be a false positive. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-15l10n: uk: add 2.52 translationArkadii Yakovets1-231/+1161
Co-authored-by: Kate Golovanova <kate@kgthreads.com> Signed-off-by: Arkadii Yakovets <ark@cho.red> Signed-off-by: Kate Golovanova <kate@kgthreads.com>
2025-11-15Merge branch 'vi-2.52' of github.com:Nekosha/git-poJiang Xin1-242/+1140
* 'vi-2.52' of github.com:Nekosha/git-po: l10n: Updated translation for vi-2.52
2025-11-15Merge branch 'l10n/zh-TW/git-2-52' of github.com:l10n-tw/git-poJiang Xin1-415/+1568
* 'l10n/zh-TW/git-2-52' of github.com:l10n-tw/git-po: l10n: zh_TW.po: update Git 2.52 translation
2025-11-15Merge branch 'po-id' of github.com:bagasme/git-poJiang Xin1-283/+1420
* 'po-id' of github.com:bagasme/git-po: l10n: po-id for 2.52
2025-11-15Merge branch 'master' of github.com:alshopov/git-poJiang Xin1-230/+1177
* 'master' of github.com:alshopov/git-po: l10n: bg.po: Updated Bulgarian translation (6065t)