aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2025-05-23git-gui: sanitize 'exec' arguments: simple casesJohannes Sixt4-15/+46
Tcl 'exec' assigns special meaning to its argument when they begin with redirection, pipe or background operator. There are a number of invocations of 'exec' which construct arguments that are taken from the Git repository or a user input. However, when file names or ref names are taken from the repository, it is possible to find names that have these special forms. They must not be interpreted by 'exec' lest it redirects input or output, or attempts to build a pipeline using a command name controlled by the repository. Introduce a helper function that identifies such arguments and prepends "./" to force such a name to be regarded as a relative file name. Convert those 'exec' calls where the arguments can simply be packed into a list. Note that most commands containing the word 'exec' route through console::exec or console::chain, which we will treat in another commit. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23git-gui: avoid auto_execok for git-bash menu itemMark Levedahl1-1/+1
On Windows, git-gui offers to open a git-bash session for the current repository from the menu, but uses [auto_execok start] to get the command to actually run that shell. The code for auto_execok, in /usr/share/tcl8.6/tcl.init, has 'start' in the 'shellBuiltins' list for cmd.exe on Windows: as a result, auto_execok does not actually search for start, meaning this usage is technically ok with auto_execok now. However, leaving this use of auto_execok in place will just induce confusion about why a known unsafe function is being used on Windows. Instead, let's switch to using our known safe _which function that looks only in $PATH, excluding the current working directory. Signed-off-by: Mark Levedahl <mlevedahl@gmail.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23git-gui: treat file names beginning with "|" as relative pathsJohannes Sixt11-32/+44
The Tcl 'open' function has a very wide interface. It can open files as well as pipes to external processes. The difference is made only by the first character of the file name: if it is "|", a process is spawned. We have a number of calls of Tcl 'open' that take a file name from the environment in which Git GUI is running. Be prepared that insane values are injected. In particular, when we intend to open a file, do not take a file name that happens to begin with "|" as a request to run a process. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23git-gui: remove unused proc is_shellscriptMark Levedahl1-10/+0
Commit 7d076d56757c (git-gui: handle shell script text filters when loading for blame, 2011-12-09) added is_shellscript to test if a file is executable by the shell, used only when searching for textconv filters. The previous commit rearranged the tests for finding such filters, and removed the only user of is_shellscript. Remove this function. Signed-off-by: Mark Levedahl <mlevedahl@gmail.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23git-gui: remove git config --list handling for git < 1.5.3Johannes Sixt1-46/+23
git-gui uses `git config --null --list` to parse configuration. Git versions prior to 1.5.3 do not have --null and need different treatment. Nobody should be using such an old version anymore. (Moreover, since 0730a5a3a, git-gui requires git v2.36 or later). Keep only the code for modern Git. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23git-gui: remove special treatment of Windows from open_cmd_pipeJohannes Sixt1-14/+5
Commit 7d076d56757c (git-gui: handle shell script text filters when loading for blame, 2011-12-09) added open_cmd_pipe to run text conversion in support of blame, with special handling for shell scripts on Windows. To determine whether the command is a shell script, 'lindex' is used to pick off the first token from the command. However, cmd is actually a command string taken from .gitconfig literally and is not necessarily a syntactically correct Tcl list. Hence, it cannot be processed by 'lindex' and 'lrange' reliably. Pass the command string to the shell just like on non-Windows platforms to avoid the potentially incorrect treatment. A use of 'auto_execok' is removed by this change. This function is dangerous on Windows, because it searches programs in the current directory. Delegating the path lookup to the shell is safe, because /bin/sh and /bin/bash follow POSIX on all platforms, including the Git for Windows port. A possible regression is that the old code, given filter command of 'foo', could find 'foo.bat' as a script, and not just bare 'foo', or 'foo.exe'. This rewrite requires explicitly giving the suffix if it is not .exe. This part of Git GUI can be exercised using git gui blame -- some.file while some.file has a textconv filter configured and has unstaged modifications. Helped-by: Mark Levedahl <mlevedahl@gmail.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23git-gui: remove HEAD detachment implementation for git < 1.5.3Mark Levedahl1-12/+2
git-gui provides an implementation to detach HEAD on Git versions prior to 1.5.3. Nobody should be using such an old version anymore. (Moreover, since 0730a5a3a, git-gui requires git v2.36 or later). Keep only the code for modern Git. Signed-off-by: Mark Levedahl <mlevedahl@gmail.com> [j6t: message tweaked] Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23git-gui: use only the configured shellMark Levedahl2-3/+4
git-gui has a few places where a bare "sh" is passed to exec, meaning that the first instance of "sh" on $PATH will be used rather than the shell configured. This violates expectations that the configured shell is being used. Let's use [shellpath] everywhere. Signed-off-by: Mark Levedahl <mlevedahl@gmail.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23git-gui: remove Tcl 8.4 workaround on 2>@1 redirectionMark Levedahl1-18/+3
Since b792230 ("git-gui: Show a progress meter for checking out files", 2007-07-08), git-gui includes a workaround for Tcl that does not support using 2>@1 to redirect stderr to stdout. Tcl added such support in 8.4.7, released in 2004, and this is fully supported in all 8.5 releases. As git-gui has a hard-coded requirement for Tcl >= 8.5, the workaround is no longer needed. Delete it. Signed-off-by: Mark Levedahl <mlevedahl@gmail.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23git-gui: make _shellpath usable on startupMark Levedahl1-8/+30
Since commit d5257fb3c1de (git-gui: handle textconv filter on Windows and in development, 2010-08-07), git-gui will search for a usable shell if _shellpath is not configured, and on Windows may resort to using auto_execok to find 'sh'. While this was intended for development use, checks are insufficient to assure a proper configuration when deployed where _shellpath is always set, but might not give a usable shell. Let's make this more robust by only searching if _shellpath was not defined, and then using only our restricted search functions. Furthermore, we should convert to a Windows path on Windows. Always check for a valid shell on startup, meaning an absolute path to an executable, aborting if these conditions are not met. Signed-off-by: Mark Levedahl <mlevedahl@gmail.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23Merge branch 'ml/git-gui-exec-path-fix'Johannes Sixt1-26/+3
* ml/git-gui-exec-path-fix: git-gui - use git-hook, honor core.hooksPath git-gui - re-enable use of hook scripts
2025-05-23git-gui: use [is_Windows], not bad _shellpathMark Levedahl1-1/+1
Commit 7d076d56757c (git-gui: handle shell script text filters when loading for blame, 2011-12-09) added open_cmd_pipe, with special handling for Windows detected by seeing that _shellpath does not point to an executable shell. That is bad practice, and is broken by the next commit that assures _shellpath is valid on all platforms. Fix this by using [is_Windows] as done for all Windows specific code. Signed-off-by: Mark Levedahl <mlevedahl@gmail.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23git-gui: _which, only add .exe suffix if not presentMark Levedahl1-0/+3
The _which function finds executables on $PATH, and adds .exe on Windows unless -script was given. However, win32.tcl executes "wscript.exe" and "cscript.exe", both of which fail as _which adds .exe to both. This is already fixed in git-gui released by Git for Windows. Do so here. Signed-off-by: Mark Levedahl <mlevedahl@gmail.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23Merge branch 'js/fix-open-exec-2.40.0' into js/fix-open-execTaylor Blau1-85/+163
Branch js/fix-open-exec-2.40.0 converts `open` and `exec` calls to call wrappers that sanitze the command arguments. This side branch updates three `open` calls that are in conflict with the fix in the preceding commit. To keep the intended operation of the 'open' calls, this merge does not try to merge and resolve the conflicts, but ignores the conversions that are brought in by the side branch, taking "ours" side of the code in these three cases. New fixes are the topic of the next commit. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23gitk: encode arguments correctly with "open"Avi Halachmi (:avih)1-16/+3
While "exec" uses a normal arguments list which is applied as command + arguments (and redirections, etc), "open" uses a single argument which is this command+arguments, where the command and arguments are a list inside this one argument to "open". Commit bb5cb23 (gitk: prevent overly long command lines 2023-05-08) changed several values from individual arguments in that list (hashes and file names), to a single value which is fed to git via redirection to its stdin using "open" [1]. However, it didn't ensure correctly that this aggregate value in this string is interpreted as a single element in this command+args list. It did just enough so that newlines (which is how these elements are concatenated) don't split this single list element. A followup commit at the same patchset: 7dd272e (gitk: escape file paths before piping to git log 2023-05-08) added a bit more, by escaping backslahes and spaces at the file names, so that at least it doesn't break when such file names get used there. But these are not enough. At the very least tab is missing, and more, and trying to manually escape every possible thing which can affect how this string is interpreted in a list is a sub-par approach. The solution is simply to tell tcl "this is a single list element". which we can do by aggregating this value completely normally (hashes and files separated by newlines), and then do [list $value]. So this is what this commit does, for all 3 places where bb5cb23 changed individual elements into an aggregate value. [1] That was not a fully accurate description. The accurate version is that this string originally included two lists: hashes and files. When used with "open" these lists correctly become the individual elements of these lists, even if they contain spaces etc, so the arguments which were used at this "git" commands were correct. Commit bb5cb23 couldn't use these two lists as-is, because it needed to process the individual elements in them (one element per line of the aggregate value), and the issue is that ensuring this aggregate is indeed interpreted as a single list element was sub-par. Note: all the (double) quotes before/after the modification are not required and with zero effect, even for \n. But this commit preserves the original quoting form intentionally. It can be cleaned up later. Signed-off-by: Avi Halachmi (:avih) <avihpit@yahoo.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23gitk: sanitize 'open' arguments: command pipelineJohannes Sixt1-4/+15
As in the earlier commits, introduce a function that constructs a pipeline of commands after sanitizing the arguments. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23gitk: collect construction of blameargs into a single conditionalJohannes Sixt1-8/+6
The command line to invoke 'git blame' for a single line is constructed using several if-conditionals, each with the same condition {$from_index new {}}. Merge all of them into a single conditional. This requires to duplicate significant parts of the command, but it helps the next change, where we will have to deal with a nested list structure. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23gitk: sanitize 'open' arguments: simple commands, readable and writableJohannes Sixt1-2/+9
As in the previous commits, introduce a function that sanitizes arguments and also keeps the returned file handle writable to pass data to stdin. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23gitk: sanitize 'open' arguments: simple commands with redirectionsJohannes Sixt1-5/+13
As in the previous commits, introduce a function that sanitizes arguments intended for the process and in addition allows to pass redirections, which are passed to Tcl's 'open' verbatim. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23gitk: sanitize 'open' arguments: simple commandsJohannes Sixt1-26/+33
Tcl 'open' treats the second argument as a command when it begins with |. The remainder of the argument is a list comprising the command and its arguments. It assigns special meaning to these arguments when they begin with a redirection, pipe or background operator. There are a number of invocations of 'open' which construct arguments that are taken from the Git repository or a user input. However, when file names or ref names are taken from the repository, it is possible to find names which have these special forms. They must not be interpreted by 'open' lest it redirects input or output, or attempts to build a pipeline using a command name controlled by the repository. Introduce a helper function that identifies such arguments and prepends "./" to force such a name to be regarded as a relative file name. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23gitk: sanitize 'exec' arguments: redirect to processJohannes Sixt1-2/+2
Convert one 'exec' call that sends output to a process (pipeline). Fortunately, the command does not contain any variables. For this reason, just treat it as a "redirection". Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23gitk: sanitize 'exec' arguments: redirections and backgroundJohannes Sixt1-3/+2
Convert 'exec' calls that both redirect output to a file and run the process in the background. 'safe_exec_redirect' can take both these "redirections" in the second argument simultaneously. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23gitk: sanitize 'exec' arguments: redirectionsJohannes Sixt1-8/+17
As in the previous commits, introduce a function that sanitizes arguments intended for the process and in addition allows to pass redirections verbatim, which are interpreted by Tcl's 'exec'. Redirections can include the background operator '&'. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23gitk: sanitize 'exec' arguments: 'eval exec'Johannes Sixt1-6/+6
Convert calls of 'exec' where the arguments are already available in a list and 'eval' is used to unpack the list. Use 'concat' to unite the arguments into a single list before passing them to 'safe_exec'. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23gitk: sanitize 'exec' arguments: simple casesJohannes Sixt1-18/+48
Tcl 'exec' assigns special meaning to its argument when they begin with redirection, pipe or background operator. There are a number of invocations of 'exec' which construct arguments that are taken from the Git repository or a user input. However, when file names or ref names are taken from the repository, it is possible to find names with have these special forms. They must not be interpreted by 'exec' lest it redirects input or output, or attempts to build a pipeline using a command name controlled by the repository. Introduce a helper function that identifies such arguments and prepends "./" to force such a name to be regarded as a relative file name. Convert those 'exec' calls where the arguments can simply be packed into a list. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23gitk: have callers of diffcmd supply pipe symbol when necessaryJohannes Sixt1-10/+6
Function 'diffcmd' derives which of git diff-files, git diff-index, or git diff-tree must be invoked depending on the ids provided. It puts the pipe symbol as the first element of the returned command list. Note though that of the four callers only two use the command with Tcl 'open' and need the pipe symbol. The other two callers pass the command to Tcl 'exec' and must remove the pipe symbol. Do not include the pipe symbol in the constructed command list, but let the call sites decide whether to add it or not. Note that Tcl 'open' inspects only the first character of the command list, which is also the first character of the first element in the list. For this reason, it is valid to just tack on the pipe symbol with |$cmd and it is not necessary to use [concat | $cmd]. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-23gitk: treat file names beginning with "|" as relative pathsJohannes Sixt1-5/+18
The Tcl 'open' function has a vary wide interface. It can open files as well as pipes to external processes. The difference is made only by the first character of the file name: if it is "|", an process is spawned. We have a number of calls of Tcl 'open' that take a file name from the environment in which Gitk is running. Be prepared that insane values are injected. In particular, when we intend to open a file, do not mistake a file name that happens to begin with "|" as a request to run a process. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2025-05-22midx docs: clarify tie breakingPhillip Wood1-4/+7
Clarify what happens when an object exists in more than one pack, but not in the preferred pack. "git multi-pack-index repack" relies on ties for objects that are not in the preferred pack being resolved in favor of the newest pack that contains a copy of the object. If ties were resolved in favor of the oldest pack as the current documentation suggests the multi-pack index would not reference any of the objects in the pack created by "git multi-pack-index repack". Helped-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-22midx: avoid negative array indexPhillip Wood1-2/+2
nth_midxed_pack_int_id() returns the index of the pack file in the multi pack index's list of packfiles that the specified object. The index is returned as a uint32_t. Storing this in an int will make the index negative if the most significant bit is set. Fix this by using uint32_t as the rest of the code does. This is unlikely to be a practical problem as it requires the multipack index to reference 2^31 packfiles. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-22midx repack: avoid potential integer overflow on 64 bit systemsPhillip Wood1-2/+8
On a 64 bit system the calculation p->pack_size * pack_info[i].referenced_objects could overflow. If a pack file contains 2^28 objects with an average compressed size of 1KB then the pack size will be 2^38B. If all of the objects are referenced by the multi-pack index the sum above will overflow. Avoid this by using shifted integer arithmetic and changing the order of the calculation so that the pack size is divided by the total number of objects in the pack before multiplying by the number of objects referenced by the multi-pack index. Using a shift of 14 bits should give reasonable accuracy while avoiding overflow for pack sizes less that 1PB. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-22midx repack: avoid integer overflow on 32 bit systemsPhillip Wood2-4/+24
On a 32 bit system "git multi-pack-index --repack --batch-size=120M" failed with fatal: size_t overflow: 6038786 * 1289 The calculation to estimated size of the objects in the pack referenced by the multi-pack-index uses st_mult() to multiply the pack size by the number of referenced objects before dividing by the total number of objects in the pack. As size_t is 32 bits on 32 bit systems this calculation easily overflows. Fix this by using 64bit arithmetic instead. Also fix a potential overflow when caluculating the total size of the objects referenced by the multipack index with a batch size larger than SIZE_MAX / 2. In that case total_size += estimated_size can overflow as both total_size and estimated_size can be greater that SIZE_MAX / 2. This is addressed by using saturating arithmetic for the addition. Although estimated_size is of type uint64_t by the time we reach this sum it is bounded by the batch size which is of type size_t and so casting estimated_size to size_t does not truncate the value. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-22diff --no-index: support limiting by pathspecJacob Keller4-20/+150
The --no-index option of git-diff enables using the diff machinery from git while operating outside of a repository. This mode of git diff is able to compare directories and produce a diff of their contents. When operating git diff in a repository, git has the notion of "pathspecs" which can specify which files to compare. In particular, when using git to diff two trees, you might invoke: $ git diff-tree -r <treeish1> <treeish2>. where the treeish could point to a subdirectory of the repository. When invoked this way, users can limit the selected paths of the tree by using a pathspec. Either by providing some list of paths to accept, or by removing paths via a negative refspec. The git diff --no-index mode does not support pathspecs, and cannot limit the diff output in this way. Other diff programs such as GNU difftools have options for excluding paths based on a pattern match. However, using git diff as a diff replacement has several advantages over many popular diff tools, including coloring moved lines, rename detections, and similar. Teach git diff --no-index how to handle pathspecs to limit the comparisons. This will only be supported if both provided paths are directories. For comparisons where one path isn't a directory, the --no-index mode already has some DWIM shortcuts implemented in the fixup_paths() function. Modify the fixup_paths function to return 1 if both paths are directories. If this is the case, interpret any extra arguments to git diff as pathspecs via parse_pathspec. Use parse_pathspec to load the remaining arguments (if any) to git diff --no-index as pathspec items. Disable PATHSPEC_ATTR support since we do not have a repository to do attribute lookup. Disable PATHSPEC_FROMTOP since we do not have a repository root. All pathspecs are treated as rooted at the provided comparison paths. After loading the pathspec data, calculate skip offsets for skipping past the root portion of the paths. This is required to ensure that pathspecs start matching from the provided path, rather than matching from the absolute path. We could instead pass the paths as prefix values to parse_pathspec. This is slightly problematic because the paths come from the command line and don't necessarily have the proper trailing slash. Additionally, that would require parsing pathspecs multiple times. Pass the pathspec object and the skip offsets into queue_diff, which in-turn must pass them along to read_directory_contents. Modify read_directory_contents to check against the pathspecs when scanning the directory. Use the skip offset to skip past the initial root of the path, and only match against portions that are below the intended directory structure being compared. The search algorithm for finding paths is recursive with read_dir. To make pathspec matching work properly, we must set both DO_MATCH_DIRECTORY and DO_MATCH_LEADING_PATHSPEC. Without DO_MATCH_DIRECTORY, paths like "a/b/c/d" will not match against pathspecs like "a/b/c". This is usually achieved by setting the is_dir parameter of match_pathspec. Without DO_MATCH_LEADING_PATHSPEC, paths like "a/b/c" would not match against pathspecs like "a/b/c/d". This is crucial because we recursively iterate down the directories. We could simply avoid checking pathspecs at subdirectories, but this would force recursion down directories which would simply be skipped. If we always passed DO_MATCH_LEADING_PATHSPEC, then we will incorrectly match in certain cases such as matching 'a/c' against ':(glob)**/d'. The match logic will see that a matches the leading part of the **/ and accept this even tho c doesn't match. To avoid this, use the match_leading_pathspec() variant recently introduced. This sets both flags when is_dir is set, but leaves them both cleared when is_dir is 0. Add test cases and documentation covering the new functionality. Note for the documentation I opted not to move the placement of '--' which is sometimes used to disambiguate arguments. The diff --no-index mode requires exactly 2 arguments determining what to compare. Any additional arguments are interpreted as pathspecs and must come afterwards. Use of '--' would not actually disambiguate anything, since there will never be ambiguity over which arguments represent paths or pathspecs. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-22pathspec: add flag to indicate operation without repositoryJacob Keller3-4/+16
A following change will add support for pathspecs to the git diff --no-index command. This mode of git diff does not load any repository. Add a new PATHSPEC_NO_REPOSITORY flag indicating that we're parsing pathspecs without a repository. Both PATHSPEC_ATTR and PATHSPEC_FROMTOP require a repository to function. Thus, verify that both of these are set in magic_mask to ensure they won't be accepted when PATHSPEC_NO_REPOSITORY is set. Check PATHSPEC_NO_REPOSITORY when warning about paths outside the directory tree. When the flag is set, do not look for a git repository when generating the warning message. Finally, add a BUG in match_pathspec_item if the istate is NULL but the pathspec has PATHSPEC_ATTR set. Callers which support PATHSPEC_ATTR should always pass a valid istate, and callers which don't pass a valid istate should have set PATHSPEC_ATTR in the magic_mask field to disable support for attribute-based pathspecs. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-22pathspec: add match_leading_pathspec variantJacob Keller2-0/+16
The do_match_pathspec() function has the DO_MATCH_LEADING_PATHSPEC option to allow pathspecs to match when matching "src" against a pathspec like "src/path/...". This support is not exposed by match_pathspec, and the internal flags to do_match_pathspec are not exposed outside of dir.c The upcoming support for pathspecs in git diff --no-index need the LEADING matching behavior when iterating down through a directory with readdir. We could try to expose the match_pathspec_with_flags to the public API. However, DO_MATCH_EXCLUDES really shouldn't be public, and its a bit weird to only have a few of the flags become public. Instead, add match_leading_pathspec() as a function which sets both DO_MATCH_DIRECTORY and DO_MATCH_LEADING_PATHSPEC when is_dir is true. This will be used in a following change to support pathspec matching in git diff --no-index. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-22Merge branch 'top-panel-search-highlight' of github.com:bnfour/gitkJohannes Sixt1-2/+2
* 'top-panel-search-highlight' of github.com:bnfour/gitk: gitk: do not hard-code color of search results in commit list Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2025-05-21name-hash: don't add sparse directories in threaded lazy initAlex Mironov1-2/+4
Ensure that logic added in 5f11669586 (name-hash: don't add directories to name_hash, 2021-04-12) also applies in multithreaded hashtable init path. As per the original single-threaded change above: sparse directory entries represent a directory that is outside the sparse-checkout definition. These are not paths to blobs, so should not be added to the name_hash table. Instead, they should be added to the directory hashtable when 'ignore_case' is true. Add a condition to avoid placing sparse directories into the name_hash hashtable. This avoids filling the table with extra entries that will never be queried. Signed-off-by: Alex Mironov <alexandrfox@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-20t: remove unexpected SANITIZE_LEAK variablesKarthik Nayak3-8/+0
As of 1fc7ddf35b (test-lib: unconditionally enable leak checking, 2024-11-20), both the `GIT_TEST_PASSING_SANITIZE_LEAK` and `TEST_PASSES_SANITIZE_LEAK` variables no longer have any meaning, the leak checks are enabled by default. However, some newly added tests include them by mistake. Let's clean this up. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-20builtin/receive-pack: add option to skip connectivity checkJustin Tobler3-18/+56
During git-receive-pack(1), connectivity of the object graph is validated to ensure that the received packfile does not leave the repository in a broken state. This is done via git-rev-list(1) and walking the objects, which can be expensive for large repositories. Generally, this check is critical to avoid an incomplete received packfile from corrupting a repository. Server operators may have additional knowledge though around exactly how Git is being used on the server-side which can be used to facilitate more efficient connectivity computation of incoming objects. For example, if it can be ensured that all objects in a repository are connected and do not depend on any missing objects, the connectivity of newly written objects can be checked by walking the object graph containing only the new objects from the updated tips and identifying the missing objects which represent the boundary between the new objects and the repository. These boundary objects can be checked in the canonical repository to ensure the new objects connect as expected and thus avoid walking the rest of the object graph. Git itself cannot make the guarantees required for such an optimization as it is possible for a repository to contain an unreachable object that references a missing object without the repository being considered corrupt. Introduce the --skip-connectivity-check option for git-receive-pack(1) which bypasses this connectivity check to give more control to the server-side. Note that without proper server-side validation of newly received objects handled outside of Git, usage of this option risks corrupting a repository. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-20t5410: test receive-pack connectivity checkJustin Tobler2-2/+23
As part of git-recieve-pack(1), the connectivity of objects is checked. Add a test validating that git-receive-pack(1) fails due to an incoming packfile that would leave the repository with missing objects. Instead of creating a new test file, "t5410" is generalized for receive-pack testing. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-20Merge branch 'yh/fix-non-themed-combobox'Johannes Sixt1-4/+2
* yh/fix-non-themed-combobox: gitk: Legacy widgets doesn't have combobox
2025-05-19The sixteenth batchJunio C Hamano1-0/+15
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19Merge branch 'ps/reftable-read-block-perffix'Junio C Hamano4-17/+19
Performance regression in not-yet-released code has been corrected. * ps/reftable-read-block-perffix: reftable: fix perf regression when reading blocks of unwanted type
2025-05-19Merge branch 'ly/reftable-writer-leakfix'Junio C Hamano1-2/+6
Leakfix. * ly/reftable-writer-leakfix: reftable/writer: fix memory leak when `writer_index_hash()` fails reftable/writer: fix memory leak when `padded_write()` fails
2025-05-19Merge branch 'jk/oidmap-cleanup'Junio C Hamano11-18/+21
Code cleanup. * jk/oidmap-cleanup: raw_object_store: drop extra pointer to replace_map oidmap: add size function oidmap: rename oidmap_free() to oidmap_clear()
2025-05-19Merge branch 'rc/t1001-test-path-is-file'Junio C Hamano1-1/+1
Test update. * rc/t1001-test-path-is-file: t1001: replace 'test -f' with 'test_path_is_file'
2025-05-19Merge branch 'ly/am-split-stgit-leakfix'Junio C Hamano1-1/+3
Leakfix. * ly/am-split-stgit-leakfix: builtin/am: fix memory leak in `split_mail_stgit_series`
2025-05-19Merge branch 'bc/make-avoid-unneeded-rebuild-with-compdb-dir'Junio C Hamano1-1/+1
Build performance fix. * bc/make-avoid-unneeded-rebuild-with-compdb-dir: Makefile: avoid constant rebuilds with compilation database
2025-05-19Merge branch 'ag/doc-send-email'Junio C Hamano3-9/+66
The `send-email` documentation has been updated with OAuth2.0 related examples. * ag/doc-send-email: docs: add credential helper for outlook and gmail in OAuth list of helpers docs: improve send-email documentation send-mail: improve checks for valid_fqdn
2025-05-19Merge branch 'sc/bundle-uri-use-all-refs-in-bundle'Junio C Hamano3-94/+124
Bundle-URI feature did not use refs recorded in the bundle other than normal branches as anchoring points to optimize the follow-up fetch during "git clone"; now it is told to utilize all. * sc/bundle-uri-use-all-refs-in-bundle: bundle-uri: add test for bundle-uri clones with tags bundle-uri: copy all bundle references ino the refs/bundle space
2025-05-19Merge branch 'pw/sequencer-reflog-use-after-free'Junio C Hamano2-60/+67
Use-after-free fix in the sequencer. * pw/sequencer-reflog-use-after-free: sequencer: rework reflog message handling sequencer: move reflog message functions
2025-05-19configure.ac: upgrade to a compilation check for sysinfoRamsay Jones1-3/+22
Commit f5e3c6c57d ("meson: do a full usage-based compile check for sysinfo", 2025-04-25) updated the 'sysinfo()' check, as part of the meson build, due to the failure of the check on Solaris. Prior to that commit, the meson build only checked the availability of the '<sys/sysinfo.h>' header file. On Solaris, both the header and the 'sysinfo()' function exist, but are completely unrelated to the same function on Linux (and cygwin). Commit 50dec7c566 ("config.mak.uname: add sysinfo() configuration for cygwin", 2025-04-17) added a similar 'sysinfo()' check to the autoconf build. This check looked for the 'sysinfo()' function itself, rather than just the header, but it will fail (incorrectly set HAVE_SYSINFO) for the same reason. In order to correctly identify the 'sysinfo()' function we require as part of 'git-gc' (used in the 'total_ram() function), we also upgrade to a compilation check, in a similar way to the meson commit. Note that since commit c9a51775a3 ("builtin/gc.c: correct RAM calculation when using sysinfo", 2025-04-17) both the 'totalram' and 'mem_unit' fields of the 'struct sysinfo' are used, so the new check includes both of those fields in the compile check. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19meson.build: correct setting of GIT_EXEC_PATHRamsay Jones1-2/+10
For the non-'runtime prefix' case, the meson build sets the GIT_EXEC_PATH build variable to an absolute path equivalent to <prefix>/libexec/git-core. In comparison, the default make build sets it to a relative path equivalent to 'libexec/git-core'. Indeed, the make build requires the use of some means outside of the Makefile (eg. config.mak[.*] or the command-line) to set GIT_EXEC_PATH to anything other than 'libexec/git-core'. For example, the make invocation: $ make gitexecdir=/some/other/bin all install will build git with GIT_EXEC_PATH set to '/some/other/bin' and install the 'library' executables to that location. However, without setting the 'gitexecdir' make variable, irrespective of the 'runtime prefix' setting, the GIT_EXEC_PATH is always set to 'libexec/git-core'. The meson built-in 'libexecdir' option can be used to provide a similar configurability. The default value for the option is 'libexec'. Attempting to set the option to '' on the command-line, will reset it to the '.' string, presumably to ensure a relative path value. This commit allows the meson build, similar to the above, to configure the project like: $ meson setup --buildtype=debugoptimized -Dprefix=$HOME -Dpcre2=disabled \ -Dlibexecdir=/some/other/bin build so that the GIT_EXEC_PATH is set to '/some/other/bin'. Absent the -Dlibexecdir argument, the GIT_EXEC_PATH is set to 'libexec/git-core'. In order to correct the value of GIT_EXEC_PATH, default the value to the static string value 'libexec/git-core', and only override if the value of the 'libexecdir' option has a value different to 'libexec' or '.'. Also, like the Makefile, add a check for an absolute path when the runtime prefix option is true (and if so, error out). Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19meson: correct path to system config/attribute filesRamsay Jones2-6/+18
The path to the system-wide config and attributes files are not being set correctly in the meson build. Unless explicitly overridden on the command line during setup, the 'gitconfig' and 'gitattributes' options are defaulting to absolute paths in the '/etc' system directory. This is only appropriate if the <prefix> is set specifically to '/usr'. The directory in which these files are placed is generally referred to as the 'system configuration directory' or 'sysconfdir' for short. When the prefix is '/usr' then the sysconfdir is usually set to '/etc', but any other value for prefix results in the relative directory value 'etc' instead. (eg if prefix is '/usr/local', then the 'etc' relative value results in a system configuration directory of '/usr/local/etc'). When setting the 'sysconfdir' builtin option value, the meson system uses exactly this algorithm, so we can use get_option('sysconfdir') directly when setting the (non-overridden) build variables. In order to allow for overriding from the command line, remove the default values specified for the 'gitconfig' and 'gitattributes' options in the 'meson_options.txt' file. This allows the user to specify any pathname for those options, while being able to test for the unset (empty) value. An absolute pathname will be used unchanged and a relative pathname will be appended to '<prefix>/'. These values are then used to set the 'ETC_GITCONFIG' and 'ETC_GITATTRIBUTES' build variables which are, in turn, passed to the compiler as '-D' arguments. When the 'gitconfig' or 'gitattributes' options are not used, then use the built-in 'sysconfdir' and set the ETC_GITCONFIG build variable to the string "<sysconfdir>/gitconfig". Similarly, set ETC_ATTRIBUTES to "<sysconfdir>/gitattributes". Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19meson: correct install location of YAML.pmRamsay Jones1-1/+1
When executing an 'meson install' the YAML.pm file is incorrectly placed in the <prefix>/share/perl5/Git/SVN directory. The YAML.pm file should be placed in a 'Memoize' subdirectory instead. In order to correct the location, update the 'install_dir' of the relevant target in the 'perl/Git/SVN/Memoize/meson.build' file. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19meson.build: quote the GITWEBDIR build configurationRamsay Jones1-1/+1
The build configuration options with (non-empty) values, for example filesystem paths potentially containing spaces, have been set using the '.set_quoted()' method. However, the GITWEBDIR value has been set using the '.set()' method instead. In order to correctly quote the GITWEBDIR value, replace the '.set()' method with '.set_quoted()'. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19meson: reformat default options to workaround bug in `meson configure`Eli Schwartz1-8/+6
Since 13cb20fc46 ("meson: fix compilation with Visual Studio", 2025-01-22) it has not been possible to list build options via `meson configure`. This is due to Meson's static analysis of build options failing to handle constant folding, and thinking we set a totally invalid default `-std=`. This is reported upstream but we anyways need to work with existing versions. It turns out there is a simple solution: turn the entire default option into a conditional branch, which means Meson sees either nothing, or everything. As a result, Git users can once again see pretty-printed options before building. Reported-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Bug: https://github.com/mesonbuild/meson/issues/14623 Signed-off-by: Eli Schwartz <eschwartz@gentoo.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19receive-pack: use batched reference updatesKarthik Nayak3-23/+56
The reference updates performed as a part of 'git-receive-pack(1)', take place one at a time. For each reference update, a new transaction is created and committed. This is necessary to ensure we can allow individual updates to fail without failing the entire command. The command also supports an 'atomic' mode, which uses a single transaction to update all of the references. But this mode has an all-or-nothing approach, where if a single update fails, all updates would fail. In 23fc8e4f61 (refs: implement batch reference update support, 2025-04-08), we introduced a new mechanism to batch reference updates. Under the hood, this uses a single transaction to perform a batch of reference updates, while allowing only individual updates to fail. Utilize this newly introduced batch update mechanism in 'git-receive-pack(1)'. This provides a significant bump in performance, especially when dealing with repositories with large number of references. With the reftable backend there is a 18x performance improvement, when performing receive-pack with 10000 refs: Benchmark 1: receive: many refs (refformat = reftable, refcount = 10000, revision = master) Time (mean ± σ): 4.276 s ± 0.078 s [User: 0.796 s, System: 3.318 s] Range (min … max): 4.185 s … 4.430 s 10 runs Benchmark 2: receive: many refs (refformat = reftable, refcount = 10000, revision = HEAD) Time (mean ± σ): 235.4 ms ± 6.9 ms [User: 75.4 ms, System: 157.3 ms] Range (min … max): 228.5 ms … 254.2 ms 11 runs Summary receive: many refs (refformat = reftable, refcount = 10000, revision = HEAD) ran 18.16 ± 0.63 times faster than receive: many refs (refformat = reftable, refcount = 10000, revision = master) In similar conditions, the files backend sees a 1.21x performance improvement: Benchmark 1: receive: many refs (refformat = files, refcount = 10000, revision = master) Time (mean ± σ): 1.121 s ± 0.021 s [User: 0.128 s, System: 0.975 s] Range (min … max): 1.097 s … 1.156 s 10 runs Benchmark 2: receive: many refs (refformat = files, refcount = 10000, revision = HEAD) Time (mean ± σ): 927.9 ms ± 22.6 ms [User: 99.0 ms, System: 815.2 ms] Range (min … max): 903.1 ms … 978.0 ms 10 runs Summary receive: many refs (refformat = files, refcount = 10000, revision = HEAD) ran 1.21 ± 0.04 times faster than receive: many refs (refformat = files, refcount = 10000, revision = master) As using batched updates requires the error handling to be moved to the end of the flow, create and use a 'struct strset' to track the failed refs and attribute the correct errors to them. This change also uncovers an issue when a client provides multiple updates to the same reference. For example: $ git send-pack remote.git A:foo B:foo Enumerating objects: 3, done. Counting objects: 100% (3/3), done. Delta compression using up to 20 threads Compressing objects: 100% (2/2), done. Writing objects: 100% (3/3), 226 bytes | 226.00 KiB/s, done. Total 3 (delta 1), reused 0 (delta 0), pack-reused 0 (from 0) remote: error: cannot lock ref 'refs/heads/foo': reference already exists To remote.git ! [remote rejected] A -> foo (failed to update ref) ! [remote failure] B -> foo (remote failed to report status) As you can see, the remote runs into an error because it cannot lock the target reference for the second update. Furthermore, the remote complains that the first update has been rejected whereas the second update didn't receive any status update because we failed to lock it. Reading this status message alone a user would probably expect that `foo` has not been updated at all. But that's not the case: while we claim that the ref wasn't updated, it surprisingly points to `A` now. One could argue that this is merely an error in how we report the result of this push. But ultimately, the user's request itself is already broken and doesn't make any sense in the first place and cannot ever lead to a sensible outcome that honors the full request. The conversion to batched transactions fixes the issue because we now try to queue both updates in the same transaction. As such, the transaction itself will notice this conflict and refuse the update altogether before we commit any of the values. Note that this requires changes to a couple of tests in t5408 that happened to exercise this behaviour. Given that the generated output is misleading and given that the user request cannot ever be fully honored this really feels more like a bug than properly designed behaviour. As such, changing the behaviour feels like the right thing to do. Since now reference updates are batched, the 'reference-transaction' hook will be invoked with all updates together. Currently git will 'die' when the hook returns with a non-zero exit status in the 'prepared' stage. For 'git-receive-pack(1)', this allowed users to reject an individual reference update, git would have applied previous updates but immediately abort further execution. This is definitely an incorrect usage of this hook, since the right place to do this would be the 'update' hook. This patch retains the latter behavior, but 'reference-transaction' hook now changes to a all-or-nothing behavior when a non-zero exit status is returned in the 'prepared' stage, since batch updates use a transaction under the hood. This explains the change in 't1416'. Helped-by: Jeff King <peff@peff.net> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19send-pack: fix memory leak around duplicate refsKarthik Nayak2-0/+13
The 'git-send-pack(1)' allows users to push objects to a remote repository and explicitly list the references to be pushed. The status of each reference pushed is captured into a list mapped by refname. If a reference fails to be updated, its error message is captured in the `ref->remote_status` field. While the command allows duplicate ref inputs, the list doesn't accommodate this behavior as a particular refname is linked to a single `struct ref*` element. So if the user inputs a reference twice like: git send-pack remote.git A:foo B:foo where the user is trying to update the same reference 'foo' twice and the reference fails to be updated, we first fill `ref->remote_status` with error message for the input 'A:foo' then we override the same field with the error message for 'B:foo'. This override happens without first free'ing the previous value. Fix this leak. The current tests already incorporate the above example, but in the test 'A:foo' succeeds while 'B:foo' fails, meaning that the memory leak isn't triggered. Add a new test with multiple duplicates. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19fetch: use batched reference updatesKarthik Nayak1-54/+73
The reference updates performed as a part of 'git-fetch(1)', take place one at a time. For each reference update, a new transaction is created and committed. This is necessary to ensure we can allow individual updates to fail without failing the entire command. The command also supports an '--atomic' mode, which uses a single transaction to update all of the references. But this mode has an all-or-nothing approach, where if a single update fails, all updates would fail. In 23fc8e4f61 (refs: implement batch reference update support, 2025-04-08), we introduced a new mechanism to batch reference updates. Under the hood, this uses a single transaction to perform a batch of reference updates, while allowing only individual updates to fail. Utilize this newly introduced batch update mechanism in 'git-fetch(1)'. This provides a significant bump in performance, especially when dealing with repositories with large number of references. Adding support for batched updates is simply modifying the flow to also create a batch update transaction in the non-atomic flow. With the reftable backend there is a 22x performance improvement, when performing 'git-fetch(1)' with 10000 refs: Benchmark 1: fetch: many refs (refformat = reftable, refcount = 10000, revision = master) Time (mean ± σ): 3.403 s ± 0.775 s [User: 1.875 s, System: 1.417 s] Range (min … max): 2.454 s … 4.529 s 10 runs Benchmark 2: fetch: many refs (refformat = reftable, refcount = 10000, revision = HEAD) Time (mean ± σ): 154.3 ms ± 17.6 ms [User: 102.5 ms, System: 56.1 ms] Range (min … max): 145.2 ms … 220.5 ms 18 runs Summary fetch: many refs (refformat = reftable, refcount = 10000, revision = HEAD) ran 22.06 ± 5.62 times faster than fetch: many refs (refformat = reftable, refcount = 10000, revision = master) In similar conditions, the files backend sees a 1.25x performance improvement: Benchmark 1: fetch: many refs (refformat = files, refcount = 10000, revision = master) Time (mean ± σ): 605.5 ms ± 9.4 ms [User: 117.8 ms, System: 483.3 ms] Range (min … max): 595.6 ms … 621.5 ms 10 runs Benchmark 2: fetch: many refs (refformat = files, refcount = 10000, revision = HEAD) Time (mean ± σ): 485.8 ms ± 4.3 ms [User: 91.1 ms, System: 396.7 ms] Range (min … max): 477.6 ms … 494.3 ms 10 runs Summary fetch: many refs (refformat = files, refcount = 10000, revision = HEAD) ran 1.25 ± 0.02 times faster than fetch: many refs (refformat = files, refcount = 10000, revision = master) With this we'll either be using a regular transaction or a batch update transaction. This helps cleanup some code which is no longer needed as we'll now always have some type of 'ref_transaction' object being propagated. One big change is that earlier, each individual update would propagate a failure. Whereas now, the `ref_transaction_for_each_rejected_update` function is called at the end of the flow to capture the exit status for 'git-fetch(1)' and also to print F/D conflict errors. This does change the order of the errors being printed, but the behavior stays the same. Since transaction errors are now explicitly defined as part of 76e760b999 (refs: introduce enum-based transaction error types, 2025-04-08), utilize them and get rid of custom errors defined within 'builtin/fetch.c'. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19refs: add function to translate errors to stringsKarthik Nayak3-24/+26
The commit 76e760b999 (refs: introduce enum-based transaction error types, 2025-04-08) introduced enum-based transaction error types. The refs transaction logic was also modified to propagate these errors. For clients of the ref transaction system, it would be beneficial to provide human readable messages for these errors. There is already an existing mapping in 'builtin/update-ref.c', move it to 'refs.c' as `ref_transaction_error_msg()` and use the same within the 'builtin/update-ref.c'. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19docs: replace git_config to repo_configK Jayatheerth1-9/+10
Since this document was written, the built-in API has been updated a few times, but the document was left stale. Adjust to the current best practices by calling repo_config() on the repository instance the subcommand implementation receives as a parameter, instead of calling git_config() that used to be the common practice. Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19docs: clarify cmd_psuh signature and explain UNUSED macroK Jayatheerth1-5/+23
The sample program, as written, would no longer build for at least two reasons: - Since this document was first written, the convention to call a subcommand implementation has changed, and cmd_psuh() now needs to accept the fourth parameter, repository. - These days, compiler warning options for developers include one that detects and complains about unused parameters, so ones that are deliberately unused have to be marked as such. Update the old-style examples to adjust to the current practices, with explanations as needed. Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19docs: remove unused mentoring mailing list referenceK Jayatheerth1-8/+0
The git-mentoring group was initially created to help newcomers with their development itches. However, in practice, most of their questions were already being addressed directly on the mailing list, and contributors consistently received helpful responses there. Remove the mentoring group details from the Documentation. Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16merge-tree: add a new --quiet flagElijah Newren3-0/+62
Git Forges may be interested in whether two branches can be merged while not being interested in what the resulting merge tree is nor which files conflicted. For such cases, add a new --quiet flag which will make use of the new mergeability_only flag added to merge-ort in the previous commit. This option allows the merge machinery to, in the outer layer of the merge: * exit early when a conflict is detected * avoid writing (most) merged blobs/trees to the object store Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16merge-ort: add a new mergeability_only optionElijah Newren2-7/+32
Git Forges may be interested in whether two branches can be merged while not being interested in what the resulting merge tree is nor which files conflicted. For such cases, add a new mergeability_only option. This option allows the merge machinery to, in the "outer layer" of the merge: * exit upon first[-ish] conflict * avoid (not prevent) writing merged blobs/trees to the object store I have a number of qualifiers there, so let me explain each: "outer layer": Note that since the recursive merge of merge bases (corresponding to call_depth > 0) can conflict without the outer final merge (corresponding to call_depth == 0) conflicting, we can't short-circuit nor avoid writing merged blobs/trees to the object store during those inner merges. "first-ish conflict": The current patch only exits early from process_entries() on the first conflict it detects, but conflicts could have been detected in a previous function call, namely detect_and_process_renames(). However: * conflicts detected by detect_and_process_renames() are quite rare conflict types * the detection would still come after regular rename detection (which is the expensive part of detect_and_process_renames()), so it is not saving us much in computation time given that process_entries() directly follows detect_and_process_renames() * [this overlaps with the next bullet point] process_entries() is the place where virtually all object writing occurs (object writing is sometimes more of a concern for Forges than computation time), so exiting early here isn't saving us much in object writes either * the code changes needed to handle an earlier exit are slightly more invasive in detect_and_process_renames() than for process_entries(). Given the rareness of the even earlier conflicts, the limited savings we'd get from exiting even earlier, and in an attempt to keep this patch simpler, we don't guarantee that we actually exit on the first conflict detected. We can always revisit this decision later if we decide that a further micro-optimization to exit slightly earlier in rare cases is worthwhile. "avoid (not prevent) writing objects": The detect_and_process_renames() call can also write objects to the object store, when rename/rename conflicts involve one (or more) files that have also been modified on both sides. Because of this alternate call path leading to handle_content_merges(), our "early exit" does not prevent writing objects entirely, even within the "outer layer" (i.e. even within call_depth == 0). I figure that's fine though, since we're already writing objects for the inner merges (i.e. for call_depth > 0), which are likely going to represent vastly more objects than files involved in rename/rename+modify/modify cases in the outer merge, on average. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16sequencer: make it clearer that commit descriptions are just commentsElijah Newren6-88/+94
Every once in a while, users report that editing the commit summaries in the todo list does not get reflected in the rebase operation, suggesting that users are (a) only using one-line commit messages, and (b) not understanding that the commit summaries are merely helpful comments to help them find the right hashes. It may be difficult to correct users' poor commit messages, but we can at least try to make it clearer that the commit summaries are not directives of some sort by inserting a comment character. Hopefully that leads to them looking a little further and noticing the hints at the bottom to use 'reword' or 'edit' directives. Yes, this change may look funny at first since it hardcodes '#' rather than using comment_line_str. However: * comment_line_str exists to allow disambiguation between lines in a commit message and lines that are instructions to users editing the commit message. No such disambiguation is needed for these comments that occur on the same line after existing directives * the exact "comment" character(s) on regular pick lines used aren't actually important; I could have used anything, including completely random variable length text for each line and it'd work because we ignore everything after 'pick' and the hash. * The whole point of this change is to signal to users that they should NOT be editing any part of the line after the hash (and if they do so, their edits will be ignored), while the whole point of comment_line_str is to allow highly flexible editing. So making it more general by using comment_line_str actually feels counterproductive. * The character for merge directives absolutely must be '#'; that has been deeply hardcoded for a long time (see below), and will break if some other comment character is used instead. In a desire to have pick and merge directives be similar, I use the same comment character for both. * Perhaps merge directives could be fixed to not be inflexible about the comment character used, if someone feels highly motivated, but I think that should be done in a separate follow-on patch. Here are (some of?) the locations where '#' has already been hardcoded for a long time for merges: 1) In check_label_or_ref_arg(): case TODO_LABEL: /* * '#' is not a valid label as the merge command uses it to * separate merge parents from the commit subject. */ 2) In do_merge(): /* * For octopus merges, the arg starts with the list of revisions to be * merged. The list is optionally followed by '#' and the oneline. */ merge_arg_len = oneline_offset = arg_len; for (p = arg; p - arg < arg_len; p += strspn(p, " \t\n")) { if (!*p) break; if (*p == '#' && (!p[1] || isspace(p[1]))) { 3) In label_oid(): if ((buf->len == the_hash_algo->hexsz && !get_oid_hex(label, &dummy)) || (buf->len == 1 && *label == '#') || hashmap_get_from_hash(&state->labels, strihash(label), label)) { /* * If the label already exists, or if the label is a * valid full OID, or the label is a '#' (which we use * as a separator between merge heads and oneline), we * append a dash and a number to make it unique. */ Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16pack-objects: allow --shallow and --path-walkDerrick Stolee2-4/+11
There does not appear to be anything particularly incompatible about the --shallow and --path-walk options of 'git pack-objects'. If shallow commits are to be handled differently, then it is by the revision walk that defines the commit set and which are interesting or uninteresting. However, before the previous change, a trivial removal of the warning would cause a failure in t5500-fetch-pack.sh when GIT_TEST_PACK_PATH_WALK is enabled. The shallow fetch would provide more objects than we desired, due to some incorrect behavior of the path-walk API, especially around walking uninteresting objects. The recently-added tests in t5538-push-shallow.sh help to confirm this behavior is working with the --path-walk option if GIT_TEST_PACK_PATH_WALK is enabled. These tests passed previously due to the --path-walk feature being disabled in the presence of a shallow clone. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16path-walk: add new 'edge_aggressive' optionDerrick Stolee5-1/+42
In preparation for allowing both the --shallow and --path-walk options in the 'git pack-objects' builtin, create a new 'edge_aggressive' option in the path-walk API. This option will help walk the boundary more thoroughly and help avoid sending extra objects during fetches and pushes. The only use of the 'edge_hint_aggressive' option in the revision API is within mark_edges_uninteresting(), which is usually called before between prepare_revision_walk() and before visiting commits with get_revision(). In prepare_revision_walk(), the UNINTERESTING commits are walked until a boundary is found. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16pack-objects: thread the path-based compressionDerrick Stolee1-2/+164
Adapting the implementation of ll_find_deltas(), create a threaded version of the --path-walk compression step in 'git pack-objects'. This involves adding a 'regions' member to the thread_params struct, allowing each thread to own a section of paths. We can simplify the way jobs are split because there is no value in extending the batch based on name-hash the way sections of the object entry array are attempted to be grouped. We re-use the 'list_size' and 'remaining' items for the purpose of borrowing work in progress from other "victim" threads when a thread has finished its batch of work more quickly. Using the Git repository as a test repo, the p5313 performance test shows that the resulting size of the repo is the same, but the threaded implementation gives gains of varying degrees depending on the number of objects being packed. (This was tested on a 16-core machine.) Test HEAD~1 HEAD --------------------------------------------------- 5313.20: big pack 2.38 1.99 -16.4% 5313.21: big pack size 16.1M 16.0M -0.2% 5313.24: repack 107.32 45.41 -57.7% 5313.25: repack size 213.3M 213.2M -0.0% (Test output is formatted to better fit in message.) This ~60% reduction in 'git repack --path-walk' time is typical across all repos I used for testing. What is interesting is to compare when the overall time improves enough to outperform the --name-hash-version=1 case. These time improvements correlate with repositories with data shapes that significantly improve their data size as well. The --path-walk feature frequently takes longer than --name-hash-version=2, trading some extra computation for some additional compression. The natural place where this additional computation comes from is the two compression passes that --path-walk takes, though the first pass is naturally faster due to the path boundaries avoiding a number of delta compression attempts. For example, the microsoft/fluentui repo has significant size reduction from --name-hash-version=1 to --name-hash-version=2 followed by further improvements with --path-walk. The threaded computation makes --path-walk more competitive in time compared to --name-hash-version=2, though still ~31% more expensive in that metric. Repack Method Pack Size Time ------------------------------------------ Hash v1 439.4M 87.24s Hash v2 161.7M 21.51s Path Walk (Before) 142.5M 81.29s Path Walk (After) 142.5M 28.16s Similar results hold for the Git repository: Repack Method Pack Size Time ------------------------------------------ Hash v1 248.8M 30.44s Hash v2 249.0M 30.15s Path Walk (Before) 213.2M 142.50s Path Walk (After) 213.3M 45.41s ...as well as the nodejs/node repository: Repack Method Pack Size Time ------------------------------------------ Hash v1 739.9M 71.18s Hash v2 764.6M 67.82s Path Walk (Before) 698.1M 208.10s Path Walk (After) 698.0M 75.10s Finally, the Linux kernel repository is a good test for this repacking time change, even though the space savings is more subtle: Repack Method Pack Size Time ------------------------------------------ Hash v1 2.5G 554.41s Hash v2 2.5G 549.62s Path Walk (before) 2.2G 1562.36s Path Walk (before) 2.2G 559.00s Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16pack-objects: refactor path-walk delta phaseDerrick Stolee4-34/+75
Previously, the --path-walk option to 'git pack-objects' would compute deltas inline with the path-walk logic. This would make the progress indicator look like it is taking a long time to enumerate objects, and then very quickly computed deltas. Instead of computing deltas on each region of objects organized by tree, store a list of regions corresponding to these groups. These can later be pulled from the list for delta compression before doing the "global" delta search. This presents a new progress indicator that can be used in tests to verify that this stage is happening. The current implementation is not integrated with threads, but we are setting it up to arrive in the next change. Since we do not attempt to sort objects by size until after exploring all trees, we can remove the previous change to t5530 due to a different error message appearing first. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16scalar: enable path-walk during push via configDerrick Stolee1-0/+1
Repositories registered with Scalar are expected to be client-only repositories that are rather large. This means that they are more likely to be good candidates for using the --path-walk option when running 'git pack-objects', especially under the hood of 'git push'. Enable this config in Scalar repositories. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16pack-objects: enable --path-walk via configDerrick Stolee6-1/+32
Users may want to enable the --path-walk option for 'git pack-objects' by default, especially underneath commands like 'git push' or 'git repack'. This should be limited to client repositories, since the --path-walk option disables bitmap walks, so would be bad to include in Git servers when serving fetches and clones. There is potential that it may be helpful to consider when repacking the repository, to take advantage of improved deltas across historical versions of the same files. Much like how "pack.useSparse" was introduced and included in "feature.experimental" before being enabled by default, use the repository settings infrastructure to make the new "pack.usePathWalk" config enabled by "feature.experimental" and "feature.manyFiles". In order to test that this config works, add a new trace2 region around the path walk code that can be checked by a 'git push' command. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16repack: add --path-walk optionDerrick Stolee3-12/+18
Since 'git pack-objects' supports a --path-walk option, allow passing it through in 'git repack'. This presents interesting testing opportunities for comparing the different repacking strategies against each other. Add the --path-walk option to the performance tests in p5313. For the microsoft/fluentui repo [1] checked out at a specific commit [2], the --path-walk tests in p5313 look like this: Test this tree ------------------------------------------------------------------------- 5313.18: thin pack with --path-walk 0.08(0.06+0.02) 5313.19: thin pack size with --path-walk 18.4K 5313.20: big pack with --path-walk 2.10(7.80+0.26) 5313.21: big pack size with --path-walk 19.8M 5313.22: shallow fetch pack with --path-walk 1.62(3.38+0.17) 5313.23: shallow pack size with --path-walk 33.6M 5313.24: repack with --path-walk 81.29(96.08+0.71) 5313.25: repack size with --path-walk 142.5M [1] https://github.com/microsoft/fluentui [2] e70848ebac1cd720875bccaa3026f4a9ed700e08 Along with the earlier tests in p5313, I'll instead reformat the comparison as follows: Repack Method Pack Size Time --------------------------------------- Hash v1 439.4M 87.24s Hash v2 161.7M 21.51s Path Walk 142.5M 81.29s There are a few things to notice here: 1. The benefits of --name-hash-version=2 over --name-hash-version=1 are significant, but --path-walk still compresses better than that option. 2. The --path-walk command is still using --name-hash-version=1 for the second pass of delta computation, using the increased name hash collisions as a potential method for opportunistic compression on top of the path-focused compression. 3. The --path-walk algorithm is currently sequential and does not use multiple threads for delta compression. Threading will be implemented in a future change so the computation time will improve to better compete in this metric. There are small benefits in size for my copy of the Git repository: Repack Method Pack Size Time --------------------------------------- Hash v1 248.8M 30.44s Hash v2 249.0M 30.15s Path Walk 213.2M 142.50s As well as in the nodejs/node repository [3]: Repack Method Pack Size Time --------------------------------------- Hash v1 739.9M 71.18s Hash v2 764.6M 67.82s Path Walk 698.1M 208.10s [3] https://github.com/nodejs/node This benefit also repeats in my copy of the Linux kernel repository: Repack Method Pack Size Time --------------------------------------- Hash v1 2.5G 554.41s Hash v2 2.5G 549.62s Path Walk 2.2G 1562.36s It is important to see that even when the repository shape does not have many name-hash collisions, there is a slight space boost to be found using this method. As this repacking strategy was released in Git for Windows 2.47.0, some users have reported cases where the --path-walk compression is slightly worse than the --name-hash-version=2 option. In those cases, it may be beneficial to combine the two options. However, there has not been a released version of Git that has both options and I don't have access to these repos for testing. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16t5538: add tests to confirm deltas in shallow pushesDerrick Stolee1-0/+33
It can be notoriously difficult to detect if delta bases are being computed properly during 'git push'. Construct an example where it will make a kilobyte worth of difference when a delta base is not found. We can then use the progress indicators to distinguish between bytes and KiB depending on whether the delta base is found and used. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16pack-objects: introduce GIT_TEST_PACK_PATH_WALKDerrick Stolee9-7/+58
There are many tests that validate whether 'git pack-objects' works as expected. Instead of duplicating these tests, add a new test environment variable, GIT_TEST_PACK_PATH_WALK, that implies --path-walk by default when specified. This was useful in testing the implementation of the --path-walk implementation, helping to find tests that are overly specific to the default object walk. These include: - t0411-clone-from-partial.sh : One test fetches from a repo that does not have the boundary objects. This causes the path-based walk to fail. Disable the variable for this test. - t5306-pack-nobase.sh : Similar to t0411, one test fetches from a repo without a boundary object. - t5310-pack-bitmaps.sh : One test compares the case when packing with bitmaps to the case when packing without them. Since we disable the test variable when writing bitmaps, this causes a difference in the object list (the --path-walk option adds an extra object). Specify --no-path-walk in both processes for the comparison. Another test checks for a specific delta base, but when computing dynamically without using bitmaps, the base object it too small to be considered in the delta calculations so no base is used. - t5316-pack-delta-depth.sh : This script cares about certain delta choices and their chain lengths. The --path-walk option changes how these chains are selected, and thus changes the results of this test. - t5322-pack-objects-sparse.sh : This demonstrates the effectiveness of the --sparse option and how it combines with --path-walk. - t5332-multi-pack-reuse.sh : This test verifies that the preferred pack is used for delta reuse when possible. The --path-walk option is not currently aware of the preferred pack at all, so finds a different delta base. - t7406-submodule-update.sh : When using the variable, the --depth option collides with the --path-walk feature, resulting in a warning message. Disable the variable so this warning does not appear. I want to call out one specific test change that is only temporary: - t5530-upload-pack-error.sh : One test cares specifically about an "unable to read" error message. Since the current implementation performs delta calculations within the path-walk API callback, a different "unable to get size" error message appears. When this is changed in a future refactoring, this test change can be reverted. Similar to GIT_TEST_NAME_HASH_VERSION, we do not add this option to the linux-TEST-vars CI build as that's already an overloaded build. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16p5313: add performance tests for --path-walkDerrick Stolee1-14/+23
The previous change added a --path-walk option to 'git pack-objects'. Create a performance test that demonstrates the time and space benefits of the feature. In order to get an appropriate comparison, we need to avoid reusing deltas and recompute them from scratch. Compare the creation of a thin pack representing a small push and the creation of a relatively large non-thin pack. Running on my copy of the Git repository results in this data (removing the repack tests for --name-hash-version): Test this tree ------------------------------------------------------------------------ 5313.2: thin pack with --name-hash-version=1 0.02(0.01+0.01) 5313.3: thin pack size with --name-hash-version=1 1.6K 5313.4: big pack with --name-hash-version=1 2.55(4.20+0.26) 5313.5: big pack size with --name-hash-version=1 16.4M 5313.6: shallow fetch pack with --name-hash-version=1 1.24(2.03+0.08) 5313.7: shallow pack size with --name-hash-version=1 12.2M 5313.10: thin pack with --name-hash-version=2 0.03(0.01+0.01) 5313.11: thin pack size with --name-hash-version=2 1.6K 5313.12: big pack with --name-hash-version=2 1.91(3.23+0.20) 5313.13: big pack size with --name-hash-version=2 16.4M 5313.14: shallow fetch pack with --name-hash-version=2 1.06(1.57+0.10) 5313.15: shallow pack size with --name-hash-version=2 12.5M 5313.18: thin pack with --path-walk 0.03(0.01+0.01) 5313.19: thin pack size with --path-walk 1.6K 5313.20: big pack with --path-walk 2.05(3.24+0.27) 5313.21: big pack size with --path-walk 16.3M 5313.22: shallow fetch pack with --path-walk 1.08(1.66+0.07) 5313.23: shallow pack size with --path-walk 12.4M This can be reformatted as follows: Pack Type Hash v1 Hash v2 Path Walk --------------------------------------------------- thin pack (time) 0.02s 0.03s 0.03s (size) 1.6K 1.6K 1.6K big pack (time) 2.55s 1.91s 2.05s (size) 16.4M 16.4M 16.3M shallow pack (time) 1.24s 1.06s 1.08s (size) 12.2M 12.5M 12.4M Note that the timing is slower because there is no threading in the --path-walk case (yet). Also, the shallow pack cases are really not using the --path-walk logic right now because it is disabled until some additions are made to the path walk API. The cases where the --path-walk option really shines is when the default name-hash is overwhelmed with unhelpful collisions. An open source example can be found in the microsoft/fluentui repo [1] at a certain commit [2]. [1] https://github.com/microsoft/fluentui [2] e70848ebac1cd720875bccaa3026f4a9ed700e08 Running the tests on this repo results in the following comparison table: Pack Type Hash v1 Hash v2 Path Walk --------------------------------------------------- thin pack (time) 0.36s 0.12s 0.08s (size) 1.2M 22.0K 18.4K big pack (time) 2.00s 2.90s 2.21s (size) 20.4M 25.9M 19.5M shallow pack (time) 1.41s 1.80s 1.65s (size) 34.4M 33.7M 33.6M Notice in particular that in the small thin pack, the time performance has improved from 0.36s for --name-hash-version=1 to 0.08s and this is likely due to the improved size of the resulting pack: 18.4K instead of 1.2M. The relatively new --name-hash-version=2 is competitive with --path-walk (0.12s and 22.0K) but not quite as successful. Finally, running this on a copy of the Linux kernel repository results in these data points: Pack Type Hash v1 Hash v2 Path Walk --------------------------------------------------- thin pack (time) 0.03s 0.13s 0.03s (size) 4.6K 4.6K 4.6K big pack (time) 15.29s 12.32s 13.92s (size) 201.1M 159.1M 158.5M shallow pack (time) 10.88s 22.93s 22.74s (size) 269.2M 273.8M 267.7M Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16pack-objects: update usage to match docsDerrick Stolee3-10/+15
The t0450 test script verifies that builtin usage matches the synopsis in the documentation. Adjust the builtin to match and then remove 'git pack-objects' from the exception list. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16pack-objects: add --path-walk optionDerrick Stolee4-9/+168
In order to more easily compute delta bases among objects that appear at the exact same path, add a --path-walk option to 'git pack-objects'. This option will use the path-walk API instead of the object walk given by the revision machinery. Since objects will be provided in batches representing a common path, those objects can be tested for delta bases immediately instead of waiting for a sort of the full object list by name-hash. This has multiple benefits, including avoiding collisions by name-hash. The objects marked as UNINTERESTING are included in these batches, so we are guaranteeing some locality to find good delta bases. After the individual passes are done on a per-path basis, the default name-hash is used to find other opportunistic delta bases that did not match exactly by the full path name. The current implementation performs delta calculations while walking objects, which is not ideal for a few reasons. First, this will cause the "Enumerating objects" phase to be much longer than usual. Second, it does not take advantage of threading during the path-scoped delta calculations. Even with this lack of threading, the path-walk option is sometimes faster than the usual approach. Future changes will refactor this code to allow for threading, but that complexity is deferred until later to keep this patch as simple as possible. This new walk is incompatible with some features and is ignored by others: * Object filters are not currently integrated with the path-walk API, such as sparse-checkout or tree depth. A blobless packfile could be integrated easily, but that is deferred for later. * Server-focused features such as delta islands, shallow packs, and using a bitmap index are incompatible with the path-walk API. * The path walk API is only compatible with the --revs option, not taking object lists or pack lists over stdin. These alternative ways to specify the objects currently ignores the --path-walk option without even a warning. Future changes will create performance tests that demonstrate the power of this approach. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16pack-objects: extract should_attempt_deltas()Derrick Stolee1-24/+32
This will be helpful in a future change, which will reuse this logic. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16p2000: add performance test for patch-mode commandsDerrick Stolee1-0/+3
The previous three changes contributed performance improvements to 'git apply', 'git add -p', and 'git reset -p' when using a sparse index. The improvement to 'git apply' also improved 'git checkout -p'. Add performance tests to demonstrate this (and to help validate that performance remains good in the future). In the truncated test output below, we see that the full checkout performance changes within noise expectations, but the sparse index cases improve 33% and then 96% for 'git add -p' and 41% and then 95% for 'git reset -p'. 'git checkout -p' improves immediatley by 91% because it does not need any change to its builtin. Test HEAD~4 HEAD~3 HEAD~2 HEAD~1 ------------------------------------------------------------------------------------- 2000.118: ... git add -p (full-v3) 0.79 0.79 +0.0% 0.82 +3.8% 0.82 +3.8% 2000.119: ... git add -p (full-v4) 0.74 0.76 +2.7% 0.74 +0.0% 0.76 +2.7% 2000.120: ... git add -p (sparse-v3) 1.94 1.28 -34.0% 0.07 -96.4% 0.07 -96.4% 2000.121: ... git add -p (sparse-v4) 1.93 1.28 -33.7% 0.06 -96.9% 0.06 -96.9% 2000.122: ... git checkout -p (full-v3) 1.18 1.18 +0.0% 1.18 +0.0% 1.19 +0.8% 2000.123: ... git checkout -p (full-v4) 1.10 1.12 +1.8% 1.11 +0.9% 1.11 +0.9% 2000.124: ... git checkout -p (sparse-v3) 1.31 0.11 -91.6% 0.11 -91.6% 0.11 -91.6% 2000.125: ... git checkout -p (sparse-v4) 1.29 0.11 -91.5% 0.11 -91.5% 0.11 -91.5% 2000.126: ... git reset -p (full-v3) 0.81 0.80 -1.2% 0.83 +2.5% 0.83 +2.5% 2000.127: ... git reset -p (full-v4) 0.78 0.77 -1.3% 0.77 -1.3% 0.78 +0.0% 2000.128: ... git reset -p (sparse-v3) 1.58 0.92 -41.8% 0.91 -42.4% 0.07 -95.6% 2000.129: ... git reset -p (sparse-v4) 1.58 0.92 -41.8% 0.92 -41.8% 0.07 -95.6% It is worth noting that if our test was more involved and had multiple hunks to evaluate, then the time spent in 'git apply' would dominate due to multiple index loads and writes. As it stands, we need the sparse index improvement in 'git add -p' itself to confirm this performance improvement. Since the change for 'git add -i' is identical, we avoid a second test case for that similar operation. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16reset: integrate sparse index with --patchDerrick Stolee2-5/+43
Similar to the previous change for 'git add -p', the reset builtin checked for integration with the sparse index after possibly redirecting its logic toward the interactive logic. This means that the builtin would expand the sparse index to a full one upon read. Move this check earlier within cmd_reset() to improve performance here. Add tests to guarantee that we are not universally expanding the index. Add behavior tests to check that we are doing the same operations as a full index. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16git add: make -p/-i aware of sparse indexDerrick Stolee2-3/+64
It is slow to expand a sparse index in-memory due to parsing of trees. We aim to minimize that performance cost when possible. 'git add -p' uses 'git apply' child processes to modify the index, but still there are some expansions that occur. It turns out that control flows out of cmd_add() in the interactive cases before the lines that confirm that the builtin is integrated with the sparse index. Moving that integration point earlier in cmd_add() allows 'git add -i' and 'git add -p' to operate without expanding a sparse index to a full one. Add test cases that confirm that these interactive add options work with the sparse index. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16apply: integrate with the sparse indexDerrick Stolee2-1/+59
The sparse index allows storing directory entries in the index, marked with the skip-wortkree bit and pointing to a tree object. This may be an unexpected data shape for some implementation areas, so we are rolling it out incrementally on a builtin-per-builtin basis. This change enables the sparse index for 'git apply'. The main motivation for this change is that 'git apply' is used as a child process of 'git add -p' and expanding the sparse index for each of those child processes can lead to significant performance issues. The good news is that the actual index manipulation code used by 'git apply' is already integrated with the sparse index, so the only product change is to mark the builtin as allowing the sparse index so it isn't inflated on read. The more involved part of this change is around adding tests that verify how 'git apply' behaves in a sparse-checkout environment and whether or not the index expands in certain operations. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16userdiff: extend Bash pattern to cover more shell function formsMoumita Dhar8-8/+128
The previous function regex required explicit matching of function bodies using `{`, `(`, `((`, or `[[`, which caused several issues: - It failed to capture valid functions where `{` was on the next line due to line continuation (`\`). - It did not recognize functions with single command body, such as `x () echo hello`. Replacing the function body matching logic with `.*$`, ensures that everything on the function definition line is captured. Additionally, the word regex is refined to better recognize shell syntax, including additional parameter expansion operators and command-line options. Signed-off-by: Moumita Dhar <dhar61595@gmail.com> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16object-file: drop support for writing objects with unknown typesJeff King2-80/+6
Since "hash-object --literally" no longer supports objects with unknown types, there are now no callers of write_object_file_literally() and its helpers. Let's drop them to simplify the code. In particular, this gets rid of some ugly copy-and-paste code from write_object_file_literally(), which is a parallel implementation of write_object_file(). When the split was originally made, the two weren't that long, but commits like 63a6745a07 (object-file: update the loose object map when writing loose objects, 2023-10-01) ended up having to duplicate some tricky code. This patch drops all of that duplication and should make things less error-prone going forward. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16hash-object: handle --literally with OPT_NEGBITJeff King1-16/+11
Since we recently removed the hash_literally() function, the hash-object --literally option has been simplified to just removing the INDEX_FORMAT_CHECK flag. Rather than pass it around as a separate bool, we can just have the option parser remove the bit from the set of flags directly. This simplifies the helper functions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16hash-object: merge HASH_* and INDEX_* flagsJeff King1-17/+6
The hash-object command has its own custom flag bits that it sets based on command-line options. But since we dropped hash_literally() in the previous commit, the only thing we do with those flag bits is convert them directly into "index_flags" to pass to index_fd(). This extra layer of indirection makes the code harder to read and reason about. Let's just use the INDEX_* flags directly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16hash-object: stop allowing unknown typesJeff King2-33/+7
When passed the "--literally" option, hash-object will allow any arbitrary string for its "-t" type option. Such objects are only useful for testing or debugging, as they cannot be used in the normal way (e.g., you cannot fetch their contents!). Let's drop this feature, which will eventually let us simplify the object-writing code. This is technically backwards incompatible, but since such objects were never really functional, it seems unlikely that anybody will notice. We will retain the --literally flag, as it also instructs hash-object not to worry about other format issues (e.g., type-specific things that fsck would complain about). The documentation does not need to be updated, as it was always vague about which checks we're loosening (it uses only the phrase "any garbage"). The code change is a bit hard to verify from just the patch text. We can drop our local hash_literally() helper, but it was really just wrapping write_object_file_literally(). We now replace that with calling index_fd(), as we do for the non-literal code path, but dropping the INDEX_FORMAT_CHECK flag. This ends up being the same semantically as what the _literally() code path was doing (modulo handling unknown types, which is our goal). We'll be able to clean up these code paths a bit more in subsequent patches. The existing test is flipped to show that we now reject the unknown type. The additional "extra-long type" test is now redundant, as we bail early upon seeing a bogus type. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16t: add lib-loose.shJeff King4-5/+38
This commit adds a shell library for writing raw loose objects into the object database. Normally this is done with hash-object, but the specific intent here is to allow broken objects that hash-object may not support. We'll convert several cases that use "hash-object --literally" to write objects with invalid types. That works currently, but dropping this dependency will allow us to remove that feature and simplify the object-writing code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16t/helper: add zlib test-toolJeff King5-0/+66
It's occasionally useful when testing or debugging to be able to do raw zlib inflate/deflate operations (e.g., to check the bytes of a specific loose or packed object). Even though zlib's deflate algorithm is used by many other programs, this is surprisingly hard to do in a portable way. E.g., gzip can do this if you manually munge some header bytes. But the result is somewhat arcane, and we don't assume gzip is available anyway. Likewise, pigz will handle raw zlib, but we can't assume it is available. So let's introduce a short test helper for just doing zlib operations. We'll use it in subsequent patches to add some new tests, but it would also have come in handy a few times in the past: - The hard-coded pack data from 3b910d0c5e (add tests for indexing packs with delta cycles, 2013-08-23) could probably be generated on the fly. - Likewise we could avoid the hard-coded data from 0b1493c2d4 (git_inflate(): skip zlib_post_call() sanity check on Z_NEED_DICT, 2025-02-25). Though note this would require support for more zlib options. - It would have helped with the debugging documented in 41dfbb2dbe (howto: add article on recovering a corrupted object, 2013-10-25). I'll leave refactoring existing tests for another day, but I hope the examples above show the general utility. I aimed for simplicity in the code. In particular, it will read all input into a memory buffer, rather than streaming. That makes the zlib loops harder to get wrong (which has been a source of subtle bugs in the past). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16oid_object_info(): drop type_name strbufJeff King4-12/+2
We provide a mechanism for callers to get the object type as a raw string, rather than an object_type enum. This was in theory useful for returning types that are not representable in the enum, but we consider any such type to be an error, and there are no callers that use the strbuf anymore. Let's drop support to simplify the code a bit. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16fsck: stop using object_info->type_name strbufJeff King3-40/+14
When fsck-ing a loose object, we use object_info's type_name strbuf to record the parsed object type as a string. For most objects this is redundant with the object_type enum, but it does let us report the string when we encounter an object with an unknown type (for which there is no matching enum value). There are a few downsides, though: 1. The code to report these cases is not actually robust. Since we did not pass a strbuf to unpack_loose_header(), we only retrieved types from headers up to 32 bytes. In longer cases, we'd simply say "object corrupt or missing". 2. This is the last caller that uses object_info's type_name strbuf support. It would be nice to refactor it so that we can simplify that code. 3. Likewise, we'll check the hash of the object using its unknown type (again, as long as that type is short enough). That depends on the hash_object_file_literally() code, which we'd eventually like to get rid of. So we can simplify things by bailing immediately in read_loose_object() when we encounter an unknown type. This has a few user-visible effects: a. Instead of producing a single line of error output like this: error: 26ed13ce3564fbbb44e35bde42c7da717ea004a6: object is of unknown type 'bogus': .git/objects/26/ed13ce3564fbbb44e35bde42c7da717ea004a6 we'll now issue two lines (the first from read_loose_object() when we see the unparsable header, and the second from the fsck code, since we couldn't read the object): error: unable to parse type from header 'bogus 4' of .git/objects/26/ed13ce3564fbbb44e35bde42c7da717ea004a6 error: 26ed13ce3564fbbb44e35bde42c7da717ea004a6: object corrupt or missing: .git/objects/26/ed13ce3564fbbb44e35bde42c7da717ea004a6 This is a little more verbose, but this sort of error should be rare (such objects are almost impossible to work with, and cannot be transferred between repositories as they are not representable in packfiles). And as a bonus, reporting the broken header in full could help with debugging other cases (e.g., a header like "blob xyzzy\0" would fail in parsing the size, but previously we'd not have showed the offending bytes). b. An object with an unknown type will be reported as corrupt, without actually doing a hash check. Again, I think this is unlikely to matter in practice since such objects are totally unusable. We'll update one fsck test to match the new error strings. And we can remove another test that covered the case of an object with an unknown type _and_ a hash corruption. Since we'll skip the hash check now in this case, the test is no longer interesting. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16oid_object_info_convert(): stop using string for object typeJeff King1-11/+4
In oid_object_info_convert(), we convert objects between their sha1 and sha256 variants. To do this, we naturally need to know the type, which we get from oid_object_info_extended() using its type_name strbuf option. But getting the value as a string (versus an object_type enum) is not helpful. Since we do not allow unknown types, the regular enum is sufficient. And the resulting code is a bit simpler, as we no longer have to manage the extra allocation nor convert the string to an enum ourselves. Note that at first glance, it might seem like we should retain the error check for "type == -1" to catch bogus types found by the underlying parser. But we don't need it, as an unknown type would have yielded an error from the call to oid_object_info_extended(), which would already have caused us to return an error. In fact, I suspect this was always impossible to trigger. Even when we were converting the string to a type enum ourselves, an invalid type would never have escaped oid_object_info_extended(), since we never passed the (now removed) OBJECT_INFO_ALLOW_UNKNOWN_TYPE option. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16cat-file: use type enum instead of buffer for -t optionJeff King1-9/+4
Now that we no longer support OBJECT_INFO_ALLOW_UNKNOWN_TYPE, there is no need to pass a strbuf into oid_object_info_extended() to record the type. The regular object_type enum is sufficient to capture all of the types we will allow. This simplifies the code a bit, and will eventually let us drop object_info's type_name strbuf support. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16object-file: drop OBJECT_INFO_ALLOW_UNKNOWN_TYPE flagJeff King4-49/+10
Since cat-file dropped its "--allow-unknown-type" option in the previous commit, there are no more uses of the internal flag that implemented it. Let's drop it. That in turn lets us drop the strbuf parameter of unpack_loose_header(), which now is always NULL. And without that, we can drop all of the additional code to inflate larger headers into the strbuf. Arguably we could drop ULHR_TOO_LONG, as no callers really care about the distinction from ULHR_BAD. But it's easy enough to retain, and it does let us produce a slightly more specific message in one instance. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16cat-file: make --allow-unknown-type a noopJeff King3-179/+56
The cat-file command has some minor support for handling objects with "unknown" types. I.e., strings that are not "blob", "commit", "tree", or "tag". In theory this could be used for debugging or experimenting with extensions to Git. But in practice this support is not very useful: 1. You can get the type and size of such objects, but nothing else. Not even the contents! 2. Only loose objects are supported, since packfiles use numeric ids for the types, rather than strings. 3. Likewise you cannot ever transfer objects between repositories, because they cannot be represented in the packfiles used for the on-the-wire protocol. The support for these unknown types complicates the object-parsing code, and has led to bugs such as b748ddb7a4 (unpack_loose_header(): fix infinite loop on broken zlib input, 2025-02-25). So let's drop it. The first step is to remove the user-facing parts, which are accessible only via cat-file. This is technically backwards-incompatible, but given the limitations listed above, these objects couldn't possibly be useful in any workflow. However, we can't just rip out the option entirely. That would hurt a caller who ran: git cat-file -t --allow-unknown-object <oid> and fed it normal, well-formed objects. There --allow-unknown-type was doing nothing, but we wouldn't want to start bailing with an error. So to protect any such callers, we'll retain --allow-unknown-type as a noop. The code change is fairly small (but we'll able to clean up more code in follow-on patches). The test updates drop any use of the option. We still retain tests that feed the broken objects to cat-file without --allow-unknown-type, as we should continue to confirm that those objects are rejected. Note that in one spot we can drop a layer of loop, re-indenting the body; viewing the diff with "-w" helps there. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16object-file.h: fix typo in variable declarationJeff King1-1/+1
This should be "compat", not "comapt". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16json-writer: describe the usage of jw_* functionsLucas Seiki Oshiro1-0/+28
Provide an overview of the set of functions used for manipulating `json_writer`s, by describing what functions should be used for each JSON-related task. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Patrick Steinhardt <ps@pks.im> Helped-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Acked-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16json-writer: add docstrings to jw_* functionsLucas Seiki Oshiro2-4/+143
Add a docstring for each function that manipulates json_writers. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Patrick Steinhardt <ps@pks.im> Helped-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Acked-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15The fifteenth batchJunio C Hamano1-0/+8
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15Merge branch 'tb/macos-false-but-the-compiler-does-not-know-it-fix'Junio C Hamano1-1/+1
Workaround for older macOS ld. * tb/macos-false-but-the-compiler-does-not-know-it-fix: intialize false_but_the_compiler_does_not_know_it_
2025-05-15Merge branch 'jc/t6011-mv-ro-fix'Junio C Hamano1-0/+1
Test fix. * jc/t6011-mv-ro-fix: t6011: fix misconversion from perl to sed
2025-05-15Merge branch 'dd/meson-perl-custom-path'Junio C Hamano10-10/+19
Meson-based build framework update. * dd/meson-perl-custom-path: meson: allow customize perl installation path
2025-05-15Merge branch 'ps/maintenance-missing-tasks'Junio C Hamano4-31/+257
Make repository clean-up tasks "gc" can do available to "git maintenance" front-end. * ps/maintenance-missing-tasks: builtin/maintenance: introduce "rerere-gc" task builtin/gc: move rerere garbage collection into separate function builtin/maintenance: introduce "worktree-prune" task builtin/gc: move pruning of worktrees into a separate function builtin/gc: remove global variables where it is trivial to do builtin/gc: fix indentation of `cmd_gc()` parameters
2025-05-15Merge branch 'cf/wrapper-bsd-eloop'Junio C Hamano1-1/+20
The fallback implementation of open_nofollow() depended on open("symlink", O_NOFOLLOW) to set errno to ELOOP, but a few BSD derived systems use different errno, which has been worked around. * cf/wrapper-bsd-eloop: wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP
2025-05-15commit-graph: fix memory leak when `fill_oids_from_packs()` failsLidong Yan1-0/+2
In commit-graph.c:fill_oids_from_packs, if open_pack_index failed, memory allocated and returned by add_packed_git will leak. Simply add close_pack and free(p) will solve this problem. Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15sequencer: fix memory leak if `todo_list_rearrange_squash()` failedLidong Yan1-3/+5
In sequencer.c:todo_list_rearrange_squash, if it fails, memory allocated in `next`, `tail`, `subjects` and `subject2item` will leak. Jump to cleanup label before return could fix this leak problem. Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15mailinfo: fix pointential memory leak if `decode_header` failedLidong Yan1-21/+21
In mailinfo.c:decode_header, if convert_to_utf8 failed, the strbuf stored in dec will leak. Simply add strbuf_release and free(dec) will solve this problem. Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15sequencer: stop pretending that an assignment is a conditionJohannes Schindelin1-3/+6
In 3e81bccdf3 (sequencer: factor out todo command name parsing, 2019-06-27), a `return` statement was introduced that basically was a long sequence of conditions, combined with `&&`, except for the last condition which is not really a condition but an assignment. The point of this construct was to return 1 (i.e. `true`) from the function if all of those conditions held true, and also assign the `bol` pointer to the end of the parsed command. Some static analyzers are really unhappy about such constructs. And human readers are at least puzzled, if not confused, by seeing a single `=` inside a chain of conditions where they would have expected to see `==` instead and, based on experience, immediately suspect a typo. Let's help all of this by turning this into the more verbose, more readable form of an `if` construct that both assigns the pointer as well as returns 1 if all of the conditions hold true. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15bundle-uri: avoid using undefined output of `sscanf()`Johannes Schindelin1-5/+7
In c429bed102 (bundle-uri: store fetch.bundleCreationToken, 2023-01-31) code was introduced that assumes that an `sscanf()` call leaves its output variables unchanged unless the return value indicates success. However, the POSIX documentation makes no such guarantee: https://pubs.opengroup.org/onlinepubs/9699919799/functions/sscanf.html So let's make sure that the output variable `maxCreationToken` is always well-defined. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15commit-graph: avoid using stale stack addressesJohannes Schindelin1-0/+9
The code is a bit too hard to reason about to fully assess whether the `fill_commit_graph_info()` function is called at all after `write_commit_graph()` returns (and hence the stack variable `topo_levels` goes out of context). Let's simply make sure that the stack address is no longer used at that stage, thereby making the code quite a bit easier to reason about. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15trace2: avoid "futile conditional"Johannes Schindelin1-19/+5
CodeQL reports empty `if` blocks that only contain a comment as "futile conditional". The comment talks about potential plans to turn this into a warning, but that seems not to have been necessary. Replace the entire construct with a concise comment. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15Avoid redundant conditionsJohannes Schindelin2-2/+2
While `if (i <= 0) ... else if (i > 0) ...` is technically equivalent to `if (i <= 0) ... else ...`, the latter is vastly easier to read because it avoids writing out a condition that is unnecessary. Let's drop such unnecessary conditions. Pointed out by CodeQL. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15fetch: avoid unnecessary work when there is no current branchJohannes Schindelin1-1/+1
As pointed out by CodeQL, `branch_get()` may return `NULL`, in which case `branch_has_merge_config()` would return early, but we can even avoid enumerating the refs prefixes in that case, saving even more CPU cycles. Technically, we should enclose these two statements in an `if (branch) {...}` block, but the indentation is already quite deep, therefore I refrained from doing that. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15has_dir_name(): make code more obviousJohannes Schindelin1-42/+13
One thing that might be non-obvious to readers (or to analyzers like CodeQL) is that the function essentially does nothing when the Git index is empty, and in particular that it does not look at the value of `len_eq_last` (which would be uninitialized at that point). Let's make this much easier to understand, by returning early if the Git index is empty, and by avoiding empty `else` blocks. This commit changes indentation and is hence best viewed using `--ignore-space-change`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15upload-pack: rename `enum` to reflect the operationJohannes Schindelin1-17/+17
While 3145ea957d (upload-pack: introduce fetch server command, 2018-03-15) added support for the `fetch` command, from the server's point of view it is an upload, and hence the `enum` should really be called `upload_state` instead of `fetch_state`. Likewise, rename its values. This also helps unconfuse CodeQL which would otherwise be at sixes or sevens about having _two_ non-local definitions of the same `enum` with the same values. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15commit-graph: avoid malloc'ing a local variableJohannes Schindelin1-72/+69
We do need a context to write the commit graph, but that context is only needed during the life time of `commit_graph_write()`, therefore it can easily be a stack variable. This also helps CodeQL recognize that it is safe to assign the address of other local variables to the context's fields. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15fetch: carefully clear local variable's address after useJohannes Schindelin1-0/+1
As pointed out by CodeQL, it is a potentially dangerous practice to store local variables' addresses in non-local structs. Yet this is exactly what happens with the `acked_commits` attribute that is used in `cmd_fetch()`: The pointer to a local variable is assigned to it. Now, it is Git's convention that `cmd_*()` functions are essentially only returning just before exiting the process, therefore there is little danger that this attribute is used after the code flow returns from that function. However, code in `cmd_*()` function is often so useful that it gets lifted into a library function, at which point this issue could become a real problem. Let's make sure to clear the `acked_commits` attribute out after it was used, and before the function returns (at which point the address would go stale). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15commit: simplify codeJohannes Schindelin1-1/+1
The difference of two unsigned integers is defined to be unsigned, and therefore it is misleading to check whether it is greater than zero (instead, the more natural way would be to check whether the difference is zero or not). Let's instead avoid the subtraction altogether, and compare the two operands directly, which makes the code more obvious as a side effect. Pointed out by CodeQL's rule with the ID `cpp/unsigned-difference-expression-compared-zero`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15git-gui: do not end the commit message with an empty lineJohannes Sixt1-4/+2
The commit message is processed to remove unnecessary empty lines. In particular, it is ensured that the text ends with at most one LF character. This one is always present, because the Tk text widget ensures that is present. However, did not consider that the processed text is written to the commit message file using `puts`, which also appends a LF character, so that the final commit message ends with two LF. Trim all trailing LF characters, and while we are here, use `string trim`, which lets us remove the leading LF in the same command. Reported-by: Gareth Fenn <garethfenn@gmail.com> Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2025-05-15gitk: do not hard-code color of search results in commit listAlexander Ogorodov1-2/+2
A global variable exists that holds the color name used to highlight search results everywhere, except that in the commit list the color is still hard-coded to "yellow". Use the global variable there as well. Signed-off-by: Alexander Ogorodov <bnfour@bnfour.net>
2025-05-14replay: replace the_repository with repo parameter passed to cmd_replay ()Elijah Newren1-30/+35
Replace the_repository everywhere with repo, feed repo from cmd_replay() to all the other functions in the file that need it, and remove the UNUSED annotation on repo. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-14packed-backend: mmap large "packed-refs" file during fsckshejialuo1-12/+7
During fsck, we use "strbuf_read" to read the content of "packed-refs" without using mmap mechanism. This is a bad practice which would consume more memory than using mmap mechanism. Besides, as all code paths in "packed-backend.c" use this way, we should make "fsck" align with the current codebase. As we have introduced the helper function "allocate_snapshot_buffer", we can simply use this function to use mmap mechanism. Suggested-by: Jeff King <peff@peff.net> Suggested-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-14packed-backend: extract snapshot allocation in `load_contents`shejialuo1-22/+31
"load_contents" would choose which way to load the content of the "packed-refs". However, we cannot directly use this function when checking the consistency due to we don't want to open the file. And we also need to reuse the logic to avoid causing repetition. Let's create a new helper function "allocate_snapshot_buffer" to extract the snapshot allocation logic in "load_contents" and update the "load_contents" to align with the behavior. Suggested-by: Jeff King <peff@peff.net> Suggested-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-14packed-backend: fsck should warn when "packed-refs" file is emptyshejialuo4-0/+33
We assume the "packed-refs" won't be empty and instead has at least one line in it (even when there are no refs packed, there is the file header line). Because there is no terminating LF in the empty file, we will report "packedRefEntryNotTerminated(ERROR)" to the user. However, the runtime code paths would accept an empty "packed-refs" file, for example, "create_snapshot" would simply return the "snapshot" without checking the content of "packed-refs". So, we should skip checking the content of "packed-refs" when it is empty during fsck. After 694b7a1999 (repack_without_ref(): write peeled refs in the rewritten file, 2013-04-22), we would always write a header into the "packed-refs" file. So, versions of Git that are not too ancient never write such an empty "packed-refs" file. As an empty file often indicates a sign of a filesystem-level issue, the way we want to resolve this inconsistency is not make everybody totally silent but notice and report the anomaly. Let's create a "FSCK_INFO" message id "EMPTY_PACKED_REFS_FILE" to report to the users that "packed-refs" is empty. Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-14scalar reconfigure: improve --maintenance docsDerrick Stolee2-9/+8
The --maintenance option for 'scalar reconfigure' has three possible values. Improve the documentation by specifying the option in the -h help menu and usage information. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-14gitk: place file name arguments after options in msgfmt callJohannes Sixt1-1/+1
The build process fails in POSIXLY_CORRECT mode: $ gitk@master:1005> POSIXLY_CORRECT=1 make * new Tcl/Tk interpreter location GEN gitk-wish Generating catalog po/zh_cn.msg msgfmt --statistics --tcl po/zh_cn.po -l zh_cn -d po/ msgfmt: --tcl requires a "-l locale" specification Try 'msgfmt --help' for more information. make: *** [Makefile:76: po/zh_cn.msg] Error 1 The reason is that option arguments cannot occur after the first non-option argument. Move the file name last. Reported-by: Nathan Royce <nroycea+kernel@gmail.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2025-05-13send-email: try to get fqdn by running hostname -f on Linux and macOSAditya Garg1-1/+15
`hostname` is a popular command available on both Linux and macOS. As per the man-page[1], `hostname -f` command returns the fully qualified domain name (FQDN) of the system. The current Net::Domain perl module being used in the script for the same has been quite unrealiable in many cases. Thankfully, we now have a better check for valid_fqdn, which does reject the invalid FQDNs given by this module properly, but at the same time, it will result in a fallback to 'localhost.localdomain' being used. `hostname -f` has been quite reliable (probably even more reliable than the Net::Domain module) and before falling back to 'localhost.localdomain', we should try to use it. Interestingly, the `hostname` command is actually used by perl modules like Net::Domain[2] and Sys::Hostname[3] to get the hostname. So, lets give `hostname -f` a chance as well! [1]: https://man7.org/linux/man-pages/man1/hostname.1.html [2]: https://github.com/Perl/perl5/blob/blead/cpan/libnet/lib/Net/Domain.pm#L88 [3]: https://github.com/Perl/perl5/blob/blead/ext/Sys-Hostname/Hostname.pm#L93 Signed-off-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-13The fourteenth batchJunio C Hamano1-0/+11
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-13Merge branch 'kj/glob-path-with-special-char'Junio C Hamano3-1/+432
"git add 'f?o'" did not add 'foo' if 'f?o', an unusual pathname, also existed on the working tree, which has been corrected. * kj/glob-path-with-special-char: dir.c: literal match with wildcard in pathspec should still glob
2025-05-13Merge branch 'kh/docfixes'Junio C Hamano2-2/+2
Docfixes. * kh/docfixes: doc: branch: fix inline-verbatim doc: reflog: fix `drop` subheading
2025-05-13Merge branch 'js/ci-buildsystems-cleanup'Junio C Hamano10-1950/+0
Code clean-up around stale CI elements and building with Visual Studio. * js/ci-buildsystems-cleanup: config.mak.uname: drop the `vcxproj` target contrib/buildsystems: drop support for building . vcproj/.vcxproj files ci: stop linking the `prove` cache
2025-05-13Merge branch 'ps/ci-test-aggreg-fix-for-meson'Junio C Hamano1-0/+1
Test result aggregation did not work in Meson based CI jobs. * ps/ci-test-aggreg-fix-for-meson: ci: fix aggregation of test results with Meson
2025-05-13Merge branch 'en/get-tree-entry-doc'Junio C Hamano1-5/+8
Doc update. * en/get-tree-entry-doc: tree-walk.h: fix incorrect API comment
2025-05-13gitlab-ci: always run MSVC-based Meson jobPatrick Steinhardt1-1/+0
With 7304bd2bc39 (ci: wire up Visual Studio build with Meson, 2025-01-22) we have introduced a CI job that builds and tests Git with Microsoft Visual Studio via Meson. This job is only being executed by default on GitHub Workflows though -- on GitLab CI it is marked as a "manual" job, so the developer has to actively trigger these jobs. The consequence of this split is that any breakage specific to this job is only noticed by developers who mainly work with GitHub. Let's improve this situation by also running the job by default on GitLab CI. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-13git-gui: wire up support for the Meson build systemPatrick Steinhardt4-0/+261
The Git project has started to wire up Meson as a build system in Git v2.48.0. Wire up support for Meson in "git-gui" so that we can trivially include it as a subproject in Git. Signed-off-by: Patrick Steinhardt <ps@pks.im>
2025-05-13git-gui: stop including GIT-VERSION-FILE filePatrick Steinhardt1-5/+2
The "GITGUI_VERSION" variable is made available by generating and including the "GIT-VERSION-FILE" file. Its value has been used in various build steps, but in the preceding commits we have refactored those to instead source the "GIT-VERSION-FILE" directly. As a result, the variable is now only used in a single recipe, and this use can be trivially replaced with sed(1). Refactor the recipe to do so and stop including "GIT-VERSION-FILE" to simplify the build process. Signed-off-by: Patrick Steinhardt <ps@pks.im>
2025-05-13git-gui: extract script to generate macOS appPatrick Steinhardt3-20/+34
Extract script to generate the macOS app. This change allows us to reuse the build logic with the Meson build system. Note that as part of this change we also modify the TKEXECUTABLE variable to track its full path. Like this we don't have to propagate both the TKEXECUTABLE and TKFRAMEWORK variables into the script, and the basename can be trivially computed from TKEXECUTABLE anyway. Signed-off-by: Patrick Steinhardt <ps@pks.im>
2025-05-13git-gui: extract script to generate macOS wrapperPatrick Steinhardt3-14/+40
Extract script to generate the macOS wrapper for git-gui. This change allows us to reuse the build logic with the Meson build system. Signed-off-by: Patrick Steinhardt <ps@pks.im>
2025-05-13git-gui: extract script to generate "tclIndex"Patrick Steinhardt2-14/+34
Extract script to generate "tclIndex". This change allows us to reuse the build logic with the Meson build system. Signed-off-by: Patrick Steinhardt <ps@pks.im>
2025-05-13git-gui: extract script to generate "git-gui"Patrick Steinhardt2-11/+30
Extract script to generate "git-gui". This change allows us to reuse the build logic with the Meson build system. Signed-off-by: Patrick Steinhardt <ps@pks.im>
2025-05-13git-gui: drop no-op GITGUI_SCRIPT replacementPatrick Steinhardt1-2/+0
The value of the GITGUI_SCRIPT variable is only used in a single place as part of an sed(1) script that massages the "git-gui.sh" script. Interestingly, this specific replacement does seem to be a no-op: we replace "^ argv0=$$0" with " argv=$(GITGUI_SCRIPT)", which has a value of "$$0". The result would thus be completely unchanged. Drop the replacement and its variable. Signed-off-by: Patrick Steinhardt <ps@pks.im>
2025-05-13git-gui: make output of GIT-VERSION-GEN source'ablePatrick Steinhardt1-3/+3
The output of GIT-VERSION-GEN can be sourced by our Makefile to make the version available there. The output has a couple of spaces around the equals sign, which is perfectly valid for parsing it in our Makefile. But in subsequent steps we'll also want to source the file in a couple of newly-introduced shell scripts, but having spaces around variable assignments is invalid there. Prepare for this step by dropping the spaces surrounding the equals sign. Like this, we can easily use the same file both in our Makefile and in shell scripts. Signed-off-by: Patrick Steinhardt <ps@pks.im>
2025-05-13git-gui: prepare GIT-VERSION-GEN for out-of-tree buildsPatrick Steinhardt2-15/+29
The GIT-VERSION-GEN unconditionally writes version information into the source directory in the form of the "GIT-VERSION-FILE". We are about to introduce the Meson build system though, which enforces out-of-tree builds by default, and in that context we cannot continue to write version information into the source tree. Prepare the script for out-of-tree builds by treating the source directory different from the output file. Signed-off-by: Patrick Steinhardt <ps@pks.im>
2025-05-13git-gui: replace GIT-GUI-VARS with GIT-GUI-BUILD-OPTIONSPatrick Steinhardt3-21/+23
The GIT-GUI-VARS file is used to track whether any of our build options has changed. Unfortunately, the format of that file does not allow us to propagate those build options to other scripts. But as we are about to introduce support for the Meson build system, we will extract a couple of scripts to deduplicate core build logic across Makefiles and Meson. With this refactoring, it will become necessary to make build options more widely accessible. Replace GIT-GUI-VARS with a new GIT-GUI-BUILD-OPTIONS file that is being populated from a template. This file can easily be sourced from build scripts in subsequent steps. Signed-off-by: Patrick Steinhardt <ps@pks.im>
2025-05-12whatschanged: list it in BreakingChanges documentJunio C Hamano2-2/+17
This can be squashed into the previous step. That is how our "git pack-redundant" conversion did. Theoretically, however, those who want to gauge the need to keep the command by exposing their users to patches before this one may want to wait until their experiment finishes before they formally say "this will go away". This change is made into a separate patch from the previous step precisely to help those folks. While at it, update the documentation page to use the new [synopsis] facility to mark-up the SYNOPSIS part. Helped-by: Elijah Newren <newren@gmail.com> [en: typofix] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12whatchanged: remove when built with WITH_BREAKING_CHANGESJunio C Hamano9-15/+66
As we made "git whatchanged" require "--i-still-use-this" and asked the users to report if they still want to use it, the logical next step is to allow us build Git without "whatchanged" to prepare for its eventual removal. If we were to follow the pattern established in 8ccc75c2 (remote: announce removal of "branches/" and "remotes/", 2025-01-22), we can do this together with the documentation update to officially list that the command will be removed in the BreakingChanges document, but let's just keep the changes separate just in case we want to proceed a bit slower. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12whatchanged: require --i-still-use-thisJunio C Hamano3-7/+37
The documentation of "git whatchanged" is pretty explicit that the command was retained for historical reasons to help those whose fingers cannot be retrained. Let's see if they still are finding it hard to type "git log --raw" instead of "git whatchanged" by marking the command as "nominated for removal", and require "--i-still-use-this" on the command line. Adjust the tests so that the option is passed when we invoke the command. In addition, we test that the command fails when "--i-still-use-this" is not given. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12tests: prepare for a world without whatchangedJunio C Hamano2-7/+7
Some tests on fast-import run "git whatchanged" without even checking the output from the command. It is tempting to remove the calls altogether since they are not doing anything useful, but they presumably were added there while the tests were developed to manually sanity check which paths were touched. Replace these calls with "git log --raw", which is a rough equivalent in the more modern Git. This does not remove "git whatchanged", but we no longer have to worry about adjusting these places when we eventually do. Helped-by: Elijah Newren <newren@gmail.com> [en: log message] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12The thirteenth batchJunio C Hamano1-0/+15
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12Merge branch 'ps/meson-bin-sh'Junio C Hamano1-1/+11
Meson-based build framework update. * ps/meson-bin-sh: meson: prefer shell at "/bin/sh" meson: report detected runtime executable paths
2025-05-12Merge branch 'ng/xdiff-truly-minimal'Junio C Hamano3-2/+18
"git diff --minimal" used to give non-minimal output when its optimization kicked in, which has been disabled. * ng/xdiff-truly-minimal: xdiff: disable cleanup_records heuristic with --minimal
2025-05-12Merge branch 'ds/fix-thin-fix'Junio C Hamano7-28/+216
"git index-pack --fix-thin" used to abort to prevent a cycle in delta chains from forming in a corner case even when there is no such cycle. * ds/fix-thin-fix: index-pack: allow revisiting REF_DELTA chains t5309: create failing test for 'git index-pack' test-tool: add pack-deltas helper
2025-05-12Merge branch 'jc/ci-skip-unavailable-external-software'Junio C Hamano1-13/+8
Further refinement on CI messages when an optional external software is unavailable (e.g. due to third-party service outage). * jc/ci-skip-unavailable-external-software: ci: download JGit from maven, not eclipse.org ci: update the message for unavailble third-party software
2025-05-12Merge branch 'ps/object-store-cleanup'Junio C Hamano41-289/+278
Further code clean-up in the object-store layer. * ps/object-store-cleanup: object-store: drop `repo_has_object_file()` treewide: convert users of `repo_has_object_file()` to `has_object()` object-store: allow fetching objects via `has_object()` object-store: move function declarations to their respective subsystems object-store: move and rename `odb_pack_keep()` object-store: drop `loose_object_path()` object-store: move `struct packed_git` into "packfile.h"
2025-05-12Merge branch 'ag/send-email-outlook'Junio C Hamano2-1/+45
Update send-email to work better with Outlook's smtp server. * ag/send-email-outlook: send-email: add --[no-]outlook-id-fix option send-email: retrieve Message-ID from outlook SMTP server
2025-05-12doc: prepare for a world without whatchangedJunio C Hamano2-3/+3
Some documentation examples reference "whatchanged", either as a placeholder command or an example of source structure. To reduce the need for future edits when `whatchanged` is removed, replace these references with alternatives: - In `MyFirstObjectWalk.adoc`, use `version` as the nearby anchor point for `walken`, instead of `whatchanged`. - In `user-manual.adoc`, cite `show` instead of `whatchanged` as a command whose source lives in the same file as `log`. Helped-by: Elijah Newren <newren@gmail.com> [en: log message] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12you-still-use-that??: help deprecating commands for removalJunio C Hamano4-8/+21
Commands slated for removal like "git pack-redundant" now require an explicit "--i-still-use-this" option to run. This is to discourage casual use and surface their pending deprecation to users. The warning message is long, so factor it into a helper function you_still_use_that() to simplify reuse by other commands. Also add a missing test to ensure this enforcement works for "pack-redundant". Helped-by: Elijah Newren <newren@gmail.com> [en: log message] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12raw_object_store: drop extra pointer to replace_mapJeff King5-10/+8
We store the replacement data in an oidmap, which is itself a pointer in the raw_object_store struct. But there's no need for an extra pointer indirection here. It is always allocated and initialized along with the containing struct, and we never check it for NULL-ness. Let's embed the map directly in the struct, which is simpler and avoids extra pointer chasing. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12oidmap: add size functionJeff King3-2/+6
Callers which want to know how many items are in an oidmap have to look at the underlying hashmap struct, leaking an implementation detail. Let's provide a type-appropriate wrapper and use it. Note in the call from lookup_replace_object(), the caller was actually looking at the hashmap's tablesize parameter (the allocated size of the table) rather than hashmap_get_size(), the number of items in the table. This probably should have been checking the number of items all along, but the two are functionally equivalent here since we only add to the map and never remove anything. Thus if there was any allocation, it was because there is at least one item. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12oidmap: rename oidmap_free() to oidmap_clear()Jeff King7-9/+10
This function does not free the oidmap struct itself; it just drops all items from the map (using hashmap_clear_() internally). It should be called oidmap_clear(), per CodingGuidelines. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12pack-bitmap: fix memory leak if `load_bitmap_entries_v1` failedLidong Yan1-4/+4
In pack-bitmap.c:load_bitmap_entries_v1, the function `read_bitmap_1` allocates a bitmap and reads index data into it. However, if any of the validation checks following the allocation fail, the allocated bitmap is not freed, resulting in a memory leak. To avoid this, the validation checks should be performed before the bitmap is allocated. Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12contrib: remove some scripts in "stats" directoryPatrick Steinhardt2-96/+0
The "stats" directory contains a couple of scripts to do some statistics on a repository: - "git-common-hash" shows the longest common hash prefixes and can be used to determine the minimum prefix length to use for object names to be unique. The script has last been touched in 53474eb92ff (contrib: update stats/mailmap script, 2012-12-12) and searching for it on the internet doesn't really surface any potential use cases or even mentions of it. Modern Git also shouldn't really need this tool as it knows to automatically scale printed prefixes via some heuristics. - "mailmap.pl" performs some statistics on the number of mailmapped commits in a repository. It has last been modified in 53474eb92ff (contrib: update stats/mailmap script, 2012-12-12) and has since been bitrotting. It doesn't even compile nowadays anymore: $ perl contrib/stats/mailmap.pl Experimental keys on scalar is now forbidden at contrib/stats/mailmap.pl line 57. Type of arg 1 to keys must be hash or array (not hash element) at contrib/stats/mailmap.pl line 57, near "}) " Experimental keys on scalar is now forbidden at contrib/stats/mailmap.pl line 57. Type of arg 1 to keys must be hash or array (not private variable) at contrib/stats/mailmap.pl line 57, near "$h)" Experimental keys on scalar is now forbidden at contrib/stats/mailmap.pl line 64. Type of arg 1 to keys must be hash or array (not private variable) at contrib/stats/mailmap.pl line 64, near "$h)" Execution of contrib/stats/mailmap.pl aborted due to compilation errors. This should be good-enough signal to indicate that nobody is using this script at all anymore. - "packinfo.pl" takes the output from git-verify-pack(1) and performs some pretty printing thereof. On the one hand it reformats the output to be easier to read and provide some summaries. On the other hand it may also print filenames of blobs. We don't have any replacement for this tool. Ideally, we should move its functionality into git-verify-pack(1) itself. Remove the first two scripts, but retain "packinfo.pl". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12contrib: remove "git-new-workdir"Patrick Steinhardt5-184/+0
The "git-new-workdir" command has been introduced to make it possible to have a separate working directory in a different place. The command thus predates git-worktree(1), which is what people use nowadays to create any such working directory. As such, the script doesn't really have much of a reason to exist nowadays anymore. It also doesn't seem like the script is still in use: the last time it has received an update was in e32afab7b03 (git-new-workdir: don't fail if the target directory is empty, 2014-11-26), more than a decade ago. Remove it as well as the tests that depend on it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12contrib: remove "emacs" directoryPatrick Steinhardt3-45/+0
While the "emacs/" directory still exists, all of its code has been replaced with stubs in 6d5ed4836db (git{,-blame}.el: remove old bitrotting Emacs code, 2018-04-11). Instead, the recommendation is to use Emacs' own vc-annotate mode. Remove the code altogether. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12contrib: remove "git-resurrect.sh"Patrick Steinhardt1-181/+0
The "git-resurrect.sh" script can be used to find traces of a branch tip in the reflog and resurrect that branch. Despite a couple of global cleanups, the script hasn't seen any activity since it was introduced in e1ff064e1bf (contrib git-resurrect: find traces of a branch name and resurrect it, 2009-02-04). Furthermore, the tool does not work with the "reftable" backend at all as it directly reads ".git/logs/HEAD". As reflogs are stored as part of the individual tables though that file wouldn't exist in a "reftable"- enabled repository. Last but not least, the tool doesn't even work unless it is explicitly invoked via `git resurrect` as it sources "git-sh-setup". As none of our build systems know to install this script, users thus have to go out of their way to really make it work, which is highly unlikely. Another source that indicates that this tool can be removed is a question for how to restore deleted branches on StackOverflow [1]. The top-voted answer uses git-reflog(1) directly and has received more than 3000 votes to date. While "git-resurrect.sh" is also mentioned, it only got 16 upvotes, and comments mention the above caveat that users have to do some manual setup to make it work. It's thus rather clear that the tool doesn't have a lot or even any users. Remove it. [1]: https://stackoverflow.com/questions/3640764/can-i-recover-a-branch-after-its-deletion-in-git Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12contrib: remove "persistent-https" remote helperPatrick Steinhardt7-875/+0
The "persistent-https" remote helper supposedly speeds up SSL operations by running a daemon that keeps a connection open to a remote server. It is effectively unmaintained nowadays: the last time it received an update was in accb613afd2 (contrib/persistent-https: use Git version for build label, 2016-07-20) and its parent commits to make it compile with Go 1.7+. This Go toolchain is somewhat dated by now though and unsupported. The oldest still-supported toolchain is Go 1.23, which was released in August 2024. It is not possible to compile the remote helper with that Go version anymore: $ go version go version go1.23.8 linux/amd64 $ make case $(go version) in \ "go version go"1.[0-5].*) EQ=" " ;; *) EQ="=" ;; esac && \ go build -o git-remote-persistent-https \ -ldflags "-X main._BUILD_EMBED_LABEL${EQ}GIT_VERSION=2.49.0.943.g965a70ebf62" go: cannot find main module, but found .git/config in /home/pks/Development/git to create a module there, run: cd ../.. && go mod init make: *** [Makefile:31: git-remote-persistent-https] Error 1 The problem is that modern Go toolchains require a "go.mod" file, but we don't have any such files. This requirement exists since quite a while already, so it's clear that nobody has tried to use this remote helper anytime recent. Remove the remote helper. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12contrib: remove "mw-to-git"Patrick Steinhardt22-3911/+0
The "mw-to-git" directory contains tools for accessing MediaWiki via Git. The scripts are essentially unmaintained in Git: despite a couple of global cleanups, the last changes were a couple of security-related issues part of 9a8606465e8 (remote-mediawiki: use "sh" to eliminate unquoted commands, 2020-09-21) and its parents. We don't ever run any of the tests so it is more likely than not that many of the tests have been bitrotting, like e.g. documented in f8ab018dafc (remote-mediawiki tests: annotate failing tests, 2020-09-21). According to Matthieu Moy [1], one of the original developers of this tool, it didn't receive any attention recently and there is no motivation to keep maintaining it anymore in the community. The project has been spun out of Git [2] and thus has a new official home, but did not receive much attention over there, either. As such, it seems like the MediaWiki transport helper is slowly fading away. But given that there is a new home, it doesn't make sense to have it as part of Git anymore only to let it rot. Remove the directory. [1]: <108f297a-b415-4742-80e4-51ea02af18e9@matthieu-moy.fr> [2]: https://github.com/Git-Mediawiki/Git-Mediawiki Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12contrib: remove "hooks" directoryPatrick Steinhardt5-1443/+0
The "hooks" directory contains a handful of example hooks. Most of these hooks are highly specific and haven't really received any updates over the last couple of years, except for some global cleanups. The multimail hook has also been removed in f74d11471fa (multimail: stop shipping a copy, 2021-06-10) in favor of its upstream project [1]. Remove those hooks. If we want to provide examples for how to use Git hooks we should do that as part of our documentation, for example in githooks(5). [1]: https://github.com/git-multimail/git-multimail Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12contrib: remove "thunderbird-patch-inline"Patrick Steinhardt2-75/+0
The "thunderbird-patch-inline" directory in "contrib/" contains a script to send patch files via Thunderbird. This script depends on the ExternalEditor extension [1], which seems to be effectively unmaintained with the last update being in 2008. While the extension has eventually been maintained in [2], that fork hasn't received any updates since 2020, either. As such, the ExternalEditor extension does not work with modern versions of Thunderbird anymore, and as the "thunderbird-patch-inline" script depends on the ExternalEditor extension it likely doesn't work anymore, either. The fact that this script hasn't been touched for the last 10 years outside of some global cleanup supports the idea that it is not useful anymore. Remove it. [1]: https://globs.org/articles.php?lng=en&pg=2 [2]: https://github.com/exteditor/exteditor/releases Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12contrib: remove remote-helper stubsPatrick Steinhardt3-37/+0
The "remote-helpers" directory contains two remote helper scripts for Mercurial and Bazaar. These scripts have since been converted into stubs in b2c851a8e67 (Revert "Merge branch 'jc/graduate-remote-hg-bzr' (early part)", 2014-05-20) as the helpers have been moved into their own upstream projects [1][2]. Given that these stubs have been created more than a decade ago it is very unlikely that anybody still tries to use them. Remove them. [1]: https://github.com/felipec/git-remote-bzr [1]: https://github.com/felipec/git-remote-hg Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12contrib: remove "examples" directoryPatrick Steinhardt1-20/+0
The "examples" directory used to contain scripted versions of some of our builtins. These have all been removed in 49eb8d39c78 (Remove contrib/examples/*, 2018-03-25), but we left a note in the directory to make it discoverable that there used to be examples. It is unlikely that anybody still looks at these examples more than 7 years after they have been removed. Remove the note and its directory. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12contrib: remove "remotes2config.sh"Patrick Steinhardt1-33/+0
Remotes can be configured either via a repository's config or by using the ".git/branches/" or ".git/remotes/" directories. Back when the new config-based mechanism has been introduced we also introduced a helper script that migrates from the old-style remote configuration to the new config-based mechanism. With the recent removal announcement for the two directories we also started to instruct users to migrate repositories that still use these mechanism to use config-based remotes. Notably though, the migration path doesn't even use the migration script. Instead, git-remote(1) itself knows how to migrate any such remote via `git remote rename`. In fact, a full migration _cannot_ use the script as it only knows to migrate remotes from ".git/remotes/", but not ".git/branches/". As such, the migration path via `git remote rename` is the only feasible way to fully migrate repositories over to the new format. Last but not least, the script doesn't even work as-is as it sources "git-sh-setup". For this to work it would need to be invoked either via Git so that this script is in our PATH, users would have to manually call it with an adjusted PATH, or distributions need to install the script into "$prefix/libexec/git-core" with a "git-" prefix. All of these steps are unlikely enough to underpin the claim that this script is not used at all. So given that: - The script cannot perform a full migration of all deprecated remote types. - We don't advertise it anywhere. - It has been basically untouched since 2007. - It doesn't even work unless users do manual steps. It should be safe enough to just remove it. Do so. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12reftable: fix perf regression when reading blocks of unwanted typePatrick Steinhardt4-17/+19
In fd888311fbc (reftable/table: move reading block into block reader, 2025-04-07), we have refactored how reftable blocks are read so that most of the logic is contained in the "block.c" subsystem itself. Most importantly, the whole logic to read the data itself is now contained in that subsystem. This change caused a significant performance regression though when reading blocks that aren't of the specific type one is searching for: Benchmark 1: update-ref: create 100k refs (revision = fd888311fbc~) Time (mean ± σ): 2.171 s ± 0.028 s [User: 1.189 s, System: 0.977 s] Range (min … max): 2.117 s … 2.206 s 10 runs Benchmark 2: update-ref: create 100k refs (revision = fd888311fbc) Time (mean ± σ): 3.418 s ± 0.030 s [User: 2.371 s, System: 1.037 s] Range (min … max): 3.377 s … 3.473 s 10 runs Summary update-ref: create 100k refs (revision = fd888311fbc~) ran 1.57 ± 0.02 times faster than update-ref: create 100k refs (revision = fd888311fbc) The root caute of the performance regression is that we changed when exactly blocks of an uninteresting type are being discarded. Previous to the refactoring in the mentioned commit we'd load the block data, read its type, notice that it's not the wanted type and discard the block. After the commit though we don't discard the block immediately, but we fully decode it only to realize that it's not the desired type. We then discard the block again, but have already performed a bunch of pointless work. Fix the regression by making `reftable_block_init()` return early in case the block is not of the desired type. This fixes the performance hit: Benchmark 1: update-ref: create 100k refs (revision = HEAD~) Time (mean ± σ): 2.712 s ± 0.018 s [User: 1.990 s, System: 0.716 s] Range (min … max): 2.682 s … 2.741 s 10 runs Benchmark 2: update-ref: create 100k refs (revision = HEAD) Time (mean ± σ): 1.670 s ± 0.012 s [User: 0.991 s, System: 0.676 s] Range (min … max): 1.652 s … 1.693 s 10 runs Summary update-ref: create 100k refs (revision = HEAD) ran 1.62 ± 0.02 times faster than update-ref: create 100k refs (revision = HEAD~) Note that the baseline performance is lower than in the original due to a couple of unrelated performance improvements that have landed since the original commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12builtin/am: fix memory leak in `split_mail_stgit_series`Lidong Yan1-1/+3
In builtin/am.c:split_mail_stgit_series, if `fopen` failed, `series_dir_buf` allocated by `xstrdup` will leak. Add `free` in `!fp` if branch will prevent the leak. Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12t1001: replace 'test -f' with 'test_path_is_file'Rodrigo Carvalho1-1/+1
'test_path_is_file' is a modern path checking method in Git's development. Replace the basic shell command 'test -f' with this approach. Signed-off-by: Rodrigo Carvalho <rodrigorsdc@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12git-var doc: fix usage of $ENV_VAR vs ENV_VARJean-Noël Avila1-21/+19
When refering to environment variables in the documentation, use the ENV_VARIABLE format instead of $ENV_VARIABLE. The latter is used in the documentation to refer to the actual value of the variable, not the name of the variable. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12git-verify-* doc: update mark-up of synopsis option descriptionsJunio C Hamano3-34/+26
To unify mark-up used in our documentation to a newer convention, started by 22293895 (doc: apply synopsis simplification on git-clone and git-init, 2024-09-24), update the documentation pages for 'git verify-commit', 'git verify-tag', and 'git verify-pack' to * use [synopsis], not [verse] in the SYNOPSIS section * enclose `--option=<value>` in backquotes * do not describe non-option arguments in the OPTIONS section Signed-off-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12git-{var,write-tree} docs: update mark-up of synopsis option descriptionsJunio C Hamano2-12/+12
To unify mark-up used in our documentation to a newer convention, started by 22293895 (doc: apply synopsis simplification on git-clone and git-init, 2024-09-24), update the documentation for 'git var' and 'git write-tree' to * use [synopsis], not [verse] in the SYNOPSIS section * enclose `--option=<value>` in backquotes Signed-off-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12git-daemon doc: update mark-up of synopsis option descriptionsJunio C Hamano1-90/+91
To unify mark-up used in our documentation to a newer convention, started by 22293895 (doc: apply synopsis simplification on git-clone and git-init, 2024-09-24), update the documentation of 'git daemon' to * use [synopsis], not [verse] in the SYNOPSIS section * enclose `--option=<value>` in backquotes Also, split '--[no-]option' into '--option' and '--no-option' to make it easier to grep for them. Signed-off-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12reftable/writer: fix memory leak when `writer_index_hash()` failsLidong Yan1-1/+3
In reftable/writer.c:writer_index_hash(), if `reftable_buf_add` failed, key allocated by `reftable_malloc` will not be insert into `obj_index_tree` thus leaks. Simple add reftable_free(key) will solve this problem. Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12reftable/writer: fix memory leak when `padded_write()` failsLidong Yan1-1/+3
In reftable/writer.c:padded_write(), if w->writer failed, zeroed allocated in `reftable_calloc` will leak. w->writer could be `reftable_write_data` in reftable/stack.c, and could fail due to some write error. Simply add reftable_free(zeroed) will solve this problem. Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-10gitk: Legacy widgets doesn't have comboboxYOKOTA Hiroshi1-4/+2
Use "proc makedroplist" function to support combobox on legacy widgets mode. "proc makedroplist" uses "ttk::combobox" for themed mode, and uses "tk_optionMenu" for legacy mode to get rid of the problem. Signed-off-by: YOKOTA Hiroshi <yokota.hgml@gmail.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2025-05-09Makefile: avoid constant rebuilds with compilation databasebrian m. carlson1-1/+1
Many contributors to software use a Language Server Protocol implementation to allow their editor to learn structural information about the code they write and provide additional features, such as jumping to the declaration or definition of a function or type. In C, the usual implementation is clangd, which requires compiling with clang. Because C and C++ projects lack a standard file system layout and build system, unlike languages such as Rust and Go, clangd requires a compilation database to be generated by the clang compiler in order to pass the proper compilation flags and discover all of the files necessary to make the LSP work. This is done by setting GENERATE_COMPILATION_DATABASE to "yes". However, when that's enabled and the user runs "make" a second time, all of the files are re-compiled, which is inconvenient for contributors to Git, since it makes small changes or rebases recompile the entirety of the codebase. This happens because the directory holding the compilation database is updated anytime an object is built, so its modification date will always be newer than the first object built. To solve this, use the same trick we do just above for the .depend directory and filter the compilation database directory out if it already exists, which avoids making it a target to be built. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Helped-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-09sequencer: rework reflog message handlingPhillip Wood2-27/+34
It has been reported that "git rebase --rebase-merges" can create corrupted reflog entries like e9c962f2ea0 HEAD@{8}: <binary>�: Merged in <branch> (pull request #4441) This is due to a use-after-free bug that happens because reflog_message() uses a static `struct strbuf` and is not called to update the current reflog message stored in `ctx->reflog_message` when creating the merge. This means `ctx->reflog_message` points to a stale reflog message that has been freed by subsequent call to reflog_message() by a command such as `reset` that used the return value directly rather than storing the result in `ctx->reflog_message`. Fix this by creating the reflog message nearer to where the commit is created and storing it in a local variable which is passed as an additional parameter to run_git_commit() rather than storing the message in `struct replay_ctx`. This makes it harder to forget to call `reflog_message()` before creating a commit and using a variable with a narrower scope means that a stale value cannot carried across a from one iteration of the loop to the next which should prevent any similar use-after-free bugs in the future. A existing test is modified to demonstrate that merges are now created with the correct reflog message. Reported-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-09sequencer: move reflog message functionsPhillip Wood1-33/+33
In the next commit these functions will be called from pick_one_commit() so move them above that function to avoid a forward declaration. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-09Merge branch 'master' of https://github.com/j6t/gitkJunio C Hamano3-126/+1553
* 'master' of https://github.com/j6t/gitk: gitk: add Tamil translation gitk: limit PATH search to bare executable names gitk: _search_exe is no longer needed gitk: override $PATH search only on Windows gitk: adjust indentation to match the style used in this script
2025-05-09Merge branch 'master' of https://github.com/j6t/git-guiJunio C Hamano6-2732/+23
* 'master' of https://github.com/j6t/git-gui: git-gui: treat the message template file as a built file git-gui: heed core.commentChar/commentString git-gui: po/README: update repository location and maintainer
2025-05-09Merge branch 'js/po-update-workflow'Johannes Sixt4-2731/+12
* js/po-update-workflow: git-gui: treat the message template file as a built file git-gui: po/README: update repository location and maintainer Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2025-05-09Merge branch 'at/translation-tamil'Johannes Sixt2-0/+1458
* at/translation-tamil: gitk: add Tamil translation Signed-off-by: Johannes Sixt <j6t@kdbg.org>
2025-05-08The twelfth batchJunio C Hamano1-0/+10
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-08Merge branch 'js/diff-codeql-false-positive-workaround'Junio C Hamano1-1/+1
Work around false positive given by CodeQL. * js/diff-codeql-false-positive-workaround: diff: check range before dereferencing an array element
2025-05-08Merge branch 'ps/mv-contradiction-fix'Junio C Hamano2-7/+81
"git mv a a/b dst" would ask to move the directory 'a' itself, as well as its contents, in a single destination directory, which is a contradicting request that is impossible to satisfy. This case is now detected and the command errors out. * ps/mv-contradiction-fix: builtin/mv: convert assert(3p) into `BUG()` builtin/mv: bail out when trying to move child and its parent
2025-05-08Merge branch 'en/hashmap-clear-fix'Junio C Hamano1-2/+3
hashmap API clean-up to ensure hashmap_clear() leaves a cleared map in a reusable state. * en/hashmap-clear-fix: hashmap: ensure hashmaps are reusable after hashmap_clear()
2025-05-08docs: add credential helper for outlook and gmail in OAuth list of helpersAditya Garg1-0/+4
This commit adds the `git-credential-outlook` and `git-credential-gmail` helpers to the list of OAuth helpers. Signed-off-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-08docs: improve send-email documentationAditya Garg1-8/+59
OAuth2.0 is a new authentication method that is being used by many email providers, including Outlook and Gmail. Recently, the Authen::SASL perl module has been updated to support OAuth2.0 authentication, thus making the git-send-email script be able to use this authentication method as well. So lets improve the documentation to reflect this change. I also had a hard time finding a reliable OAuth2.0 access token generator for Outlook and Gmail. So I added a link to the such generators which I developed myself after seaching through lots of code and API documentation to make things easier for others. Signed-off-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-08send-mail: improve checks for valid_fqdnAditya Garg1-1/+3
The current implementation of a valid Fully Qualified Domain Name is not that strict. It just checks whether it has a dot (.) and if using macOS, it should not end with .local. As per RFC1035[1], from what I understood, the following checks need to be done: - The domain must contain atleast one dot - Each label (separated by dots) must be 1-63 characters long - Labels must start and end with an alphanumeric character - Labels can contain alphanumeric characters and hyphens Here are some examples of valid and invalid labels: 'example.com', # Valid 'sub.example.com', # Valid 'my-domain.org', # Valid 'localhost', # Invalid (no dot) 'MacBook..', # Invalid (double dots) '-example.com', # Invalid (starts with a hyphen) 'example-.com', # Invalid (ends with a hyphen) 'example..com', # Invalid (double dots) 'example', # Invalid (no TLD) 'example.local', # Invalid on macOS 'valid-domain.co.uk', # Valid '123.example.com', # Valid 'example.com.', # Invalid (trailing dot) 'toolonglabeltoolonglabeltoolonglabeltoolonglabeltoolonglabeltoolonglabel.com', # Invalid (label > 63 chars) Due to current implementation, I was not able to send emails from Ubuntu. Upon debugging, I found that the SMTP domain being passed to Outlook's servers was not valid. Net::SMTP=GLOB(0x5db4351225f8)>>> EHLO MacBook.. Net::SMTP=GLOB(0x5db4351225f8)<<< 501 5.5.4 Invalid domain name Net::SMTP=GLOB(0x5db4351225f8)>>> HELO MacBook.. Notice that an invalid domain name "MacBook.." is sent by git-send-email. We have a fallback code that checks output from Net::Domain::domainname() or asking domain method of an Net::SMTP instance to detect a misconfigured hostname and replace it with fallback "localhost.localdomain", but the valid_fqdn apparently is failing to say "MacBook.." is not a valid fqdn. With this patch, the rule used in valid_fqdn is tightened, the beginning part of the SMTP exchange looked like this: Net::SMTP=GLOB(0x58c8af71e930)>>> EHLO localhost.localdomain Net::SMTP=GLOB(0x58c8af71e930)<<< 250-PN4P287CA0064.outlook.office365.com Hello [1]: https://datatracker.ietf.org/doc/html/rfc1035 Signed-off-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-08meson: allow customize perl installation pathĐoàn Trần Công Danh10-10/+19
Some distros, notably Fedora, want to install non-core Perl libraries into specific directory, namely /usr/share/perl5/vendor_perl. The Makefile build system allows this by overriding perllibdir variable, let's make meson works on par with our Makefile. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-07scalar reconfigure: add --maintenance=<mode> optionDerrick Stolee3-6/+47
When users want to enable the latest and greatest configuration options recommended by Scalar after a Git upgrade, 'scalar reconfigure --all' is a great option that iterates over all repos in the multi-valued 'scalar.repos' config key. However, this feature previously forced users to enable background maintenance. In some environments this is not preferred. Add a new --maintenance=<mode> option to 'scalar reconfigure' that provides options for enabling (default), disabling, or leaving background maintenance config as-is. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-07scalar clone: add --no-maintenance optionDerrick Stolee3-5/+22
When creating a new enlistment via 'scalar clone', the default is to set up situations that work for most user scenarios. Background maintenance is one of those highly-recommended options for most users. However, when using 'scalar clone' to create an enlistment in a different situation, such as prepping a VM image, it may be valuable to disable background maintenance so the manual maintenance steps do not get blocked by concurrent background maintenance activities. Add a new --no-maintenance option to 'scalar clone'. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>