aboutsummaryrefslogtreecommitdiffstats
path: root/t/helper
AgeCommit message (Collapse)AuthorFilesLines
2022-09-01Merge branch 'sg/parse-options-subcommand'Junio C Hamano4-1/+132
Introduce the "subcommand" mode to parse-options API and update the command line parser of Git commands with subcommands. * sg/parse-options-subcommand: (23 commits) remote: run "remote rm" argv through parse_options() maintenance: add parse-options boilerplate for subcommands pass subcommand "prefix" arguments to parse_options() builtin/worktree.c: let parse-options parse subcommands builtin/stash.c: let parse-options parse subcommands builtin/sparse-checkout.c: let parse-options parse subcommands builtin/remote.c: let parse-options parse subcommands builtin/reflog.c: let parse-options parse subcommands builtin/notes.c: let parse-options parse subcommands builtin/multi-pack-index.c: let parse-options parse subcommands builtin/hook.c: let parse-options parse subcommands builtin/gc.c: let parse-options parse 'git maintenance's subcommands builtin/commit-graph.c: let parse-options parse subcommands builtin/bundle.c: let parse-options parse subcommands parse-options: add support for parsing subcommands parse-options: drop leading space from '--git-completion-helper' output parse-options: clarify the limitations of PARSE_OPT_NODASH parse-options: PARSE_OPT_KEEP_UNKNOWN only applies to --options api-parse-options.txt: fix description of OPT_CMDMODE t0040-parse-options: test parse_options() with various 'parse_opt_flags' ...
2022-08-29Merge branch 'mt/rot13-in-c'Junio C Hamano3-0/+384
Test portability improvements. * mt/rot13-in-c: tests: use the new C rot13-filter helper to avoid PERL prereq t0021: implementation the rot13-filter.pl script in C t0021: avoid grepping for a Perl-specific string at filter output
2022-08-19parse-options: add support for parsing subcommandsSZEDER Gábor3-0/+63
Several Git commands have subcommands to implement mutually exclusive "operation modes", and they usually parse their subcommand argument with a bunch of if-else if statements. Teach parse-options to handle subcommands as well, which will result in shorter and simpler code with consistent error handling and error messages on unknown or missing subcommand, and it will also make possible for our Bash completion script to handle subcommands programmatically. The approach is guided by the following observations: - Most subcommands [1] are implemented in dedicated functions, and most of those functions [2] either have a signature matching the 'int cmd_foo(int argc, const char **argc, const char *prefix)' signature of builtin commands or can be trivially converted to that signature, because they miss only that last prefix parameter or have no parameters at all. - Subcommand arguments only have long form, and they have no double dash prefix, no negated form, and no description, and they don't take any arguments, and can't be abbreviated. - There must be exactly one subcommand among the arguments, or zero if the command has a default operation mode. - All arguments following the subcommand are considered to be arguments of the subcommand, and, conversely, arguments meant for the subcommand may not preceed the subcommand. So in the end subcommand declaration and parsing would look something like this: parse_opt_subcommand_fn *fn = NULL; struct option builtin_commit_graph_options[] = { OPT_STRING(0, "object-dir", &opts.obj_dir, N_("dir"), N_("the object directory to store the graph")), OPT_SUBCOMMAND("verify", &fn, graph_verify), OPT_SUBCOMMAND("write", &fn, graph_write), OPT_END(), }; argc = parse_options(argc, argv, prefix, options, builtin_commit_graph_usage, 0); return fn(argc, argv, prefix); Here each OPT_SUBCOMMAND specifies the name of the subcommand and the function implementing it, and the address of the same 'fn' subcommand function pointer. parse_options() then processes the arguments until it finds the first argument matching one of the subcommands, sets 'fn' to the function associated with that subcommand, and returns, leaving the rest of the arguments unprocessed. If none of the listed subcommands is found among the arguments, parse_options() will show usage and abort. If a command has a default operation mode, 'fn' should be initialized to the function implementing that mode, and parse_options() should be invoked with the PARSE_OPT_SUBCOMMAND_OPTIONAL flag. In this case parse_options() won't error out when not finding any subcommands, but will return leaving 'fn' unchanged. Note that if that default operation mode has any --options, then the PARSE_OPT_KEEP_UNKNOWN_OPT flag is necessary as well (otherwise parse_options() would error out upon seeing the unknown option meant to the default operation mode). Some thoughts about the implementation: - The same pointer to 'fn' must be specified as 'value' for each OPT_SUBCOMMAND, because there can be only one set of mutually exclusive subcommands; parse_options() will BUG() otherwise. There are other ways to tell parse_options() where to put the function associated with the subcommand given on the command line, but I didn't like them: - Change parse_options()'s signature by adding a pointer to subcommand function to be set to the function associated with the given subcommand, affecting all callsites, even those that don't have subcommands. - Introduce a specific parse_options_and_subcommand() variant with that extra funcion parameter. - I decided against automatically calling the subcommand function from within parse_options(), because: - There are commands that have to perform additional actions after option parsing but before calling the function implementing the specified subcommand. - The return code of the subcommand is usually the return code of the git command, but preserving the return code of the automatically called subcommand function would have made the API awkward. - Also add a OPT_SUBCOMMAND_F() variant to allow specifying an option flag: we have two subcommands that are purposefully excluded from completion ('git remote rm' and 'git stash save'), so they'll have to be specified with the PARSE_OPT_NOCOMPLETE flag. - Some of the 'parse_opt_flags' don't make sense with subcommands, and using them is probably just an oversight or misunderstanding. Therefore parse_options() will BUG() when invoked with any of the following flags while the options array contains at least one OPT_SUBCOMMAND: - PARSE_OPT_KEEP_DASHDASH: parse_options() stops parsing arguments when encountering a "--" argument, so it doesn't make sense to expect and keep one before a subcommand, because it would prevent the parsing of the subcommand. However, this flag is allowed in combination with the PARSE_OPT_SUBCOMMAND_OPTIONAL flag, because the double dash might be meaningful for the command's default operation mode, e.g. to disambiguate refs and pathspecs. - PARSE_OPT_STOP_AT_NON_OPTION: As its name suggests, this flag tells parse_options() to stop as soon as it encouners a non-option argument, but subcommands are by definition not options... so how could they be parsed, then?! - PARSE_OPT_KEEP_UNKNOWN: This flag can be used to collect any unknown --options and then pass them to a different command or subsystem. Surely if a command has subcommands, then this functionality should rather be delegated to one of those subcommands, and not performed by the command itself. However, this flag is allowed in combination with the PARSE_OPT_SUBCOMMAND_OPTIONAL flag, making possible to pass --options to the default operation mode. - If the command with subcommands has a default operation mode, then all arguments to the command must preceed the arguments of the subcommand. AFAICT we don't have any commands where this makes a difference, because in those commands either only the command accepts any arguments ('notes' and 'remote'), or only the default subcommand ('reflog' and 'stash'), but never both. - The 'argv' array passed to subcommand functions currently starts with the name of the subcommand. Keep this behavior. AFAICT no subcommand functions depend on the actual content of 'argv[0]', but the parse_options() call handling their options expects that the options start at argv[1]. - To support handling subcommands programmatically in our Bash completion script, 'git cmd --git-completion-helper' will now list both subcommands and regular --options, if any. This means that the completion script will have to separate subcommands (i.e. words without a double dash prefix) from --options on its own, but that's rather easy to do, and it's not much work either, because the number of subcommands a command might have is rather low, and those commands accept only a single --option or none at all. An alternative would be to introduce a separate option that lists only subcommands, but then the completion script would need not one but two git invocations and command substitutions for commands with subcommands. Note that this change doesn't affect the behavior of our Bash completion script, because when completing the --option of a command with subcommands, e.g. for 'git notes --<TAB>', then all subcommands will be filtered out anyway, as none of them will match the word to be completed starting with that double dash prefix. [1] Except 'git rerere', because many of its subcommands are implemented in the bodies of the if-else if statements parsing the command's subcommand argument. [2] Except 'credential', 'credential-store' and 'fsmonitor--daemon', because some of the functions implementing their subcommands take special parameters. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19parse-options: PARSE_OPT_KEEP_UNKNOWN only applies to --optionsSZEDER Gábor2-4/+4
The description of 'PARSE_OPT_KEEP_UNKNOWN' starts with "Keep unknown arguments instead of erroring out". This is a bit misleading, as this flag only applies to unknown --options, while non-option arguments are kept even without this flag. Update the description to clarify this, and rename the flag to PARSE_OPTIONS_KEEP_UNKNOWN_OPT to make this obvious just by looking at the flag name. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19t0040-parse-options: test parse_options() with various 'parse_opt_flags'SZEDER Gábor3-0/+68
In 't0040-parse-options.sh' we thoroughly test the parsing of all types and forms of options, but in all those tests parse_options() is always invoked with a 0 flags parameter. Add a few tests to demonstrate how various 'enum parse_opt_flags' values are supposed to influence option parsing. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-14t0021: implementation the rot13-filter.pl script in CMatheus Tavares3-0/+384
This script is currently used by three test files: t0021-conversion.sh, t2080-parallel-checkout-basics.sh, and t2082-parallel-checkout-attributes.sh. To avoid the need for the PERL dependency at these tests, let's convert the script to a C test-tool command. The following commit will take care of actually modifying the said tests to use the new C helper and removing the Perl script. The Perl script flushes the log file handler after each write. As commented in [1], this seems to be an early design decision that was later reconsidered, but possibly ended up being left in the code by accident: >> +$debug->flush(); > > Isn't $debug flushed automatically? Maybe, but autoflush is not explicitly enabled. I will enable it again (I disabled it because of Eric's comment but I re-read the comment and he is only talking about pipes). Anyways, this behavior is not really needed for the tests and the flush() calls make the code slightly larger, so let's avoid them altogether in the new C version. [1]: https://lore.kernel.org/git/7F1F1A0E-8FC3-4FBD-81AA-37786DE0EF50@gmail.com/ Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12Merge branch 'ab/plug-revisions-leak'Junio C Hamano1-2/+0
Plug a bit more leaks in the revisions API. * ab/plug-revisions-leak: revisions API: don't leak memory on argv elements that need free()-ing bisect.c: partially fix bisect_rev_setup() memory leak log: refactor "rev.pending" code in cmd_show() log: fix a memory leak in "git show <revision>..." test-fast-rebase helper: use release_revisions() (again) bisect.c: add missing "goto" for release_revisions()
2022-08-03Merge branch 'rs/mergesort'Junio C Hamano1-27/+7
Make our mergesort implementation type-safe. * rs/mergesort: mergesort: remove llist_mergesort() packfile: use DEFINE_LIST_SORT fetch-pack: use DEFINE_LIST_SORT commit: use DEFINE_LIST_SORT blame: use DEFINE_LIST_SORT test-mergesort: use DEFINE_LIST_SORT test-mergesort: use DEFINE_LIST_SORT_DEBUG mergesort: add macros for typed sort of linked lists mergesort: tighten merge loop mergesort: unify ranks loops
2022-08-03test-fast-rebase helper: use release_revisions() (again)Ævar Arnfjörð Bjarmason1-2/+0
Fix a bug in 0139c58ab95 (revisions API users: add "goto cleanup" for release_revisions(), 2022-04-13), in that commit a release_revisions() call was added to this function, but it never did anything due to this TODO memset() added in fe1a21d5267 (fast-rebase: demonstrate merge-ort's API via new test-tool command, 2020-10-29). Simply removing the memset() will fix the "cmdline" which can be seen when running t5520-pull.sh. This sort of thing could be detected automatically with a rule similar to the unused.cocci merged in 7fa60d2a5b6 (Merge branch 'ab/cocci-unused' into next, 2022-07-11). The following rule on top would catch the case being fixed here: @@ type T; identifier I; identifier REL1 =~ "^[a-z_]*_(release|reset|clear|free)$"; identifier REL2 =~ "^(release|clear|free)_[a-z_]*$"; @@ - memset(\( I \| &I \), 0, ...); ... when != \( I \| &I \) ( \( REL1 \| REL2 \)( \( I \| &I \), ...); | \( REL1 \| REL2 \)( \( &I \| I \) ); ) ... when != \( I \| &I \) That rule should arguably use only &I, not I (as we might be passed a pointer). The distinction would matter if anyone cared about the side-effects of a memset() followed by release() of a pointer to a variable passed into the function. As such a pattern would be at best very confusing, and most likely point to buggy code as in this case, the above rule is probably fine as-is. But as this rule only found one such bug in the entire codebase let's not add it to contrib/coccinelle/unused.cocci for now, we can always dig it up in the future if it's deemed useful. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-17test-mergesort: use DEFINE_LIST_SORTRené Scharfe1-12/+3
Build a typed sort function for the mergesort performance test tool using DEFINE_LIST_SORT instead of calling llist_mergesort(). This gets rid of the next pointer accessor functions and improves the performance at the cost of a slightly higher object text size. Before: 0071.12: llist_mergesort() unsorted 0.24(0.22+0.01) 0071.14: llist_mergesort() sorted 0.12(0.10+0.01) 0071.16: llist_mergesort() reversed 0.12(0.10+0.01) __TEXT __DATA __OBJC others dec hex 6407 276 0 24701 31384 7a98 t/helper/test-mergesort.o With this patch: 0071.12: DEFINE_LIST_SORT unsorted 0.22(0.21+0.01) 0071.14: DEFINE_LIST_SORT sorted 0.11(0.10+0.01) 0071.16: DEFINE_LIST_SORT reversed 0.11(0.10+0.01) __TEXT __DATA __OBJC others dec hex 6615 276 0 25832 32723 7fd3 t/helper/test-mergesort.o Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-17test-mergesort: use DEFINE_LIST_SORT_DEBUGRené Scharfe1-15/+4
Define a typed sort function using DEFINE_LIST_SORT_DEBUG for the mergesort sanity check instead of using llist_mergesort(). This gets rid of the next pointer accessor functions and improves the performance at the cost of slightly bigger object text. Before: Benchmark 1: t/helper/test-tool mergesort test Time (mean ± σ): 108.4 ms ± 0.2 ms [User: 106.7 ms, System: 1.2 ms] Range (min … max): 108.0 ms … 108.8 ms 27 runs __TEXT __DATA __OBJC others dec hex 6251 276 0 23172 29699 7403 t/helper/test-mergesort.o With this patch: Benchmark 1: t/helper/test-tool mergesort test Time (mean ± σ): 94.0 ms ± 0.2 ms [User: 92.4 ms, System: 1.1 ms] Range (min … max): 93.7 ms … 94.5 ms 31 runs __TEXT __DATA __OBJC others dec hex 6407 276 0 24701 31384 7a98 t/helper/test-mergesort.o Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01test-tool delta: fix a memory leakÆvar Arnfjörð Bjarmason1-7/+14
Fix a memory leak introduced in a310d434946 ([PATCH] Deltification library work by Nicolas Pitre., 2005-05-19), as a result we can mark another test as passing with SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01test-tool ref-store: fix a memory leakÆvar Arnfjörð Bjarmason1-0/+1
Fix a memory leak introduced in fa099d23227 (worktree.c: kill parse_ref() in favor of refs_resolve_ref_unsafe(), 2017-04-24), as a result we can mark another test as passing with SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01test-tool bloom: fix memory leaksÆvar Arnfjörð Bjarmason1-0/+2
Fix memory leaks introduced with these tests in f1294eaf7fb (bloom.c: introduce core Bloom filter constructs, 2020-03-30), as a result we can mark almost the entirety of t0095-bloom.sh as passing with SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true", there's still an unrelated memory leak in "git commit" in one of the tests, let's skip that one under SANITIZE_LEAK for now. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01test-tool json-writer: fix memory leaksÆvar Arnfjörð Bjarmason1-4/+12
Fix memory leaks introduced with these tests in 75459410edd (json_writer: new routines to create JSON data, 2018-07-13), as a result we can mark a test as passing with SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01test-tool regex: call regfree(), fix memory leaksÆvar Arnfjörð Bjarmason1-3/+6
Fix memory leaks in "test-tool regex" which have been there since c91841594c2 (test-regex: Add a test to check for a bug in the regex routines, 2012-09-01), as a result we can mark a test as passing with SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true". We could regfree() on the die() paths here, which would make some invocations of valgrind(1) happy, but let's just target SANITIZE=leak for now. Variables that are still reachable when we die() are not reported as leaks. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01test-tool urlmatch-normalization: fix a memory leakÆvar Arnfjörð Bjarmason1-3/+8
Fix a memory leak in "test-tool urlmatch-normalization", as a result we can mark the corresponding test as passing with SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01test-tool {dump,scrap}-cache-tree: fix memory leaksÆvar Arnfjörð Bjarmason2-1/+7
Fix memory leaks in two test-tools used by t0090-cache-tree.sh. As a result we can mark the test as passing with SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01test-tool path-utils: fix a memory leakÆvar Arnfjörð Bjarmason1-4/+7
Fix a memory leak in "test-tool path-utils", as a result we can mark the corresponding test as passing with SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01test-tool test-hash: fix a memory leakÆvar Arnfjörð Bjarmason1-0/+1
Fix a memory leak in "test-tool test-hash" which has been there since b57cbbf8a86 (test-sha1: test hashing large buffer, 2006-06-24), as a result we can mark more tests as passing with SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-17Merge branch 'jk/bug-fl-va-list-fix'Junio C Hamano1-2/+2
Fix buggy va_list usage in recent code. * jk/bug-fl-va-list-fix: bug_fl(): correctly initialize trace2 va_list
2022-06-16bug_fl(): correctly initialize trace2 va_listJeff King1-2/+2
The code added 0cc05b044f (usage.c: add a non-fatal bug() function to go with BUG(), 2022-06-02) sets up two va_list variables: one to output to stderr, and one to trace2. But the order of initialization is wrong: va_list ap, cp; va_copy(cp, ap); va_start(ap, fmt); We copy the contents of "ap" into "cp" before it is initialized, meaning it is full of garbage. The two should be swapped. However, there's another bug, noticed by Johannes Schindelin: we forget to call va_end() for the copy. So instead of just fixing the copy's initialization, let's do two separate start/end pairs. This is allowed by the standard, and we don't need to use copy here since we have access to the original varargs. Matching the pairs with the calls makes it more obvious that everything is being done correctly. Note that we do call bug_fl() in the tests, but it didn't trigger this problem because our format string doesn't have any placeholders. So even though we were passing a garbage va_list through the stack, nobody ever needed to look at it. We can easily adjust one of the trace2 tests to trigger this, both for bug() and for BUG(). The latter isn't broken, but it's nice to exercise both a bit more. Without the fix in this patch (but with the test change), the bug() case causes a segfault. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-13Merge branch 'ab/hooks-regression-fix'Junio C Hamano1-3/+19
In Git 2.36 we revamped the way how hooks are invoked. One change that is end-user visible is that the output of a hook is no longer directly connected to the standard output of "git" that spawns the hook, which was noticed post release. This is getting corrected. * ab/hooks-regression-fix: hook API: fix v2.36.0 regression: hooks should be connected to a TTY run-command: add an "ungroup" option to run_process_parallel()
2022-06-10Merge branch 'ab/bug-if-bug'Junio C Hamano1-2/+27
A new bug() and BUG_if_bug() API is introduced to make it easier to uniformly log "detect multiple bugs and abort in the end" pattern. * ab/bug-if-bug: cache-tree.c: use bug() and BUG_if_bug() receive-pack: use bug() and BUG_if_bug() parse-options.c: use optbug() instead of BUG() "opts" check parse-options.c: use new bug() API for optbug() usage.c: add a non-fatal bug() function to go with BUG() common-main.c: move non-trace2 exit() behavior out of trace2.c
2022-06-10Merge branch 'jh/builtin-fsmonitor-part3'Junio C Hamano4-0/+138
More fsmonitor--daemon. * jh/builtin-fsmonitor-part3: (30 commits) t7527: improve implicit shutdown testing in fsmonitor--daemon fsmonitor--daemon: allow --super-prefix argument t7527: test Unicode NFC/NFD handling on MacOS t/lib-unicode-nfc-nfd: helper prereqs for testing unicode nfc/nfd t/helper/hexdump: add helper to print hexdump of stdin fsmonitor: on macOS also emit NFC spelling for NFD pathname t7527: test FSMonitor on case insensitive+preserving file system fsmonitor: never set CE_FSMONITOR_VALID on submodules t/perf/p7527: add perf test for builtin FSMonitor t7527: FSMonitor tests for directory moves fsmonitor: optimize processing of directory events fsm-listen-darwin: shutdown daemon if worktree root is moved/renamed fsm-health-win32: force shutdown daemon if worktree root moves fsm-health-win32: add polling framework to monitor daemon health fsmonitor--daemon: stub in health thread fsmonitor--daemon: rename listener thread related variables fsmonitor--daemon: prepare for adding health thread fsmonitor--daemon: cd out of worktree root fsm-listen-darwin: ignore FSEvents caused by xattr changes on macOS unpack-trees: initialize fsmonitor_has_run_once in o->result ...
2022-06-10Merge branch 'ab/env-array'Junio C Hamano1-1/+1
Rename .env_array member to .env in the child_process structure. * ab/env-array: run-command API users: use "env" not "env_array" in comments & names run-command API: rename "env_array" to "env"
2022-06-07Merge branch 'ab/plug-leak-in-revisions'Junio C Hamano2-7/+17
Plug the memory leaks from the trickiest API of all, the revision walker. * ab/plug-leak-in-revisions: (27 commits) revisions API: add a TODO for diff_free(&revs->diffopt) revisions API: have release_revisions() release "topo_walk_info" revisions API: have release_revisions() release "date_mode" revisions API: call diff_free(&revs->pruning) in revisions_release() revisions API: release "reflog_info" in release revisions() revisions API: clear "boundary_commits" in release_revisions() revisions API: have release_revisions() release "prune_data" revisions API: have release_revisions() release "grep_filter" revisions API: have release_revisions() release "filter" revisions API: have release_revisions() release "cmdline" revisions API: have release_revisions() release "mailmap" revisions API: have release_revisions() release "commits" revisions API users: use release_revisions() for "prune_data" users revisions API users: use release_revisions() with UNLEAK() revisions API users: use release_revisions() in builtin/log.c revisions API users: use release_revisions() in http-push.c revisions API users: add "goto cleanup" for release_revisions() stash: always have the owner of "stash_info" free it revisions API users: use release_revisions() needing REV_INFO_INIT revision.[ch]: document and move code declared around "init" ...
2022-06-07run-command: add an "ungroup" option to run_process_parallel()Ævar Arnfjörð Bjarmason1-3/+19
Extend the parallel execution API added in c553c72eed6 (run-command: add an asynchronous parallel child processor, 2015-12-15) to support a mode where the stdout and stderr of the processes isn't captured and output in a deterministic order, instead we'll leave it to the kernel and stdio to sort it out. This gives the API same functionality as GNU parallel's --ungroup option. As we'll see in a subsequent commit the main reason to want this is to support stdout and stderr being connected to the TTY in the case of jobs=1, demonstrated here with GNU parallel: $ parallel --ungroup 'test -t {} && echo TTY || echo NTTY' ::: 1 2 TTY TTY $ parallel 'test -t {} && echo TTY || echo NTTY' ::: 1 2 NTTY NTTY Another is as GNU parallel's documentation notes a potential for optimization. As demonstrated in next commit our results with "git hook run" will be similar, but generally speaking this shows that if you want to run processes in parallel where the exact order isn't important this can be a lot faster: $ hyperfine -r 3 -L o ,--ungroup 'parallel {o} seq ::: 10000000 >/dev/null ' Benchmark 1: parallel seq ::: 10000000 >/dev/null Time (mean ± σ): 220.2 ms ± 9.3 ms [User: 124.9 ms, System: 96.1 ms] Range (min … max): 212.3 ms … 230.5 ms 3 runs Benchmark 2: parallel --ungroup seq ::: 10000000 >/dev/null Time (mean ± σ): 154.7 ms ± 0.9 ms [User: 136.2 ms, System: 25.1 ms] Range (min … max): 153.9 ms … 155.7 ms 3 runs Summary 'parallel --ungroup seq ::: 10000000 >/dev/null ' ran 1.42 ± 0.06 times faster than 'parallel seq ::: 10000000 >/dev/null ' A large part of the juggling in the API is to make the API safer for its maintenance and consumers alike. For the maintenance of the API we e.g. avoid malloc()-ing the "pp->pfd", ensuring that SANITIZE=address and other similar tools will catch any unexpected misuse. For API consumers we take pains to never pass the non-NULL "out" buffer to an API user that provided the "ungroup" option. The resulting code in t/helper/test-run-command.c isn't typical of such a user, i.e. they'd typically use one mode or the other, and would know whether they'd provided "ungroup" or not. We could also avoid the strbuf_init() for "buffered_output" by having "struct parallel_processes" use a static PARALLEL_PROCESSES_INIT initializer, but let's leave that cleanup for later. Using a global "run_processes_parallel_ungroup" variable to enable this option is rather nasty, but is being done here to produce as minimal of a change as possible for a subsequent regression fix. This change is extracted from a larger initial version[1] which ends up with a better end-state for the API, but in doing so needed to modify all existing callers of the API. Let's defer that for now, and narrowly focus on what we need for fixing the regression in the subsequent commit. It's safe to do this with a global variable because: A) hook.c is the only user of it that sets it to non-zero, and before we'll get any other API users we'll refactor away this method of passing in the option, i.e. re-roll [1]. B) Even if hook.c wasn't the only user we don't have callers of this API that concurrently invoke this parallel process starting API itself in parallel. As noted above "A" && "B" are rather nasty, and we don't want to live with those caveats long-term, but for now they should be an acceptable compromise. 1. https://lore.kernel.org/git/cover-v2-0.8-00000000000-20220518T195858Z-avarab@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-02run-command API: rename "env_array" to "env"Ævar Arnfjörð Bjarmason1-1/+1
Start following-up on the rename mentioned in c7c4bdeccf3 (run-command API: remove "env" member, always use "env_array", 2021-11-25) of "env_array" to "env". The "env_array" name was picked in 19a583dc39e (run-command: add env_array, an optional argv_array for env, 2014-10-19) because "env" was taken. Let's not forever keep the oddity of "*_array" for this "struct strvec", but not for its "args" sibling. This commit is almost entirely made with a coccinelle rule[1]. The only manual change here is in run-command.h to rename the struct member itself and to change "env_array" to "env" in the CHILD_PROCESS_INIT initializer. The rest of this is all a result of applying [1]: * make contrib/coccinelle/run_command.cocci.patch * patch -p1 <contrib/coccinelle/run_command.cocci.patch * git add -u 1. cat contrib/coccinelle/run_command.pending.cocci @@ struct child_process E; @@ - E.env_array + E.env @@ struct child_process *E; @@ - E->env_array + E->env I've avoided changing any comments and derived variable names here, that will all be done in the next commit. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-02usage.c: add a non-fatal bug() function to go with BUG()Ævar Arnfjörð Bjarmason1-2/+27
Add a bug() function to use in cases where we'd like to indicate a runtime BUG(), but would like to defer the BUG() call because we're possibly accumulating more bug() callers to exhaustively indicate what went wrong. We already have this sort of facility in various parts of the codebase, just in the form of ad-hoc re-inventions of the functionality that this new API provides. E.g. this will be used to replace optbug() in parse-options.c, and the 'error("BUG:[...]' we do in a loop in builtin/receive-pack.c. Unlike the code this replaces we'll log to trace2 with this new bug() function (as with other usage.c functions, including BUG()), we'll also be able to avoid calls to xstrfmt() in some cases, as the bug() function itself accepts variadic sprintf()-like arguments. Any caller to bug() can follow up such calls with BUG_if_bug(), which will BUG() out (i.e. abort()) if there were any preceding calls to bug(), callers can also decide not to call BUG_if_bug() and leave the resulting BUG() invocation until exit() time. There are currently no bug() API users that don't call BUG_if_bug() themselves after a for-loop, but allowing for not calling BUG_if_bug() keeps the API flexible. As the tests and documentation here show we'll catch missing BUG_if_bug() invocations in our exit() wrapper. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-26t/helper/hexdump: add helper to print hexdump of stdinJeff Hostetler3-0/+32
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-26t/helper/fsmonitor-client: create stress testJeff Hostetler1-0/+106
Create a stress test to hammer on the fsmonitor daemon. Create a client-side thread pool of n threads and have each of them make m requests as fast as they can. We do not currently inspect the contents of the response. We're only interested in placing a heavy request load on the daemon. This test is useful for interactive testing and various experimentation. For example, to place additional load on the daemon while another test is running. We currently do not have a test script that actually uses this helper. We might add such a test in the future. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-26t/helper: add 'pack-mtimes' test-toolTaylor Blau3-0/+58
In the next patch, we will implement and test support for writing a cruft pack via a special mode of `git pack-objects`. To make sure that objects are written with the correct timestamps, and a new test-tool that can dump the object names and corresponding timestamps from a given `.mtimes` file. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-13revisions API users: add "goto cleanup" for release_revisions()Ævar Arnfjörð Bjarmason1-5/+13
Add a release_revisions() to various users of "struct rev_info" which requires a minor refactoring to a "goto cleanup" pattern to use that function. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-13revisions API users: add straightforward release_revisions()Ævar Arnfjörð Bjarmason1-0/+1
Add a release_revisions() to various users of "struct rev_list" in those straightforward cases where we only need to add the release_revisions() call to the end of a block, and don't need to e.g. refactor anything to use a "goto cleanup" pattern. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-13t/helper/test-fast-rebase.c: don't leak "struct strbuf"Ævar Arnfjörð Bjarmason1-2/+3
Fix a memory leak that's been with us since f9500261e0a (fast-rebase: write conflict state to working tree, index, and HEAD, 2021-05-20) changed this code to move these strbuf_release() into an if/else block. We'll also add to "reflog_msg" in the "else" arm of the "if" block being modified here, and we'll append to "branch_msg" in both cases. But after f9500261e0a only the "if" block would free these two "struct strbuf". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-04Merge branch 'jh/builtin-fsmonitor-part2'Junio C Hamano4-0/+133
Built-in fsmonitor (part 2). * jh/builtin-fsmonitor-part2: (30 commits) t7527: test status with untracked-cache and fsmonitor--daemon fsmonitor: force update index after large responses fsmonitor--daemon: use a cookie file to sync with file system fsmonitor--daemon: periodically truncate list of modified files t/perf/p7519: add fsmonitor--daemon test cases t/perf/p7519: speed up test on Windows t/perf/p7519: fix coding style t/helper/test-chmtime: skip directories on Windows t/perf: avoid copying builtin fsmonitor files into test repo t7527: create test for fsmonitor--daemon t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon help: include fsmonitor--daemon feature flag in version info fsmonitor--daemon: implement handle_client callback compat/fsmonitor/fsm-listen-darwin: implement FSEvent listener on MacOS compat/fsmonitor/fsm-listen-darwin: add MacOS header files for FSEvent compat/fsmonitor/fsm-listen-win32: implement FSMonitor backend on Windows fsmonitor--daemon: create token-based changed path cache fsmonitor--daemon: define token-ids fsmonitor--daemon: add pathname classification fsmonitor--daemon: implement 'start' command ...
2022-03-25t/helper/test-chmtime: skip directories on WindowsJeff Hostetler1-0/+15
Teach `test-tool.exe chmtime` to ignore errors when setting the mtime on a directory on Windows. NEEDSWORK: The Windows version of `utime()` (aka `mingw_utime()`) does not properly handle directories because it uses `_wopen()`. It should be converted to using `CreateFileW()` and backup semantics at a minimum. Since I'm already in the middle of a large patch series, I did not want to destabilize other callers of `utime()` right now. The problem has only been observed in the t/perf/p7519 test when the test repo contains an empty directory on disk. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-25t/helper/fsmonitor-client: create IPC client to talk to FSMonitor DaemonJeff Hostetler3-0/+118
Create an IPC client to send query and flush commands to the daemon. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-23Merge branch 'ep/remove-duplicated-includes'Junio C Hamano1-1/+0
Code clean-up. * ep/remove-duplicated-includes: attr.h: remove duplicate struct definition t/helper/test-run-command.c: delete duplicate include builtin/stash.c: delete duplicate include builtin/sparse-checkout.c: delete duplicate include builtin/gc.c: delete duplicate include attr.c: delete duplicate include
2022-03-16Merge branch 'ab/string-list-count-in-size-t'Junio C Hamano1-3/+4
Count string_list items in size_t, not "unsigned int". * ab/string-list-count-in-size-t: string-list API: change "nr" and "alloc" to "size_t" gettext API users: don't explicitly cast ngettext()'s "n"
2022-03-16Merge branch 'ds/commit-graph-gen-v2-fixes'Junio C Hamano1-0/+13
Fixes to the way generation number v2 in the commit-graph files are (not) handled. * ds/commit-graph-gen-v2-fixes: commit-graph: declare bankruptcy on GDAT chunks commit-graph: fix generation number v2 overflow values commit-graph: start parsing generation v2 (again) commit-graph: fix ordering bug in generation numbers t5318: extract helpers to lib-commit-graph.sh test-read-graph: include extra post-parse info
2022-03-13t/helper/test-run-command.c: delete duplicate includeElia Pinto1-1/+0
parse-options.h is included more than once. Signed-off-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-07string-list API: change "nr" and "alloc" to "size_t"Ævar Arnfjörð Bjarmason1-3/+4
Change the "nr" and "alloc" members of "struct string_list" to use "size_t" instead of "nr". On some platforms the size of an "unsigned int" will be smaller than a "size_t", e.g. a 32 bit unsigned v.s. 64 bit unsigned. As "struct string_list" is a generic API we use in a lot of places this might cause overflows. As one example: code in "refs.c" keeps track of the number of refs with a "size_t", and auxiliary code in builtin/remote.c in get_ref_states() appends those to a "struct string_list". While we're at it split the "nr" and "alloc" in string-list.h across two lines, which is the case for most such struct member declarations (e.g. in "strbuf.h" and "strvec.h"). Changing e.g. "int i" to "size_t i" in run_and_feed_hook() isn't strictly necessary, and there are a lot more cases where we'll use a local "int", "unsigned int" etc. variable derived from the "nr" in the "struct string_list". But in that case as well as add_wrapped_shortlog_msg() in builtin/shortlog.c we need to adjust the printf format referring to "nr" anyway, so let's also change the other variables referring to it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-06Merge branch 'ac/usage-string-fixups'Junio C Hamano1-3/+3
Usage-string normalization. * ac/usage-string-fixups: amend remaining usage strings according to style guide
2022-03-01test-read-graph: include extra post-parse infoDerrick Stolee1-0/+13
It can be helpful to verify that the 'struct commit_graph' that results from parsing a commit-graph is correctly structured. The existence of different chunks is not enough to verify that all of the optional features are correctly enabled. Update 'test-tool read-graph' to output an "options:" line that includes information for different parts of the struct commit_graph. In particular, this change demonstrates that the read_generation_data option is never being enabled, which will be fixed in a later change. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-25Merge branch 'ab/date-mode-release'Junio C Hamano1-1/+4
Plug (some) memory leaks around parse_date_format(). * ab/date-mode-release: date API: add and use a date_mode_release() date API: add basic API docs date API: provide and use a DATE_MODE_INIT date API: create a date.h, split from cache.h cache.h: remove always unused show_date_human() declaration
2022-02-25Merge branch 'ab/only-single-progress-at-once'Junio C Hamano1-13/+37
Further tweaks on progress API. * ab/only-single-progress-at-once: pack-bitmap-write.c: don't return without stop_progress() progress API: unify stop_progress{,_msg}(), fix trace2 bug progress.c: refactor stop_progress{,_msg}() to use helpers progress.c: use dereferenced "progress" variable, not "(*p_progress)" progress.h: format and be consistent with progress.c naming progress.c tests: test some invalid usage progress.c tests: make start/stop commands on stdin progress.c test helper: add missing braces leak tests: fix a memory leak in "test-progress" helper
2022-02-23amend remaining usage strings according to style guideAbhradeep Chakraborty1-3/+3
Usage strings for git (sub)command flags has a style guide that suggests - first letter should not capitalized (unless required) and it should skip full-stop at the end of line. But there are some files where usage-strings do not follow the above mentioned guide. Amend the usage strings that don't follow the style convention/guide. Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16Merge branch 'hn/reftable-coverity-fixes'Junio C Hamano1-4/+5
Problems identified by Coverity in the reftable code have been corrected. * hn/reftable-coverity-fixes: reftable: add print functions to the record types reftable: make reftable_record a tagged union reftable: remove outdated file reftable.c reftable: implement record equality generically reftable: make reftable-record.h function signatures const correct reftable: handle null refnames in reftable_ref_record_equal reftable: drop stray printf in readwrite_test reftable: order unittests by complexity reftable: all xxx_free() functions accept NULL arguments reftable: fix resource warning reftable: ignore remove() return value in stack_test.c reftable: check reftable_stack_auto_compact() return value reftable: fix resource leak blocksource.c reftable: fix resource leak in block.c error path reftable: fix OOB stack write in print functions
2022-02-16date API: add and use a date_mode_release()Ævar Arnfjörð Bjarmason1-0/+2
Fix a memory leak in the parse_date_format() function by providing a new date_mode_release() companion function. By using this in "t/helper/test-date.c" we can mark the "t0006-date.sh" test as passing when git is compiled with SANITIZE=leak, and whitelist it to run under "GIT_TEST_PASSING_SANITIZE_LEAK=true" by adding "TEST_PASSES_SANITIZE_LEAK=true" to the test itself. The other tests that expose this memory leak (i.e. take the "mode->type == DATE_STRFTIME" branch in parse_date_format()) are "t6300-for-each-ref.sh" and "t7004-tag.sh". The former is due to an easily fixed leak in "ref-filter.c", and brings the failures in "t6300-for-each-ref.sh" down from 51 to 48. Fixing the remaining leaks will have to wait until there's a release_revisions() in "revision.c", as they have to do with leaks via "struct rev_info". There is also a leak in "builtin/blame.c" due to its call to parse_date_format() to parse the "blame.date" configuration. However as it declares a file-level "static struct date_mode blame_date_mode" to track the data, LSAN will not report it as a leak. It's possible to get valgrind(1) to complain about it with e.g.: valgrind --leak-check=full --show-leak-kinds=all ./git -P -c blame.date=format:%Y blame README.md But let's focus on things LSAN complains about, and are thus observable with "TEST_PASSES_SANITIZE_LEAK=true". We should get to fixing memory leaks in "builtin/blame.c", but as doing so would require some re-arrangement of cmd_blame() let's leave it for some other time. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16date API: provide and use a DATE_MODE_INITÆvar Arnfjörð Bjarmason1-1/+1
Provide and use a DATE_MODE_INIT macro. Most of the users of struct date_mode" use it via pretty.h's "struct pretty_print_context" which doesn't have an initialization macro, so we're still bound to being initialized to "{ 0 }" by default. But we can change the couple of callers that directly declared a variable on the stack to instead use the initializer, and thus do away with the "mode.local = 0" added in add00ba2de9 (date: make "local" orthogonal to date format, 2015-09-03). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16date API: create a date.h, split from cache.hÆvar Arnfjörð Bjarmason1-0/+1
Move the declaration of the date.c functions from cache.h, and adjust the relevant users to include the new date.h header. The show_ident_date() function belonged in pretty.h (it's defined in pretty.c), its two users outside of pretty.c didn't strictly need to include pretty.h, as they get it indirectly, but let's add it to them anyway. Similarly, the change to "builtin/{fast-import,show-branch,tag}.c" isn't needed as far as the compiler is concerned, but since they all use the "DATE_MODE()" macro we now define in date.h, let's have them include it. We could simply include this new header in "cache.h", but as this change shows these functions weren't common enough to warrant including in it in the first place. By moving them out of cache.h changes to this API will no longer cause a (mostly) full re-build of the project when "make" is run. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-11Merge branch 'ab/no-errno-from-resolve-ref-unsafe'Junio C Hamano1-2/+1
Remaining code-clean-up. * ab/no-errno-from-resolve-ref-unsafe: refs API: remove "failure_errno" from refs_resolve_ref_unsafe() sequencer: don't use die_errno() on refs_resolve_ref_unsafe() failure
2022-02-11Merge branch 'bc/csprng-mktemps'Junio C Hamano3-0/+31
Pick a better random number generator and use it when we prepare temporary filenames. * bc/csprng-mktemps: wrapper: use a CSPRNG to generate random file names wrapper: add a helper to generate numbers from a CSPRNG
2022-02-03progress.c tests: make start/stop commands on stdinÆvar Arnfjörð Bjarmason1-11/+33
Change the usage of the "test-tool progress" introduced in 2bb74b53a49 (Test the progress display, 2019-09-16) to take command like "start" and "stop" on stdin, instead of running them implicitly. This makes for tests that are easier to read, since the recipe will mirror the API usage, and allows for easily testing invalid usage that would yield (or should yield) a BUG(), e.g. providing two "start" calls in a row. A subsequent commit will add such tests. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-03progress.c test helper: add missing bracesÆvar Arnfjörð Bjarmason1-2/+3
If we have braces on one arm of an if/else all of them should have it, per the CodingGuidelines's "When there are multiple arms to a conditional[...]" advice. This formatting change makes a subsequent commit smaller. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-03leak tests: fix a memory leak in "test-progress" helperÆvar Arnfjörð Bjarmason1-0/+1
Fix a memory leak in the test-progress helper, and mark the corresponding "t0500-progress-display.sh" test as being leak-free under SANITIZE=leak. This fixes a leak added in 2bb74b53a4 (Test the progress display, 2019-09-16). My 48f68715b14 (tr2: stop leaking "thread_name" memory, 2021-08-27) had fixed another memory leak in this test (as it did some trace2 testing). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-26refs API: remove "failure_errno" from refs_resolve_ref_unsafe()Ævar Arnfjörð Bjarmason1-2/+1
Remove the now-unused "failure_errno" parameter from the refs_resolve_ref_unsafe() signature. In my recent 96f6623ada0 (Merge branch 'ab/refs-errno-cleanup', 2021-11-29) series we made all of its callers explicitly request the errno via an output parameter. As that series shows all but one caller ended up passing in a boilerplate "ignore_errno", since they only cared about whether the return value was NULL or not, i.e. if the ref could be resolved. There was one small issue with that series fixed with a follow-up in 31e39123695 (Merge branch 'ab/refs-errno-cleanup', 2022-01-14) a small bug in that series was fixed. After those two there was one caller left in sequencer.c that used the "failure_errno', but as of the preceding commit it uses a boilerplate "ignore_errno" instead. This leaves the public refs API without any use of "failure_errno" at all. We could still do with a bit of cleanup and generalization between refs.c and refs/files-backend.c before the "reftable" integration lands, but that's all internal to the reference code itself. So let's remove this output parameter. Not only isn't it used now, but it's unlikely that we'll want it again in the future. We'd like to slowly move the refs API to a more file-backend independent way of communicating error codes, having it use a "failure_errno" was only the first step in that direction. If this or any other function needs to communicate what specifically is wrong with the requested "refname" it'll be better to have the function set some output enum of well-defined error states than piggy-backend on "errno". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-20reftable: order unittests by complexityHan-Wen Nienhuys1-4/+5
This is a more practical ordering when working on refactorings of the reftable code. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-17wrapper: add a helper to generate numbers from a CSPRNGbrian m. carlson3-0/+31
There are many situations in which having access to a cryptographically secure pseudorandom number generator (CSPRNG) is helpful. In the future, we'll encounter one of these when dealing with temporary files. To make this possible, let's add a function which reads from a system CSPRNG and returns some bytes. We know that all systems will have such an interface. A CSPRNG is required for a secure TLS or SSH implementation and a Git implementation which provided neither would be of little practical use. In addition, POSIX is set to standardize getentropy(2) in the next version, so in the (potentially distant) future we can rely on that. For systems which lack one of the other interfaces, we provide the ability to use OpenSSL's CSPRNG. OpenSSL is highly portable and functions on practically every known OS, and we know it will have access to some source of cryptographically secure randomness. We also provide support for the arc4random in libbsd for folks who would prefer to use that. Because this is a security sensitive interface, we take some precautions. We either succeed by filling the buffer completely as we requested, or we fail. We don't return partial data because the caller will almost never find that to be a useful behavior. Specify a makefile knob which users can use to specify one or more suitable CSPRNGs, and turn the multiple string options into a set of defines, since we cannot match on strings in the preprocessor. We allow multiple options to make the job of handling this in autoconf easier. The order of options is important here. On systems with arc4random, which is most of the BSDs, we use that, since, except on MirBSD and macOS, it uses ChaCha20, which is extremely fast, and sits entirely in userspace, avoiding a system call. We then prefer getrandom over getentropy, because the former has been available longer on Linux, and then OpenSSL. Finally, if none of those are available, we use /dev/urandom, because most Unix-like operating systems provide that API. We prefer options that don't involve device files when possible because those work in some restricted environments where device files may not be available. Set the configuration variables appropriately for Linux and the BSDs, including macOS, as well as Windows and NonStop. We specifically only consider versions which receive publicly available security support here. For the same reason, we don't specify getrandom(2) on Linux, because CentOS 7 doesn't support it in glibc (although its kernel does) and we don't want to resort to making syscalls. Finally, add a test helper to allow this to be tested by hand and in tests. We don't add any tests, since invoking the CSPRNG is not likely to produce interesting, reproducible results. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-12Merge branch 'ma/windows-dynload-fix'Junio C Hamano1-1/+3
Fix calling dynamically loaded functions on Windows. * ma/windows-dynload-fix: lazyload: use correct calling conventions
2022-01-10Merge branch 'ds/fetch-pull-with-sparse-index'Junio C Hamano1-54/+10
"git fetch" and "git pull" are now declared sparse-index clean. Also "git ls-files" learns the "--sparse" option to help debugging. * ds/fetch-pull-with-sparse-index: test-read-cache: remove --table, --expand options t1091/t3705: remove 'test-tool read-cache --table' t1092: replace 'read-cache --table' with 'ls-files --sparse' ls-files: add --sparse option fetch/pull: use the sparse index
2022-01-10Merge branch 'hn/test-ref-store-show-hash-algo'Junio C Hamano1-4/+5
Debugging support for refs API. * hn/test-ref-store-show-hash-algo: test-ref-store: print hash algorithm
2022-01-09lazyload: use correct calling conventionsMatthias Aßhauer1-1/+3
Christoph Reiter reported on the Git for Windows issue tracker[1], that mingw_strftime() imports strftime() from ucrtbase.dll with the wrong calling convention. It should be __cdecl instead of WINAPI, which we always use in DECLARE_PROC_ADDR(). The MSYS2 project encountered cmake sefaults on x86 Windows caused by the same issue in the cmake source. [2] There are no known git crashes that where caused by this, yet, but we should try to prevent them. We import two other non-WINAPI functions via DECLARE_PROC_ADDR(), too. * NtSetSystemInformation() (NTAPI) * GetUserNameExW() (SEC_ENTRY) NTAPI, SEC_ENTRY and WINAPI are all ususally defined as __stdcall, but there are circumstances where they're defined differently. Teach DECLARE_PROC_ADDR() about calling conventions and be explicit about when we want to use which calling convention. Import winnt.h for the definition of NTAPI and sspi.h for SEC_ENTRY near their respective only users. [1] https://github.com/git-for-windows/git/issues/3560 [2] https://github.com/msys2/MINGW-packages/issues/10152 Reported-By: Christoph Reiter <reiter.christoph@gmail.com> Signed-off-by: Matthias Aßhauer <mha1993@live.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-22Merge branch 'ab/common-main-cleanup'Junio C Hamano1-2/+3
Code clean-up. * ab/common-main-cleanup: common-main.c: call exit(), don't return
2021-12-22test-read-cache: remove --table, --expand optionsDerrick Stolee1-54/+10
This commit effectively reverts 2782db3 (test-tool: don't force full index, 2021-03-30) and e2df6c3 (test-read-cache: print cache entries with --table, 2021-03-30) to remove the --table and --expand options from 'test-tool read-cache'. The previous changes already removed these options from the test suite in favor of 'git ls-files --sparse'. The initial thought of creating these options was to allow for tests to see additional information with every cache entry. In particular, the object type is still not mirrored in 'git ls-files'. Since sparse directory entries always end with a slash, the object type is not critical to verify the sparse index is enabled. It was thought that it would be helpful to have additional information, such as flags, but that was not needed for the FS Monitor integration and hasn't been needed since. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-21test-ref-store: print hash algorithmHan-Wen Nienhuys1-4/+5
This provides a better error message in case SHA256 was inadvertently switched on through the environment. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15Merge branch 'hn/allow-bogus-oid-in-ref-tests'Junio C Hamano1-8/+65
The test helper for refs subsystem learned to write bogus and/or nonexistent object name to refs to simulate error situations we want to test Git in. * hn/allow-bogus-oid-in-ref-tests: t1430: create valid symrefs using test-helper t1430: remove refs using test-tool refs: introduce REF_SKIP_REFNAME_VERIFICATION flag refs: introduce REF_SKIP_OID_VERIFICATION flag refs: update comment. test-ref-store: plug memory leak in cmd_delete_refs test-ref-store: parse symbolic flag constants test-ref-store: remove force-create argument for create-reflog
2021-12-15Merge branch 'hn/reflog-tests'Junio C Hamano1-3/+3
Prepare tests on ref API to help testing reftable backends. * hn/reflog-tests: refs/debug: trim trailing LF from reflog message test-ref-store: tweaks to for-each-reflog-ent format t1405: check for_each_reflog_ent_reverse() more thoroughly test-ref-store: don't add newline to reflog message show-branch: show reflog message
2021-12-15Merge branch 'ab/run-command'Junio C Hamano2-5/+6
API clean-up. * ab/run-command: run-command API: remove "env" member, always use "env_array" difftool: use "env_array" to simplify memory management run-command API: remove "argv" member, always use "args" run-command API users: use strvec_push(), not argv construction run-command API users: use strvec_pushl(), not argv construction run-command tests: use strvec_pushv(), not argv assignment run-command API users: use strvec_pushv(), not argv assignment upload-archive: use regular "struct child_process" pattern worktree: stop being overly intimate with run_command() internals
2021-12-15Merge branch 'hn/reftable'Junio C Hamano3-1/+26
The "reftable" backend for the refs API, without integrating into the refs subsystem, has been added. * hn/reftable: Add "test-tool dump-reftable" command. reftable: add dump utility reftable: implement stack, a mutable database of reftable files. reftable: implement refname validation reftable: add merged table view reftable: add a heap-based priority queue for reftable records reftable: reftable file level tests reftable: read reftable files reftable: generic interface to tables reftable: write reftable files reftable: a generic binary tree implementation reftable: reading/writing blocks Provide zlib's uncompress2 from compat/zlib-compat.c reftable: (de)serialization for the polymorphic record type. reftable: add blocksource, an abstraction for random access reads reftable: utility functions reftable: add error related functionality reftable: add LICENSE hash.h: provide constants for the hash IDs
2021-12-10Merge branch 'hn/create-reflog-simplify'Junio C Hamano1-2/+1
A small simplification of API. * hn/create-reflog-simplify: refs: drop force_create argument of create_reflog API
2021-12-10Merge branch 'vd/sparse-sparsity-fix-on-read'Junio C Hamano1-2/+3
Ensure that the sparseness of the in-core index matches the index.sparse configuration specified by the repository immediately after the on-disk index file is read. * vd/sparse-sparsity-fix-on-read: sparse-index: update do_read_index to ensure correct sparsity sparse-index: add ensure_correct_sparsity function sparse-index: avoid unnecessary cache tree clearing test-read-cache.c: prepare_repo_settings after config init
2021-12-07refs: introduce REF_SKIP_REFNAME_VERIFICATION flagHan-Wen Nienhuys1-0/+1
Use this flag with the test-helper in t1430, to avoid direct writes to the ref database. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07refs: introduce REF_SKIP_OID_VERIFICATION flagHan-Wen Nienhuys1-0/+1
This lets the ref-store test helper write non-existent or unparsable objects into the ref storage. Use this to make t1006 and t3800 independent of the files storage backend. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07test-ref-store: plug memory leak in cmd_delete_refsHan-Wen Nienhuys1-1/+4
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07test-ref-store: parse symbolic flag constantsHan-Wen Nienhuys1-7/+59
This lets tests use REF_XXXX constants instead of hardcoded integers. The flag names should be separated by a ','. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07test-ref-store: remove force-create argument for create-reflogHan-Wen Nienhuys1-1/+1
Nobody uses force_create=0, so this flag is unnecessary. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07Merge branch 'hn/create-reflog-simplify' into hn/reftable-coverity-fixesJunio C Hamano1-2/+1
* hn/create-reflog-simplify: refs: drop force_create argument of create_reflog API
2021-12-07Merge branch 'hn/reftable' into hn/reftable-coverity-fixesJunio C Hamano3-1/+26
* hn/reftable: Add "test-tool dump-reftable" command. reftable: add dump utility reftable: implement stack, a mutable database of reftable files. reftable: implement refname validation reftable: add merged table view reftable: add a heap-based priority queue for reftable records reftable: reftable file level tests reftable: read reftable files reftable: generic interface to tables reftable: write reftable files reftable: a generic binary tree implementation reftable: reading/writing blocks Provide zlib's uncompress2 from compat/zlib-compat.c reftable: (de)serialization for the polymorphic record type. reftable: add blocksource, an abstraction for random access reads reftable: utility functions reftable: add error related functionality reftable: add LICENSE hash.h: provide constants for the hash IDs
2021-12-07common-main.c: call exit(), don't returnÆvar Arnfjörð Bjarmason1-2/+3
Change the main() function to call "exit()" instead of ending with a "return" statement. The "exit()" function is our own wrapper that calls trace2_cmd_exit_fl() for us, from git-compat-util.h: #define exit(code) exit(trace2_cmd_exit_fl(__FILE__, __LINE__, (code))) That "exit()" wrapper has been in use ever since ee4512ed481 (trace2: create new combined trace facility, 2019-02-22). This changes nothing about how we "exit()", as we'd invoke "trace2_cmd_exit_fl()" in both cases due to the wrapper, this change makes it easier to reason about this code, as we're now always obviously relying on our "exit()" wrapper. There is already code immediately downstream of our "main()" which has a hard reliance on that, e.g. the various "exit()" calls downstream of "cmd_main()" in "git.c". We even had a comment in "t/helper/test-trace2.c" that seemed to be confused about how the "exit()" wrapper interacted with uses of "return", even though it was introduced in the same trace2 series in a15860dca3f (trace2: t/helper/test-trace2, t0210.sh, t0211.sh, t0212.sh, 2019-02-22), after the aforementioned ee4512ed481. Perhaps it pre-dated the "exit()" wrapper? This change makes the "trace2_cmd_exit()" macro orphaned, we now always use "trace2_cmd_exit_fl()" directly, but let's keep that simpler example in place. Even if we're unlikely to get another "main()" other than the one in our "common-main.c", there's some value in having the API documentation and example discuss a simpler version that doesn't require an "exit()" wrapper macro. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-02test-ref-store: tweaks to for-each-reflog-ent formatHan-Wen Nienhuys1-2/+3
We have some tests that read from files in .git/logs/ hierarchy when checking if correct reflog entries are created, but that is too specific to the files backend. Other backends like reftable may not store its reflog entries in such a "one line per entry" format. Update for-each-reflog-ent test helper to produce output that is identical to lines in a reflog file files backend uses. That way, (1) the current tests can be updated to use the test helper to read the reflog entries instead of (parts of) reflog files, and perform the same inspection for correctness, and (2) when the ref backend is swapped to another backend, the updated test can be used as-is to check the correctness. Adapt t1400 to use the for-each-reflog-ent test helper. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-02test-ref-store: don't add newline to reflog messageHan-Wen Nienhuys1-3/+2
By convention, reflog messages always end in '\n', so before we would print blank lines between entries. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-29Merge branch 'mc/clean-smudge-with-llp64'Junio C Hamano1-4/+17
The clean/smudge conversion code path has been prepared to better work on platforms where ulong is narrower than size_t. * mc/clean-smudge-with-llp64: clean/smudge: allow clean filters to process extremely large files odb: guard against data loss checking out a huge file git-compat-util: introduce more size_t helpers odb: teach read_blob_entry to use size_t t1051: introduce a smudge filter test for extremely large files test-lib: add prerequisite for 64-bit platforms test-tool genzeros: generate large amounts of data more efficiently test-genzeros: allow more than 2G zeros in Windows
2021-11-29Merge branch 'tb/plug-pack-bitmap-leaks'Junio C Hamano1-1/+2
Leakfix. * tb/plug-pack-bitmap-leaks: pack-bitmap.c: more aggressively free in free_bitmap_index() pack-bitmap.c: don't leak type-level bitmaps midx.c: write MIDX filenames to strbuf builtin/multi-pack-index.c: don't leak concatenated options builtin/repack.c: avoid leaking child arguments builtin/pack-objects.c: don't leak memory via arguments t/helper/test-read-midx.c: free MIDX within read_midx_file() midx.c: don't leak MIDX from verify_midx_file midx.c: clean up chunkfile after reading the MIDX
2021-11-29Merge branch 'ab/refs-errno-cleanup'Junio C Hamano1-1/+2
The "remainder" of hn/refs-errno-cleanup topic. * ab/refs-errno-cleanup: (21 commits) refs API: post-migration API renaming [2/2] refs API: post-migration API renaming [1/2] refs API: don't expose "errno" in run_transaction_hook() refs API: make expand_ref() & repo_dwim_log() not set errno refs API: make resolve_ref_unsafe() not set errno refs API: make refs_ref_exists() not set errno refs API: make refs_resolve_refdup() not set errno refs tests: ignore ignore errno in test-ref-store helper refs API: ignore errno in worktree.c's find_shared_symref() refs API: ignore errno in worktree.c's add_head_info() refs API: make files_copy_or_rename_ref() et al not set errno refs API: make loose_fill_ref_dir() not set errno refs API: make resolve_gitlink_ref() not set errno refs API: remove refs_read_ref_full() wrapper refs/files: remove "name exist?" check in lock_ref_oid_basic() reflog tests: add --updateref tests refs API: make refs_rename_ref_available() static refs API: make parse_loose_ref_contents() not set errno refs API: make refs_read_raw_ref() not set errno refs API: add a version of refs_resolve_ref_unsafe() with "errno" ...
2021-11-25run-command tests: use strvec_pushv(), not argv assignmentÆvar Arnfjörð Bjarmason1-4/+5
As in the preceding commit change this API user to use strvec_pushv() instead of assigning to the "argv" member directly. This leaves us without test coverage of how the "argv" assignment in this API works, but we'll be removing it in a subsequent commit. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-25run-command API users: use strvec_pushv(), not argv assignmentÆvar Arnfjörð Bjarmason1-1/+1
Migrate those run-command API users that assign directly to the "argv" member to use a strvec_pushv() of "args" instead. In these cases it did not make sense to further refactor these callers, e.g. daemon.c could be made to construct the arguments closer to handle(), but that would require moving the construction from its cmd_main() and pass "argv" through two intermediate functions. It would be possible for a change like this to introduce a regression if we were doing: cp.argv = argv; argv[1] = "foo"; And changed the code, as is being done here, to: strvec_pushv(&cp.args, argv); argv[1] = "foo"; But as viewing this change with the "-W" flag reveals none of these functions modify variable that's being pushed afterwards in a way that would introduce such a logic error. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-24test-read-cache.c: prepare_repo_settings after config initVictoria Dye1-2/+3
Move `prepare_repo_settings` after the git directory has been set up in `test-read-cache.c`. The git directory settings must be initialized to properly assign repo settings using the worktree-level git config. Signed-off-by: Victoria Dye <vdye@github.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-22refs: drop force_create argument of create_reflog APIHan-Wen Nienhuys1-2/+1
There is only one caller, builtin/checkout.c, and it hardcodes force_create=1. This argument was introduced in abd0cd3a301 (refs: new public ref function: safe_create_reflog, 2015-07-21), which promised to immediately use it in a follow-on commit, but that never happened. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-03test-tool genzeros: generate large amounts of data more efficientlyJohannes Schindelin1-2/+15
In this developer's tests, producing one gigabyte worth of NULs in a busy loop that writes out individual bytes, unbuffered, took ~27sec. Writing chunked 256kB buffers instead only took ~0.6sec This matters because we are about to introduce a pair of test cases that want to be able to produce 5GB of NULs, and we cannot use `/dev/zero` because of the HP NonStop platform's lack of support for that device. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-03test-genzeros: allow more than 2G zeros in WindowsCarlo Marcelo Arenas Belón1-2/+2
d5cfd142ec (tests: teach the test-tool to generate NUL bytes and use it, 2019-02-14), add a way to generate zeroes in a portable way without using /dev/zero (needed by HP NonStop), but uses a long variable that is limited to 2^31 in Windows. Use instead a (POSIX/C99) intmax_t that is at least 64bit wide in 64-bit Windows to use in a future test. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-27t/helper/test-read-midx.c: free MIDX within read_midx_file()Taylor Blau1-1/+2
When calling `read_midx_file()` to show information about a MIDX or list the objects contained within it we fail to call `close_midx()`, leaking the memory allocated to store that MIDX. Fix this by calling `close_midx()` before exiting the function. We can drop the "early" return when `show_objects` is non-zero, since the next instruction is also a return. (We could just as easily put a `cleanup` label here as with previous patches. But the only other time we terminate the function early is when we fail to load a MIDX in the first place. `close_midx()` does handle a NULL argument, but the extra complexity is likely not warranted). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-25Merge branch 'ab/mark-leak-free-tests-more'Junio C Hamano4-1/+15
Bunch of tests are marked as "passing leak check". * ab/mark-leak-free-tests-more: merge: add missing strbuf_release() ls-files: add missing string_list_clear() ls-files: fix a trivial dir_clear() leak tests: fix test-oid-array leak, test in SANITIZE=leak tests: fix a memory leak in test-oidtree.c tests: fix a memory leak in test-parse-options.c tests: fix a memory leak in test-prio-queue.c
2021-10-18Merge branch 'tb/repack-write-midx'Junio C Hamano1-1/+28
"git repack" has been taught to generate multi-pack reachability bitmaps. * tb/repack-write-midx: test-read-midx: fix leak of bitmap_index struct builtin/repack.c: pass `--refs-snapshot` when writing bitmaps builtin/repack.c: make largest pack preferred builtin/repack.c: support writing a MIDX while repacking builtin/repack.c: extract showing progress to a variable builtin/repack.c: rename variables that deal with non-kept packs builtin/repack.c: keep track of existing packs unconditionally midx: preliminary support for `--refs-snapshot` builtin/multi-pack-index.c: support `--stdin-packs` mode midx: expose `write_midx_file_only()` publicly
2021-10-18Merge branch 'rs/mergesort'Junio C Hamano1-5/+363
The mergesort implementation used to sort linked list has been optimized. * rs/mergesort: test-mergesort: use repeatable random numbers mergesort: use ranks stack p0071: test performance of llist_mergesort() p0071: measure sorting of already sorted and reversed files test-mergesort: add unriffle_skewed mode test-mergesort: add unriffle mode test-mergesort: add generate subcommand test-mergesort: add test subcommand test-mergesort: add sort subcommand test-mergesort: use strbuf_getline()
2021-10-16refs API: post-migration API renaming [2/2]Ævar Arnfjörð Bjarmason1-1/+1
Rename the transitory refs_werrres_ref_unsafe() function to refs_resolve_ref_unsafe(), now that all callers of the old function have learned to pass in a "failure_errno" parameter. The coccinelle semantic patch added in the preceding commit works, but I couldn't figure out how to get spatch(1) to re-flow these argument lists (and sometimes make lines way too long), so this rename was done with: perl -pi -e 's/refs_werrres_ref_unsafe/refs_resolve_ref_unsafe/g' \ $(git grep -l refs_werrres_ref_unsafe -- '*.c') But after that "make contrib/coccinelle/refs.cocci.patch" comes up empty, so the result would have been the same. Let's remove that transitory semantic patch file, we won't need to retain it for any other in-flight changes, refs_werrres_ref_unsafe() only existed within this patch series. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-16refs tests: ignore ignore errno in test-ref-store helperÆvar Arnfjörð Bjarmason1-2/+3
The cmd_resolve_ref() function has always ignored errno on failure, but let's do so explicitly when using the refs_resolve_ref_unsafe() function. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-13Merge branch 'jh/builtin-fsmonitor-part1'Junio C Hamano1-167/+66
Built-in fsmonitor (part 1). * jh/builtin-fsmonitor-part1: t/helper/simple-ipc: convert test-simple-ipc to use start_bg_command run-command: create start_bg_command simple-ipc/ipc-win32: add Windows ACL to named pipe simple-ipc/ipc-win32: add trace2 debugging simple-ipc: move definition of ipc_active_state outside of ifdef simple-ipc: preparations for supporting binary messages. trace2: add trace2_child_ready() to report on background children
2021-10-11Merge branch 'ab/designated-initializers'Junio C Hamano1-2/+4
Code clean-up. * ab/designated-initializers: cbtree.h: define cb_init() in terms of CBTREE_INIT *.h: move some *_INIT to designated initializers *.h _INIT macros: don't specify fields equal to 0 *.[ch] *_INIT macros: use { 0 } for a "zero out" idiom submodule-config.h: remove unused SUBMODULE_INIT macro
2021-10-11Merge branch 'tb/midx-write-propagate-namehash'Junio C Hamano1-1/+9
"git multi-pack-index write --bitmap" learns to propagate the hashcache from original bitmap to resulting bitmap. * tb/midx-write-propagate-namehash: t5326: test propagating hashcache values p5326: generate pack bitmaps before writing the MIDX bitmap p5326: don't set core.multiPackIndex unnecessarily p5326: create missing 'perf-tag' tag midx.c: respect 'pack.writeBitmapHashcache' when writing bitmaps pack-bitmap.c: propagate namehash values from existing bitmaps t/helper/test-bitmap.c: add 'dump-hashes' mode
2021-10-08Add "test-tool dump-reftable" command.Han-Wen Nienhuys3-0/+7
This command dumps individual tables or a stack of of tables. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08reftable: implement stack, a mutable database of reftable files.Han-Wen Nienhuys1-0/+1
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08reftable: implement refname validationHan-Wen Nienhuys1-0/+1
The packed/loose format has restrictions on refnames: a and a/b cannot coexist. This limitation does not apply to reftable per se, but must be maintained for interoperability. This code adds validation routines to abort transactions that are trying to add invalid names. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08reftable: add merged table viewHan-Wen Nienhuys1-0/+1
This adds an abstract, read-only interface to the ref database. This primitive is used to construct the read view of the ref database (the read view is constructed by merging several *.ref files). It also provides the mechanism to provide a unified view of the refs in the main repository and the per-worktree refs. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08reftable: add a heap-based priority queue for reftable recordsHan-Wen Nienhuys1-0/+1
This is needed to create a merged view multiple reftables Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08reftable: reftable file level testsHan-Wen Nienhuys1-0/+1
With support for reading and writing files in place, we can construct files (in memory) and attempt to read them back. Because some sections of the format are optional (eg. indices, log entries), we have to exercise this code using multiple sizes of input data Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08reftable: a generic binary tree implementationHan-Wen Nienhuys1-0/+1
The reftable format includes support for an (OID => ref) map. This map can speed up visibility and reachability checks. In particular, various operations along the fetch/push path within Gerrit have ben sped up by using this structure. The map is constructed with help of a binary tree. Object IDs are hashes, so they are uniformly distributed. Hence, the tree does not attempt forced rebalancing. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08reftable: reading/writing blocksHan-Wen Nienhuys1-0/+1
The reftable format is structured as a sequence of block. Within a block, records are prefix compressed, with an index of offsets for fully expand keys to enable binary search within blocks. This commit provides the logic to read and write these blocks. Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08reftable: (de)serialization for the polymorphic record type.Han-Wen Nienhuys1-1/+1
The reftable format is structured as a sequence of blocks, and each block contains a sequence of prefix-compressed key-value records. There are 4 types of records, and they have similarities in how they must be handled. This is achieved by introducing a polymorphic 'record' type that encapsulates ref, log, index and object records. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08reftable: utility functionsHan-Wen Nienhuys3-1/+12
This commit provides basic utility classes for the reftable library. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08test-mergesort: use repeatable random numbersRené Scharfe1-2/+10
Use MINSTD to generate pseudo-random numbers consistently instead of using rand(3), whose output can vary from system to system, and reset its seed before filling in the test values. This gives repeatable results across versions and systems, which simplifies sharing and comparing of results between developers. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07tests: fix test-oid-array leak, test in SANITIZE=leakÆvar Arnfjörð Bjarmason1-0/+4
Fix a trivial memory leak present ever since 38d905bf585 (sha1-array: add test-sha1-array and basic tests, 2014-10-01), now that that's fixed we can test this under GIT_TEST_PASSING_SANITIZE_LEAK=true. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07tests: fix a memory leak in test-oidtree.cÆvar Arnfjörð Bjarmason1-0/+3
Fix a memory leak in t/helper/test-oidtree.c, we were not freeing the "struct strbuf" we used for the stdin input we parsed. This leak has been here ever since 92d8ed8ac10 (oidtree: a crit-bit tree for odb_loose_cache, 2021-07-07). Now that it's fixed we can declare that t0069-oidtree.sh will pass under GIT_TEST_PASSING_SANITIZE_LEAK=true. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07tests: fix a memory leak in test-parse-options.cÆvar Arnfjörð Bjarmason1-1/+6
Fix a memory leak in t/helper/test-parse-options.c, we were not freeing the allocated "struct string_list" or its items. Let's move the declaration of the "list" variable into the cmd__parse_options() and release it at the end. In c8ba1639165 (parse-options: add OPT_STRING_LIST helper, 2011-06-09) the "list" variable was added, and later on in c8ba1639165 (parse-options: add OPT_STRING_LIST helper, 2011-06-09) the "expect" was added. The "list" variable was last touched in 2721ce21e43 (use string_list initializer consistently, 2016-06-13), but it was still left at the static scope, it's better to move it to the function for consistency. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07tests: fix a memory leak in test-prio-queue.cÆvar Arnfjörð Bjarmason1-0/+2
Fix a memory leak in t/helper/test-prio-queue.c, the lack of freeing the memory with clear_prio_queue() has been there ever since this code was originally added in b4b594a3154 (prio-queue: priority queue of pointers to structs, 2013-06-06). By fixing this leak we can cleanly run t0009-prio-queue.sh under SANITIZE=leak, so annotate it as such with TEST_PASSES_SANITIZE_LEAK=true. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-07test-read-midx: fix leak of bitmap_index structJeff King1-2/+6
In read_midx_preferred_pack(), we open the bitmap index but never free it. This isn't a big deal since this is just a test helper, and we exit immediately after, but since we're trying to keep our leak-checking tidy now, it's worth fixing. Signed-off-by: Jeff King <peff@peff.net> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-06Merge branch 'ab/repo-settings-cleanup'Junio C Hamano1-2/+4
Code cleanup. * ab/repo-settings-cleanup: repository.h: don't use a mix of int and bitfields repo-settings.c: simplify the setup read-cache & fetch-negotiator: check "enum" values in switch() environment.c: remove test-specific "ignore_untracked..." variable wrapper.c: add x{un,}setenv(), and use xsetenv() in environment.c
2021-10-06Merge branch 'jt/add-submodule-odb-clean-up'Junio C Hamano1-3/+1
More code paths that use the hack to add submodule's object database to the set of alternate object store have been cleaned up. * jt/add-submodule-odb-clean-up: revision: remove "submodule" from opt struct repository: support unabsorbed in repo_submodule_init submodule: remove unnecessary unabsorbed fallback
2021-10-01test-mergesort: add unriffle_skewed modeRené Scharfe1-0/+28
Add a mode that turns a sorted list into adversarial input for a bottom-up mergesort implementation that doubles the length of sorted sublists at each level -- like our llist_mergesort(). While unriffle mode splits the list in half at each recursion step, unriffle_skewed splits it into 2^l items and the rest, with 2^l being the highest power of two smaller than the number of items and thus 2^l >= rest. The rest is unriffled with the tail of the first half to require a merge to compare the maximum number of elements. It complements the unriffle mode, which targets balanced merges. If the number of elements is a power of two then both actually produce the same result, as 2^l == rest == n/2 at each recursion step in that case. Here are the results: $ t/helper/test-tool mergesort test | awk ' $7 > max[$3] {max[$3] = $7; line[$3] = $0} END {for (n in line) print line[n]} ' distribut mode n m get_next set_next compare verdict sawtooth unriffle_skewed 100 128 1184 700 589 OK sawtooth unriffle_skewed 1023 1024 16373 10230 9207 OK sawtooth unriffle 1024 1024 16384 10240 9217 OK sawtooth unriffle_skewed 1025 2048 18454 11275 10241 OK The sawtooth distribution with m>=n produces a sorted list and unriffle_skewed mode turns it into adversarial input for unbalanced merges, which it wins in all cases except for n=1024 -- the resulting list is the same, but unriffle is tested before unriffle_skewed, so its result is selected by the AWK script. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01test-mergesort: add unriffle modeRené Scharfe1-0/+29
Add a mode that turns sorted items into adversarial input for mergesort. Do that by running mergesort in reverse and rearranging the items in such a way that each merge needs the maximum number of operations to undo it. To riffle is a card shuffling technique and involves splitting a deck into two and then to interleave them. A perfect riffle takes one card from each half in turn. That's similar to the most expensive merge, which has to take one item from each sublist in turn, which requires the maximum number of comparisons (n-1). So unriffle does that in reverse, i.e. it generates the first sublist out of the items at even indexes and the second sublist out of the items at odd indexes, without changing their order in any other way. Done recursively until we reach the trivial sublist length of one, this twists the list into an order that requires the maximum effort for mergesort to untangle. As a baseline, here are the rand distributions with the highest number of comparisons from "test-tool mergesort test": $ t/helper/test-tool mergesort test | awk ' NR > 1 && $1 != "rand" {next} $7 > max[$3] {max[$3] = $7; line[$3] = $0} END {for (n in line) print line[n]} ' distribut mode n m get_next set_next compare verdict rand copy 100 32 1184 700 569 OK rand reverse_1st_half 1023 256 16373 10230 8976 OK rand reverse_1st_half 1024 512 16384 10240 8993 OK rand dither 1025 64 18454 11275 9970 OK And here are the most expensive ones overall: $ t/helper/test-tool mergesort test | awk ' $7 > max[$3] {max[$3] = $7; line[$3] = $0} END {for (n in line) print line[n]} ' distribut mode n m get_next set_next compare verdict stagger reverse 100 64 1184 700 580 OK sawtooth unriffle 1023 1024 16373 10230 9179 OK sawtooth unriffle 1024 1024 16384 10240 9217 OK stagger unriffle 1025 2048 18454 11275 10241 OK The sawtooth distribution with m>=n generates a sorted list. The unriffle mode is designed to turn that into adversarial input for mergesort, and that checks out for n=1023 and n=1024, where it produces the list that requires the most comparisons. Item counts that are not powers of two have other winners, and that's because unriffle recursively splits lists into equal-sized halves, while llist_mergesort() splits them into the biggest power of two smaller than n and the rest, e.g. for n=1025 it sorts the first 1024 separately and finally merges them to the last item. So unriffle mode works as designed for the intended use case, but to consistently generate adversarial input for unbalanced merges we need something else. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01test-mergesort: add generate subcommandRené Scharfe1-1/+59
Add a subcommand for printing test data. It can be used to generate special test cases and feed them into the sort subcommand or sort(1) for performance measurements. It may also be useful to illustrate the effect of distributions, modes and their parameters. It generates n integers with the specified distribution and its distribution-specific parameter m. E.g. m is the maximum value for the plateau distribution and the length and height of individual teeth of the sawtooth distribution. The generated values are printed as zero-padded eight-digit hexadecimal numbers to make sure alphabetic and numeric order are the same. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01test-mergesort: add test subcommandRené Scharfe1-1/+231
Adapt the qsort certification program from "Engineering a Sort Function" by Bentley and McIlroy for testing our linked list sort function. It generates several lists with various distribution patterns and counts the number of operations llist_mergesort() needs to order them. It compares the result to the output of a trusted sort function (qsort(1)) and also checks if the sort is stable. Also add a test script that makes use of the new subcommand. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01test-mergesort: add sort subcommandRené Scharfe1-1/+8
Give the code for sorting a text file its own sub-command. This allows extending the helper, which we'll do in the following patches. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01test-mergesort: use strbuf_getline()René Scharfe1-4/+2
Strip line ending characters to make sure empty lines are sorted like sort(1) does. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28builtin/repack.c: make largest pack preferredTaylor Blau1-1/+24
When repacking into a geometric series and writing a multi-pack bitmap, it is beneficial to have the largest resulting pack be the preferred object source in the bitmap's MIDX, since selecting the large packs can lead to fewer broken delta chains and better compression. Teach 'git repack' to identify this pack and pass it to the MIDX write machinery in order to mark it as preferred. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27*.h: move some *_INIT to designated initializersÆvar Arnfjörð Bjarmason1-2/+4
Move various *_INIT macros to use designated initializers. This helps readability. I've only picked those leftover macros that were not touched by another in-flight series of mine which changed others, but also how initialization was done. In the case of SUBMODULE_ALTERNATE_SETUP_INIT I've left an explicit initialization of "error_mode", even though SUBMODULE_ALTERNATE_ERROR_IGNORE itself is defined as "0". Let's not peek under the hood and assume that enum fields we know the value of will stay at "0". The change to "TESTSUITE_INIT" in "t/helper/test-run-command.c" was part of an earlier on-list version[1] of c90be786da9 (test-tool run-command: fix flip-flop init pattern, 2021-09-11). 1. https://lore.kernel.org/git/patch-1.1-0aa4523ab6e-20210909T130849Z-avarab@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27*.h _INIT macros: don't specify fields equal to 0Ævar Arnfjörð Bjarmason1-1/+1
Change the initialization of "struct strbuf" changed in cbc0f81d96f (strbuf: use designated initializers in STRBUF_INIT, 2017-07-10) to omit specifying "alloc" and "len", as we do with other "alloc" and "len" (or "nr") in similar structs. Let's likewise omit the explicit initialization of all fields in the "struct ipc_client_connect_option" struct added in 59c7b88198a (simple-ipc: add win32 implementation, 2021-03-15). Do the same for a few other initializers, e.g. STRVEC_INIT and CACHE_DEF_INIT. Finally, start incrementally changing the same pattern in "t/helper/test-run-command.c". This change was part of an earlier on-list version[1] of c90be786da9 (test-tool run-command: fix flip-flop init pattern, 2021-09-11). 1. https://lore.kernel.org/git/patch-1.1-0aa4523ab6e-20210909T130849Z-avarab@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23Merge branch 'ab/retire-option-argument'Junio C Hamano1-1/+0
An oddball OPTION_ARGUMENT feature has been removed from the parse-options API. * ab/retire-option-argument: parse-options API: remove OPTION_ARGUMENT feature difftool: use run_command() API in run_file_diff() difftool: prepare "diff" cmdline in cmd_difftool() difftool: prepare "struct child_process" in cmd_difftool()
2021-09-23Merge branch 'ab/test-tool-run-command-cleanup'Junio C Hamano1-4/+1
Code clean-up. * ab/test-tool-run-command-cleanup: test-tool run-command: fix flip-flop init pattern
2021-09-22environment.c: remove test-specific "ignore_untracked..." variableÆvar Arnfjörð Bjarmason1-2/+4
Instead of the global ignore_untracked_cache_config variable added in dae6c322fa1 (test-dump-untracked-cache: don't modify the untracked cache, 2016-01-27) we can make use of the new facility to set config via environment variables added in d8d77153eaf (config: allow specifying config entries via envvar pairs, 2021-01-12). It's arguably a bit hacky to use setenv() and getenv() to pass messages between the same program, but since the test helpers are not the main intended audience of repo-settings.c I think it's better than hardcoding the test-only special-case in prepare_repo_settings(). This uses the xsetenv() wrapper added in the preceding commit, if we don't set these in the environment we'll fail in t7063-status-untracked-cache.sh, but let's fail earlier anyway if that were to happen. This breaks any parent process that's potentially using the GIT_CONFIG_* and GIT_CONFIG_PARAMETERS mechanism to pass one-shot config setting down to a git subprocess, but in this case we don't care about the general case of such potential parents. This process neither spawns other "git" processes, nor is it interested in other configuration. We might want to pick up other test modes here, but those will be passed via GIT_TEST_* environment variables. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20Merge branch 'ab/serve-cleanup'Junio C Hamano1-5/+9
Code clean-up around "git serve". * ab/serve-cleanup: upload-pack: document and rename --advertise-refs serve.[ch]: remove "serve_options", split up --advertise-refs code {upload,receive}-pack tests: add --advertise-refs tests serve.c: move version line to advertise_capabilities() serve: move transfer.advertiseSID check into session_id_advertise() serve.[ch]: don't pass "struct strvec *keys" to commands serve: use designated initializers transport: use designated initializers transport: rename "fetch" in transport_vtable to "fetch_refs" serve: mark has_capability() as static
2021-09-20t/helper/simple-ipc: convert test-simple-ipc to use start_bg_commandJeff Hostetler1-156/+43
Convert test helper to use `start_bg_command()` when spawning a server daemon in the background rather than blocks of platform-specific code. Also, while here, remove _() translation around error messages since this is a test helper and not Git code. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20simple-ipc: preparations for supporting binary messages.Jeff Hostetler1-11/+23
Add `command_len` argument to the Simple IPC API. In my original Simple IPC API, I assumed that the request would always be a null-terminated string of text characters. The `command` argument was just a `const char *`. I found a caller that would like to pass a binary command to the daemon, so I am amending the Simple IPC API to receive `const char *command, size_t command_len` arguments. I considered changing the `command` argument to be a `void *`, but the IPC layer simply passes it to the pkt-line layer which takes a `const char *`, so to avoid confusion I left it as is. Note, the response side has always been a `struct strbuf` which includes the buffer and length, so we already support returning a binary answer. (Yes, it feels a little weird returning a binary buffer in a `strbuf`, but it works.) Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14t/helper/test-bitmap.c: add 'dump-hashes' modeTaylor Blau1-1/+9
The pack-bitmap writer code is about to learn how to propagate values from an existing hash-cache. To prepare, teach the test-bitmap helper to dump the values from a bitmap's hash-cache extension in order to test those changes. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12parse-options API: remove OPTION_ARGUMENT featureÆvar Arnfjörð Bjarmason1-1/+0
As was noted in 1a85b49b87a (parse-options: make OPT_ARGUMENT() more useful, 2019-03-14) there's only ever been one user of the OPT_ARGUMENT(), that user was added in 20de316e334 (difftool: allow running outside Git worktrees with --no-index, 2019-03-14). The OPT_ARGUMENT() feature itself was added way back in 580d5bffdea (parse-options: new option type to treat an option-like parameter as an argument., 2008-03-02), but as discussed in 1a85b49b87a wasn't used until 20de316e334 in 2019. Now that the preceding commit has migrated this code over to using "struct strvec" to manage the "args" member of a "struct child_process", we can just use that directly instead of relying on OPT_ARGUMENT. This has a minor change in behavior in that if we'll pass --no-index we'll now always pass it as the first argument, before we'd pass it in whatever position the caller did. Preserving this was the real value of OPT_ARGUMENT(), but as it turns out we didn't need that either. We can always inject it as the first argument, the other end will parse it just the same. Note that we cannot remove the "out" and "cpidx" members of "struct parse_opt_ctx_t" added in 580d5bffdea, while they were introduced with OPT_ARGUMENT() we since used them for other things. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12test-tool run-command: fix flip-flop init patternÆvar Arnfjörð Bjarmason1-4/+1
In be5d88e1128 (test-tool run-command: learn to run (parts of) the testsuite, 2019-10-04) an init pattern was added that would use TESTSUITE_INIT, but then promptly memset() everything back to 0. We'd then set the "dup" on the two string lists. Our setting of "next" to "-1" thus did nothing, we'd reset it to "0" before using it. Let's set it to "0" instead, and trust the "STRING_LIST_INIT_DUP" to set "strdup_strings" appropriately for us. Note that while we compile this code, there's no in-tree user for the "testsuite" target being modified here anymore, see the discussion at and around <nycvar.QRO.7.76.6.2109091323150.59@tvgsbejvaqbjf.bet>[1]. 1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.2109091323150.59@tvgsbejvaqbjf.bet/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09repository: support unabsorbed in repo_submodule_initJonathan Tan1-3/+1
In preparation for a subsequent commit that migrates code using add_submodule_odb() to repo_submodule_init(), teach repo_submodule_init() to support submodules with unabsorbed gitdirs. (See the documentation for "git submodule absorbgitdirs" for more information about absorbed and unabsorbed gitdirs.) Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01t/helper/test-read-midx.c: add --checksum modeTaylor Blau1-1/+15
Subsequent tests will want to check for the existence of a multi-pack bitmap which matches the multi-pack-index stored in the pack directory. The multi-pack bitmap includes the hex checksum of the MIDX it corresponds to in its filename (for example, '$packdir/multi-pack-index-<checksum>.bitmap'). As a result, some tests want a way to learn what '<checksum>' is. This helper addresses that need by printing the checksum of the repository's multi-pack-index. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05serve.[ch]: remove "serve_options", split up --advertise-refs codeÆvar Arnfjörð Bjarmason1-5/+9
The "advertise capabilities" mode of serve.c added in ed10cb952d3 (serve: introduce git-serve, 2018-03-15) is only used by the http-backend.c to call {upload,receive}-pack with the --advertise-refs parameter. See 42526b478e3 (Add stateless RPC options to upload-pack, receive-pack, 2009-10-30). Let's just make cmd_upload_pack() take the two (v2) or three (v2) parameters the the v2/v1 servicing functions need directly, and pass those in via the function signature. The logic of whether daemon mode is implied by the timeout belongs in the v1 function (only used there). Once we split up the "advertise v2 refs" from "serve v2 request" it becomes clear that v2 never cared about those in combination. The only time it mattered was for v1 to emit its ref advertisement, in that case we wanted to emit the smart-http-only "no-done" capability. Since we only do that in the --advertise-refs codepath let's just have it set "do_done" itself in v1's upload_pack() just before send_ref(), at that point --advertise-refs and --stateless-rpc in combination are redundant (the only user is get_info_refs() in http-backend.c), so we can just pass in --advertise-refs only. Since we need to touch all the serve() and advertise_capabilities() codepaths let's rename them to less clever and obvious names, it's been suggested numerous times, the latest of which is [1]'s suggestion for protocol_v2_serve_loop(). Let's go with that. 1. https://lore.kernel.org/git/CAFQ2z_NyGb8rju5CKzmo6KhZXD0Dp21u-BbyCb2aNxLEoSPRJw@mail.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-04Merge branch 'ab/getcwd-test'Junio C Hamano3-0/+28
Portability test update. * ab/getcwd-test: t0001: fix broken not-quite getcwd(3) test in bed67874e2
2021-07-30t0001: fix broken not-quite getcwd(3) test in bed67874e2Ævar Arnfjörð Bjarmason3-0/+28
With a54e938e5b (strbuf: support long paths w/o read rights in strbuf_getcwd() on FreeBSD, 2017-03-26) we had t0001 break on systems like OpenBSD and AIX whose getcwd(3) has standard (but not like glibc et al) behavior. This was partially fixed in bed67874e2 (t0001: skip test with restrictive permissions if getpwd(3) respects them, 2017-08-07). The problem with that fix is that while its analysis of the problem is correct, it doesn't actually call getcwd(3), instead it invokes "pwd -P". There is no guarantee that "pwd -P" is going to call getcwd(3), as opposed to e.g. being a shell built-in. On AIX under both bash and ksh this test breaks because "pwd -P" will happily display the current working directory, but getcwd(3) called by the "git init" we're testing here will fail to get it. I checked whether clobbering the $PWD environment variable would affect it, and it didn't. Presumably these shells keep track of their working directory internally. There's possible follow-up work here in teaching strbuf_getcwd() to get the working directory with whatever method "pwd" uses on these platforms. See [1] for a discussion of that, but let's take the easy way out here and just skip these tests by fixing the GETCWD_IGNORES_PERMS prerequisite to match the limitations of strbuf_getcwd(). 1. https://lore.kernel.org/git/b650bef5-d739-d98d-e9f1-fa292b6ce982@web.de/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-28Merge branch 'ab/pkt-line-tests'Junio C Hamano1-0/+12
Tests that cover protocol bits have been updated and helpers used there have been consolidated. * ab/pkt-line-tests: test-lib-functions: use test-tool for [de]packetize()
2021-07-28Merge branch 'ab/attribute-format'Junio C Hamano1-1/+1
Many "printf"-like helper functions we have have been annotated with __attribute__() to catch placeholder/parameter mismatches. * ab/attribute-format: advice.h: add missing __attribute__((format)) & fix usage *.h: add a few missing __attribute__((format)) *.c static functions: add missing __attribute__((format)) sequencer.c: move static function to avoid forward decl *.c static functions: don't forward-declare __attribute__
2021-07-28Merge branch 'ew/many-alternate-optim'Junio C Hamano3-0/+51
Optimization for repositories with many alternate object store. * ew/many-alternate-optim: oidtree: a crit-bit tree for odb_loose_cache oidcpy_with_padding: constify `src' arg make object_directory.loose_objects_subdir_seen a bitmap avoid strlen via strbuf_addstr in link_alt_odb_entry speed up alt_odb_usable() with many alternates
2021-07-19test-lib-functions: use test-tool for [de]packetize()Ævar Arnfjörð Bjarmason1-0/+12
The shell+perl "[de]packetize()" helper functions were added in 4414a150025 (t/lib-git-daemon: add network-protocol helpers, 2018-01-24), and around the same time we added the "pkt-line" helper in 74e70029615 (test-pkt-line: introduce a packet-line test helper, 2018-03-14). For some reason it seems we've mostly used the shell+perl version instead of the helper since then. There were discussions around 88124ab2636 (test-lib-functions: make packetize() more efficient, 2020-03-27) and cacae4329fa (test-lib-functions: simplify packetize() stdin code, 2020-03-29) to improve them and make them more efficient. There was one good reason to do so, we needed an equivalent of "test-tool pkt-line pack", but that command wasn't capable of handling input with "\n" (a feature) or "\0" (just because it happens to be printf-based under the hood). Let's add a "pkt-line-raw" helper for that, and expose is at a packetize_raw() to go with the existing packetize() on the shell level, this gives us the smallest amount of change to the tests themselves. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-16Merge branch 'jt/partial-clone-submodule-1'Junio C Hamano3-0/+45
Prepare the internals for lazily fetching objects in submodules from their promisor remotes. * jt/partial-clone-submodule-1: promisor-remote: teach lazy-fetch in any repo run-command: refactor subprocess env preparation submodule: refrain from filtering GIT_CONFIG_COUNT promisor-remote: support per-repository config repository: move global r_f_p_c to repo struct
2021-07-13Merge branch 'hn/prep-tests-for-reftable'Junio C Hamano1-1/+1
Preliminary clean-up of tests before the main reftable changes hits the codebase. * hn/prep-tests-for-reftable: (22 commits) t1415: set REFFILES for test specific to storage format t4202: mark bogus head hash test with REFFILES t7003: check reflog existence only for REFFILES t7900: stop checking for loose refs t1404: mark tests that muck with .git directly as REFFILES. t2017: mark --orphan/logAllRefUpdates=false test as REFFILES t1414: mark corruption test with REFFILES t1407: require REFFILES for for_each_reflog test test-lib: provide test prereq REFFILES t5304: use "reflog expire --all" to clear the reflog t5304: restyle: trim empty lines, drop ':' before > t7003: use rev-parse rather than FS inspection t5000: inspect HEAD using git-rev-parse t5000: reformat indentation to the latest fashion t1301: fix typo in error message t1413: use tar to save and restore entire .git directory t1401-symbolic-ref: avoid direct filesystem access t1401: use tar to snapshot and restore repo state t5601: read HEAD using rev-parse t9300: check ref existence using test-helper rather than a file system check ...
2021-07-13advice.h: add missing __attribute__((format)) & fix usageÆvar Arnfjörð Bjarmason1-1/+1
Add the missing __attribute__((format)) checking to advise_if_enabled(). This revealed a trivial issue introduced in b3b18d16213 (advice: revamp advise API, 2020-03-02). We treated the argv[1] as a format string, but did not intend to do so. Let's use "%s" and pass argv[1] as an argument instead. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-08Merge branch 'ab/cmd-foo-should-return'Junio C Hamano4-4/+4
Code clean-up. * ab/cmd-foo-should-return: builtins + test helpers: use return instead of exit() in cmd_*
2021-07-07oidtree: a crit-bit tree for odb_loose_cacheEric Wong3-0/+51
This saves 8K per `struct object_directory', meaning it saves around 800MB in my case involving 100K alternates (half or more of those alternates are unlikely to hold loose objects). This is implemented in two parts: a generic, allocation-free `cbtree' and the `oidtree' wrapper on top of it. The latter provides allocation using alloc_state as a memory pool to improve locality and reduce free(3) overhead. Unlike oid-array, the crit-bit tree does not require sorting. Performance is bound by the key length, for oidtree that is fixed at sizeof(struct object_id). There's no need to have 256 oidtrees to mitigate the O(n log n) overhead like we did with oid-array. Being a prefix trie, it is natively suited for expanding short object IDs via prefix-limited iteration in `find_short_object_filename'. On my busy workstation, p4205 performance seems to be roughly unchanged (+/-8%). Startup with 100K total alternates with no loose objects seems around 10-20% faster on a hot cache. (800MB in memory savings means more memory for the kernel FS cache). The generic cbtree implementation does impose some extra overhead for oidtree in that it uses memcmp(3) on "struct object_id" so it wastes cycles comparing 12 extra bytes on SHA-1 repositories. I've not yet explored reducing this overhead, but I expect there are many places in our code base where we'd want to investigate this. More information on crit-bit trees: https://cr.yp.to/critbit.html Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28promisor-remote: teach lazy-fetch in any repoJonathan Tan3-0/+45
This is one step towards supporting partial clone submodules. Even after this patch, we will still lack partial clone submodules support, primarily because a lot of Git code that accesses submodule objects does so by adding their object stores as alternates, meaning that any lazy fetches that would occur in the submodule would be done based on the config of the superproject, not of the submodule. This also prevents testing of the functionality in this patch by user-facing commands. So for now, test this mechanism using a test helper. Besides that, there is some code that uses the wrapper functions like has_promisor_remote(). Those will need to be checked to see if they could support the non-wrapper functions instead (and thus support any repository, not just the_repository). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-14Merge branch 'en/ort-perf-batch-11'Junio C Hamano1-20/+34
Optimize out repeated rename detection in a sequence of mergy operations. * en/ort-perf-batch-11: merge-ort, diffcore-rename: employ cached renames when possible merge-ort: handle interactions of caching and rename/rename(1to1) cases merge-ort: add helper functions for using cached renames merge-ort: preserve cached renames for the appropriate side merge-ort: avoid accidental API mis-use merge-ort: add code to check for whether cached renames can be reused merge-ort: populate caches of rename detection results merge-ort: add data structures for in-memory caching of rename detection t6429: testcases for remembering renames fast-rebase: write conflict state to working tree, index, and HEAD fast-rebase: change assert() to BUG() Documentation/technical: describe remembering renames optimization t6423: rename file within directory that other side renamed
2021-06-09builtins + test helpers: use return instead of exit() in cmd_*Ævar Arnfjörð Bjarmason4-4/+4
Change various cmd_* functions that claim to return an "int" to use "return" instead of exit() to indicate an exit code. These were not marked with NORETURN, and by directly exit()-ing we'll skip the cleanup git.c would otherwise do (e.g. closing fd's, erroring if we can't). See run_builtin() in git.c. In the case of shell.c and sh-i18n--envsubst.c this was the result of an incomplete migration to using a cmd_main() in 3f2e2297b9 (add an extra level of indirection to main(), 2016-07-01). This was spotted by SunCC 12.5 on Solaris 10 (gcc210 on the gccfarm). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02t/helper/ref-store: initialize oid in resolve-refHan-Wen Nienhuys1-1/+1
This will print $ZERO_OID when asking for a non-existent ref from the test-helper. Since resolve-ref provides direct access to refs_resolve_ref_unsafe(), it provides a reliable mechanism for accessing REFNAME, while avoiding the implicit resolution to refs/heads/REFNAME. Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20fast-rebase: write conflict state to working tree, index, and HEADElijah Newren1-19/+32
Previously, when fast-rebase hit a conflict, it simply aborted and left HEAD, the index, and the working tree where they were before the operation started. While fast-rebase does not support restarting from a conflicted state, write the conflicted state out anyway as it gives us a way to see what the conflicts are and write tests that check for them. This will be important in the upcoming commits, because sequencer.c is only superficially integrated with merge-ort.c; in particular, it calls merge_switch_to_result() after EACH merge instead of only calling it at the end of all the sequence of merges (or when a conflict is hit). This not only causes needless updates to the working copy and index, but also causes all intermediate data to be freed and tossed, preventing caching information from one merge to the next. However, integrating sequencer.c more deeply with merge-ort.c is a big task, and making this small extension to fast-rebase.c provides us with a simple way to test the edge and corner cases that we want to make sure continue working. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20fast-rebase: change assert() to BUG()Elijah Newren1-1/+2
assert() can succinctly document expectations for the code, and do so in a way that may be useful to future folks trying to refactor the code and change basic assumptions; it allows them to more quickly find some places where their violations of previous assumptions trips things up. Unfortunately, assert() can surround a function call with important side-effects, which is a huge mistake since some users will compile with assertions disabled. I've had to debug such mistakes before in other codebases, so I should know better. Luckily, this was only in test code, but it's still very embarrassing. Change an assert() to an if (...) BUG (...). Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11Merge branch 'jk/symlinked-dotgitx-cleanup'Junio C Hamano1-13/+33
Various test and documentation updates about .gitsomething paths that are symlinks. * jk/symlinked-dotgitx-cleanup: docs: document symlink restrictions for dot-files fsck: warn about symlinked dotfiles we'll open with O_NOFOLLOW t0060: test ntfs/hfs-obscured dotfiles t7450: test .gitmodules symlink matching against obscured names t7450: test verify_path() handling of gitmodules t7415: rename to expand scope fsck_tree(): wrap some long lines fsck_tree(): fix shadowed variable t7415: remove out-dated comment about translation
2021-05-10Merge branch 'bc/hash-transition-interop-part-1'Junio C Hamano1-1/+1
SHA-256 transition. * bc/hash-transition-interop-part-1: hex: print objects using the hash algorithm member hex: default to the_hash_algo on zero algorithm value builtin/pack-objects: avoid using struct object_id for pack hash commit-graph: don't store file hashes as struct object_id builtin/show-index: set the algorithm for object IDs hash: provide per-algorithm null OIDs hash: set, copy, and use algo field in struct object_id builtin/pack-redundant: avoid casting buffers to struct object_id Use the final_oid_fn to finalize hashing of object IDs hash: add a function to finalize object IDs http-push: set algorithm when reading object ID Always use oidread to read into struct object_id hash: add an algo member to struct object_id
2021-05-04t0060: test ntfs/hfs-obscured dotfilesJeff King1-13/+33
We have tests that cover various filesystem-specific spellings of ".gitmodules", because we need to reliably identify that path for some security checks. These are from dc2d9ba318 (is_{hfs,ntfs}_dotgitmodules: add tests, 2018-05-12), with the actual code coming from e7cb0b4455 (is_ntfs_dotgit: match other .git files, 2018-05-11) and 0fc333ba20 (is_hfs_dotgit: match other .git files, 2018-05-02). Those latter two commits also added similar matching functions for .gitattributes and .gitignore. These ended up not being used in the final series, and are currently dead code. But in preparation for them being used in some fsck checks, let's make sure they actually work by throwing a few basic tests at them. Likewise, let's cover .mailmap (which does need matching code added). I didn't bother with the whole battery of tests that we cover for .gitmodules. These functions are all based on the same generic matcher, so it's sufficient to test most of the corner cases just once. Note that the ntfs magic prefix names in the tests come from the algorithm described in e7cb0b4455 (and are different for each file). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-30Merge branch 'ds/sparse-index-protections'Junio C Hamano1-10/+56
Builds on top of the sparse-index infrastructure to mark operations that are not ready to mark with the sparse index, causing them to fall back on fully-populated index that they always have worked with. * ds/sparse-index-protections: (47 commits) name-hash: use expand_to_path() sparse-index: expand_to_path() name-hash: don't add directories to name_hash revision: ensure full index resolve-undo: ensure full index read-cache: ensure full index pathspec: ensure full index merge-recursive: ensure full index entry: ensure full index dir: ensure full index update-index: ensure full index stash: ensure full index rm: ensure full index merge-index: ensure full index ls-files: ensure full index grep: ensure full index fsck: ensure full index difftool: ensure full index commit: ensure full index checkout: ensure full index ...
2021-04-30Merge branch 'jk/promisor-optim'Junio C Hamano1-3/+3
Handling of "promisor packs" that allows certain objects to be missing and lazily retrievable has been optimized (a bit). * jk/promisor-optim: revision: avoid parsing with --exclude-promisor-objects lookup_unknown_object(): take a repository argument is_promisor_object(): free tree buffer after parsing
2021-04-27hash: provide per-algorithm null OIDsbrian m. carlson1-1/+1
Up until recently, object IDs did not have an algorithm member, only a hash. Consequently, it was possible to share one null (all-zeros) object ID among all hash algorithms. Now that we're going to be handling objects from multiple hash algorithms, it's important to make sure that all object IDs have a correct algorithm field. Introduce a per-algorithm null OID, and add it to struct hash_algo. Introduce a wrapper function as well, and use it everywhere we used to use the null_oid constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-20Merge branch 'ab/userdiff-tests'Junio C Hamano3-0/+48
A bit of code clean-up and a lot of test clean-up around userdiff area. * ab/userdiff-tests: blame tests: simplify userdiff driver test blame tests: don't rely on t/t4018/ directory userdiff: remove support for "broken" tests userdiff tests: list builtin drivers via test-tool userdiff tests: explicitly test "default" pattern userdiff: add and use for_each_userdiff_driver() userdiff style: normalize pascal regex declaration userdiff style: declare patterns with consistent style userdiff style: re-order drivers in alphabetical order
2021-04-13Merge branch 'cc/test-helper-bloom-usage-fix'Junio C Hamano1-1/+1
Usage message fix for a test helper. * cc/test-helper-bloom-usage-fix: test-bloom: fix missing 'bloom' from usage string
2021-04-13Merge branch 'tb/pack-preferred-tips-to-give-bitmap'Junio C Hamano3-0/+26
A configuration variable has been added to force tips of certain refs to be given a reachability bitmap. * tb/pack-preferred-tips-to-give-bitmap: builtin/pack-objects.c: respect 'pack.preferBitmapTips' t/helper/test-bitmap.c: initial commit pack-bitmap: add 'test_bitmap_commits()' helper
2021-04-13lookup_unknown_object(): take a repository argumentJeff King1-3/+3
All of the other lookup_foo() functions take a repository argument, but lookup_unknown_object() was never converted, and it uses the_repository internally. Let's fix that. We could leave a wrapper that uses the_repository, but there aren't that many calls, so we'll just convert them all. I looked briefly at each site to see if we had a repository struct (besides the_repository) we could pass, but none of them do (so this conversion to pass the_repository is a pure noop in each case, though it does take us one step closer to eventually getting rid of the_repository). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08Merge branch 'tb/reverse-midx'Junio C Hamano1-4/+20
An on-disk reverse-index to map the in-pack location of an object back to its object name across multiple packfiles is introduced. * tb/reverse-midx: midx.c: improve cache locality in midx_pack_order_cmp() pack-revindex: write multi-pack reverse indexes pack-write.c: extract 'write_rev_file_order' pack-revindex: read multi-pack reverse indexes Documentation/technical: describe multi-pack reverse indexes midx: make some functions non-static midx: keep track of the checksum midx: don't free midx_name early midx: allow marking a pack as preferred t/helper/test-read-midx.c: add '--show-objects' builtin/multi-pack-index.c: display usage on unrecognized command builtin/multi-pack-index.c: don't enter bogus cmd_mode builtin/multi-pack-index.c: split sub-commands builtin/multi-pack-index.c: define common usage with a macro builtin/multi-pack-index.c: don't handle 'progress' separately builtin/multi-pack-index.c: inline 'flags' with options
2021-04-08userdiff tests: list builtin drivers via test-toolÆvar Arnfjörð Bjarmason3-0/+48
Change the userdiff test to list the builtin drivers via the test-tool, using the new for_each_userdiff_driver() API function. This gets rid of the need to modify this part of the test every time a new pattern is added, see 2ff6c34612 (userdiff: support Bash, 2020-10-22) and 09dad9256a (userdiff: support Markdown, 2020-05-02) for two recent examples. I only need the "list-builtin-drivers "argument here, but let's add "list-custom-drivers" and "list-drivers" too, just because it's easy. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-05test-bloom: fix missing 'bloom' from usage stringChristian Couder1-1/+1
Like 'get_murmur3' and 'generate_filter', 'get_filter_for_commit' is a subcommand of `test-tool bloom` not of `test-tool` itself. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-02Merge branch 'jh/simple-ipc'Junio C Hamano3-0/+789
A simple IPC interface gets introduced to build services like fsmonitor on top. * jh/simple-ipc: t0052: add simple-ipc tests and t/helper/test-simple-ipc tool simple-ipc: add Unix domain socket implementation unix-stream-server: create unix domain socket under lock unix-socket: disallow chdir() when creating unix domain sockets unix-socket: add backlog size option to unix_stream_listen() unix-socket: eliminate static unix_stream_socket() helper function simple-ipc: add win32 implementation simple-ipc: design documentation for new IPC mechanism pkt-line: add options argument to read_packetized_to_strbuf() pkt-line: add PACKET_READ_GENTLE_ON_READ_ERROR option pkt-line: do not issue flush packets in write_packetized_*() pkt-line: eliminate the need for static buffer in packet_write_gently()
2021-03-31t/helper/test-bitmap.c: initial commitTaylor Blau3-0/+26
Add a new 'bitmap' test-tool which can be used to list the commits that have received bitmaps. In theory, a determined tester could run 'git rev-list --test-bitmap <commit>' to check if '<commit>' received a bitmap or not, since '--test-bitmap' exits with a non-zero code when it can't find the requested commit. But this is a dubious behavior to rely on, since arguably 'git rev-list' could continue its object walk outside of which commits are covered by bitmaps. This will be used to test the behavior of 'pack.preferBitmapTips', which will be added in the following patch. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30test-tool: don't force full indexDerrick Stolee1-1/+12
We will use 'test-tool read-cache --table' to check that a sparse index is written as part of init_repos. Since we will no longer always expand a sparse index into a full index, add an '--expand' parameter that adds a call to ensure_full_index() so we can compare a sparse index directly against a full index, or at least what the in-memory index looks like when expanded in this way. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30test-read-cache: print cache entries with --tableDerrick Stolee1-10/+45
This table is helpful for discovering data in the index to ensure it is being written correctly, especially as we build and test the sparse-index. This table includes an output format similar to 'git ls-tree', but should not be compared to that directly. The biggest reasons are that 'git ls-tree' includes a tree entry for every subdirectory, even those that would not appear as a sparse directory in a sparse-index. Further, 'git ls-tree' does not use a trailing directory separator for its tree rows. This does not print the stat() information for the blobs. That will be added in a future change with another option. The tests that are added in the next few changes care only about the object types and IDs. However, this future need for full index information justifies the need for this test helper over extending a user-facing feature, such as 'git ls-files'. To make the option parsing slightly more robust, wrap the string comparisons in a loop adapted from test-dir-iterator.c. Care must be taken with the final check for the 'cnt' variable. We continue the expectation that the numerical value is the final argument. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30t/helper/test-read-midx.c: add '--show-objects'Taylor Blau1-4/+20
The 'read-midx' helper is used in places like t5319 to display basic information about a multi-pack-index. In the next patch, the MIDX writing machinery will learn a new way to choose from which pack an object is selected when multiple copies of that object exist. To disambiguate which pack introduces an object so that this feature can be tested, add a '--show-objects' option which displays additional information about each object in the MIDX. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-24Merge branch 'nk/diff-index-fsmonitor'Junio C Hamano1-2/+2
"git diff-index" codepath has been taught to trust fsmonitor status to reduce number of lstat() calls. * nk/diff-index-fsmonitor: fsmonitor: add perf test for git diff HEAD fsmonitor: add assertion that fsmonitor is valid to check_removed fsmonitor: skip lstat deletion check during git diff-index
2021-03-22t0052: add simple-ipc tests and t/helper/test-simple-ipc toolJeff Hostetler3-0/+789
Create t0052-simple-ipc.sh with unit tests for the "simple-ipc" mechanism. Create t/helper/test-simple-ipc test tool to exercise the "simple-ipc" functions. When the tool is invoked with "run-daemon", it runs a server to listen for "simple-ipc" connections on a test socket or named pipe and responds to a set of commands to exercise/stress the communication setup. When the tool is invoked with "start-daemon", it spawns a "run-daemon" command in the background and waits for the server to become ready before exiting. (This helps make unit tests in t0052 more predictable and avoids the need for arbitrary sleeps in the test script.) The tool also has a series of client "send" commands to send commands and data to a server instance. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-19Merge branch 'jc/calloc-fix'Junio C Hamano1-1/+1
Code clean-up. * jc/calloc-fix: xcalloc: use CALLOC_ARRAY() when applicable
2021-03-18fsmonitor: add perf test for git diff HEADNipunn Koorapati1-2/+2
Update the xargs call so that if your large repo contains symlinks, test-tool chmtime failure does not end the script. On Linux Test this tree upstream/master --------------------------------------------------------------------------------------------------------- 7519.4: status (fsmonitor=fsmonitor-watchman) 0.52(0.43+0.10) 0.53(0.49+0.05) +1.9% 7519.5: status -uno (fsmonitor=fsmonitor-watchman) 0.21(0.15+0.07) 0.22(0.13+0.09) +4.8% 7519.6: status -uall (fsmonitor=fsmonitor-watchman) 1.65(0.93+0.71) 1.69(1.03+0.65) +2.4% 7519.7: status (dirty) (fsmonitor=fsmonitor-watchman) 11.99(11.34+1.58) 11.95(11.02+1.79) -0.3% 7519.8: diff (fsmonitor=fsmonitor-watchman) 0.25(0.17+0.26) 0.25(0.18+0.26) +0.0% 7519.9: diff HEAD (fsmonitor=fsmonitor-watchman) 0.39(0.25+0.34) 0.89(0.35+0.74) +128.2% 7519.10: diff -- 0_files (fsmonitor=fsmonitor-watchman) 0.16(0.13+0.04) 0.16(0.12+0.05) +0.0% 7519.11: diff -- 10_files (fsmonitor=fsmonitor-watchman) 0.16(0.12+0.05) 0.16(0.12+0.05) +0.0% 7519.12: diff -- 100_files (fsmonitor=fsmonitor-watchman) 0.16(0.12+0.05) 0.16(0.12+0.05) +0.0% 7519.13: diff -- 1000_files (fsmonitor=fsmonitor-watchman) 0.16(0.11+0.06) 0.16(0.12+0.05) +0.0% 7519.14: diff -- 10000_files (fsmonitor=fsmonitor-watchman) 0.18(0.13+0.06) 0.17(0.10+0.08) -5.6% 7519.15: add (fsmonitor=fsmonitor-watchman) 2.25(1.53+0.68) 2.25(1.47+0.74) +0.0% 7519.18: status (fsmonitor=disabled) 0.88(0.73+1.03) 0.89(0.67+1.08) +1.1% 7519.19: status -uno (fsmonitor=disabled) 0.45(0.43+0.89) 0.45(0.34+0.98) +0.0% 7519.20: status -uall (fsmonitor=disabled) 1.88(1.16+1.58) 1.88(1.22+1.51) +0.0% 7519.21: status (dirty) (fsmonitor=disabled) 7.53(7.05+2.11) 7.53(6.98+2.04) +0.0% 7519.22: diff (fsmonitor=disabled) 0.42(0.37+0.92) 0.42(0.38+0.91) +0.0% 7519.23: diff HEAD (fsmonitor=disabled) 0.44(0.41+0.90) 0.44(0.40+0.91) +0.0% 7519.24: diff -- 0_files (fsmonitor=disabled) 0.13(0.09+0.05) 0.13(0.09+0.05) +0.0% 7519.25: diff -- 10_files (fsmonitor=disabled) 0.13(0.10+0.04) 0.13(0.10+0.04) +0.0% 7519.26: diff -- 100_files (fsmonitor=disabled) 0.13(0.09+0.05) 0.13(0.10+0.04) +0.0% 7519.27: diff -- 1000_files (fsmonitor=disabled) 0.13(0.09+0.06) 0.13(0.09+0.05) +0.0% 7519.28: diff -- 10000_files (fsmonitor=disabled) 0.14(0.11+0.05) 0.14(0.10+0.05) +0.0% 7519.29: add (fsmonitor=disabled) 2.43(1.61+1.64) 2.43(1.69+1.57) +0.0% On linux (2.29.2 vs w/ this patch): nipunn@nipunn-dbx:~/src/server3$ strace -f -c git diff 2>&1 | grep lstat 0.04 0.000063 3 20 6 lstat nipunn@nipunn-dbx:~/src/server3$ strace -f -c git diff HEAD 2>&1 | grep lstat 94.98 5.242262 10 523783 13 lstat nipunn@nipunn-dbx:~/src/server3$ strace -f -c ../git/bin-wrappers/git diff 2>&1 | grep lstat 0.38 0.000032 5 7 3 lstat nipunn@nipunn-dbx:~/src/server3$ strace -f -c ../git/bin-wrappers/git diff HEAD 2>&1 | grep lstat 99.44 0.741892 9 81634 10 lstat On mac (2.29.2 vs w/ this patch): nipunn-mbp:server nipunn$ sudo dtruss -L -f -c git diff 2>&1 | grep "^lstat64 " lstat64 8 nipunn-mbp:server nipunn$ sudo dtruss -L -f -c git diff HEAD 2>&1 | grep "^lstat64 " lstat64 120242 nipunn-mbp:server nipunn$ sudo dtruss -L -f -c ../git/bin-wrappers/git diff 2>&1 | grep "^lstat64 " lstat64 4 nipunn-mbp:server nipunn$ sudo dtruss -L -f -c ../git/bin-wrappers/git diff HEAD 2>&1 | grep "^lstat64 " lstat64 4497 There are still a bunch of lstats - on directories, but not every file. Progress! Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15xcalloc: use CALLOC_ARRAY() when applicableJunio C Hamano1-1/+1
These are for codebase before Git 2.31 Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17Merge branch 'jt/trace2-BUG'Junio C Hamano1-0/+9
Even though invocations of "die()" were logged to the trace2 system, "BUG()"s were not, which has been corrected. * jt/trace2-BUG: usage: trace2 BUG() invocations
2021-02-17Merge branch 'ak/corrected-commit-date'Junio C Hamano1-0/+4
The commit-graph learned to use corrected commit dates instead of the generation number to help topological revision traversal. * ak/corrected-commit-date: doc: add corrected commit date info commit-reach: use corrected commit dates in paint_down_to_common() commit-graph: use generation v2 only if entire chain does commit-graph: implement generation data chunk commit-graph: implement corrected commit date commit-graph: return 64-bit generation number commit-graph: add a slab to store topological levels t6600-test-reach: generalize *_three_modes commit-graph: consolidate fill_commit_graph_info revision: parse parent in indegree_walk_step() commit-graph: fix regression when computing Bloom filters
2021-02-10Merge branch 'ab/grep-pcre-invalid-utf8'Junio C Hamano3-0/+14
Update support for invalid UTF-8 in PCRE2. * ab/grep-pcre-invalid-utf8: grep/pcre2: better support invalid UTF-8 haystacks grep/pcre2 tests: don't rely on invalid UTF-8 data test
2021-02-09usage: trace2 BUG() invocationsJonathan Tan1-0/+9
die() messages are traced in trace2, but BUG() messages are not. Anyone tracking die() messages would have even more reason to track BUG(). Therefore, write to trace2 when BUG() is invoked. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-24grep/pcre2: better support invalid UTF-8 haystacksÆvar Arnfjörð Bjarmason3-0/+14
Improve the support for invalid UTF-8 haystacks given a non-ASCII needle when using the PCREv2 backend. This is a more complete fix for a bug I started to fix in 870eea8166 (grep: do not enter PCRE2_UTF mode on fixed matching, 2019-07-26), now that PCREv2 has the PCRE2_MATCH_INVALID_UTF mode we can make use of it. This fixes the sort of case described in 8a5999838e (grep: stess test PCRE v2 on invalid UTF-8 data, 2019-07-26), i.e.: - The subject string is non-ASCII (e.g. "ævar") - We're under a is_utf8_locale(), e.g. "en_US.UTF-8", not "C" - We are using --ignore-case, or we're a non-fixed pattern If those conditions were satisfied and we matched found non-valid UTF-8 data PCREv2 might bark on it, in practice this only happened under the JIT backend (turned on by default on most platforms). Ultimately this fixes a "regression" in b65abcafc7 ("grep: use PCRE v2 for optimized fixed-string search", 2019-07-01), I'm putting that in scare-quotes because before then we wouldn't properly support these complex case-folding, locale etc. cases either, it just broke in different ways. There was a bug related to this the PCRE2_NO_START_OPTIMIZE flag fixed in PCREv2 10.36. It can be worked around by setting the PCRE2_NO_START_OPTIMIZE flag. Let's do that in those cases, and add tests for the bug. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-21refs: switch peel_ref() to peel_iterated_oid()Jeff King1-13/+0
The peel_ref() interface is confusing and error-prone: - it's typically used by ref iteration callbacks that have both a refname and oid. But since they pass only the refname, we may load the ref value from the filesystem again. This is inefficient, but also means we are open to a race if somebody simultaneously updates the ref. E.g., this: int some_ref_cb(const char *refname, const struct object_id *oid, ...) { if (!peel_ref(refname, &peeled)) printf("%s peels to %s", oid_to_hex(oid), oid_to_hex(&peeled); } could print nonsense. It is correct to say "refname peels to..." (you may see the "before" value or the "after" value, either of which is consistent), but mentioning both oids may be mixing before/after values. Worse, whether this is possible depends on whether the optimization to read from the current iterator value kicks in. So it is actually not possible with: for_each_ref(some_ref_cb); but it _is_ possible with: head_ref(some_ref_cb); which does not use the iterator mechanism (though in practice, HEAD should never peel to anything, so this may not be triggerable). - it must take a fully-qualified refname for the read_ref_full() code path to work. Yet we routinely pass it partial refnames from callbacks to for_each_tag_ref(), etc. This happens to work when iterating because there we do not call read_ref_full() at all, and only use the passed refname to check if it is the same as the iterator. But the requirements for the function parameters are quite unclear. Instead of taking a refname, let's instead take an oid. That fixes both problems. It's a little funny for a "ref" function not to involve refs at all. The key thing is that it's optimizing under the hood based on having access to the ref iterator. So let's change the name to make it clear why you'd want this function versus just peel_object(). There are two other directions I considered but rejected: - we could pass the peel information into the each_ref_fn callback. However, we don't know if the caller actually wants it or not. For packed-refs, providing it is essentially free. But for loose refs, we actually have to peel the object, which would be wasteful in most cases. We could likewise pass in a flag to the callback indicating whether the peeled information is known, but that complicates those callbacks, as they then have to decide whether to manually peel themselves. Plus it requires changing the interface of every callback, whether they care about peeling or not, and there are many of them. - we could make a function to return the peeled value of the current iterated ref (computing it if necessary), and BUG() otherwise. I.e.: int peel_current_iterated_ref(struct object_id *out); Each of the current callers is an each_ref_fn callback, so they'd mostly be happy. But: - we use those callbacks with functions like head_ref(), which do not use the iteration code. So we'd need to handle the fallback case there, anyway. - it's possible that a caller would want to call into generic code that sometimes is used during iteration and sometimes not. This encapsulates the logic to do the fast thing when possible, and fallback when necessary. The implementation is mostly obvious, but I want to call out a few things in the patch: - the test-tool coverage for peel_ref() is now meaningless, as it all collapses to a single peel_object() call (arguably they were pretty uninteresting before; the tricky part of that function is the fast-path we see during iteration, but these calls didn't trigger that). I've just dropped it entirely, though note that some other tests relied on the tags we created; I've moved that creation to the tests where it matters. - we no longer need to take a ref_store parameter, since we'd never look up a ref now. We do still rely on a global "current iterator" variable which _could_ be kept per-ref-store. But in practice this is only useful if there are multiple recursive iterations, at which point the more appropriate solution is probably a stack of iterators. No caller used the actual ref-store parameter anyway (they all call the wrapper that passes the_repository). - the original only kicked in the optimization when the "refname" pointer matched (i.e., not string comparison). We do likewise with the "oid" parameter here, but fall back to doing an actual oideq() call. This in theory lets us kick in the optimization more often, though in practice no current caller cares. It should never be wrong, though (peeling is a property of an object, so two refs pointing to the same object would peel identically). - the original took care not to touch the peeled out-parameter unless we found something to put in it. But no caller cares about this, and anyway, it is enforced by peel_object() itself (and even in the optimized iterator case, that's where we eventually end up). We can shorten the code and avoid an extra copy by just passing the out-parameter through the stack. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18commit-graph: implement generation data chunkAbhishek Kumar1-0/+4
As discovered by Ævar, we cannot increment graph version to distinguish between generation numbers v1 and v2 [1]. Thus, one of pre-requistes before implementing generation number v2 was to distinguish between graph versions in a backwards compatible manner. We are going to introduce a new chunk called Generation DATa chunk (or GDAT). GDAT will store corrected committer date offsets whereas CDAT will still store topological level. Old Git does not understand GDAT chunk and would ignore it, reading topological levels from CDAT. New Git can parse GDAT and take advantage of newer generation numbers, falling back to topological levels when GDAT chunk is missing (as it would happen with a commit-graph written by old Git). We introduce a test environment variable 'GIT_TEST_COMMIT_GRAPH_NO_GDAT' which forces commit-graph file to be written without generation data chunk to emulate a commit-graph file written by old Git. To minimize the space required to store corrrected commit date, Git stores corrected commit date offsets into the commit-graph file, instea of corrected commit dates. This saves us 4 bytes per commit, decreasing the GDAT chunk size by half, but it's possible for the offset to overflow the 4-bytes allocated for storage. As such overflows are and should be exceedingly rare, we use the following overflow management scheme: We introduce a new commit-graph chunk, Generation Data OVerflow ('GDOV') to store corrected commit dates for commits with offsets greater than GENERATION_NUMBER_V2_OFFSET_MAX. If the offset is greater than GENERATION_NUMBER_V2_OFFSET_MAX, we set the MSB of the offset and the other bits store the position of corrected commit date in GDOV chunk, similar to how Extra Edge List is maintained. We test the overflow-related code with the following repo history: F - N - U / \ U - N - U N \ / N - F - N Where the commits denoted by U have committer date of zero seconds since Unix epoch, the commits denoted by N have committer date of 1112354055 (default committer date for the test suite) seconds since Unix epoch and the commits denoted by F have committer date of (2 ^ 31 - 2) seconds since Unix epoch. The largest offset observed is 2 ^ 31, just large enough to overflow. [1]: https://lore.kernel.org/git/87a7gdspo4.fsf@evledraar.gmail.com/ Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25Merge branch 'jx/t5411-flake-fix'Junio C Hamano1-14/+40
The exchange between receive-pack and proc-receive hook did not carefully check for errors. * jx/t5411-flake-fix: receive-pack: use default version 0 for proc-receive receive-pack: gently write messages to proc-receive t5411: new helper filter_out_user_friendly_and_stable_output
2020-11-21Merge branch 'en/strmap'Junio C Hamano1-5/+4
A specialization of hashmap that uses a string as key has been introduced. Hopefully it will see wider use over time. * en/strmap: shortlog: use strset from strmap.h Use new HASHMAP_INIT macro to simplify hashmap initialization strmap: take advantage of FLEXPTR_ALLOC_STR when relevant strmap: enable allocations to come from a mem_pool strmap: add a strset sub-type strmap: split create_entry() out of strmap_put() strmap: add functions facilitating use as a string->int map strmap: enable faster clearing and reusing of strmaps strmap: add more utility functions strmap: new utility functions hashmap: provide deallocation function names hashmap: introduce a new hashmap_partial_clear() hashmap: allow re-use after hashmap_free() hashmap: adjust spacing to fix argument alignment hashmap: add usage documentation explaining hashmap_free[_entries]()
2020-11-18Merge branch 'en/merge-ort-api-null-impl'Junio C Hamano3-0/+213
Preparation for a new merge strategy. * en/merge-ort-api-null-impl: merge,rebase,revert: select ort or recursive by config or environment fast-rebase: demonstrate merge-ort's API via new test-tool command merge-ort-wrappers: new convience wrappers to mimic the old merge API merge-ort: barebones API of new merge strategy with empty implementation
2020-11-18Merge branch 'ds/maintenance-part-3'Junio C Hamano3-0/+37
Parts of "git maintenance" to ease writing crontab entries (and other scheduling system configuration) for it. * ds/maintenance-part-3: maintenance: add troubleshooting guide to docs maintenance: use 'incremental' strategy by default maintenance: create maintenance.strategy config maintenance: add start/stop subcommands maintenance: add [un]register subcommands for-each-repo: run subcommands on configured repos maintenance: add --schedule option and config maintenance: optionally skip --auto process
2020-11-11Use new HASHMAP_INIT macro to simplify hashmap initializationElijah Newren1-2/+1
Now that hashamp has lazy initialization and a HASHMAP_INIT macro, hashmaps allocated on the stack can be initialized without a call to hashmap_init() and in some cases makes the code a bit shorter. Convert some callsites over to take advantage of this. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11receive-pack: use default version 0 for proc-receiveJiang Xin1-6/+10
In the verison negotiation phase between "receive-pack" and "proc-receive", "proc-receive" can send an empty flush-pkt to end the negotiation and use default version 0. Capabilities (such as "push-options") are not supported in version 0. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11receive-pack: gently write messages to proc-receiveJiang Xin1-10/+32
Johannes found a flaky hang in `t5411/test-0013-bad-protocol.sh` in the osx-clang job of the CI/PR builds, and ran into an issue when using the `--stress` option with the following error messages: fatal: unable to write flush packet: Broken pipe send-pack: unexpected disconnect while reading sideband packet fatal: the remote end hung up unexpectedly In this test case, the "proc-receive" hook sends an error message and dies earlier. While "receive-pack" on the other side of the pipe should forward the error message of the "proc-receive" hook to the client side, but it fails to do so. This is because "receive-pack" uses `packet_write_fmt()` and `packet_flush()` to write pkt-line message to "proc-receive" hook, and these functions die immediately when pipe is broken. Using "gently" forms for these functions will get more predicable output. Add more "--die-*" options to test helper to test different stages of the protocol between "receive-pack" and "proc-receive" hook. Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-02Merge branch 'js/avoid-split-sideband-message'Junio C Hamano1-0/+23
The side-band status report can be sent at the same time as the primary payload multiplexed, but the demultiplexer on the receiving end incorrectly split a single status report into two, which has been corrected. * js/avoid-split-sideband-message: test-pkt-line: drop colon from sideband identity sideband: report unhandled incomplete sideband messages as bugs sideband: avoid reporting incomplete sideband messages
2020-11-02hashmap: provide deallocation function namesElijah Newren1-3/+3
hashmap_free(), hashmap_free_entries(), and hashmap_free_() have existed for a while, but aren't necessarily the clearest names, especially with hashmap_partial_clear() being added to the mix and lazy-initialization now being supported. Peff suggested we adopt the following names[1]: - hashmap_clear() - remove all entries and de-allocate any hashmap-specific data, but be ready for reuse - hashmap_clear_and_free() - ditto, but free the entries themselves - hashmap_partial_clear() - remove all entries but don't deallocate table - hashmap_partial_clear_and_free() - ditto, but free the entries This patch provides the new names and converts all existing callers over to the new naming scheme. [1] https://lore.kernel.org/git/20201030125059.GA3277724@coredump.intra.peff.net/ Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-29fast-rebase: demonstrate merge-ort's API via new test-tool commandElijah Newren3-0/+213
Add a new test-tool command named 'fast-rebase', which is a super-slimmed down and nowhere near as capable version of 'git rebase'. 'test-tool fast-rebase' is not currently planned for usage in the testsuite, but is here for two purposes: 1) Demonstrate the desired API of merge-ort. In particular, fast-rebase takes advantage of the separation of the merging operation from the updating of the index and working tree, to allow it to pick N commits, but only update the index and working tree once at the end. Look for the calls to merge_incore_nonrecursive() and merge_switch_to_result(). 2) Provide a convenient benchmark that isn't polluted by the heavy disk writing and forking of unnecessary processes that comes from sequencer.c and merge-recursive.c. fast-rebase is not meant to replace sequencer.c, just give ideas on how sequencer.c can be changed. Updating sequencer.c with these goals is probably a large amount of work; writing a simple targeted command with no documentation, less-than-useful help messages, numerous limitations in terms of flags it can accept and situations it can handle, and which is flagged off from users is a much easier interim step. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-27test-pkt-line: drop colon from sideband identityJeff King1-1/+1
We pass "sideband: " as our identity for errors to recv_sideband(). But it already adds the trailing colon and space. This doesn't invalidate any tests, but it looks funny when you examine the test output. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-20sideband: avoid reporting incomplete sideband messagesJohannes Schindelin1-0/+23
In 2b695ecd74d (t5500: count objects through stderr, not trace, 2020-05-06) we tried to ensure that the "Total 3" message could be grepped in Git's output, even if it sometimes got chopped up into multiple lines in the trace machinery. However, the first instance where this mattered now goes through the sideband machinery, where it is _still_ possible for messages to get chopped up: it *is* possible for the standard error stream to be sent byte-for-byte and hence it can be easily interrupted. Meaning: it is possible for the single line that we're looking for to be chopped up into multiple sideband packets, with a primary packet being delivered between them. This seems to happen occasionally in the `vs-test` part of our CI builds, i.e. with binaries built using Visual C, but not when building with GCC or clang; The symptom is that t5500.43 fails to find a line matching `remote: Total 3` in the `log` file, which ends in something along these lines: remote: Tota remote: l 3 (delta 0), reused 0 (delta 0), pack-reused 0 This should not happen, though: we have code in `demultiplex_sideband()` _specifically_ to stitch back together lines that were delivered in separate sideband packets. However, this stitching was broken in a subtle way in fbd76cd450 (sideband: reverse its dependency on pkt-line, 2019-01-16): before that change, incomplete sideband lines would not be flushed upon receiving a primary packet, but after that patch, they would be. The subtleness of this bug comes from the fact that it is easy to get confused by the ambiguous meaning of the `break` keyword: after writing the primary packet contents, the `break;` in the original version of `recv_sideband()` does _not_ break out of the `while` loop, but instead only ends the `switch` case: while (!retval) { [...] switch (band) { [...] case 1: /* Write the contents of the primary packet */ write_or_die(out, buf + 1, len); /* Here, we do *not* break out of the loop, `retval` is unchanged */ break; [...] } if (outbuf.len) { /* Write any remaining sideband messages lacking a trailing LF */ strbuf_addch(&outbuf, '\n'); xwrite(2, outbuf.buf, outbuf.len); } In contrast, after fbd76cd450 (sideband: reverse its dependency on pkt-line, 2019-01-16), the body of the `while` loop was extracted into `demultiplex_sideband()`, crucially _including_ the logic to write incomplete sideband messages: switch (band) { [...] case 1: *sideband_type = SIDEBAND_PRIMARY; /* This does not break out of the loop: the loop is in the caller */ break; [...] } cleanup: [...] /* This logic is now no longer _outside_ the loop but _inside_ */ if (scratch->len) { strbuf_addch(scratch, '\n'); xwrite(2, scratch->buf, scratch->len); } The correct way to fix this is to return from `demultiplex_sideband()` early. The caller will then write out the contents of the primary packet and continue looping. The `scratch` buffer for incomplete sideband messages is owned by that caller, and will continue to accumulate the remainder(s) of those messages. The loop will only end once `demultiplex_sideband()` returned non-zero _and_ did not indicate a primary packet, which is the case only when we hit the `cleanup:` path, in which we take care of flushing any unfinished sideband messages and release the `scratch` buffer. To ensure that this does not get broken again, we introduce a pair of subcommands of the `pkt-line` test helper that specifically chop up the sideband message and squeeze a primary packet into the middle. Final note: The other test case touched by 2b695ecd74d (t5500: count objects through stderr, not trace, 2020-05-06) is not affected by this issue because the sideband machinery is not involved there. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>