aboutsummaryrefslogtreecommitdiffstats
path: root/t/perf
AgeCommit message (Collapse)AuthorFilesLines
2021-07-02perf: fix when running with TEST_OUTPUT_DIRECTORYPatrick Steinhardt3-13/+24
When the TEST_OUTPUT_DIRECTORY is defined, then all test data will be written in that directory instead of the default directory located in "t/". While this works as expected for our normal tests, performance tests fail to locate and aggregate performance data because they don't know to handle TEST_OUTPUT_DIRECTORY correctly and always look at the default location. Fix the issue by adding a `--results-dir` parameter to "aggregate.perl" which identifies the directory where results are and by making the "run" script awake of the TEST_OUTPUT_DIRECTORY variable. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10Merge branch 'rs/repack-without-loosening-promised-objects'Junio C Hamano1-0/+4
"git repack -A -d" in a partial clone unnecessarily loosened objects in promisor pack. * rs/repack-without-loosening-promised-objects: repack: avoid loosening promisor objects in partial clones
2021-04-30Merge branch 'ds/sparse-index-protections'Junio C Hamano1-0/+101
Builds on top of the sparse-index infrastructure to mark operations that are not ready to mark with the sparse index, causing them to fall back on fully-populated index that they always have worked with. * ds/sparse-index-protections: (47 commits) name-hash: use expand_to_path() sparse-index: expand_to_path() name-hash: don't add directories to name_hash revision: ensure full index resolve-undo: ensure full index read-cache: ensure full index pathspec: ensure full index merge-recursive: ensure full index entry: ensure full index dir: ensure full index update-index: ensure full index stash: ensure full index rm: ensure full index merge-index: ensure full index ls-files: ensure full index grep: ensure full index fsck: ensure full index difftool: ensure full index commit: ensure full index checkout: ensure full index ...
2021-04-28repack: avoid loosening promisor objects in partial clonesRafael Silva1-0/+4
When `git repack -A -d` is run in a partial clone, `pack-objects` is invoked twice: once to repack all promisor objects, and once to repack all non-promisor objects. The latter `pack-objects` invocation is with --exclude-promisor-objects and --unpack-unreachable, which loosens all objects unused during this invocation. Unfortunately, this includes promisor objects. Because the -d argument to `git repack` subsequently deletes all loose objects also in packs, these just-loosened promisor objects will be immediately deleted. However, this extra disk churn is unnecessary in the first place. For example, in a newly-cloned partial repo that filters all blob objects (e.g. `--filter=blob:none`), `repack` ends up unpacking all trees and commits into the filesystem because every object, in this particular case, is a promisor object. Depending on the repo size, this increases the disk usage considerably: In my copy of the linux.git, the object directory peaked 26GB of more disk usage. In order to avoid this extra disk churn, pass the names of the promisor packfiles as --keep-pack arguments to the second invocation of `pack-objects`. This informs `pack-objects` that the promisor objects are already in a safe packfile and, therefore, do not need to be loosened. For testing, we need to validate whether any object was loosened. However, the "evidence" (loosened objects) is deleted during the process which prevents us from inspecting the object directory. Instead, let's teach `pack-objects` to count loosened objects and emit via trace2 thus allowing inspecting the debug events after the process is finished. This new event is used on the added regression test. Lastly, add a new perf test to evaluate the performance impact made by this changes (tested on git.git): Test HEAD^ HEAD ---------------------------------------------------------- 5600.3: gc 134.38(41.93+90.95) 7.80(6.72+1.35) -94.2% For a bigger repository, such as linux.git, the improvement is even bigger: Test HEAD^ HEAD ------------------------------------------------------------------- 5600.3: gc 6833.00(918.07+3162.74) 268.79(227.02+39.18) -96.1% These improvements are particular big because every object in the newly-cloned partial repository is a promisor object. Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Helped-by: Jeff King <peff@peff.net> Helped-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13revision: avoid parsing with --exclude-promisor-objectsJeff King1-0/+8
When --exclude-promisor-objects is given, before traversing any objects we iterate over all of the objects in any promisor packs, marking them as UNINTERESTING and SEEN. We turn the oid we get from iterating the pack into an object with parse_object(), but this has two problems: - it's slow; we are zlib inflating (and reconstructing from deltas) every byte of every object in the packfile - it leaves the tree buffers attached to their structs, which means our heap usage will grow to store every uncompressed tree simultaneously. This can be gigabytes. We can obviously fix the second by freeing the tree buffers after we've parsed them. But we can observe that the function doesn't look at the object contents at all! The only reason we call parse_object() is that we need a "struct object" on which to set the flags. There are two options here: - we can look up just the object type via oid_object_info(), and then call the appropriate lookup_foo() function - we can call lookup_unknown_object(), which gives us an OBJ_NONE struct (which will get auto-converted later by object_as_type() via calls to lookup_commit(), etc). The first one is closer to the current code, but we do pay the price to look up the type for each object. The latter should be more efficient in CPU, though it wastes a little bit of memory (the "unknown" object structs are a union of all object types, so some of the structs are bigger than they need to be). It also runs the risk of triggering a latent bug in code that calls lookup_object() directly but isn't ready to handle OBJ_NONE (such code would already be buggy, but we use lookup_unknown_object() infrequently enough that it might be hiding). I went with the second option here. I don't think the risk is high (and we'd want to find and fix any such bugs anyway), and it should be more efficient overall. The new tests in p5600 show off the improvement (this is on git.git): Test HEAD^ HEAD ------------------------------------------------------------------------------- 5600.5: count commits 0.37(0.37+0.00) 0.38(0.38+0.00) +2.7% 5600.6: count non-promisor commits 11.74(11.37+0.37) 0.04(0.03+0.00) -99.7% The improvement is particularly big in this script because _every_ object in the newly-cloned partial repo is a promisor object. So after marking them all, there's nothing left to traverse. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13is_promisor_object(): free tree buffer after parsingJeff King1-0/+4
To get the list of all promisor objects, we not only include all objects in promisor packs, but also parse each of those objects to see which objects they reference. After parsing a tree object, the tree->buffer field will remain populated until we explicitly free it. So in a partial clone of blob:none, for example, we are essentially reading every tree in the repository (since they're all in the initial promisor pack), and keeping all of their uncompressed contents in memory at once. This patch frees the tree buffers after we've finished marking all of their reachable objects. We shouldn't need to do this for any other object type. While we are using some extra memory to store the structs, no other object type stores the whole contents in its parsed form (we do sometimes hold on to commit buffers, but less so these days due to commit graphs, plus most commands which care about promisor objects turn off the save_commit_buffer global). Even for a moderate-sized repository like git.git, this patch drops the peak heap (as measured by massif) for git-fsck from ~1.7GB to ~138MB. Fsck is a good candidate for measuring here because it doesn't interact with the promisor code except to call is_promisor_object(), so we can isolate just this problem. The added perf test shows only a tiny improvement on my machine for git.git, since 1.7GB isn't enough to cause any real memory pressure: Test HEAD^ HEAD -------------------------------------------------------------------------------- 5600.4: fsck 21.26(20.90+0.35) 20.84(20.79+0.04) -2.0% With linux.git the absolute change is a bit bigger, though still a small percentage: Test HEAD^ HEAD ----------------------------------------------------------------------------- 5600.4: fsck 262.26(259.13+3.12) 254.92(254.62+0.29) -2.8% I didn't have the patience to run it under massif with linux.git, but it's probably on the order of about 14GB improvement, since that's the sum of the sizes of all of the uncompressed trees (but still isn't enough to create memory pressure on this particular machine, which has 64GB of RAM). Smaller machines would probably see a bigger effect on runtime (and sadly our perf suite does not measure peak heap). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-07Merge branch 'ps/pack-bitmap-optim'Junio C Hamano1-0/+14
Optimize "rev-list --use-bitmap-index --objects" corner case that uses negative tags as the stopping points. * ps/pack-bitmap-optim: pack-bitmap: avoid traversal of objects referenced by uninteresting tag
2021-03-30p2000: add sparse-index reposDerrick Stolee1-1/+18
p2000-sparse-operations.sh compares different Git commands in repositories with many files at HEAD but using sparse-checkout to focus on a small portion of those files. Add extra copies of the repository that use the sparse-index format so we can track how that affects the performance of different commands. At this point in time, the sparse-index is 100% overhead from the CPU front, and this is measurable in these tests: Test --------------------------------------------------------------- 2000.2: git status (full-index-v3) 0.59(0.51+0.12) 2000.3: git status (full-index-v4) 0.59(0.52+0.11) 2000.4: git status (sparse-index-v3) 1.40(1.32+0.12) 2000.5: git status (sparse-index-v4) 1.41(1.36+0.08) 2000.6: git add -A (full-index-v3) 2.32(1.97+0.19) 2000.7: git add -A (full-index-v4) 2.17(1.92+0.14) 2000.8: git add -A (sparse-index-v3) 2.31(2.21+0.15) 2000.9: git add -A (sparse-index-v4) 2.30(2.20+0.13) 2000.10: git add . (full-index-v3) 2.39(2.02+0.20) 2000.11: git add . (full-index-v4) 2.20(1.94+0.16) 2000.12: git add . (sparse-index-v3) 2.36(2.27+0.12) 2000.13: git add . (sparse-index-v4) 2.33(2.21+0.16) 2000.14: git commit -a -m A (full-index-v3) 2.47(2.12+0.20) 2000.15: git commit -a -m A (full-index-v4) 2.26(2.00+0.17) 2000.16: git commit -a -m A (sparse-index-v3) 3.01(2.92+0.16) 2000.17: git commit -a -m A (sparse-index-v4) 3.01(2.94+0.15) Note that there is very little difference between the v3 and v4 index formats when the sparse-index is enabled. This is primarily due to the fact that the relative file sizes are the same, and the command time is mostly taken up by parsing tree objects to expand the sparse index into a full one. With the current file layout, the index file sizes are given by this table: | full index | sparse index | +-------------+--------------+ v3 | 108 MiB | 1.6 MiB | v4 | 80 MiB | 1.2 MiB | Future updates will improve the performance of Git commands when the index is sparse. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30t/perf: add performance test for sparse operationsDerrick Stolee1-0/+84
Create a test script that takes the default performance test (the Git codebase) and multiplies it by 256 using four layers of duplicated trees of width four. This results in nearly one million blob entries in the index. Then, we can clone this repository with sparse-checkout patterns that demonstrate four copies of the initial repository. Each clone will use a different index format or mode so peformance can be tested across the different options. Note that the initial repo is stripped of submodules before doing the copies. This preserves the expected data shape of the sparse index, because directories containing submodules are not collapsed to a sparse directory entry. Run a few Git commands on these clones, especially those that use the index (status, add, commit). Here are the results on my Linux machine: Test -------------------------------------------------------------- 2000.2: git status (full-index-v3) 0.37(0.30+0.09) 2000.3: git status (full-index-v4) 0.39(0.32+0.10) 2000.4: git add -A (full-index-v3) 1.42(1.06+0.20) 2000.5: git add -A (full-index-v4) 1.26(0.98+0.16) 2000.6: git add . (full-index-v3) 1.40(1.04+0.18) 2000.7: git add . (full-index-v4) 1.26(0.98+0.17) 2000.8: git commit -a -m A (full-index-v3) 1.42(1.11+0.16) 2000.9: git commit -a -m A (full-index-v4) 1.33(1.08+0.16) It is perhaps noteworthy that there is an improvement when using index version 4. This is because the v3 index uses 108 MiB while the v4 index uses 80 MiB. Since the repeated portions of the directories are very short (f3/f1/f2, for example) this ratio is less pronounced than in similarly-sized real repositories. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-24Merge branch 'nk/diff-index-fsmonitor'Junio C Hamano1-0/+4
"git diff-index" codepath has been taught to trust fsmonitor status to reduce number of lstat() calls. * nk/diff-index-fsmonitor: fsmonitor: add perf test for git diff HEAD fsmonitor: add assertion that fsmonitor is valid to check_removed fsmonitor: skip lstat deletion check during git diff-index
2021-03-24Merge branch 'tb/geometric-repack'Junio C Hamano1-3/+33
"git repack" so far has been only capable of repacking everything under the sun into a single pack (or split by size). A cleverer strategy to reduce the cost of repacking a repository has been introduced. * tb/geometric-repack: builtin/pack-objects.c: ignore missing links with --stdin-packs builtin/repack.c: reword comment around pack-objects flags builtin/repack.c: be more conservative with unsigned overflows builtin/repack.c: assign pack split later t7703: test --geometric repack with loose objects builtin/repack.c: do not repack single packs with --geometric builtin/repack.c: add '--geometric' option packfile: add kept-pack cache for find_kept_pack_entry() builtin/pack-objects.c: rewrite honor-pack-keep logic p5303: measure time to repack with keep p5303: add missing &&-chains builtin/pack-objects.c: add '--stdin-packs' option revision: learn '--no-kept-objects' packfile: introduce 'find_kept_pack_entry()'
2021-03-22Merge branch 'jk/perf-in-worktrees'Junio C Hamano1-9/+22
Perf test update to work better in secondary worktrees. * jk/perf-in-worktrees: t/perf: avoid copying worktree files from test repo t/perf: handle worktrees as test repos
2021-03-22pack-bitmap: avoid traversal of objects referenced by uninteresting tagPatrick Steinhardt1-0/+14
When preparing the bitmap walk, we first establish the set of of have and want objects by iterating over the set of pending objects: if an object is marked as uninteresting, it's declared as an object we already have, otherwise as an object we want. These two sets are then used to compute which transitively referenced objects we need to obtain. One special case here are tag objects: when a tag is requested, we resolve it to its first not-tag object and add both resolved objects as well as the tag itself into either the have or want set. Given that the uninteresting-property always propagates to referenced objects, it is clear that if the tag is uninteresting, so are its children and vice versa. But we fail to propagate the flag, which effectively means that referenced objects will always be interesting except for the case where they have already been marked as uninteresting explicitly. This mislabeling does not impact correctness: we now have it in our "wants" set, and given that we later do an `AND NOT` of the bitmaps of "wants" and "haves" sets it is clear that the result must be the same. But we now start to needlessly traverse the tag's referenced objects in case it is uninteresting, even though we know that each referenced object will be uninteresting anyway. In the worst case, this can lead to a complete graph walk just to establish that we do not care for any object. Fix the issue by propagating the `UNINTERESTING` flag to pointees of tag objects and add a benchmark with negative revisions to p5310. This shows some nice performance benefits, tested with linux.git: Test HEAD~ HEAD --------------------------------------------------------------------------------------------------------------- 5310.3: repack to disk 193.18(181.46+16.42) 194.61(183.41+15.83) +0.7% 5310.4: simulated clone 25.93(24.88+1.05) 25.81(24.73+1.08) -0.5% 5310.5: simulated fetch 2.64(5.30+0.69) 2.59(5.16+0.65) -1.9% 5310.6: pack to file (bitmap) 58.75(57.56+6.30) 58.29(57.61+5.73) -0.8% 5310.7: rev-list (commits) 1.45(1.18+0.26) 1.46(1.22+0.24) +0.7% 5310.8: rev-list (objects) 15.35(14.22+1.13) 15.30(14.23+1.07) -0.3% 5310.9: rev-list with tag negated via --not --all (objects) 22.49(20.93+1.56) 0.11(0.09+0.01) -99.5% 5310.10: rev-list with negative tag (objects) 0.61(0.44+0.16) 0.51(0.35+0.16) -16.4% 5310.11: rev-list count with blob:none 12.15(11.19+0.96) 12.18(11.19+0.99) +0.2% 5310.12: rev-list count with blob:limit=1k 17.77(15.71+2.06) 17.75(15.63+2.12) -0.1% 5310.13: rev-list count with tree:0 1.69(1.31+0.38) 1.68(1.28+0.39) -0.6% 5310.14: simulated partial clone 20.14(19.15+0.98) 19.98(18.93+1.05) -0.8% 5310.16: clone (partial bitmap) 12.78(13.89+1.07) 12.72(13.99+1.01) -0.5% 5310.17: pack to file (partial bitmap) 42.07(45.44+2.72) 41.44(44.66+2.80) -1.5% 5310.18: rev-list with tree filter (partial bitmap) 0.44(0.29+0.15) 0.46(0.32+0.14) +4.5% While most benchmarks are probably in the range of noise, the newly added 5310.9 and 5310.10 benchmarks consistenly perform better. Signed-off-by: Patrick Steinhardt <ps@pks.im>. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18fsmonitor: add perf test for git diff HEADNipunn Koorapati1-0/+4
Update the xargs call so that if your large repo contains symlinks, test-tool chmtime failure does not end the script. On Linux Test this tree upstream/master --------------------------------------------------------------------------------------------------------- 7519.4: status (fsmonitor=fsmonitor-watchman) 0.52(0.43+0.10) 0.53(0.49+0.05) +1.9% 7519.5: status -uno (fsmonitor=fsmonitor-watchman) 0.21(0.15+0.07) 0.22(0.13+0.09) +4.8% 7519.6: status -uall (fsmonitor=fsmonitor-watchman) 1.65(0.93+0.71) 1.69(1.03+0.65) +2.4% 7519.7: status (dirty) (fsmonitor=fsmonitor-watchman) 11.99(11.34+1.58) 11.95(11.02+1.79) -0.3% 7519.8: diff (fsmonitor=fsmonitor-watchman) 0.25(0.17+0.26) 0.25(0.18+0.26) +0.0% 7519.9: diff HEAD (fsmonitor=fsmonitor-watchman) 0.39(0.25+0.34) 0.89(0.35+0.74) +128.2% 7519.10: diff -- 0_files (fsmonitor=fsmonitor-watchman) 0.16(0.13+0.04) 0.16(0.12+0.05) +0.0% 7519.11: diff -- 10_files (fsmonitor=fsmonitor-watchman) 0.16(0.12+0.05) 0.16(0.12+0.05) +0.0% 7519.12: diff -- 100_files (fsmonitor=fsmonitor-watchman) 0.16(0.12+0.05) 0.16(0.12+0.05) +0.0% 7519.13: diff -- 1000_files (fsmonitor=fsmonitor-watchman) 0.16(0.11+0.06) 0.16(0.12+0.05) +0.0% 7519.14: diff -- 10000_files (fsmonitor=fsmonitor-watchman) 0.18(0.13+0.06) 0.17(0.10+0.08) -5.6% 7519.15: add (fsmonitor=fsmonitor-watchman) 2.25(1.53+0.68) 2.25(1.47+0.74) +0.0% 7519.18: status (fsmonitor=disabled) 0.88(0.73+1.03) 0.89(0.67+1.08) +1.1% 7519.19: status -uno (fsmonitor=disabled) 0.45(0.43+0.89) 0.45(0.34+0.98) +0.0% 7519.20: status -uall (fsmonitor=disabled) 1.88(1.16+1.58) 1.88(1.22+1.51) +0.0% 7519.21: status (dirty) (fsmonitor=disabled) 7.53(7.05+2.11) 7.53(6.98+2.04) +0.0% 7519.22: diff (fsmonitor=disabled) 0.42(0.37+0.92) 0.42(0.38+0.91) +0.0% 7519.23: diff HEAD (fsmonitor=disabled) 0.44(0.41+0.90) 0.44(0.40+0.91) +0.0% 7519.24: diff -- 0_files (fsmonitor=disabled) 0.13(0.09+0.05) 0.13(0.09+0.05) +0.0% 7519.25: diff -- 10_files (fsmonitor=disabled) 0.13(0.10+0.04) 0.13(0.10+0.04) +0.0% 7519.26: diff -- 100_files (fsmonitor=disabled) 0.13(0.09+0.05) 0.13(0.10+0.04) +0.0% 7519.27: diff -- 1000_files (fsmonitor=disabled) 0.13(0.09+0.06) 0.13(0.09+0.05) +0.0% 7519.28: diff -- 10000_files (fsmonitor=disabled) 0.14(0.11+0.05) 0.14(0.10+0.05) +0.0% 7519.29: add (fsmonitor=disabled) 2.43(1.61+1.64) 2.43(1.69+1.57) +0.0% On linux (2.29.2 vs w/ this patch): nipunn@nipunn-dbx:~/src/server3$ strace -f -c git diff 2>&1 | grep lstat 0.04 0.000063 3 20 6 lstat nipunn@nipunn-dbx:~/src/server3$ strace -f -c git diff HEAD 2>&1 | grep lstat 94.98 5.242262 10 523783 13 lstat nipunn@nipunn-dbx:~/src/server3$ strace -f -c ../git/bin-wrappers/git diff 2>&1 | grep lstat 0.38 0.000032 5 7 3 lstat nipunn@nipunn-dbx:~/src/server3$ strace -f -c ../git/bin-wrappers/git diff HEAD 2>&1 | grep lstat 99.44 0.741892 9 81634 10 lstat On mac (2.29.2 vs w/ this patch): nipunn-mbp:server nipunn$ sudo dtruss -L -f -c git diff 2>&1 | grep "^lstat64 " lstat64 8 nipunn-mbp:server nipunn$ sudo dtruss -L -f -c git diff HEAD 2>&1 | grep "^lstat64 " lstat64 120242 nipunn-mbp:server nipunn$ sudo dtruss -L -f -c ../git/bin-wrappers/git diff 2>&1 | grep "^lstat64 " lstat64 4 nipunn-mbp:server nipunn$ sudo dtruss -L -f -c ../git/bin-wrappers/git diff HEAD 2>&1 | grep "^lstat64 " lstat64 4497 There are still a bunch of lstats - on directories, but not every file. Progress! Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-01Merge branch 'jh/fsmonitor-prework'Junio C Hamano3-12/+64
Preliminary changes to fsmonitor integration. * jh/fsmonitor-prework: fsmonitor: refactor initialization of fsmonitor_last_update token fsmonitor: allow all entries for a folder to be invalidated fsmonitor: log FSMN token when reading and writing the index fsmonitor: log invocation of FSMonitor hook to trace2 read-cache: log the number of scanned files to trace2 read-cache: log the number of lstat calls to trace2 preload-index: log the number of lstat calls to trace2 p7519: add trace logging during perf test p7519: move watchman cleanup earlier in the test p7519: fix watchman watch-list test on Windows p7519: do not rely on "xargs -d" in test
2021-02-26t/perf: avoid copying worktree files from test repoJeff King1-1/+1
When running the perf suite, we copy files from an existing $GIT_DIR to a scratch repository to give us a realistic setup on which to operate. Since the perf scripts themselves may modify the scratch repository, we want to make sure we've scrubbed any references back to the original. One existing example is that we avoid copying the file "commondir" at the top-level of the repository. In a worktree git-dir (e.g., .git/worktrees/foo), that file contains the path to the parent repository; copying it could mean ref updates in the scratch repository affect the original. But there are other files we should cover, too: - "gitdir" in a worktree git-dir contains the path to the actual .git file in the working tree. We _shouldn't_ end up looking at it at all, since the lack of a "commondir" file means Git won't consider this to be a worktree git-dir. But it's best to err on the safe side. - in a parent repository that contains worktrees, the "$GIT_DIR/worktrees" directory will contain the git dirs for the individual worktrees. Which will themselves contain commondir and gitdir files that may reference the original repository. We should likewise remove them. Note that this does mean that the perf suite's scratch repositories will never have any worktrees. That's OK; we don't have any perf tests that are influenced by their presence. If we add any, they'd probably want to create the worktrees themselves anyway. This patch adds both paths to the set of omissions in test_perf_copy_repo_contents(). Note that we won't get confused here by matching arbitrary names like refs/heads/commondir. This list is always matching top-level entries in $GIT_DIR (we rely on "cp -R" to do the actual recursion). Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26t/perf: handle worktrees as test reposJeff King1-9/+22
The perf suite gets confused when test_perf_default_repo is pointed at a worktree (which includes when it is run from within a worktree at all, since the default is to use the current repository). Here's an example: $ git worktree add ~/foo Preparing worktree (new branch 'foo') HEAD is now at 328c109303 The eighth batch $ cd ~/foo $ make [...build output...] $ cd t/perf $ ./p0000-perf-lib-sanity.sh -v -i [...] perf 1 - test_perf_default_repo works: running: foo=$(git rev-parse HEAD) && test_export foo fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]' The problem is that we didn't copy all of the necessary files from the source repository (in this case we got HEAD, but we have no refs!). We discover the git-dir with "rev-parse --git-dir", but this points to the worktree's partial repository in .../.git/worktrees/foo. That partial repository has a "commondir" file which points to the main repository, where the actual refs are stored, but we don't copy it. This is the correct thing to do, though! If we did copy it, then our scratch test repo would be pointing back to the original main repo, and any ref updates we made in the tests would impact that original repo. Instead, we need to either: 1. Make a scratch copy of the original main repo (in addition to the worktree repo), and point the scratch worktree repo's commondir at it. This preserves the original relationship, but it's doubtful any script really cares (if they are testing worktree performance, they'd probably make their own worktrees). And it's trickier to get right. 2. Collapse the main and worktree repos into a single scratch repo. This can be done by copying everything from both, preferring any files from the worktree repo. This patch does the second one. With this applied, the example above results in p0000 running successfully. Reported-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22p5303: measure time to repack with keepJeff King1-2/+32
Add two new tests to measure repack performance. Both tests split the repository into synthetic "pushes", and then leave the remaining objects in a big base pack. The first new test marks an empty pack as "kept" and then passes --honor-pack-keep to avoid including objects in it. That doesn't change the resulting pack, but it does let us compare to the normal repack case to see how much overhead we add to check whether objects are kept or not. The other test is of --stdin-packs, which gives us a sense of how that number scales based on the number of packs we provide as input. In each of those tests, the empty pack isn't considered, but the residual pack (objects that were left over and not included in one of the synthetic push packs) is marked as kept. (Note that in the single-pack case of the --stdin-packs test, there is nothing do since there are no non-excluded packs). Here are some timings on a recent clone of the kernel: 5303.5: repack (1) 57.26(54.59+10.84) 5303.6: repack with kept (1) 57.33(54.80+10.51) in the 50-pack case, things start to slow down: 5303.11: repack (50) 71.54(88.57+4.84) 5303.12: repack with kept (50) 85.12(102.05+4.94) and by the time we hit 1,000 packs, things are substantially worse, even though the resulting pack produced is the same: 5303.17: repack (1000) 216.87(490.79+14.57) 5303.18: repack with kept (1000) 665.63(938.87+15.76) That's because the code paths around handling .keep files are known to scale badly; they look in every single pack file to find each object. Our solution to that was to notice that most repos don't have keep files, and to make that case a fast path. But as soon as you add a single .keep, that part of pack-objects slows down again (even if we have fewer objects total to look at). Likewise, the scaling is pretty extreme on --stdin-packs (but each subsequent test is also being asked to do more work): 5303.7: repack with --stdin-packs (1) 0.01(0.01+0.00) 5303.13: repack with --stdin-packs (50) 3.53(12.07+0.24) 5303.19: repack with --stdin-packs (1000) 195.83(371.82+8.10) Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22p5303: add missing &&-chainsJeff King1-2/+2
These are in a helper function, so the usual chain-lint doesn't notice them. This function is still not perfect, as it has some git invocations on the left-hand-side of the pipe, but it's primary purpose is timing, not finding bugs or correctness issues. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16p7519: add trace logging during perf testJeff Hostetler3-2/+35
Add optional trace logging to allow us to better compare performance of various fsmonitor providers and compare results with non-fsmonitor runs. Currently, this includes Trace2 logging, but may be extended to include other trace targets, such as GIT_TRACE_FSMONITOR if desired. Using this logging helped me explain an odd behavior on MacOS where the kernel was dropping events and causing the hook to Watchman to timeout. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16p7519: move watchman cleanup earlier in the testJeff Hostetler1-8/+17
Shutdown Watchman after the Watchman-based tests and before the block of "no fsmonitor" tests. This helps ensure that Watchman cannot affect the test results for the other. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16p7519: fix watchman watch-list test on WindowsJeff Hostetler1-1/+1
Only use the final portion of the test trash directory file name when verifying that Watchman was started. On Windows and under the SDK, $GIT_WORKTREE is a cygwin-style path with forward slashes and a "/c/" drive name. However `watchman watch-list` reports a proper Windows-style pathname with drive letters and backslashes. This causes the grep to fail. Since we don't really care about the full pathname (and we really don't want to bother with normalizaing them), just see if the test-name portion of the path is found. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16p7519: do not rely on "xargs -d" in testJeff Hostetler1-1/+11
Convert the test to use a more portable method to update the mtime on a large number of files under version control. The Mac version of xargs does not support the "-d" option. Likewise, the "-0" and "--null" options are not portable. Furthermore, use `test-tool chmtime` rather than `touch` to update the mtime to ensure that it is actually updated (especially on file systems with only whole second resolution). Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10Merge branch 'jk/pretty-lazy-load-commit'Junio C Hamano1-1/+1
Some pretty-format specifiers do not need the data in commit object (e.g. "%H"), but we were over-eager to load and parse it, which has been made even lazier. * jk/pretty-lazy-load-commit: pretty: lazy-load commit data when expanding user-format
2021-02-08Merge branch 'jk/p5303-sed-portability-fix' into maintJunio C Hamano1-4/+8
A perf script was made more portable. * jk/p5303-sed-portability-fix: p5303: avoid sed GNU-ism
2021-02-05Merge branch 'jk/p5303-sed-portability-fix'Junio C Hamano1-4/+8
A perf script was made more portable. * jk/p5303-sed-portability-fix: p5303: avoid sed GNU-ism
2021-02-05Merge branch 'nk/perf-fsmonitor-cleanup' into maintJunio C Hamano1-1/+6
Test fix. * nk/perf-fsmonitor-cleanup: p7519: allow running without watchman prereq
2021-01-29p5303: avoid sed GNU-ismJeff King1-4/+8
Using "1~5" isn't portable. Nobody seems to have noticed, since perhaps people don't tend to run the perf suite on more exotic platforms. Still, it's better to set a good example. We can use: perl -ne 'print if $. % 5 == 1' instead. But we can further observe that perl does a good job of the other parts of this pipeline, and fold the whole thing together. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28pretty: lazy-load commit data when expanding user-formatJeff King1-1/+1
When we expand a user-format, we try to avoid work that isn't necessary for the output. For instance, we don't bother parsing the commit header until we know we need the author, subject, etc. But we do always load the commit object's contents from disk, even if the format doesn't require it (e.g., just "%H"). Traditionally this didn't matter much, because we'd have loaded it as part of the traversal anyway, and we'd typically have those bytes attached to the commit struct (or these days, cached in a commit-slab). But when we have a commit-graph, we might easily get to the point of pretty-printing a commit without ever having looked at the actual object contents. We should push off that load (and reencoding) until we're certain that it's needed. I think the results of p4205 show the advantage pretty clearly (we serve parent and tree oids out of the commit struct itself, so they benefit as well): # using git.git as the test repo Test HEAD^ HEAD ---------------------------------------------------------------------- 4205.1: log with %H 0.40(0.39+0.01) 0.03(0.02+0.01) -92.5% 4205.2: log with %h 0.45(0.44+0.01) 0.09(0.09+0.00) -80.0% 4205.3: log with %T 0.40(0.39+0.00) 0.04(0.04+0.00) -90.0% 4205.4: log with %t 0.46(0.46+0.00) 0.09(0.08+0.01) -80.4% 4205.5: log with %P 0.39(0.39+0.00) 0.03(0.03+0.00) -92.3% 4205.6: log with %p 0.46(0.46+0.00) 0.10(0.09+0.00) -78.3% 4205.7: log with %h-%h-%h 0.52(0.51+0.01) 0.15(0.14+0.00) -71.2% 4205.8: log with %an-%ae-%s 0.42(0.41+0.00) 0.42(0.41+0.01) +0.0% # using linux.git as the test repo Test HEAD^ HEAD ---------------------------------------------------------------------- 4205.1: log with %H 7.12(6.97+0.14) 0.76(0.65+0.11) -89.3% 4205.2: log with %h 7.35(7.19+0.16) 1.30(1.19+0.11) -82.3% 4205.3: log with %T 7.58(7.42+0.15) 1.02(0.94+0.08) -86.5% 4205.4: log with %t 8.05(7.89+0.15) 1.55(1.41+0.13) -80.7% 4205.5: log with %P 7.12(7.01+0.10) 0.76(0.69+0.07) -89.3% 4205.6: log with %p 7.38(7.27+0.10) 1.32(1.20+0.12) -82.1% 4205.7: log with %h-%h-%h 7.81(7.67+0.13) 1.79(1.67+0.12) -77.1% 4205.8: log with %an-%ae-%s 7.90(7.74+0.15) 7.81(7.66+0.15) -1.1% I added the final test to show where we don't improve (the 1% there is just lucky noise), but also as a regression test to make sure we're not doing anything stupid like loading the commit multiple times when there are several placeholders that need it. Reported-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15Merge branch 'nk/perf-fsmonitor-cleanup'Junio C Hamano1-1/+6
Test fix. * nk/perf-fsmonitor-cleanup: p7519: allow running without watchman prereq
2021-01-06Merge branch 'es/perf-export-fix'Junio C Hamano1-4/+1
Tweak unneeded recursion from a test framework helper function. * es/perf-export-fix: t/perf: avoid unnecessary test_export() recursion
2021-01-06p7519: allow running without watchman prereqTaylor Blau1-1/+6
p7519 measures the performance of the fsmonitor code. To do this, it uses the installed copy of Watchman. If Watchman isn't installed, a noop integration script is installed in its place. When in the latter mode, it is expected that the script should not write a "last update token": in fact, it doesn't write anything at all since the script is blank. Commit 33226af42b (t/perf/fsmonitor: improve error message if typoing hook name, 2020-10-26) made sure that running 'git update-index --fsmonitor' did not write anything to stderr, but this is not the case when using the empty Watchman script, since Git will complain that: $ which watchman watchman not found $ cat .git/hooks/fsmonitor-empty $ git -c core.fsmonitor=.git/hooks/fsmonitor-empty update-index --fsmonitor warning: Empty last update token. Prior to 33226af42b, the output wasn't checked at all, which allowed this noop mode to work. But, 33226af42b breaks p7519 when running it without a 'watchman(1)' on your system. Handle this by only checking that the stderr is empty only when running with a real watchman executable. Otherwise, assert that the error message is the expected one when running in the noop mode. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-22t/perf: avoid unnecessary test_export() recursionEric Sunshine1-4/+1
test_export() has been self-recursive since its inception even though a simple for-loop would have served just as well to append its arguments to the `test_export_` variable separated by the pipe character "|". Recently `test_export_` was changed instead to a space-separated list of tokens to be exported, an operation which can be accomplished via a single simple assignment, with no need for looping or recursion. Therefore, simplify the implementation. While at it, take advantage of the fact that variable names to be exported are shell identifiers, thus won't be composed of special characters or whitespace, thus simple a `$*` can be used rather than magical `"$@"`. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-18Merge branch 'es/perf-export-fix'Junio C Hamano1-2/+7
Dev-support fix for BSD. * es/perf-export-fix: t/perf: fix test_export() failure with BSD `sed`
2020-12-16t/perf: fix test_export() failure with BSD `sed`Eric Sunshine1-2/+7
test_perf() runs each test in its own subshell which makes it difficult to persist variables between tests. test_export() addresses this shortcoming by grabbing the values of specified variables after a test runs but before the subshell exits, and writes those values to a file which is loaded into the environment of subsequent tests. To grab the values to be persisted, test_export() pipes the output of the shell's builtin `set` command through `sed` which plucks them out using a regular expression along the lines of `s/^(var1|var2)/.../p`. Unfortunately, though, this use of alternation is not portable. For instance, BSD-lineage `sed` (including macOS `sed`) does not support it in the default "basic regular expression" mode (BRE). It may be possible to enable "extended regular expression" mode (ERE) in some cases with `sed -E`, however, `-E` is neither portable nor part of POSIX. Fortunately, alternation is unnecessary in this case and can easily be avoided, so replace it with a series of simple expressions such as `s/^var1/.../p;s/^var2/.../p`. While at it, tighten the expressions so they match the variable names exactly rather than matching prefixes (i.e. use `s/^var1=/.../p`). If the requirements of test_export() become more complex in the future, then an alternative would be to replace `sed` with `perl` which supports alternation on all platforms, however, the simple elimination of alternation via multiple `sed` expressions suffices for the present. Reported-by: Sangeeta <sangunb09@gmail.com> Diagnosed-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08Merge branch 'nk/perf-fsmonitor-cleanup'Junio C Hamano1-2/+1
Test clean-up. * nk/perf-fsmonitor-cleanup: perf/fsmonitor: use test_must_be_empty helper
2020-12-08Merge branch 'ps/update-ref-multi-transaction'Junio C Hamano1-13/+7
"git update-ref --stdin" learns to take multiple transactions in a single session. * ps/update-ref-multi-transaction: update-ref: disallow "start" for ongoing transactions p1400: use `git-update-ref --stdin` to test multiple transactions update-ref: allow creation of multiple transactions t1400: avoid touching refs on filesystem
2020-11-30perf/fsmonitor: use test_must_be_empty helperNipunn Koorapati1-2/+1
Simplify test and make error messages more clear here. Per feedback from Junio in 33226af42b (t/perf/fsmonitor: improve error message if typoing hook name, 2020-10-26) Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Acked-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16p1400: use `git-update-ref --stdin` to test multiple transactionsPatrick Steinhardt1-13/+7
In commit 0a0fbbe3ff (refs: remove lookup cache for reference-transaction hook, 2020-08-25), a new benchmark was added to p1400 which has the intention to exercise creation of multiple transactions in a single process. As git-update-ref wasn't yet able to create multiple transactions with a single run we instead used git-push. As its non-atomic version creates a transaction per reference update, this was the best approximation we could make at that point in time. Now that `git-update-ref --stdin` supports creation of multiple transactions, let's convert the benchmark to use that instead. It has less overhead and it's also a lot clearer what the actual intention is. Signed-off-by: Patrick Steinhardt <ps@pks.im> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-26t/perf/fsmonitor: add benchmark for dirty statusNipunn Koorapati1-0/+5
This benchmark covers the git status time for a heavily dirty directory - benchmarking fsmonitor's refresh When running to compare our perl vs rs-git-fsmonitor - we see that the perl script incurs significant overhead - further motivation to provide a faster implementation within git. 7519.7: status (dirty) (fsmonitor=query-watchman) 10.05(7.78+1.56) 7519.20: status (dirty) (fsmonitor=rs-git-fsmonitor) 6.72(4.37+1.64) 7519.33: status (dirty) (fsmonitor=disabled) 5.62(4.24+2.03) Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-26t/perf/fsmonitor: perf comparison of multiple fsmonitor integrationsNipunn Koorapati1-8/+14
Allows for simple perf comparison of different integrations. I ran it to compare our perl script w/ rs-git-fsmonitor and found 20-30ms of overhead on every command. Output looks like this (extra newlines added for readability) Test this tree --------------------------------------------------------------------------- 7519.4: status (fsmonitor=query-watchman) 0.42(0.37+0.05) 7519.5: status -uno (fsmonitor=query-watchman) 0.19(0.12+0.07) 7519.6: status -uall (fsmonitor=query-watchman) 1.36(0.73+0.62) 7519.7: diff (fsmonitor=query-watchman) 0.14(0.09+0.05) 7519.8: diff -- 0_files (fsmonitor=query-watchman) 0.14(0.11+0.03) 7519.9: diff -- 10_files (fsmonitor=query-watchman) 0.14(0.10+0.04) 7519.10: diff -- 100_files (fsmonitor=query-watchman) 0.14(0.09+0.05) 7519.11: diff -- 1000_files (fsmonitor=query-watchman) 0.14(0.08+0.06) 7519.12: diff -- 10000_files (fsmonitor=query-watchman) 0.14(0.09+0.05) 7519.13: add (fsmonitor=query-watchman) 2.04(1.32+0.66) 7519.16: status (fsmonitor=rs-git-fsmonitor) 0.39(0.32+0.08) 7519.17: status -uno (fsmonitor=rs-git-fsmonitor) 0.17(0.11+0.06) 7519.18: status -uall (fsmonitor=rs-git-fsmonitor) 1.33(0.71+0.61) 7519.19: diff (fsmonitor=rs-git-fsmonitor) 0.11(0.07+0.04) 7519.20: diff -- 0_files (fsmonitor=rs-git-fsmonitor) 0.11(0.09+0.03) 7519.21: diff -- 10_files (fsmonitor=rs-git-fsmonitor) 0.11(0.09+0.03) 7519.22: diff -- 100_files (fsmonitor=rs-git-fsmonitor) 0.11(0.07+0.04) 7519.23: diff -- 1000_files (fsmonitor=rs-git-fsmonitor) 0.11(0.06+0.06) 7519.24: diff -- 10000_files (fsmonitor=rs-git-fsmonitor) 0.11(0.06+0.06) 7519.25: add (fsmonitor=rs-git-fsmonitor) 2.03(1.28+0.69) 7519.28: status (fsmonitor=disabled) 0.77(0.59+0.99) 7519.29: status -uno (fsmonitor=disabled) 0.42(0.33+0.85) 7519.30: status -uall (fsmonitor=disabled) 1.59(1.02+1.34) 7519.31: diff (fsmonitor=disabled) 0.35(0.30+0.81) 7519.32: diff -- 0_files (fsmonitor=disabled) 0.11(0.08+0.04) 7519.33: diff -- 10_files (fsmonitor=disabled) 0.11(0.07+0.04) 7519.34: diff -- 100_files (fsmonitor=disabled) 0.11(0.08+0.03) 7519.35: diff -- 1000_files (fsmonitor=disabled) 0.11(0.10+0.02) 7519.36: diff -- 10000_files (fsmonitor=disabled) 0.12(0.07+0.06) 7519.37: add (fsmonitor=disabled) 2.24(1.48+1.44) Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-26t/perf/fsmonitor: initialize test with git resetNipunn Koorapati1-2/+6
Previously, the git add of the previous suiterun would pollute the numbers in the second run Before: Test this tree ----------------------------------------------------------------------------- 7519.4: status (fsmonitor=fsmonitor-watchman) 0.40(0.36+0.04) 7519.5: status -uno (fsmonitor=fsmonitor-watchman) 0.19(0.12+0.07) 7519.6: status -uall (fsmonitor=fsmonitor-watchman) 1.36(0.74+0.61) 7519.7: diff (fsmonitor=fsmonitor-watchman) 0.14(0.10+0.04) 7519.8: diff -- 0_files (fsmonitor=fsmonitor-watchman) 0.14(0.10+0.04) 7519.9: diff -- 10_files (fsmonitor=fsmonitor-watchman) 0.14(0.09+0.05) 7519.10: diff -- 100_files (fsmonitor=fsmonitor-watchman) 0.14(0.10+0.04) 7519.11: diff -- 1000_files (fsmonitor=fsmonitor-watchman) 0.14(0.08+0.06) 7519.12: diff -- 10000_files (fsmonitor=fsmonitor-watchman) 0.14(0.10+0.04) 7519.13: add (fsmonitor=fsmonitor-watchman) 2.03(1.28+0.69) 7519.16: status (fsmonitor=disabled) 0.64(0.49+0.90) 7519.17: status -uno (fsmonitor=disabled) 1.15(0.92+1.00) 7519.18: status -uall (fsmonitor=disabled) 2.32(1.46+1.55) 7519.19: diff (fsmonitor=disabled) 1.44(1.12+1.76) 7519.20: diff -- 0_files (fsmonitor=disabled) 0.11(0.07+0.05) 7519.21: diff -- 10_files (fsmonitor=disabled) 0.11(0.06+0.05) 7519.22: diff -- 100_files (fsmonitor=disabled) 0.11(0.08+0.03) 7519.23: diff -- 1000_files (fsmonitor=disabled) 0.11(0.08+0.04) 7519.24: diff -- 10000_files (fsmonitor=disabled) 0.12(0.06+0.07) 7519.25: add (fsmonitor=disabled) 2.25(1.47+1.47) After: Test this tree ----------------------------------------------------------------------------- 7519.4: status (fsmonitor=fsmonitor-watchman) 0.41(0.33+0.09) 7519.5: status -uno (fsmonitor=fsmonitor-watchman) 0.20(0.14+0.07) 7519.6: status -uall (fsmonitor=fsmonitor-watchman) 1.37(0.78+0.58) 7519.7: diff (fsmonitor=fsmonitor-watchman) 0.14(0.10+0.04) 7519.8: diff -- 0_files (fsmonitor=fsmonitor-watchman) 0.14(0.08+0.06) 7519.9: diff -- 10_files (fsmonitor=fsmonitor-watchman) 0.14(0.09+0.05) 7519.10: diff -- 100_files (fsmonitor=fsmonitor-watchman) 0.14(0.10+0.05) 7519.11: diff -- 1000_files (fsmonitor=fsmonitor-watchman) 0.14(0.11+0.04) 7519.12: diff -- 10000_files (fsmonitor=fsmonitor-watchman) 0.14(0.09+0.05) 7519.13: add (fsmonitor=fsmonitor-watchman) 2.04(1.27+0.71) 7519.16: status (fsmonitor=disabled) 0.78(0.59+0.99) 7519.17: status -uno (fsmonitor=disabled) 0.43(0.32+0.88) 7519.18: status -uall (fsmonitor=disabled) 1.58(0.96+1.38) 7519.19: diff (fsmonitor=disabled) 0.36(0.31+0.79) 7519.20: diff -- 0_files (fsmonitor=disabled) 0.11(0.08+0.03) 7519.21: diff -- 10_files (fsmonitor=disabled) 0.11(0.07+0.04) 7519.22: diff -- 100_files (fsmonitor=disabled) 0.11(0.08+0.04) 7519.23: diff -- 1000_files (fsmonitor=disabled) 0.11(0.07+0.05) 7519.24: diff -- 10000_files (fsmonitor=disabled) 0.12(0.08+0.05) 7519.25: add (fsmonitor=disabled) 2.25(1.48+1.47) Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-26t/perf/fsmonitor: factor setup for fsmonitor into functionNipunn Koorapati1-2/+6
This prepares for it being called multiple times when testing different hooks Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-26t/perf/fsmonitor: silence initial git commitNipunn Koorapati1-1/+1
It is extremely verbose, printing >10K non-useful lines Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-26t/perf/fsmonitor: shorten DESC to basenameNipunn Koorapati1-1/+5
The full name is lengthy and makes it hard to read Before: 7519.3: status (fsmonitor=/home/nipunn/src/server/.git/hooks/rs-git-fsmonitor) 0.02(0.01+0.00) After 7519.3: status (fsmonitor=rs-git-fsmonitor) 0.03(0.02+0.00) Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-26t/perf/fsmonitor: factor description out for readabilityNipunn Koorapati1-10/+12
There was much duplication here. Prepares for making changes to the description. Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-26t/perf/fsmonitor: improve error message if typoing hook nameNipunn Koorapati1-1/+3
Previously - it would silently run the perf suite w/o using fsmonitor - fsmonitor errors are not hard failures. Now it errors loudly. GIT_PERF_7519_FSMONITOR="$HOME/rs-git-fsmonitorr" ./p7519-fsmonitor.sh -i -v fatal: cannot run /home/nipunn/rs-git-fsmonitorr: No such file or directory not ok 2 - setup for fsmonitor Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-26t/perf/fsmonitor: move watchman setup to one-time-repo-setupNipunn Koorapati1-7/+9
It is only required to be set up once. This prepares for testing multiple hooks in one invocation. Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-26t/perf/fsmonitor: separate one time repo initializationNipunn Koorapati1-8/+11
In preparation for testing multiple fsmonitor hooks Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-20p7519-fsmonitor: add a git add benchmarkNipunn Koorapati1-0/+4
Test v2.29.0-rc1 this tree ----------------------------------------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 1.48(0.79+0.67) 1.48(0.79+0.67) +0.0% 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.16(0.11+0.05) 0.17(0.13+0.04) +6.3% 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.77+0.58) 1.37(0.72+0.63) +0.7% 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.84(0.21+0.63) 0.14(0.11+0.03) -83.3% 7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.07+0.05) 0.13(0.09+0.04) +8.3% 7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.09+0.04) 0.13(0.07+0.06) +8.3% 7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.05) 0.12(0.08+0.05) +0.0% 7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.05) 0.13(0.09+0.04) +8.3% 7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.14(0.08+0.06) 0.13(0.07+0.06) -7.1% 7519.11: add (fsmonitor=.git/hooks/fsmonitor-watchman) 2.75(1.41+1.27) 2.03(1.26+0.70) -26.2% 7519.13: status (fsmonitor=) 1.38(1.03+1.04) 1.37(1.04+1.04) -0.7% 7519.14: status -uno (fsmonitor=) 1.11(0.83+0.98) 1.10(0.89+0.90) -0.9% 7519.15: status -uall (fsmonitor=) 2.30(1.57+1.42) 2.31(1.49+1.50) +0.4% 7519.16: diff (fsmonitor=) 1.43(1.13+1.76) 1.46(1.19+1.72) +2.1% 7519.17: diff -- 0_files (fsmonitor=) 0.10(0.08+0.04) 0.11(0.08+0.04) +10.0% 7519.18: diff -- 10_files (fsmonitor=) 0.10(0.07+0.05) 0.11(0.08+0.04) +10.0% 7519.19: diff -- 100_files (fsmonitor=) 0.10(0.07+0.04) 0.11(0.07+0.05) +10.0% 7519.20: diff -- 1000_files (fsmonitor=) 0.10(0.08+0.03) 0.11(0.08+0.04) +10.0% 7519.21: diff -- 10000_files (fsmonitor=) 0.11(0.08+0.05) 0.12(0.07+0.06) +9.1% 7519.22: add (fsmonitor=) 2.26(1.46+1.49) 2.27(1.42+1.55) +0.4% Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-20p7519-fsmonitor: refactor to avoid code duplicationNipunn Koorapati1-99/+37
Much of the benchmark code is redundant. This is easier to understand and edit. Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-20perf lint: add make test-lint to perf testsNipunn Koorapati2-4/+7
Perf tests have not been linted for some time. They've grown some seq instead of test_seq. This runs the existing lints on the perf tests as well. Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-20t/perf: add fsmonitor perf test for git diffNipunn Koorapati1-0/+71
Results for the git-diff fsmonitor optimization in patch in the parent-rev (using a 400k file repo to test) As you can see here - git diff with fsmonitor running is significantly better with this patch series (80% faster on my workload)! GIT_PERF_LARGE_REPO=~/src/server ./run v2.29.0-rc1 . -- p7519-fsmonitor.sh Test v2.29.0-rc1 this tree ----------------------------------------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 1.46(0.82+0.64) 1.47(0.83+0.62) +0.7% 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.16(0.12+0.04) 0.17(0.12+0.05) +6.3% 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.73+0.62) 1.37(0.76+0.60) +0.7% 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.85(0.22+0.63) 0.14(0.10+0.05) -83.5% 7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.05) 0.13(0.11+0.02) +8.3% 7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.04) 0.13(0.09+0.04) +8.3% 7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.07+0.05) 0.13(0.07+0.06) +8.3% 7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.09+0.04) 0.13(0.08+0.05) +8.3% 7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.14(0.09+0.05) 0.13(0.10+0.03) -7.1% 7519.12: status (fsmonitor=) 1.67(0.93+1.49) 1.67(0.99+1.42) +0.0% 7519.13: status -uno (fsmonitor=) 0.37(0.30+0.82) 0.37(0.33+0.79) +0.0% 7519.14: status -uall (fsmonitor=) 1.58(0.97+1.35) 1.57(0.86+1.45) -0.6% 7519.15: diff (fsmonitor=) 0.34(0.28+0.83) 0.34(0.27+0.83) +0.0% 7519.16: diff -- 0_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.08+0.02) +0.0% 7519.17: diff -- 10_files (fsmonitor=) 0.09(0.07+0.03) 0.09(0.06+0.05) +0.0% 7519.18: diff -- 100_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.06+0.04) +0.0% 7519.19: diff -- 1000_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.05+0.05) +0.0% 7519.20: diff -- 10000_files (fsmonitor=) 0.10(0.08+0.04) 0.10(0.06+0.05) +0.0% I also added a benchmark for a tiny git diff workload w/ a pathspec. I see an approximately .02 second overhead added w/ and w/o fsmonitor From looking at these results, I suspected that refresh_fsmonitor is already happening during git diff - independent of this patch series' optimization. Confirmed that suspicion by breaking on refresh_fsmonitor. (gdb) bt [simplified] 0 refresh_fsmonitor at fsmonitor.c:176 1 ie_match_stat at read-cache.c:375 2 match_stat_with_submodule at diff-lib.c:237 4 builtin_diff_files at builtin/diff.c:260 5 cmd_diff at builtin/diff.c:541 6 run_builtin at git.c:450 7 handle_builtin at git.c:700 8 run_argv at git.c:767 9 cmd_main at git.c:898 10 main at common-main.c:52 Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-20t/perf/p7519-fsmonitor.sh: warm cache on first git statusNipunn Koorapati1-1/+2
The first git status would be inflated due to warming of filesystem cache. This makes the results comparable. Before Test this tree -------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 2.52(1.59+1.56) 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.18(0.12+0.06) 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.73+0.62) 7519.7: status (fsmonitor=) 0.69(0.52+0.90) 7519.8: status -uno (fsmonitor=) 0.37(0.28+0.81) 7519.9: status -uall (fsmonitor=) 1.53(0.93+1.32) After Test this tree -------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 0.39(0.33+0.06) 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.17(0.13+0.05) 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.34(0.77+0.56) 7519.7: status (fsmonitor=) 0.70(0.53+0.90) 7519.8: status -uno (fsmonitor=) 0.37(0.32+0.78) 7519.9: status -uall (fsmonitor=) 1.55(1.01+1.25) Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-20t/perf/README: elaborate on output formatNipunn Koorapati1-0/+2
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-22Merge branch 'jk/dont-count-existing-objects-twice'Junio C Hamano1-0/+4
There is a logic to estimate how many objects are in the repository, which is mean to run once per process invocation, but it ran every time the estimated value was requested. * jk/dont-count-existing-objects-twice: packfile: actually set approximate_object_count_valid
2020-09-17packfile: actually set approximate_object_count_validJeff King1-0/+4
The approximate_object_count() function tries to compute the count only once per process. But ever since it was introduced in 8e3f52d778 (find_unique_abbrev: move logic out of get_short_sha1(), 2016-10-03), we failed to actually set the "valid" flag, meaning we'd compute it fresh on every call. This turns out not to be _too_ bad, because we're only iterating through the packed_git list, and not making any system calls. But since it may get called for every abbreviated hash we output, even this can add up if you have many packs. Here are before-and-after timings for a new perf test which just asks rev-list to abbreviate each commit hash (the test repo is linux.git, with commit-graphs): Test origin HEAD ---------------------------------------------------------------------------- 5303.3: rev-list (1) 28.91(28.46+0.44) 29.03(28.65+0.38) +0.4% 5303.4: abbrev-commit (1) 1.18(1.06+0.11) 1.17(1.02+0.14) -0.8% 5303.7: rev-list (50) 28.95(28.56+0.38) 29.50(29.17+0.32) +1.9% 5303.8: abbrev-commit (50) 3.67(3.56+0.10) 3.57(3.42+0.15) -2.7% 5303.11: rev-list (1000) 30.34(29.89+0.43) 30.82(30.35+0.46) +1.6% 5303.12: abbrev-commit (1000) 86.82(86.52+0.29) 77.82(77.59+0.22) -10.4% 5303.15: load 10,000 packs 0.08(0.02+0.05) 0.08(0.02+0.06) +0.0% It doesn't help at all when we have 1 pack (5303.4), but we get a 10% speedup when there are 1000 packs (5303.12). That's a modest speedup for a case that's already slow and we'd hope to avoid in general (note how slow it is even after, because we have to look in each of those packs for abbreviations). But it's a one-line change that clearly matches the original intent, so it seems worth doing. The included perf test may also be useful for keeping an eye on any regressions in the overall abbreviation code. Reported-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-31Merge branch 'ps/ref-transaction-hook'Junio C Hamano1-3/+10
Code simplification by removing ineffective optimization. * ps/ref-transaction-hook: refs: remove lookup cache for reference-transaction hook
2020-08-25refs: remove lookup cache for reference-transaction hookPatrick Steinhardt1-3/+10
When adding the reference-transaction hook, there were concerns about the performance impact it may have on setups which do not make use of the new hook at all. After all, it gets executed every time a reftx is prepared, committed or aborted, which linearly scales with the number of reference-transactions created per session. And as there are code paths like `git push` which create a new transaction for each reference to be updated, this may translate to calling `find_hook()` quite a lot. To address this concern, a cache was added with the intention to not repeatedly do negative hook lookups. Turns out this cache caused a regression, which was fixed via e5256c82e5 (refs: fix interleaving hook calls with reference-transaction hook, 2020-08-07). In the process of discussing the fix, we realized that the cache doesn't really help even in the negative-lookup case. While performance tests added to benchmark this did show a slight improvement in the 1% range, this really doesn't warrent having a cache. Furthermore, it's quite flaky, too. E.g. running it twice in succession produces the following results: Test master pks-reftx-hook-remove-cache -------------------------------------------------------------------------- 1400.2: update-ref 2.79(2.16+0.74) 2.73(2.12+0.71) -2.2% 1400.3: update-ref --stdin 0.22(0.08+0.14) 0.21(0.08+0.12) -4.5% Test master pks-reftx-hook-remove-cache -------------------------------------------------------------------------- 1400.2: update-ref 2.70(2.09+0.72) 2.74(2.13+0.71) +1.5% 1400.3: update-ref --stdin 0.21(0.10+0.10) 0.21(0.08+0.13) +0.0% One case notably absent from those benchmarks is a single executable searching for the hook hundreds of times, which is exactly the case for which the negative cache was added. p1400.2 will spawn a new update-ref for each transaction and p1400.3 only has a single reference-transaction for all reference updates. So this commit adds a third benchmark, which performs an non-atomic push of a thousand references. This will create a new reference transaction per reference. But even for this case, the negative cache doesn't consistently improve performance: Test master pks-reftx-hook-remove-cache -------------------------------------------------------------------------- 1400.4: nonatomic push 6.63(6.50+0.13) 6.81(6.67+0.14) +2.7% 1400.4: nonatomic push 6.35(6.21+0.14) 6.39(6.23+0.16) +0.6% 1400.4: nonatomic push 6.43(6.31+0.13) 6.42(6.28+0.15) -0.2% So let's just remove the cache altogether to simplify the code. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-21p5302: count up to online-cpus for thread testsJeff King1-23/+24
When PERF_EXTRA is enabled, p5302 checks the performance of index-pack with various numbers of threads. This can be useful for deciding what the default should be (which is currently capped at 3 threads based on the results of this script). However, we only go up to 8 threads, and modern machines may have more. Let's get the number of CPUs from test-tool, and test various numbers of threads between one and that maximum. Note that the current tests aren't all identical, as we have to set GIT_FORCE_THREADS for the --threads=1 test (which measures the overhead of starting a single worker thread versus the "0" case of using the main thread). To keep the loop simple, we'll keep the "0" case out of it, and set GIT_FORCE_THREADS=1 for all of the other cases (it's a noop for all but the "1" case, since numbers higher than 1 would always need threads). Note also that we could skip running "test-tool" if PERF_EXTRA isn't set. However, there's some small value in knowing the number of threads, so that we can mark each test as skipped in the output. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-21p5302: disable thread-count parameter tests by defaultJeff King3-5/+16
The primary function of the perf suite is to detect regressions (or improvements) between versions of Git. The only numbers we show a direct comparison for are timings between the same test run on two different versions. However, it can sometimes be used to collect other information. For instance, p5302 runs the same index-pack operation with different thread counts. The output doesn't directly compare these, but anybody interested in working on index-pack can manually compare the results. For a normal regression run of the full perf-suite, though, this incurs a significant cost to generate numbers nobody will actually look at; about 25% of the total time of the test suite is spent in p5302. And the low-thread-count runs are the most expensive part of it, since they're (unsurprisingly) not using as many threads. Let's skip these tests by default, but make it possible for people working on index-pack to still run them by setting an environment variable. Rather than make this specific to p5302, let's introduce a generic mechanism. This makes it possible to run the full suite with every possible test if somebody really wants to burn some CPU. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19refs: implement reference transaction hookPatrick Steinhardt1-0/+32
The low-level reference transactions used to update references are currently completely opaque to the user. While certainly desirable in most usecases, there are some which might want to hook into the transaction to observe all queued reference updates as well as observing the abortion or commit of a prepared transaction. One such usecase would be to have a set of replicas of a given Git repository, where we perform Git operations on all of the repositories at once and expect the outcome to be the same in all of them. While there exist hooks already for a certain subset of Git commands that could be used to implement a voting mechanism for this, many others currently don't have any mechanism for this. The above scenario is the motivation for the new "reference-transaction" hook that reaches directly into Git's reference transaction mechanism. The hook receives as parameter the current state the transaction was moved to ("prepared", "committed" or "aborted") and gets via its standard input all queued reference updates. While the exit code gets ignored in the "committed" and "aborted" states, a non-zero exit code in the "prepared" state will cause the transaction to be aborted prematurely. Given the usecase described above, a voting mechanism can now be implemented via this hook: as soon as it gets called, it will take all of stdin and use it to cast a vote to a central service. When all replicas of the repository agree, the hook will exit with zero, otherwise it will abort the transaction by returning non-zero. The most important upside is that this will catch _all_ commands writing references at once, allowing to implement strong consistency for reference updates via a single mechanism. In order to test the impact on the case where we don't have any "reference-transaction" hook installed in the repository, this commit introduce two new performance tests for git-update-refs(1). Run against an empty repository, it produces the following results: Test origin/master HEAD -------------------------------------------------------------------- 1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4% 1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0% The performance test p1400.2 creates, updates and deletes a branch a thousand times, thus averaging runtime of git-update-refs over 3000 invocations. p1400.3 instead calls `git-update-refs --stdin` three times and queues a thousand creations, updates and deletes respectively. As expected, p1400.3 consistently shows no noticeable impact, as for each batch of updates there's a single call to access(3P) for the negative hook lookup. On the other hand, for p1400.2, one can see an impact caused by this patchset. But doing five runs of the performance tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead ranged from -1.5% to +1.1%. These inconsistent performance numbers can be explained by the overhead of spawning 3000 processes. This shows that the overhead of assembling the hook path and executing access(3P) once to check if it's there is mostly outweighed by the operating system's overhead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-04pack-bitmap: pass object filter to fill-in traversalJeff King1-0/+5
Sometimes a bitmap traversal still has to walk some commits manually, because those commits aren't included in the bitmap packfile (e.g., due to a push or commit since the last full repack). If we're given an object filter, we don't pass it down to this traversal. It's not necessary for correctness because the bitmap code has its own filters to post-process the bitmap result (which it must, to filter out the objects that _are_ mentioned in the bitmapped packfile). And with blob filters, there was no performance reason to pass along those filters, either. The fill-in traversal could omit them from the result, but it wouldn't save us any time to do so, since we'd still have to walk each tree entry to see if it's a blob or not. But now that we support tree filters, there's opportunity for savings. A tree:depth=0 filter means we can avoid accessing trees entirely, since we know we won't them (or any of the subtrees or blobs they point to). The new test in p5310 shows this off (the "partial bitmap" state is one where HEAD~100 and its ancestors are all in a bitmapped pack, but HEAD~100..HEAD are not). Here are the results (run against linux.git): Test HEAD^ HEAD ------------------------------------------------------------------------------------------------- [...] 5310.16: rev-list with tree filter (partial bitmap) 0.19(0.17+0.02) 0.03(0.02+0.01) -84.2% The absolute number of savings isn't _huge_, but keep in mind that we only omitted 100 first-parent links (in the version of linux.git here, that's 894 actual commits). In a more pathological case, we might have a much larger proportion of non-bitmapped commits. I didn't bother creating such a case in the perf script because the setup is expensive, and this is plenty to show the savings as a percentage. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-04pack-bitmap.c: support 'tree:0' filteringTaylor Blau1-0/+5
In the previous patch, we made it easy to define other filters that exclude all objects of a certain type. Use that in order to implement bitmap-level filtering for the '--filter=tree:<n>' filter when 'n' is equal to 0. The general case is not helped by bitmaps, since for values of 'n > 0', the object filtering machinery requires a full-blown tree traversal in order to determine the depth of a given tree. Caching this is non-obvious, too, since the same tree object can have a different depth depending on the context (e.g., a tree was moved up in the directory hierarchy between two commits). But, the 'n = 0' case can be helped, and this patch does so. Running p5310.11 in this tree and on master with the kernel, we can see that this case is helped substantially: Test master this tree -------------------------------------------------------------------------------- 5310.11: rev-list count with tree:0 10.68(10.39+0.27) 0.06(0.04+0.01) -99.4% Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-28Merge branch 'jk/fast-import-use-hashmap'Junio C Hamano1-0/+23
The custom hash function used by "git fast-import" has been replaced with the one from hashmap.c, which gave us a nice performance boost. * jk/fast-import-use-hashmap: fast-import: replace custom hash with hashmap.c
2020-04-06fast-import: replace custom hash with hashmap.cJeff King1-0/+23
We use a custom hash in fast-import to store the set of objects we've imported so far. It has a fixed set of 2^16 buckets and chains any collisions with a linked list. As the number of objects grows larger than that, the load factor increases and we degrade to O(n) lookups and O(n^2) insertions. We can scale better by using our hashmap.c implementation, which will resize the bucket count as we grow. This does incur an extra memory cost of 8 bytes per object, as hashmap stores the integer hash value for each entry in its hashmap_entry struct (which we really don't care about here, because we're just reusing the embedded object hash). But I think the numbers below justify this (and our per-object memory cost is already much higher). I also looked at using khash, but it seemed to perform slightly worse than hashmap at all sizes, and worse even than the existing code for small sizes. It's also awkward to use here, because we want to look up a "struct object_entry" from a "struct object_id", and it doesn't handle mismatched keys as well. Making a mapping of object_id to object_entry would be more natural, but that would require pulling the embedded oid out of the object_entry or incurring an extra 32 bytes per object. In a synthetic test creating as many cheap, tiny objects as possible perl -e ' my $bits = shift; my $nr = 2**$bits; for (my $i = 0; $i < $nr; $i++) { print "blob\n"; print "data 4\n"; print pack("N", $i); } ' $bits | git fast-import I got these results: nr_objects master khash hashmap 2^20 0m4.317s 0m5.109s 0m3.890s 2^21 0m10.204s 0m9.702s 0m7.933s 2^22 0m27.159s 0m17.911s 0m16.751s 2^23 1m19.038s 0m35.080s 0m31.963s 2^24 4m18.766s 1m10.233s 1m6.793s which points to hashmap as the winner. We didn't have any perf tests for fast-export or fast-import, so I added one as a more real-world case. It uses an export without blobs since that's significantly cheaper than a full one, but still is an interesting case people might use (e.g., for rewriting history). It will emphasize this change in some ways (as a percentage we spend more time making objects and less shuffling blob bytes around) and less in others (the total object count is lower). Here are the results for linux.git: Test HEAD^ HEAD ---------------------------------------------------------------------------- 9300.1: export (no-blobs) 67.64(66.96+0.67) 67.81(67.06+0.75) +0.3% 9300.2: import (no-blobs) 284.04(283.34+0.69) 198.09(196.01+0.92) -30.3% It only has ~5.2M commits and trees, so this is a larger effect than I expected (the 2^23 case above only improved by 50s or so, but here we gained almost 90s). This is probably due to actually performing more object lookups in a real import with trees and commits, as opposed to just dumping a bunch of blobs into a pack. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-27p5310: stop timing non-bitmap pack-to-diskJeff King1-4/+0
Commit 645c432d61 (pack-objects: use reachability bitmap index when generating non-stdout pack, 2016-09-10) added two timing tests for packing to an on-disk file, both with and without bitmaps. However, the non-bitmap one isn't interesting to have as part of p5310's regression suite. It _could_ be used as a baseline to show off the improvement in the bitmap case, but: - the point of the t/perf suite is to find performance regressions, and it won't help with that. We don't compare the numbers between two tests (which the perf suite has no idea are even related), and any change in its numbers would have nothing to do with bitmaps. - it did show off the improvement in the commit message of 645c432d61, but it wasn't even necessary there. The bitmap case already shows an improvement (because before the patch, it behaved the same as the non-bitmap case), and the perf suite is even able to show the difference between the before and after measurements. On top of that, it's one of the most expensive tests in the suite, clocking in around 60s for linux.git on my machine (as compared to 16s for the bitmapped version). And by default when using "./run", we'd run it three times! So let's just drop it. It's not useful and is adding minutes to perf runs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-14pack-objects: support filters with bitmapsJeff King1-0/+4
Just as rev-list recently learned to combine filters and bitmaps, let's do the same for pack-objects. The infrastructure is all there; we just need to pass along our filter options, and the pack-bitmap code will decide to use bitmaps or not. This unsurprisingly makes things faster for partial clones of large repositories (here we're cloning linux.git): Test HEAD^ HEAD ------------------------------------------------------------------------------ 5310.11: simulated partial clone 38.94(37.28+5.87) 11.06(11.27+4.07) -71.6% Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-14pack-bitmap: implement BLOB_LIMIT filteringJeff King1-0/+5
Just as the previous commit implemented BLOB_NONE, we can support BLOB_LIMIT filters by looking at the sizes of any blobs in the result and unsetting their bits as appropriate. This is slightly more expensive than BLOB_NONE, but still produces a noticeable speedup (these results are on git.git): Test HEAD~2 HEAD ------------------------------------------------------------------------------------ 5310.9: rev-list count with blob:none 1.80(1.77+0.02) 0.22(0.20+0.02) -87.8% 5310.10: rev-list count with blob:limit=1k 1.99(1.96+0.03) 0.29(0.25+0.03) -85.4% The implementation is similar to the BLOB_NONE one, with the exception that we have to go object-by-object while walking the blob-type bitmap (since we can't mask out the matches, but must look up the size individually for each blob). The trick with using ctz64() is taken from show_objects_for_type(), which likewise needs to find individual bits (but wants to quickly skip over big chunks without blobs). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-14pack-bitmap: implement BLOB_NONE filteringJeff King1-0/+5
We can easily support BLOB_NONE filters with bitmaps. Since we know the types of all of the objects, we just need to clear the result bits of any blobs. Note two subtleties in the implementation (which I also called out in comments): - we have to include any blobs that were specifically asked for (and not reached through graph traversal) to match the non-bitmap version - we have to handle in-pack and "ext_index" objects separately. Arguably prepare_bitmap_walk() could be adding these ext_index objects to the type bitmaps. But it doesn't for now, so let's match the rest of the bitmap code here (it probably wouldn't be an efficiency improvement to do so since the cost of extending those bitmaps is about the same as our loop here, but it might make the code a bit simpler). Here are perf results for the new test on git.git: Test HEAD^ HEAD -------------------------------------------------------------------------------- 5310.9: rev-list count with blob:none 1.67(1.62+0.05) 0.22(0.21+0.02) -86.8% Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-14rev-list: allow commit-only bitmap traversalsJeff King1-0/+8
Ever since we added reachability bitmap support, we've been able to use it with rev-list to get the full list of objects, like: git rev-list --objects --use-bitmap-index --all But you can't do so without --objects, since we weren't ready to just show the commits. However, the internals of the bitmap code are mostly ready for this: they avoid opening up trees when walking to fill in the bitmaps. We just need to actually pass in the rev_info to traverse_bitmap_commit_list() so it knows which types to bother triggering our callback for. For completeness, the perf test now covers both the existing --objects case, as well as the new commits-only behavior (the objects one got way faster when we introduced bitmaps, but obviously isn't improved now). Here are numbers for linux.git: Test HEAD^ HEAD ------------------------------------------------------------------------ 5310.7: rev-list (commits) 8.29(8.10+0.19) 1.76(1.72+0.04) -78.8% 5310.8: rev-list (objects) 8.06(7.94+0.12) 8.14(7.94+0.13) +1.0% That run was cheating a little, as I didn't have any commit-graph in the repository, and we'd built it by default these days when running git-gc. Here are numbers with a commit-graph: Test HEAD^ HEAD ------------------------------------------------------------------------ 5310.7: rev-list (commits) 0.70(0.58+0.12) 0.51(0.46+0.04) -27.1% 5310.8: rev-list (objects) 6.20(6.09+0.10) 6.27(6.16+0.11) +1.1% Still an improvement, but a lot less impressive. We could have the perf script remove any commit-graph to show the out-sized effect, but it probably makes sense to leave it in what would be a more typical setup. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-16Merge branch 'cs/store-packfiles-in-hashmap'Junio C Hamano1-0/+18
In a repository with many packfiles, the cost of the procedure that avoids registering the same packfile twice was unnecessarily high by using an inefficient search algorithm, which has been corrected. * cs/store-packfiles-in-hashmap: packfile.c: speed up loading lots of packfiles
2019-12-10Merge branch 'jk/perf-wo-git-dot-pm'Junio C Hamano1-2/+7
Test cleanup. * jk/perf-wo-git-dot-pm: t/perf: don't depend on Git.pm
2019-12-06Merge branch 'tg/perf-remove-stale-result'Junio C Hamano2-11/+5
PerfTest fix to avoid stale result mixed up with the latest round of test results. * tg/perf-remove-stale-result: perf-lib: use a single filename for all measurement types
2019-12-03packfile.c: speed up loading lots of packfilesColin Stolley1-0/+18
When loading packfiles on start-up, we traverse the internal packfile list once per file to avoid reloading packfiles that have already been loaded. This check runs in quadratic time, so for poorly maintained repos with a large number of packfiles, it can be pretty slow. Add a hashmap containing the packfile names as we load them so that the average runtime cost of checking for already-loaded packs becomes constant. Add a perf test to p5303 to show speed-up. The existing p5303 test runtimes are dominated by other factors and do not show an appreciable speed-up. The new test in p5303 clearly exposes a speed-up in bad cases. In this test we create 10,000 packfiles and measure the start-up time of git rev-parse, which does little else besides load in the packs. Here are the numbers for the new p5303 test: Test HEAD^ HEAD --------------------------------------------------------------------- 5303.12: load 10,000 packs 1.03(0.92+0.10) 0.12(0.02+0.09) -88.3% Signed-off-by: Colin Stolley <cstolley@runbox.com> Helped-by: Jeff King <peff@peff.net> [jc: squashed the change to call hashmap in install_packed_git() by peff] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-01Merge branch 'jk/optim-in-pack-idx-conversion'Junio C Hamano1-0/+1
Code clean-up. * jk/optim-in-pack-idx-conversion: pack-objects: avoid pointless oe_map_new_pack() calls
2019-11-27t/perf: don't depend on Git.pmJeff King1-2/+7
The perf suite's aggregate.perl depends on Git.pm, which is a mild annoyance if you've built git with NO_PERL. It turns out that the only thing we use it for is a single call of the command_oneline() helper. We can just replace this with backticks or similar. Annoyingly, perl has no backtick equivalent that avoids a shell eval, which means our $arg would require quoting. This probably doesn't matter for our purposes, but it's better to be safe and model good style. So we'll just provide a short helper around open(), which takes its arguments as a list. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-27perf-lib: use a single filename for all measurement typesJeff King2-11/+5
The perf tests write files recording the results of tests. These results are later aggregated by 'aggregate.perl'. If the tests are run multiple times, those results are overwritten by the new results. This works just fine as long as there are only perf tests measuring the times, whose results are stored in "$base".times files. However 22bec79d1a ("t/perf: add infrastructure for measuring sizes", 2018-08-17) introduced a new type of test for measuring the size of something. The results of this are written to "$base".size files. "$base" is essentially made up of the basename of the script plus the test number. So if test numbers shift because a new test was introduced earlier in the script we might end up with both a ".times" and a ".size" file for the same test. In the aggregation script the ".times" file is preferred over the ".size" file, so some size tests might end with performance numbers from a previous run of the test. This is mainly relevant when writing perf tests that check both performance and sizes, and can get quite confusing during developement. We could fix this by doing a more thorough job of cleaning out old ".times" and ".size" files before running each test. However, an even easier solution is to just use the same filename for both types of measurement, meaning we'll always overwrite the previous result. We don't even need to change the file format to distinguish the two; aggregate.perl already decides which is which based on a regex of the content (this may become ambiguous if we add new types in the future, but we could easily add a header field to the file at that point). Based on an initial patch from Thomas Gummerer, who discovered the problem and did all of the analysis (which I stole for the commit message above): https://public-inbox.org/git/20191119185047.8550-1-t.gummerer@gmail.com/ Helped-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-12pack-objects: avoid pointless oe_map_new_pack() callsJeff King1-0/+1
This patch fixes an extreme slowdown in pack-objects when you have more than 1023 packs. See below for numbers. Since 43fa44fa3b (pack-objects: move in_pack out of struct object_entry, 2018-04-14), we use a complicated system to save some per-object memory. Each object_entry structs gets a 10-bit field to store the index of the pack it's in. We map those indices into pointers using packing_data->in_pack_by_idx, which we initialize at the start of the program. If we have 2^10 or more packs, then we instead create an array of pack pointers, one per object. This is packing_data->in_pack. So far so good. But there's one other tricky case: if a new pack arrives after we've initialized in_pack_by_idx, it won't have an index yet. We solve that by calling oe_map_new_pack(), which just switches on the fly to the less-optimal in_pack mechanism, allocating the array and back-filling it for already-seen objects. But that logic kicks in even when we've switched to it already (whether because we really did see a new pack, or because we had too many packs in the first place). The result doesn't produce a wrong outcome, but it's very slow. What happens is this: - imagine you have a repo with 500k objects and 2000 packs that you want to repack. - before looking at any objects, we call prepare_in_pack_by_idx(). It starts allocating an index for each pack. On the 1024th pack, it sees there are too many, so it bails, leaving in_pack_by_idx as NULL. - while actually adding objects to the packing list, we call oe_set_in_pack(), which checks whether the pack already has an index. If it's one of the packs after the first 1023, then it doesn't have one, and we'll call oe_map_new_pack(). But there's no useful work for that function to do. We're already using in_pack, so it just uselessly walks over the complete list of objects, trying to backfill in_pack. And we end up doing this for almost 1000 packs (each of which may be triggered by more than one object). And each time it triggers, we may iterate over up to 500k objects. So in the absolute worst case, this is quadratic in the number of objects. The solution is simple: we don't need to bother checking whether the pack has an index if we've already converted to using in_pack, since by definition we're not going to use it. So we can just push the "does the pack have a valid index" check down into that half of the conditional, where we know we're going to use it. The current test in p5303 sadly doesn't notice this problem, since it maxes out at 1000 packs. If we add a new test to it at 2000 packs, it does show the improvement: Test HEAD^ HEAD ---------------------------------------------------------------------- 5303.12: repack (2000) 26.72(39.68+0.67) 15.70(28.70+0.66) -41.2% However, these many-pack test cases are rather expensive to run, so adding larger and larger numbers isn't appealing. Instead, we can show it off more easily by using GIT_TEST_FULL_IN_PACK_ARRAY, which forces us into the absolute worst case: no pack has an index, so we'll trigger oe_map_new_pack() pointlessly for every single object, making it truly quadratic. Here are the numbers (on git.git) with the included change to p5303: Test HEAD^ HEAD ---------------------------------------------------------------------- 5303.3: rev-list (1) 2.05(1.98+0.06) 2.06(1.99+0.06) +0.5% 5303.4: repack (1) 33.45(33.46+0.19) 2.75(2.73+0.22) -91.8% 5303.6: rev-list (50) 2.07(2.01+0.06) 2.06(2.01+0.05) -0.5% 5303.7: repack (50) 34.21(35.18+0.16) 3.49(4.50+0.12) -89.8% 5303.9: rev-list (1000) 2.87(2.78+0.08) 2.88(2.80+0.07) +0.3% 5303.10: repack (1000) 41.26(51.30+0.47) 10.75(20.75+0.44) -73.9% Again, those improvements aren't realistic for the 1-pack case (because in the real world, the full-array solution doesn't kick in), but it's more useful to be testing the more-complicated code path. While we're looking at this issue, we'll tweak one more thing: in oe_map_new_pack(), we call REALLOC_ARRAY(pack->in_pack). But we'd never expect to get here unless we're back-filling it for the first time, in which case it would be NULL. So let's switch that to ALLOC_ARRAY() for clarity, and add a BUG() to document the expectation. Unfortunately this code isn't well-covered in the test suite because it's inherently racy (it only kicks in if somebody else adds a new pack while we're in the middle of repacking). Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-10Fix spelling errors in messages shown to usersElijah Newren1-1/+1
Reported-by: Jens Schleusener <Jens.Schleusener@fossies.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-12t/perf: rename duplicate-numbered test scriptJeff King1-0/+0
There are two perf scripts numbered p5600, but with otherwise different names ("clone-reference" versus "partial-clone"). We store timing results in files named after the whole script, so internally we don't get confused between the two. But "aggregate.perl" just prints the test number for each result, giving multiple entries for "5600.3". It also makes it impossible to skip one test but not the other with GIT_SKIP_TESTS. Let's renumber the one that appeared later (by date -- the source of the problem is that the two were developed on independent branches). For the non-perf test suite, our test-lint rule would have complained about this when the two were merged, but t/perf never learned that trick. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-01check_everything_connected: assume alternate ref tips are validJeff King1-0/+27
When we receive a remote ref update to sha1 "X", we want to check that we have all of the objects needed by "X". We can assume that our repository is not currently corrupted, and therefore if we have a ref pointing at "Y", we have all of its objects. So we can stop our traversal from "X" as soon as we hit "Y". If we make the same non-corruption assumption about any repositories we use to store alternates, then we can also use their ref tips to shorten the traversal. This is especially useful when cloning with "--reference", as we otherwise do not have any local refs to check against, and have to traverse the whole history, even though the other side may have sent us few or no objects. Here are results for the included perf test (which shows off more or less the maximal savings, getting one new commit and sharing the whole history): Test HEAD^ HEAD -------------------------------------------------------------------- [on git.git] 5600.3: clone --reference 2.94(2.86+0.08) 0.09(0.08+0.01) -96.9% [on linux.git] 5600.3: clone --reference 45.74(45.34+0.41) 0.36(0.30+0.08) -99.2% Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-19Merge branch 'ab/perf-installed-fix'Junio C Hamano4-27/+53
Performance test framework has been broken and measured the version of Git that happens to be on $PATH, not the specified one to measure, for a while, which has been corrected. * ab/perf-installed-fix: perf-lib.sh: forbid the use of GIT_TEST_INSTALLED perf tests: add "bindir" prefix to git tree test results perf-lib.sh: remove GIT_TEST_INSTALLED from perf-lib.sh perf-lib.sh: make "./run <revisions>" use the correct gits perf aggregate: remove GIT_TEST_INSTALLED from --codespeed perf README: correct docs for 3c8f12c96c regression
2019-05-13Merge branch 'jk/perf-aggregate-wo-libjson'Junio C Hamano1-2/+2
The script to aggregate perf result unconditionally depended on libjson-perl even though it did not have to, which has been corrected. * jk/perf-aggregate-wo-libjson: t/perf: depend on perl JSON only when using --codespeed
2019-05-13Merge branch 'jk/p5302-avoid-collision-check-cost'Junio C Hamano1-13/+18
Fix index-pack perf test so that the repeated invocations always run in an empty repository, which emulates the initial clone situation better. * jk/p5302-avoid-collision-check-cost: p5302: create the repo in each index-pack test
2019-05-13Merge branch 'ew/repack-with-bitmaps-by-default'Junio C Hamano2-3/+1
The connectivity bitmaps are created by default in bare repositories now; also the pathname hash-cache is created by default to avoid making crappy deltas when repacking. * ew/repack-with-bitmaps-by-default: pack-objects: default to writing bitmap hash-cache t5310: correctly remove bitmaps for jgit test repack: enable bitmaps by default on bare repos
2019-05-13Merge branch 'js/partial-clone-connectivity-check'Junio C Hamano1-0/+26
During an initial "git clone --depth=..." partial clone, it is pointless to spend cycles for a large portion of the connectivity check that enumerates and skips promisor objects (which by definition is all objects fetched from the other side). This has been optimized out. * js/partial-clone-connectivity-check: t/perf: add perf script for partial clones clone: do faster object check for partial clones
2019-05-08perf-lib.sh: forbid the use of GIT_TEST_INSTALLEDÆvar Arnfjörð Bjarmason2-0/+13
As noted in preceding commits setting GIT_TEST_INSTALLED has never been supported or documented, and as noted in an earlier t/perf/README change to the extent that it's been documented nobody's notices that the example hasn't worked since 3c8f12c96c ("test-lib: reorder and include GIT-BUILD-OPTIONS a lot earlier", 2012-06-24). We could directly support GIT_TEST_INSTALLED for invocations without the "run" script, such as: GIT_TEST_INSTALLED=../../ ./p0000-perf-lib-sanity.sh GIT_TEST_INSTALLED=/home/avar/g/git ./p0000-perf-lib-sanity.sh But while not having this "error" will "work", it won't write the the resulting "test-results/*" files to the right place, and thus a subsequent call to aggregate.perl won't work as expected. Let's just tell the user that they need to use the "run" script, which'll correctly deal with this and set the right PERF_RESULTS_PREFIX. If someone's in desperate need of bypassing "run" for whatever reason they can trivially do so by setting "PERF_SET_GIT_TEST_INSTALLED", but not we won't have people who expect GIT_TEST_INSTALLED to just work wondering why their aggregation doesn't work, even though they're running the right "git". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
2019-05-08perf tests: add "bindir" prefix to git tree test resultsÆvar Arnfjörð Bjarmason2-2/+4
Change the output file names in test-results/ to be "test-results/bindir_<munged dir>" rather than just "test-results/<munged dir>". This is for consistency with the "build_" directories we have for built revisions, i.e. "test-results/build_<SHA-1>". There's no user-visible functional changes here, it just makes it easier to see at a glance what "test-results" files are of what "type" as they're all explicitly grouped together now, and to grep this code to find both the run_dirs_helper() implementation and its corresponding aggregate.perl code. Note that we already guarantee that the rest of the PERF_RESULTS_PREFIX is an absolute path, and since it'll start with e.g. "/" which we munge to "_" we'll up with a readable string like "bindir_home_avar[...]". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
2019-05-08perf-lib.sh: remove GIT_TEST_INSTALLED from perf-lib.shÆvar Arnfjörð Bjarmason3-32/+38
Follow-up my preceding change which fixed the immediate "./run <revisions>" regression in 0baf78e7bc ("perf-lib.sh: rely on test-lib.sh for --tee handling", 2019-03-15) and entirely get rid of GIT_TEST_INSTALLED from perf-lib.sh (and aggregate.perl). As noted in that change the dance we're doing with GIT_TEST_INSTALLED perf-lib.sh isn't necessary, but there I was doing the most minimal set of changes to quickly fix a regression. But it's much simpler to never deal with the "GIT_TEST_INSTALLED" we were setting in perf-lib.sh at all. Instead the run_dirs_helper() sets the previously inferred $PERF_RESULTS_PREFIX directly. Setting this at the callsite that's already best positioned to exhaustively know about all the different cases we need to handle where PERF_RESULTS_PREFIX isn't what we want already (the empty string) makes the most sense. In one-off cases like: ./run ./p0000-perf-lib-sanity.sh ./p0000-perf-lib-sanity.sh We'll just do the right thing because PERF_RESULTS_PREFIX will be empty, and test-lib.sh takes care of finding where our git is. Any refactoring of this code needs to change both the shell code and the Perl code in aggregate.perl, because when running e.g.: ./run ../../ -- <test> The "../../" path to a relative bindir needs to be munged to a filename containing the results, and critically aggregate.perl does not get passed the path to those aggregations, just "../..". Let's fix cases where aggregate.perl would print e.g. ".." in its report output for this, and "git" for "/home/avar/g/git", i.e. it would always pick the last element. Now'll always print the full path instead. This also makes the code sturdier, e.g. you can feed "../.." to "./run" and then an absolute path to the aggregate.perl script, as long as the absolute path and "../.." resolved to the same directory printing the aggregation will work. Also simplify the "[_*]" on the RHS of "tr -c", we're trimming everything to "_", so we don't need that. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
2019-05-08perf-lib.sh: make "./run <revisions>" use the correct gitsÆvar Arnfjörð Bjarmason2-2/+10
Fix a really bad regression in 0baf78e7bc ("perf-lib.sh: rely on test-lib.sh for --tee handling", 2019-03-15). Since that change all runs of different <revisions> of git have used the git found in the user's $PATH, e.g. /usr/bin/git instead of the <revision> we just built and wanted to performance test. The problem starts with GIT_TEST_INSTALLED not working like our non-perf tests with the "run" script. I.e. you can't run performance tests against a given installed git. Instead we expect to use it ourselves to point GIT_TEST_INSTALLED to the <revision> we just built. However, we had been relying on '$(cd "$GIT_TEST_INSTALLED" && pwd)' to resolve that relative $GIT_TEST_INSTALLED to an absolute path *before* test-lib.sh was loaded, in cases where it was e.g. "build/<rev>/bin-wrappers" and we wanted "<abs_path>build/...". This change post-dates another proposed solution by a few days[1], I didn't notice that version when I initially wrote this. I'm doing the most minimal thing to solve the regression here, a follow-up change will move this result prefix selection logic entirely into the "run" script. This makes e.g. these cases all work: ./run . $PWD/../../ origin/master origin/next HEAD -- <tests> As well as just a plain one-off: ./run <tests> And, since we're passing down the new GIT_PERF_DIR_MYDIR_REL we make sure the bug relating to aggregate.perl not finding our files as described in 0baf78e7bc doesn't happen again. What *doesn't* work is setting GIT_TEST_INSTALLED to a relative path, this will subtly fail in test-lib.sh. This has always been the case even before 0baf78e7bc, and as documented in t/README the GIT_TEST_INSTALLED variable should be set to an absolute path (needs to be set "to the bindir", which is always absolute), and the "perf" framework expects to munge it itself. Perhaps that should be dealt with in the future to allow manually setting GIT_TEST_INSTALLED, but as a preceding commit showed the user can just use the "run" script, which'll also pick the right output directory for the test results as expected by aggregate.perl. 1. https://public-inbox.org/git/20190502222409.GA15631@sigill.intra.peff.net/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
2019-05-08perf aggregate: remove GIT_TEST_INSTALLED from --codespeedÆvar Arnfjörð Bjarmason1-3/+0
Remove the setting of the "environment" from the --codespeed output. I don't think this is useful, and it helps with a later refactoring where we GIT_TEST_INSTALLED stop munging/reading GIT_TEST_INSTALLED in the perf tests in so many places. This was added in 05eb1c37ed ("perf/aggregate: implement codespeed JSON output", 2018-01-05), but since the "run" scripts uses "GIT_TEST_INSTALLED" internally this was only ever useful for one-off runs of a single revision as all the "environment" values would be ones for whatever directory the "run" script ran last. Let's instead fall back on the "uname -r" case, which is the sort of thing the environment should be set to, not something that duplicates other parts of the codpseed output. For setting the "environment" to something custom the perf.repoName variable can be used. See 19cf57a92e ("perf/run: read GIT_PERF_REPO_NAME from perf.repoName", 2018-01-05). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
2019-05-08perf README: correct docs for 3c8f12c96c regressionÆvar Arnfjörð Bjarmason1-1/+1
Since 3c8f12c96c ("test-lib: reorder and include GIT-BUILD-OPTIONS a lot earlier", 2012-06-24) the suggested advice of overriding GIT_BUILD_DIR has not worked. We've printed a hard error like this given e.g. GIT_BUILD_DIR=/home/avar/g/git: /bin-wrappers/git is not executable; using GIT_EXEC_PATH error: You haven't built things yet, have you? Let's just suggest that the user run other gits via the "run" script. That'll do the right thing for setting the path to the other git, and running the "aggregate.perl" scripts afterwards will work. As an aside, if setting GIT_BUILD_DIR had still worked, then the MODERN_GIT feature/fix added in 1a0962dee5 ("t/perf: fix regression in testing older versions of git", 2016-06-22) would have broke. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
2019-05-05t/perf: add perf script for partial clonesJeff King1-0/+26
We don't cover the partial clone feature at all in t/perf. Let's at least run a few basic tests so that we'll notice any regressions. We'll do a no-blob clone, and split it into two parts: the actual object transfer, and the subsequent checkout (which will of course require another transfer to get the blobs). That will help us more clearly assess the performance of each. There are obviously a lot more possibilities besides just a no-blob partial clone, but this should serve as a canary that alerts us to any generic slow-downs (and we can add more tests later for cases that aren't exercised here). There are a few non-ideal things here that make this not an entirely accurate test, but are probably OK for our purposes: 1. We have to do some extra prep/cleanup work inside the timing tests, since they impact the on-disk state and the perf harness may run each one multiple times. In practice this is probably OK, since these bits should be much less expensive than the operations we are measuring. 2. The clone time is likely to be dominated by the server's object enumeration. In the real world, a repo large enough to drive people to partial clones is likely to have reachability bitmaps enabled. And in the opposite direction, our object transfer is happening at the speed of a local pipe, whereas in the real world it would bottle-neck on the network. So any percentage speedups should be taken with a grain of salt. But hopefully any regressions will produce enough of an effect to be noticeable. This script also demonstrates the recent improvement from dfa33a298d (clone: do faster object check for partial clones, 2019-04-19): Test dfa33a298d^ dfa33a298d ------------------------------------------------------------------------- 5600.2: clone without blobs 18.41(22.72+1.09) 6.83(11.65+0.50) -62.9% 5600.3: checkout of result 1.82(3.24+0.26) 1.84(3.24+0.26) +1.1% Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-25Merge branch 'jk/revision-rewritten-parents-in-prio-queue'Junio C Hamano1-0/+18
Performance fix for "rev-list --parents -- pathspec". * jk/revision-rewritten-parents-in-prio-queue: revision: use a prio_queue to hold rewritten parents
2019-04-24t/perf: depend on perl JSON only when using --codespeedJeff King1-2/+2
Commit 05eb1c37ed (perf/aggregate: implement codespeed JSON output, 2018-01-05) added a dependency on the perl JSON module to show output from aggregate.perl, but we only need it when the user asks for --codespeed output. While the module is pretty common, it's not part of the base system, and this dependency can get in the way of producing the default human-readable output. Let's bump the "use" down to a "require" in the code path that needs it, which will be interpreted at run-time instead of compile-time. People not using "--codespeed" won't even load the module, and anybody using it should see the same results (including the same perl error if they don't have it). Note that this skips the importing step, so we'll have to fully qualify our function call. We could accomplish the same thing in other ways. E.g., calling JSON->import() ourselves, or wrapping "use JSON" in an eval. Since there's only one such call, this seems like the least-magical way of doing it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-23p5302: create the repo in each index-pack testJeff King1-13/+18
The p5302 script runs "index-pack --stdin" in each timing test. It does two things to try to get good timings: 1. we do the repo creation in a separate (non-timed) setup test, so that our timing is purely the index-pack run 2. we use a separate repo for each test; this is important because the presence of existing objects in the repo influences the result (because we'll end up doing collision checks against them) But this forgets one thing: we generally run each timed test multiple times to reduce the impact of noise. Which means that repeats of each test after the first will be subject to the collision slowdown from point 2, and we'll generally just end up taking the first time anyway. Instead, let's create the repo in the test (effectively undoing point 1). That does add a constant amount of extra work to each iteration, but it's quite small compared to the actual effects we're interested in measuring. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-04revision: use a prio_queue to hold rewritten parentsJeff King1-0/+18
This patch fixes a quadratic list insertion in rewrite_one() when pathspec limiting is combined with --parents. What happens is something like this: 1. We see that some commit X touches the path, so we try to rewrite its parents. 2. rewrite_one() loops forever, rewriting parents, until it finds a relevant parent (or hits the root and decides there are none). The heavy lifting is done by process_parent(), which uses try_to_simplify_commit() to drop parents. 3. process_parent() puts any intermediate parents into the &revs->commits list, inserting by commit date as usual. So if commit X is recent, and then there's a large chunk of history that doesn't touch the path, we may add a lot of commits to &revs->commits. And insertion by commit date is O(n) in the worst case, making the whole thing quadratic. We tried to deal with this long ago in fce87ae538 (Fix quadratic performance in rewrite_one., 2008-07-12). In that scheme, we cache the oldest commit in the list; if the new commit to be added is older, we can start our linear traversal there. This often works well in practice because parents are older than their descendants, and thus we tend to add older and older commits as we traverse. But this isn't guaranteed, and in fact there's a simple case where it is not: merges. Imagine we look at the first parent of a merge and see a very old commit (let's say 3 years old). And on the second parent, as we go back 3 years in history, we might have many commits. That one first-parent commit has polluted our oldest-commit cache; it will remain the oldest while we traverse a huge chunk of history, during which we have to fall back to the slow, linear method of adding to the list. Naively, one might imagine that instead of caching the oldest commit, we'd start at the last-added one. But that just makes some cases faster while making others slower (and indeed, while it made a real-world test case much faster, it does quite poorly in the perf test include here). Fundamentally, these are just heuristics; our worst case is still quadratic, and some cases will approach that. Instead, let's use a data structure with better worst-case performance. Swapping out revs->commits for something else would have repercussions all over the code base, but we can take advantage of one fact: for the rewrite_one() case, nobody actually needs to see those commits in revs->commits until we've finished generating the whole list. That leaves us with two obvious options: 1. We can generate the list _unordered_, which should be O(n), and then sort it afterwards, which would be O(n log n) total. This is "sort-after" below. 2. We can insert the commits into a separate data structure, like a priority queue. This is "prio-queue" below. I expected that sort-after would be the fastest (since it saves us the extra step of copying the items into the linked list), but surprisingly the prio-queue seems to be a bit faster. Here are timings for the new p0001.6 for all three techniques across a few repositories, as compared to master: master cache-last sort-after prio-queue -------------------------------------------------------------------------------------------- GIT_PERF_REPO=git.git 0.52(0.50+0.02) 0.53(0.51+0.02) +1.9% 0.37(0.33+0.03) -28.8% 0.37(0.32+0.04) -28.8% GIT_PERF_REPO=linux.git 20.81(20.74+0.07) 20.31(20.24+0.07) -2.4% 0.94(0.86+0.07) -95.5% 0.91(0.82+0.09) -95.6% GIT_PERF_REPO=llvm-project.git 83.67(83.57+0.09) 4.23(4.15+0.08) -94.9% 3.21(3.15+0.06) -96.2% 2.98(2.91+0.07) -96.4% A few items to note: - the cache-list tweak does improve the bad case for llvm-project.git that started my digging into this problem. But it performs terribly on linux.git, barely helping at all. - the sort-after and prio-queue techniques work well. They approach the timing for running without --parents at all, which is what you'd expect (see below for more data). - prio-queue just barely outperforms sort-after. As I said, I'm not really sure why this is the case, but it is. You can see it even more prominently in this real-world case on llvm-project.git: git rev-list --parents 07ef786652e7 -- llvm/test/CodeGen/Generic/bswap.ll where prio-queue routinely outperforms sort-after by about 7%. One guess is that the prio-queue may just be more efficient because it uses a compact array. There are three new perf tests: - "rev-list --parents" gives us a baseline for running with --parents. This isn't sped up meaningfully here, because the bad case is triggered only with simplification. But it's good to make sure we don't screw it up (now, or in the future). - "rev-list -- dummy" gives us a baseline for just traversing with pathspec limiting. This gives a lower bound for the next test (and it's also a good thing for us to be checking in general for regressions, since we don't seem to have any existing tests). - "rev-list --parents -- dummy" shows off the problem (and our fix) Here are the timings for those three on llvm-project.git, before and after the fix: Test master prio-queue ------------------------------------------------------------------------------ 0001.3: rev-list --parents 2.24(2.12+0.12) 2.22(2.11+0.11) -0.9% 0001.5: rev-list -- dummy 2.89(2.82+0.07) 2.92(2.89+0.03) +1.0% 0001.6: rev-list --parents -- dummy 83.67(83.57+0.09) 2.98(2.91+0.07) -96.4% Changes in the first two are basically noise, and you can see we approach our lower bound in the final one. Note that we can't fully get rid of the list argument from process_parents(). Other callers do have lists, and it would be hard to convert them. They also don't seem to have this problem (probably because they actually remove items from the list as they loop, meaning it doesn't grow so large in the first place). So this basically just drops the "cache_ptr" parameter (which was used only by the one caller we're fixing here) and replaces it with a prio_queue. Callers are free to use either data structure, depending on what they're prepared to handle. Reported-by: Björn Pettersson A <bjorn.a.pettersson@ericsson.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-18perf-lib.sh: rely on test-lib.sh for --tee handlingJeff King1-23/+11
Since its inception, the perf-lib.sh script has manually handled the "--tee" option (and other options which imply it, like "--valgrind") with a cut-and-pasted block from test-lib.sh. That block has grown stale over the years, and has at least three problems: 1. It uses $SHELL to re-exec the script, whereas the version in test-lib.sh learned to use $TEST_SHELL_PATH. 2. It does an ad-hoc search of the "$*" string, whereas test-lib.sh learned to carefully parse the arguments left to right. 3. It never learned about --verbose-log (which also implies --tee), so it would not trigger for that option. This last one was especially annoying, because t/perf/run uses the GIT_TEST_OPTS from your config.mak to run the perf scripts. So if you've set, say, "-x --verbose-log" there, it will be passed as part of most perf runs. And while this script doesn't recognize the option, the test-lib.sh that we source _does_, and the behavior ends up being much more annoying: - as the comment at the top of the block says, we have to run this tee code early, before we start munging variables (it says GIT_BUILD_DIR, but the problematic variable is actually GIT_TEST_INSTALLED). - since we don't recognize --verbose-log, we don't trigger the block. We go on to munge GIT_TEST_INSTALLED, converting it from a relative to an absolute path. - then we source test-lib.sh, which _does_ recognize --verbose-log. It re-execs the script, which runs again. But this time with an absolute version of GIT_TEST_INSTALLED. - As a result, we copy the absolute version of GIT_TEST_INSTALLED into perf_results_prefix. Instead of writing our results to the expected "test-results/build_1234abcd.p1234-whatever.times", we instead write them to "test-results/_full_path_to_repo_t_perf_build_1234...". The aggregate.perl script doesn't expect this, and so it prints "<missing>" for each result (even though it spent considerable time running the tests!). We can solve all of these in one blow by just deleting our custom handling, and relying on the inclusion of test-lib.sh to handle --tee, --verbose-log, etc. There's one catch, though. We want to handle GIT_TEST_INSTALLED after we've included test-lib.sh, since we want it un-munged in the re-exec'd version of the script. But if we want to convert it from a relative to an absolute path, we must do so before we load test-lib.sh, since it will change our working directory. So we compute the absolute directory first, store it away, then include test-lib.sh, and finally assign to GIT_TEST_INSTALLED as appropriate. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-18pack-objects: default to writing bitmap hash-cacheJeff King2-3/+1
Enabling pack.writebitmaphashcache should always be a performance win. It costs only 4 bytes per object on disk, and the timings in ae4f07fbcc (pack-bitmap: implement optional name_hash cache, 2013-12-21) show it improving fetch and partial-bitmap clone times by 40-50%. The only reason we didn't enable it by default at the time is that early versions of JGit's bitmap reader complained about the presence of optional header bits it didn't understand. But that was changed in JGit's d2fa3987a (Use bitcheck to check for presence of OPT_FULL option, 2013-10-30), which made it into JGit v3.5.0 in late 2014. So let's turn this option on by default. It's backwards-compatible with all versions of Git, and if you are also using JGit on the same repository, you'd only run into problems using a version that's almost 5 years old. We'll drop the manual setting from all of our test scripts, including perf tests. This isn't strictly necessary, but it has two advantages: 1. If the hash-cache ever stops being enabled by default, our perf regression tests will notice. 2. We can use the modified perf tests to show off the behavior of an otherwise unconfigured repo, as shown below. These are the results of a few of a perf tests against linux.git that showed interesting results. You can see the expected speedup in 5310.4, which was noted in ae4f07fbcc. Curiously, 5310.8 did not improve (and actually got slower), despite seeing the opposite in ae4f07fbcc. I don't have an explanation for that. The tests from p5311 did not exist back then, but do show improvements (a smaller pack due to better deltas, which we found in less time). Test HEAD^ HEAD ------------------------------------------------------------------------------------- 5310.4: simulated fetch 7.39(22.70+0.25) 5.64(11.43+0.22) -23.7% 5310.8: clone (partial bitmap) 18.45(24.83+1.19) 19.94(28.40+1.36) +8.1% 5311.31: server (128 days) 0.41(1.13+0.05) 0.34(0.72+0.02) -17.1% 5311.32: size (128 days) 7.4M 7.0M -4.8% 5311.33: client (128 days) 1.33(1.49+0.06) 1.29(1.37+0.12) -3.0% Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-14prune: use bitmaps for reachability traversalJeff King1-0/+11
Pruning generally has to traverse the whole commit graph in order to see which objects are reachable. This is the exact problem that reachability bitmaps were meant to solve, so let's use them (if they're available, of course). Here are timings on git.git: Test HEAD^ HEAD ------------------------------------------------------------------------ 5304.6: prune with bitmaps 3.65(3.56+0.09) 1.01(0.92+0.08) -72.3% And on linux.git: Test HEAD^ HEAD -------------------------------------------------------------------------- 5304.6: prune with bitmaps 35.05(34.79+0.23) 3.00(2.78+0.21) -91.4% The tests show a pretty optimal case, as we'll have just repacked and should have pretty good coverage of all refs with our bitmaps. But that's actually pretty realistic: normally prune is run via "gc" right after repacking. A few notes on the implementation: - the change is actually in reachable.c, so it would improve reachability traversals by "reflog expire --stale-fix", as well. Those aren't performed regularly, though (a normal "git gc" doesn't use --stale-fix), so they're not really worth measuring. There's a low chance of regressing that caller, since the use of bitmaps is totally transparent from the caller's perspective. - The bitmap case could actually get away without creating a "struct object", and instead the caller could just look up each object id in the bitmap result. However, this would be a marginal improvement in runtime, and it would make the callers much more complicated. They'd have to handle both the bitmap and non-bitmap cases separately, and in the case of git-prune, we'd also have to tweak prune_shallow(), which relies on our SEEN flags. - Because we do create real object structs, we go through a few contortions to create ones of the right type. This isn't strictly necessary (lookup_unknown_object() would suffice), but it's more memory efficient to use the correct types, since we already know them. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-14prune: lazily perform reachability traversalJeff King1-0/+24
The general strategy of "git prune" is to do a full reachability walk, then for each loose object see if we found it in our walk. But if we don't have any loose objects, we don't need to do the expensive walk in the first place. This patch postpones that walk until the first time we need to see its results. Note that this is really a specific case of a more general optimization, which is that we could traverse only far enough to find the object under consideration (i.e., stop the traversal when we find it, then pick up again when asked about the next object, etc). That could save us in some instances from having to do a full walk. But it's actually a bit tricky to do with our traversal code, and you'd need to do a full walk anyway if you have even a single unreachable object (which you generally do, if any objects are actually left after running git-repack). So in practice this lazy-load of the full walk catches one easy but common case (i.e., you've just repacked via git-gc, and there's nothing unreachable). The perf script is fairly contrived, but it does show off the improvement: Test HEAD^ HEAD ------------------------------------------------------------------------- 5304.4: prune with no objects 3.66(3.60+0.05) 0.00(0.00+0.00) -100.0% and would let us know if we accidentally regress this optimization. Note also that we need to take special care with prune_shallow(), which relies on us having performed the traversal. So this optimization can only kick in for a non-shallow repository. Since this is easy to get wrong and is not covered by existing tests, let's add an extra test to t5304 that covers this case explicitly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-20tests: send "bug in the test script" errors to the script's stderrSZEDER Gábor1-2/+2
Some of the functions in our test library check that they were invoked properly with conditions like this: test "$#" = 2 || error "bug in the test script: not 2 parameters to test-expect-success" If this particular condition is triggered, then 'error' will abort the whole test script with a bold red error message [1] right away. However, under certain circumstances the test script will be aborted completely silently, namely if: - a similar condition in a test helper function like 'test_line_count' is triggered, - which is invoked from the test script's "main" shell [2], - and the test script is run manually (i.e. './t1234-foo.sh' as opposed to 'make t1234-foo.sh' or 'make test') [3] - and without the '--verbose' option, because the error message is printed from within 'test_eval_', where standard output is redirected either to /dev/null or to a log file. The only indication that something is wrong is that not all tests in the script are executed and at the end of the test script's output there is no "# passed all N tests" message, which are subtle and can easily go unnoticed, as I had to experience myself. Send these "bug in the test script" error messages directly to the test scripts standard error and thus to the terminal, so those bugs will be much harder to overlook. Instead of updating all ~20 such 'error' calls with a redirection, let's add a BUG() function to 'test-lib.sh', wrapping an 'error' call with the proper redirection and also including the common prefix of those error messages, and convert all those call sites [4] to use this new BUG() function instead. [1] That particular error message from 'test_expect_success' is printed in color only when running with or without '--verbose'; with '--tee' or '--verbose-log' the error is printed without color, but it is printed to the terminal nonetheless. [2] If such a condition is triggered in a subshell of a test, then 'error' won't be able to abort the whole test script, but only the subshell, which in turn causes the test to fail in the usual way, indicating loudly and clearly that something is wrong. [3] Well, 'error' aborts the test script the same way when run manually or by 'make' or 'prove', but both 'make' and 'prove' pay attention to the test script's exit status, and even a silently aborted test script would then trigger those tools' usual noticable error messages. [4] Strictly speaking, not all those 'error' calls need that redirection to send their output to the terminal, see e.g. 'test_expect_success' in the opening example, but I think it's better to be consistent. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-12p3400: replace calls to `git checkout -b' by `git checkout -B'Alban Gruin1-5/+5
p3400 makes a copy of the current repository to test git-rebase performance, and creates new branches in the copy with `git checkout -b'. If the original repository has branches with the same name as the script is trying to create, this operation will fail. This replaces these calls by `git checkout -B' to force the creation and update of these branches. Signed-off-by: Alban Gruin <alban.gruin@gmail.com> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-10Merge branch 'ab/fsck-skiplist'Junio C Hamano2-0/+53
Update fsck.skipList implementation and documentation. * ab/fsck-skiplist: fsck: support comments & empty lines in skipList fsck: use oidset instead of oid_array for skipList fsck: use strbuf_getline() to read skiplist file fsck: add a performance test for skipList fsck: add a performance test fsck: document that skipList input must be unabbreviated fsck: document and test commented & empty line skipList input fsck: document and test sorted skipList input fsck tests: add a test for no skipList input fsck tests: setup of bogus commit object
2018-09-12fsck: add a performance test for skipListRené Scharfe1-0/+40
Create a performance test to see how the skipList implementation performs. First we setup N bad commits, then we see how progressively working our way up to 0..N in increments of 10x does. I.e. the needle(s) in the haystack get progressively more numerous. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-12fsck: add a performance testÆvar Arnfjörð Bjarmason1-0/+13
Add a plain performance test for "fsck". This test will not be used to / referred to in any upcoming commit of mine in this series, but having a simple test for fsck performance is valuable, so let's add it while we're at it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20t/perf: add perf tests for fetches from a bitmapped serverJeff King1-0/+45
A server with bitmapped packs can serve a clone very quickly. However, fetches are not necessarily made any faster, because we spend a lot less time in object traversal (which is what bitmaps help with) and more time finding deltas (because we may have to throw out on-disk deltas if the client does not have the base). As a first step to making this faster, this patch introduces a new perf script to measure fetches into a repo of various ages from a fully-bitmapped server. We separately measure the work done by the server (in pack-objects) and that done by the client (in index-pack). Furthermore, we measure the size of the resulting pack. Breaking it down like this (instead of just doing a regular "git fetch") lets us see how much each side benefits from any changes. And since we know the pack size, if we estimate the network speed, then one could calculate a complete wall-clock time for the operation (though the script does not do this automatically). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20t/perf: add infrastructure for measuring sizesJeff King3-5/+81
The main objective of scripts in the perf framework is to run "test_perf", which measures the time it takes to run some operation. However, it can also be interesting to see the change in the output size of certain operations. This patch introduces test_size, which records a single numeric output from the test and shows it in the aggregated output (with pretty printing and relative size comparison). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20t/perf: factor out percent calculationsJeff King1-9/+12
This will let us reuse the code when we add new values to aggregate besides times. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20t/perf: factor boilerplate out of test_perfJeff King1-26/+35
About half of test_perf() is boilerplate preparing to run _any_ test, and the other half is specifically running a timing test. Let's split it into two functions, so that we can reuse the boilerplate in future commits. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-23Merge branch 'cc/perf-bisect'Junio C Hamano1-0/+6
Performance test updates. * cc/perf-bisect: perf/bisect_run_script: disable codespeed
2018-05-06perf/bisect_run_script: disable codespeedChristian Couder1-0/+6
When bisecting a performance regression using a config file, `./bisect_regression --config my_perf.conf` for example, the config file can contain Codespeed configuration which would instruct the 'aggregate.perl' script called by the 'run' script to output results in the Codespeed format and maybe to try to send this output to a Codespeed server. This is unfortunate because the 'bisect_run_script' relies on the regular output from 'aggregate.perl' to mesure performance, so let's disable Codespeed output and sending results to a Codespeed server. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-26perf/aggregate: use Getopt::Long for option parsingChristian Couder1-36/+26
When passing an option '--foo' that it does not recognize, the aggregate.perl script should die with an helpful error message like: Unknown option: foo ./aggregate.perl [options] [--] [<dir_or_rev>...] [--] \ [<test_script>...] > Options: --codespeed * Format output for Codespeed --reponame <str> * Send given reponame to codespeed --sort-by <str> * Sort output (only "regression" \ criteria is supported) rather than: fatal: Needed a single revision rev-parse --verify --foo: command returned error: 128 To implement that let's use Getopt::Long for option parsing instead of the current manual and sloppy parsing. This should save some code and make option parsing simpler, tighter and safer. This will avoid something like 'foo--sort-by=regression' to be handled as if '--sort-by=regression' had been used, for example. As Getopt::Long eats '--' at the end of options, this changes a bit the way '--' is handled as we can now have '--' both after the options and before the scripts. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-25Merge branch 'cc/perf-bisect'Junio C Hamano3-10/+166
Performance measuring framework in t/perf learned to help bisecting performance regressions. * cc/perf-bisect: t/perf: add scripts to bisect performance regressions perf/run: add --subsection option
2018-04-11t/perf: add scripts to bisect performance regressionsChristian Couder2-0/+120
The new bisect_regression script can be used to automatically bisect performance regressions. It will pass the new bisect_run_script to `git bisect run`. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-11perf/run: add --subsection optionChristian Couder1-10/+46
This new option makes it possible to run perf tests as defined in only one subsection of a config file. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-11Merge branch 'nd/combined-test-helper'Junio C Hamano5-13/+13
Small test-helper programs have been consolidated into a single binary. * nd/combined-test-helper: (36 commits) t/helper: merge test-write-cache into test-tool t/helper: merge test-wildmatch into test-tool t/helper: merge test-urlmatch-normalization into test-tool t/helper: merge test-subprocess into test-tool t/helper: merge test-submodule-config into test-tool t/helper: merge test-string-list into test-tool t/helper: merge test-strcmp-offset into test-tool t/helper: merge test-sigchain into test-tool t/helper: merge test-sha1-array into test-tool t/helper: merge test-scrap-cache-tree into test-tool t/helper: merge test-run-command into test-tool t/helper: merge test-revision-walking into test-tool t/helper: merge test-regex into test-tool t/helper: merge test-ref-store into test-tool t/helper: merge test-read-cache into test-tool t/helper: merge test-prio-queue into test-tool t/helper: merge test-path-utils into test-tool t/helper: merge test-online-cpus into test-tool t/helper: merge test-mktemp into test-tool t/helper: merge (unused) test-mergesort into test-tool ...
2018-03-27perf/aggregate: add --sort-by=regression optionChristian Couder1-1/+58
One of the most interesting thing one can be interested in when looking at performance test results is possible performance regressions. This new option makes it easy to spot such possible regressions. This new option is named '--sort-by=regression' to make it possible and easy to add other ways to sort the results, like for example '--sort-by=utime'. If we would like to sort according to how much the stime regressed we could also add a new option called '--sort-by=regression:stime'. Then '--sort-by=regression' could become a synonym for '--sort-by=regression:rtime'. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-27perf/aggregate: add display_dir()Christian Couder1-4/+7
This new helper function will be reused in a subsequent commit. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-27t/helper: merge test-write-cache into test-toolNguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-27t/helper: merge test-string-list into test-toolNguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-27t/helper: merge test-read-cache into test-toolNguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-27t/helper: merge test-drop-caches into test-toolNguyễn Thái Ngọc Duy1-6/+6
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-27t/helper: merge test-lazy-init-name-hash into test-toolNguyễn Thái Ngọc Duy1-4/+4
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-27perf: use GIT_PERF_REPEAT_COUNT=3 by default even without config fileRené Scharfe1-5/+3
9ba95ed23c (perf/run: update get_var_from_env_or_config() for subsections) stopped setting a default value for GIT_PERF_REPEAT_COUNT if no perf config file is present, because get_var_from_env_or_config returns early in that case. Fix it by setting the default value after calling this function. Its fifth parameter is not used for any other variable, so remove the associated code. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-15Merge branch 'cc/perf-aggregate'Junio C Hamano1-11/+37
"make perf" enhancement. * cc/perf-aggregate: perf/aggregate: sort JSON fields in output perf/aggregate: add --reponame option perf/aggregate: add --subsection option
2018-02-13Merge branch 'ab/simplify-perl-makefile'Junio C Hamano1-1/+1
The build procedure for perl/ part has been greatly simplified by weaning ourselves off of MakeMaker. * ab/simplify-perl-makefile: perl: treat PERLLIB_EXTRA as an extra path again perl: avoid *.pmc and fix Error.pm further Makefile: replace perl/Makefile.PL with simple make rules
2018-02-02perf/aggregate: sort JSON fields in outputChristian Couder1-1/+1
It is much easier to diff the output against a previous one when the fields are sorted. Helped-by: Philip Oakley <philipoakley@iee.org> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-02perf/aggregate: add --reponame optionChristian Couder1-2/+13
This makes it easier to use the aggregate script on the command line when one wants to get the "environment" fields set in the codespeed output. Previously setting GIT_REPO_NAME was needed for this purpose. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-02perf/aggregate: add --subsection optionChristian Couder1-9/+24
This makes it easier to use the aggregate script on the command line, to get results from subsections. Previously setting GIT_PERF_SUBSECTION was needed for this purpose. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-23Merge branch 'cc/codespeed'Junio C Hamano2-54/+137
"perf" test output can be sent to codespeed server. * cc/codespeed: perf/run: read GIT_PERF_REPO_NAME from perf.repoName perf/run: learn to send output to codespeed server perf/run: learn about perf.codespeedOutput perf/run: add conf_opts argument to get_var_from_env_or_config() perf/aggregate: implement codespeed JSON output perf/aggregate: refactor printing results perf/aggregate: fix checking ENV{GIT_PERF_SUBSECTION}
2018-01-05perf/run: read GIT_PERF_REPO_NAME from perf.repoNameChristian Couder1-0/+3
The GIT_PERF_REPO_NAME env variable is used in the `aggregate.perl` script to set the 'environment' field in the JSON Codespeed output. Let's make it easy to set this variable by setting it in a config file. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-05perf/run: learn to send output to codespeed serverChristian Couder1-1/+11
Let's make it possible to set in a config file the URL of a codespeed server. And then let's make the `run` script send the perf test results to this URL at the end of the tests. This should make is possible to easily automate the process of running perf tests and having their results available in Codespeed. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-05perf/run: learn about perf.codespeedOutputChristian Couder1-1/+6
Let's make it possible to set in a config file the output format (regular or codespeed) of the perf tests. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-05perf/run: add conf_opts argument to get_var_from_env_or_config()Christian Couder1-5/+6
Let's make it possible to use `git config` type specifiers like `--int` or `--bool`, so that config values are converted to the canonical form and easier to use. This additional argument is now the fourth argument of get_var_from_env_or_config() instead of the fifth because we want the default value argument to be unset if it is not passed, and this is simpler if it is the last argument. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-05perf/aggregate: implement codespeed JSON outputChristian Couder1-2/+62
Codespeed (https://github.com/tobami/codespeed/) is an open source project that can be used to track how some software performs over time. It stores performance test results in a database and can show nice graphs and charts on a web interface. As it can be interesting to use Codespeed to see how Git performance evolves over time and releases, let's implement a Codespeed output in "perf/aggregate.perl". Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-05perf/aggregate: refactor printing resultsChristian Couder1-46/+50
As we want to implement another kind of output than the current output for the perf test results, let's refactor the existing code that outputs the results in its own print_default_results() function. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-05perf/aggregate: fix checking ENV{GIT_PERF_SUBSECTION}Christian Couder1-1/+1
The way we check ENV{GIT_PERF_SUBSECTION} could trigger comparison between undef and "" that may be flagged by use of strict & warnings. Let's fix that. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-04perf: amend the grep tests to test grep.threadsÆvar Arnfjörð Bjarmason2-21/+86
Ever since 5b594f457a ("Threaded grep", 2010-01-25) the number of threads git-grep uses under PTHREADS has been hardcoded to 8, but there's no performance test to check whether this is an optimal setting. Amend the existing tests for the grep engines to support a mode where this can be tested, e.g.: GIT_PERF_GREP_THREADS='1 8 16' GIT_PERF_LARGE_REPO=~/g/linux ./run p782* Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-28Merge branch 'bp/fsmonitor'Junio C Hamano1-2/+1
Test fix. * bp/fsmonitor: p7519: improve check for prerequisite WATCHMAN
2017-12-18p7519: improve check for prerequisite WATCHMANRené Scharfe1-2/+1
The return code of command -v with a non-existing command is 1 in bash and 127 in dash. Use that return code directly to allow the script to work with dash and without watchman (e.g. on Debian). While at it stop redirecting the output. stderr is redirected to /dev/null by test_lazy_prereq already, and stdout can actually be useful -- the path of the found watchman executable is sent there, but it's shown only if the script was run with --verbose. Signed-off-by: Rene Scharfe <l.s.r@web.de> Acked-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-13Merge branch 'ds/for-each-file-in-obj-micro-optim'Junio C Hamano1-0/+4
The code to iterate over loose object files got optimized. * ds/for-each-file-in-obj-micro-optim: sha1_file: use strbuf_add() instead of strbuf_addf()
2017-12-11Makefile: replace perl/Makefile.PL with simple make rulesÆvar Arnfjörð Bjarmason1-1/+1
Replace the perl/Makefile.PL and the fallback perl/Makefile used under NO_PERL_MAKEMAKER=NoThanks with a much simpler implementation heavily inspired by how the i18n infrastructure's build process works[1]. The reason for having the Makefile.PL in the first place is that it was initially[2] building a perl C binding to interface with libgit, this functionality, that was removed[3] before Git.pm ever made it to the master branch. We've since since started maintaining a fallback perl/Makefile, as MakeMaker wouldn't work on some platforms[4]. That's just the tip of the iceberg. We have the PM.stamp hack in the top-level Makefile[5] to detect whether we need to regenerate the perl/perl.mak, which I fixed just recently to deal with issues like the perl version changing from under us[6]. There is absolutely no reason for why this needs to be so complex anymore. All we're getting out of this elaborate Rube Goldberg machine was copying perl/* to perl/blib/* as we do a string-replacement on the *.pm files to hardcode @@LOCALEDIR@@ in the source, as well as pod2man-ing Git.pm & friends. So replace the whole thing with something that's pretty much a copy of how we generate po/build/**.mo from po/*.po, just with a small sed(1) command instead of msgfmt. As that's being done rename the files from *.pm to *.pmc just to indicate that they're generated (see "perldoc -f require"). While I'm at it, change the fallback for Error.pm from being something where we'll ship our own Error.pm if one doesn't exist at build time to one where we just use a Git::Error wrapper that'll always prefer the system-wide Error.pm, only falling back to our own copy if it really doesn't exist at runtime. It's now shipped as Git::FromCPAN::Error, making it easy to add other modules to Git::FromCPAN::* in the future if that's needed. Functional changes: * This will not always install into perl's idea of its global "installsitelib". This only potentially matters for packagers that need to expose Git.pm for non-git use, and as explained in the INSTALL file there's a trivial workaround. * The scripts themselves will 'use lib' the target directory, but if INSTLIBDIR is set it overrides it. It doesn't have to be this way, it could be set in addition to INSTLIBDIR, but my reading of [7] is that this is the desired behavior. * We don't build man pages for all of the perl modules as we used to, only Git(3pm). As discussed on-list[8] that we were building installed manpages for purely internal APIs like Git::I18N or private-Error.pm was always a bug anyway, and all the Git::SVN::* ones say they're internal APIs. There are apparently external users of Git.pm, but I don't expect there to be any of the others. As a side-effect of these general changes the perl documentation now only installed by install-{doc,man}, not a mere "install" as before. 1. 5e9637c629 ("i18n: add infrastructure for translating Git with gettext", 2011-11-18) 2. b1edc53d06 ("Introduce Git.pm (v4)", 2006-06-24) 3. 18b0fc1ce1 ("Git.pm: Kill Git.xs for now", 2006-09-23) 4. f848718a69 ("Make perl/ build procedure ActiveState friendly.", 2006-12-04) 5. ee9be06770 ("perl: detect new files in MakeMaker builds", 2012-07-27) 6. c59c4939c2 ("perl: regenerate perl.mak if perl -V changes", 2017-03-29) 7. 0386dd37b1 ("Makefile: add PERLLIB_EXTRA variable that adds to default perl path", 2013-11-15) 8. 87bmjjv1pu.fsf@evledraar.booking.com ("Re: [PATCH] Makefile: replace perl/Makefile.PL with simple make rules" Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-06Merge branch 'jk/fewer-pack-rescan'Junio C Hamano3-23/+82
Internaly we use 0{40} as a placeholder object name to signal the codepath that there is no such object (e.g. the fast-forward check while "git fetch" stores a new remote-tracking ref says "we know there is no 'old' thing pointed at by the ref, as we are creating it anew" by passing 0{40} for the 'old' side), and expect that a codepath to locate an in-core object to return NULL as a sign that the object does not exist. A look-up for an object that does not exist however is quite costly with a repository with large number of packfiles. This access pattern has been optimized. * jk/fewer-pack-rescan: sha1_file: fast-path null sha1 as a missing object everything_local: use "quick" object existence check p5551: add a script to test fetch pack-dir rescans t/perf/lib-pack: use fast-import checkpoint to create packs p5550: factor out nonsense-pack creation
2017-12-06Merge branch 'cc/perf-run-config'Junio C Hamano3-15/+89
* cc/perf-run-config: perf: store subsection results in "test-results/$GIT_PERF_SUBSECTION/" perf/run: show name of rev being built perf/run: add run_subsection() perf/run: update get_var_from_env_or_config() for subsections perf/run: add get_subsections() perf/run: add calls to get_var_from_env_or_config() perf/run: add GIT_PERF_DIRS_OR_REVS perf/run: add get_var_from_env_or_config() perf/run: add '--config' option to the 'run' script
2017-12-04sha1_file: use strbuf_add() instead of strbuf_addf()Derrick Stolee1-0/+4
Replace use of strbuf_addf() with strbuf_add() when enumerating loose objects in for_each_file_in_obj_subdir(). Since we already check the length and hex-values of the string before consuming the path, we can prevent extra computation by using the lower- level method. One consumer of for_each_file_in_obj_subdir() is the abbreviation code. OID abbreviations use a cached list of loose objects (per object subdirectory) to make repeated queries fast, but there is significant cache load time when there are many loose objects. Most repositories do not have many loose objects before repacking, but in the GVFS case the repos can grow to have millions of loose objects. Profiling 'git log' performance in GitForWindows on a GVFS-enabled repo with ~2.5 million loose objects revealed 12% of the CPU time was spent in strbuf_addf(). Add a new performance test to p4211-line-log.sh that is more sensitive to this cache-loading. By limiting to 1000 commits, we more closely resemble user wait time when reading history into a pager. For a copy of the Linux repo with two ~512 MB packfiles and ~572K loose objects, running 'git log --oneline --parents --raw -1000' had the following performance: HEAD~1 HEAD ---------------------------------------- 7.70(7.15+0.54) 7.44(7.09+0.29) -3.4% Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21Merge branch 'bp/fsmonitor'Junio C Hamano1-0/+184
We learned to talk to watchman to speed up "git status" and other operations that need to see which paths have been modified. * bp/fsmonitor: fsmonitor: preserve utf8 filenames in fsmonitor-watchman log fsmonitor: read entirety of watchman output fsmonitor: MINGW support for watchman integration fsmonitor: add a performance test fsmonitor: add a sample integration script for Watchman fsmonitor: add test cases for fsmonitor extension split-index: disable the fsmonitor extension when running the split index test fsmonitor: add a test tool to dump the index extension update-index: add fsmonitor support to update-index ls-files: Add support in ls-files to display the fsmonitor valid bit fsmonitor: add documentation for the fsmonitor extension. fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files. update-index: add a new --force-write-index option preload-index: add override to enable testing preload-index bswap: add 64 bit endianness helper get_be64
2017-11-21p5551: add a script to test fetch pack-dir rescansJeff King1-0/+55
Since fetch often deals with object-ids we don't have (yet), it's an easy mistake for it to use a function like parse_object() that gives the correct result (e.g., NULL) but does so very slowly (because after failing to find the object, we re-scan the pack directory looking for new packs). The regular test suite won't catch this because the end result is correct, but we would want to know about performance regressions, too. Let's add a test to the regression suite. Note that this uses a synthetic repository that has a large number of packs. That's not ideal, as it means we're not testing what "normal" users see (in fact, some of these problems have existed for ages without anybody noticing simply because a rescan on a normal repository just isn't that expensive). So what we're really looking for here is the spike you'd notice in a pathological case (a lot of unknown objects coming into a repo with a lot of packs). If that's fast, then the normal cases should be, too. Note that the test also makes liberal use of $MODERN_GIT for setup; some of these regressions go back a ways, and we should be able to use it to find the problems there. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21t/perf/lib-pack: use fast-import checkpoint to create packsJeff King1-7/+3
We currently use fast-import only to create a large number of objects, and then run O(n) invocations of pack-objects to turn them into packs. We can do this faster by just asking fast-import to checkpoint and create a pack for each (after telling it not to turn loose tiny packs). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21p5550: factor out nonsense-pack creationJeff King2-23/+31
We have a function to create a bunch of irrelevant packs to measure the expense of reprepare_packed_git(). Let's make that available to other perf scripts. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-13p4211-line-log.sh: add log --online --raw --parents perf testDerrick Stolee1-0/+4
Add a new perf test for testing the performance of log while computing OID abbreviations. Using --oneline --raw and --parents options maximizes the number of OIDs to abbreviate while still spending some time computing diffs. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01fsmonitor: add a performance testBen Peart1-0/+184
Add a test utility (test-drop-caches) that flushes all changes to disk then drops file system cache on Windows, Linux, and OSX. Add a perf test (p7519-fsmonitor.sh) for fsmonitor. By default, the performance test will utilize the Watchman file system monitor if it is installed. If Watchman is not installed, it will use a dummy integration script that does not report any new or modified files. The dummy script has very little overhead which provides optimistic results. The performance test will also use the untracked cache feature if it is available as fsmonitor uses it to speed up scanning for untracked files. There are 4 environment variables that can be used to alter the default behavior of the performance test: GIT_PERF_7519_UNTRACKED_CACHE: used to configure core.untrackedCache GIT_PERF_7519_SPLIT_INDEX: used to configure core.splitIndex GIT_PERF_7519_FSMONITOR: used to configure core.fsmonitor GIT_PERF_7519_DROP_CACHE: if set, the OS caches are dropped between tests The big win for using fsmonitor is the elimination of the need to scan the working directory looking for changed and untracked files. If the file information is all cached in RAM, the benefits are reduced. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-24perf: store subsection results in "test-results/$GIT_PERF_SUBSECTION/"Christian Couder2-3/+9
When tests are run for a subsection defined in a config file, it is better if the results for the current subsection are not overwritting the results of a previous subsection. So let's store the results for a subsection in a subdirectory of "test-results/" with the subsection name. The aggregate.perl, when it is run for a subsection, should then aggregate the results found in "test-results/$GIT_PERF_SUBSECTION/". Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-24perf/run: show name of rev being builtChristian Couder1-2/+3
It is nice for the user to not just show the sha1 of the current revision being built but also the actual name of this revision. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-24perf/run: add run_subsection()Christian Couder1-12/+35
Let's actually use the subsections we find in the config file to run the perf tests separately for each subsection. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-24perf/run: update get_var_from_env_or_config() for subsectionsChristian Couder1-12/+20
As we will set some config options in subsections, let's teach get_var_from_env_or_config() to get the config options from the subsections if they are set there. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-24perf/run: add get_subsections()Christian Couder1-0/+7
This function makes it possible to find subsections, so that we will be able to run different tests for different subsections in a later commit. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-24perf/run: add calls to get_var_from_env_or_config()Christian Couder1-0/+3
These calls make it possible to have the make command or the make options in a config file, instead of in environment variables. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-24perf/run: add GIT_PERF_DIRS_OR_REVSChristian Couder1-0/+3
This environment variable can be set to some revisions or directories whose Git versions should be tested, in addition to the revisions or directories passed as arguments to the 'run' script. This enables a "perf.dirsOrRevs" configuration variable to be used to set revisions or directories whose Git versions should be tested. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-24perf/run: add get_var_from_env_or_config()Christian Couder2-3/+21
Add get_var_from_env_or_config() to easily set variables from a config file if they are defined there and not already set. This can also set them to a default value if one is provided. As an example, use this function to set GIT_PERF_REPEAT_COUNT from the perf.repeatCount config option or from the default value. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-24perf/run: add '--config' option to the 'run' scriptChristian Couder1-1/+6
It is error prone and tiring to use many long environment variables to give parameters to the 'run' script. Let's make it easy to store some parameters in a config file and to pass them to the run script. The GIT_PERF_CONFIG_FILE variable will be set to the argument of the '--config' option. This variable is not used yet. It will be used in a following commit. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-21perf: add test for writing the indexKevin Willford1-0/+29
A performance test for writing the index to be able to determine if changes to allocating ondisk structure help. Signed-off-by: Kevin Willford <kewillf@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-05Merge branch 'rs/sha1-name-readdir-optim'Junio C Hamano1-0/+16
Optimize "what are the object names already taken in an alternate object database?" query that is used to derive the length of prefix an object name is uniquely abbreviated to. * rs/sha1-name-readdir-optim: sha1_file: guard against invalid loose subdirectory numbers sha1_file: let for_each_file_in_obj_subdir() handle subdir names p4205: add perf test script for pretty log formats sha1_name: cache readdir(3) results in find_short_object_filename()
2017-06-24p4205: add perf test script for pretty log formatsRené Scharfe1-0/+16
Add simple performance tests for expanded log format placeholders. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13Merge branch 'jh/memihash-opt' into maintJunio C Hamano1-5/+42
perf-test update. * jh/memihash-opt: p0004: don't error out if test repo is too small p0004: don't abort if multi-threaded is too slow p0004: use test_perf p0004: avoid using pipes p0004: simplify calls of test-lazy-init-name-hash
2017-06-05perf: work around the tested repo having an index.lockÆvar Arnfjörð Bjarmason1-1/+8
When the tested repo has an index.lock file it should be removed. This file may be present if e.g. git-status previously crashed in that repo, and it will make a lot of git commands fail. Let's try harder and remove the lock. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02Merge branch 'ab/grep-preparatory-cleanup'Junio C Hamano6-3/+221
The internal implementation of "git grep" has seen some clean-up. * ab/grep-preparatory-cleanup: (31 commits) grep: assert that threading is enabled when calling grep_{lock,unlock} grep: given --threads with NO_PTHREADS=YesPlease, warn pack-objects: fix buggy warning about threads pack-objects & index-pack: add test for --threads warning test-lib: add a PTHREADS prerequisite grep: move is_fixed() earlier to avoid forward declaration grep: change internal *pcre* variable & function names to be *pcre1* grep: change the internal PCRE macro names to be PCRE1 grep: factor test for \0 in grep patterns into a function grep: remove redundant regflags assignments grep: catch a missing enum in switch statement perf: add a comparison test of log --grep regex engines with -F perf: add a comparison test of log --grep regex engines perf: add a comparison test of grep regex engines with -F perf: add a comparison test of grep regex engines perf: emit progress output when unpacking & building perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do grep: add tests to fix blind spots with \0 patterns grep: prepare for testing binary regexes containing rx metacharacters grep: add a test helper function for less verbose -f \0 tests ...
2017-05-30Merge branch 'jh/memihash-opt'Junio C Hamano1-5/+42
perf-test update. * jh/memihash-opt: p0004: don't error out if test repo is too small p0004: don't abort if multi-threaded is too slow p0004: use test_perf p0004: avoid using pipes p0004: simplify calls of test-lazy-init-name-hash
2017-05-30Merge branch 'ab/perf-wildmatch'Junio C Hamano3-4/+59
Add perf-test for wildmatch. * ab/perf-wildmatch: perf: add test showing exponential growth in path globbing perf: add function to setup a fresh test repo
2017-05-26perf: add a comparison test of log --grep regex engines with -FÆvar Arnfjörð Bjarmason1-0/+44
Add a performance comparison test of log --grepgrep regex engines given fixed strings. See the preceding fixed-string t/perf change ("perf: add a comparison test of grep regex engines with -F", 2017-04-21) for notes about this, in particular this mostly tests exactly the same codepath now, but might not in the future: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p4221-log-grep-engines-fixed.sh [...] Test this tree -------------------------------------------------------- 4221.1: fixed log --grep='int' 5.99(5.55+0.40) 4221.2: basic log --grep='int' 5.92(5.56+0.31) 4221.3: extended log --grep='int' 6.01(5.51+0.45) 4221.4: perl log --grep='int' 5.99(5.56+0.38) 4221.6: fixed log --grep='uncommon' 5.06(4.76+0.27) 4221.7: basic log --grep='uncommon' 5.02(4.78+0.21) 4221.8: extended log --grep='uncommon' 4.99(4.78+0.20) 4221.9: perl log --grep='uncommon' 5.00(4.72+0.26) 4221.11: fixed log --grep='æ' 5.35(5.12+0.20) 4221.12: basic log --grep='æ' 5.34(5.11+0.20) 4221.13: extended log --grep='æ' 5.39(5.10+0.22) 4221.14: perl log --grep='æ' 5.44(5.16+0.23) Only the non-ASCII -i case is different: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_4221_LOG_OPTS=' -i' ./run p4221-log-grep-engines-fixed.sh [...] Test this tree ----------------------------------------------------------- 4221.1: fixed log -i --grep='int' 6.17(5.77+0.35) 4221.2: basic log -i --grep='int' 6.16(5.59+0.39) 4221.3: extended log -i --grep='int' 6.15(5.70+0.39) 4221.4: perl log -i --grep='int' 6.15(5.69+0.38) 4221.6: fixed log -i --grep='uncommon' 5.10(4.88+0.21) 4221.7: basic log -i --grep='uncommon' 5.04(4.76+0.25) 4221.8: extended log -i --grep='uncommon' 5.07(4.82+0.23) 4221.9: perl log -i --grep='uncommon' 5.03(4.78+0.22) 4221.11: fixed log -i --grep='æ' 5.93(5.65+0.25) 4221.12: basic log -i --grep='æ' 5.88(5.62+0.25) 4221.13: extended log -i --grep='æ' 6.02(5.69+0.29) 4221.14: perl log -i --grep='æ' 5.36(5.06+0.29) See commit ("perf: add a comparison test of grep regex engines", 2017-04-19) for details on the machine the above test run was executed on. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26perf: add a comparison test of log --grep regex enginesÆvar Arnfjörð Bjarmason1-0/+53
Add a very basic performance comparison test comparing the POSIX basic, extended and perl engines with patterns matching log messages via --grep=<pattern>. $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p4220-log-grep-engines.sh [...] Test this tree --------------------------------------------------------------------- 4220.1: basic log --grep='how.to' 6.22(6.00+0.21) 4220.2: extended log --grep='how.to' 6.23(5.98+0.23) 4220.3: perl log --grep='how.to' 6.07(5.79+0.25) 4220.5: basic log --grep='^how to' 6.19(5.93+0.22) 4220.6: extended log --grep='^how to' 6.19(5.93+0.23) 4220.7: perl log --grep='^how to' 6.14(5.88+0.24) 4220.9: basic log --grep='[how] to' 6.96(6.65+0.28) 4220.10: extended log --grep='[how] to' 6.96(6.69+0.24) 4220.11: perl log --grep='[how] to' 6.95(6.58+0.33) 4220.13: basic log --grep='\(e.t[^ ]*\|v.ry\) rare' 7.10(6.80+0.27) 4220.14: extended log --grep='(e.t[^ ]*|v.ry) rare' 7.07(6.80+0.26) 4220.15: perl log --grep='(e.t[^ ]*|v.ry) rare' 7.70(7.46+0.22) 4220.17: basic log --grep='m\(ú\|u\)lt.b\(æ\|y\)te' 6.12(5.87+0.24) 4220.18: extended log --grep='m(ú|u)lt.b(æ|y)te' 6.14(5.84+0.26) 4220.19: perl log --grep='m(ú|u)lt.b(æ|y)te' 6.16(5.93+0.20) With -i: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_4220_LOG_OPTS=' -i' ./run p4220-log-grep-engines.sh [...] Test this tree ------------------------------------------------------------------------ 4220.1: basic log -i --grep='how.to' 6.74(6.41+0.32) 4220.2: extended log -i --grep='how.to' 6.78(6.55+0.22) 4220.3: perl log -i --grep='how.to' 6.06(5.77+0.28) 4220.5: basic log -i --grep='^how to' 6.80(6.57+0.22) 4220.6: extended log -i --grep='^how to' 6.83(6.52+0.29) 4220.7: perl log -i --grep='^how to' 6.16(5.94+0.20) 4220.9: basic log -i --grep='[how] to' 7.87(7.61+0.24) 4220.10: extended log -i --grep='[how] to' 7.85(7.57+0.27) 4220.11: perl log -i --grep='[how] to' 7.03(6.75+0.25) 4220.13: basic log -i --grep='\(e.t[^ ]*\|v.ry\) rare' 8.68(8.41+0.25) 4220.14: extended log -i --grep='(e.t[^ ]*|v.ry) rare' 8.80(8.44+0.28) 4220.15: perl log -i --grep='(e.t[^ ]*|v.ry) rare' 7.85(7.56+0.26) 4220.17: basic log -i --grep='m\(ú\|u\)lt.b\(æ\|y\)te' 6.94(6.68+0.24) 4220.18: extended log -i --grep='m(ú|u)lt.b(æ|y)te' 7.04(6.76+0.24) 4220.19: perl log -i --grep='m(ú|u)lt.b(æ|y)te' 6.26(5.92+0.29) See commit ("perf: add a comparison test of grep regex engines", 2017-04-19) for details on the machine the above test run was executed on. Before commit ("log: make --regexp-ignore-case work with --perl-regexp", 2017-05-20) this test will almost definitely fail (depending on the repo) if passed the -i option, since it wasn't properly supported under PCRE. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26perf: add a comparison test of grep regex engines with -FÆvar Arnfjörð Bjarmason1-0/+41
Add a performance comparison test of grep regex engines given fixed strings. The current logic in compile_regexp() ignores the engine parameter and uses kwset() to search for these, so this test shows no difference between engines right now: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p7821-grep-engines-fixed.sh [...] Test this tree ------------------------------------------------ 7821.1: fixed grep int 0.56(1.67+0.68) 7821.2: basic grep int 0.57(1.70+0.57) 7821.3: extended grep int 0.59(1.76+0.51) 7821.4: perl grep int 1.08(1.71+0.55) 7821.6: fixed grep uncommon 0.23(0.55+0.50) 7821.7: basic grep uncommon 0.24(0.55+0.50) 7821.8: extended grep uncommon 0.26(0.55+0.52) 7821.9: perl grep uncommon 0.24(0.58+0.47) 7821.11: fixed grep æ 0.36(1.30+0.42) 7821.12: basic grep æ 0.36(1.32+0.40) 7821.13: extended grep æ 0.38(1.30+0.42) 7821.14: perl grep æ 0.35(1.24+0.48) Only when run with -i via GIT_PERF_7821_GREP_OPTS=' -i' do we avoid avoid going through the same kwset.[ch] codepath, see the "Even when -F..." comment in grep.c. This only kicks for the non-ASCII case: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_7821_GREP_OPTS=' -i' ./run p7821-grep-engines-fixed.sh [...] Test this tree --------------------------------------------------- 7821.1: fixed grep -i int 0.62(2.10+0.57) 7821.2: basic grep -i int 0.68(1.90+0.61) 7821.3: extended grep -i int 0.78(1.94+0.57) 7821.4: perl grep -i int 0.98(1.78+0.74) 7821.6: fixed grep -i uncommon 0.24(0.44+0.64) 7821.7: basic grep -i uncommon 0.25(0.56+0.54) 7821.8: extended grep -i uncommon 0.27(0.62+0.45) 7821.9: perl grep -i uncommon 0.24(0.59+0.49) 7821.11: fixed grep -i æ 0.30(0.96+0.39) 7821.12: basic grep -i æ 0.27(0.92+0.44) 7821.13: extended grep -i æ 0.28(0.90+0.46) 7821.14: perl grep -i æ 0.28(0.74+0.49) I'm planning to change how fixed-string searching happens. This test gives a baseline for comparing performance before & after any such change. See commit ("perf: add a comparison test of grep regex engines", 2017-04-19) for details on the machine the above test run was executed on. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26perf: add a comparison test of grep regex enginesÆvar Arnfjörð Bjarmason1-0/+56
Add a very basic performance comparison test comparing the POSIX basic, extended and perl engines. In theory the "basic" and "extended" engines should be implemented using the same underlying code with a slightly different pattern parser, but some implementations may not do this. Jump through some slight hoops to test both, which is worthwhile since "basic" is the default. Running this on an i7 3.4GHz Linux 4.9.0-2 Debian testing against a checkout of linux.git & latest upstream PCRE, both PCRE and git compiled with -O3 using gcc 7.1.1: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p7820-grep-engines.sh [...] Test this tree --------------------------------------------------------------- 7820.1: basic grep 'how.to' 0.34(1.24+0.53) 7820.2: extended grep 'how.to' 0.33(1.23+0.45) 7820.3: perl grep 'how.to' 0.31(1.05+0.56) 7820.5: basic grep '^how to' 0.32(1.24+0.42) 7820.6: extended grep '^how to' 0.33(1.20+0.44) 7820.7: perl grep '^how to' 0.57(2.67+0.42) 7820.9: basic grep '[how] to' 0.51(2.16+0.45) 7820.10: extended grep '[how] to' 0.49(2.20+0.43) 7820.11: perl grep '[how] to' 0.56(2.60+0.43) 7820.13: basic grep '\(e.t[^ ]*\|v.ry\) rare' 0.66(3.25+0.40) 7820.14: extended grep '(e.t[^ ]*|v.ry) rare' 0.65(3.19+0.46) 7820.15: perl grep '(e.t[^ ]*|v.ry) rare' 1.05(5.74+0.34) 7820.17: basic grep 'm\(ú\|u\)lt.b\(æ\|y\)te' 0.34(1.28+0.47) 7820.18: extended grep 'm(ú|u)lt.b(æ|y)te' 0.34(1.38+0.38) 7820.19: perl grep 'm(ú|u)lt.b(æ|y)te' 0.39(1.56+0.44) Options can also be passed to git-grep via the GIT_PERF_7820_GREP_OPTS environment variable. There are various modes such as "-v" that have very different performance profiles, but handling the combinatorial explosion of testing all those options would make this script much more complex and harder to maintain. Instead just add the ability to do one-shot runs with arbitrary options, e.g.: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_7820_GREP_OPTS=" -i" ./run p7820-grep-engines.sh [...] Test this tree ------------------------------------------------------------------ 7820.1: basic grep -i 'how.to' 0.49(1.72+0.38) 7820.2: extended grep -i 'how.to' 0.46(1.64+0.42) 7820.3: perl grep -i 'how.to' 0.44(1.45+0.45) 7820.5: basic grep -i '^how to' 0.47(1.76+0.38) 7820.6: extended grep -i '^how to' 0.47(1.70+0.42) 7820.7: perl grep -i '^how to' 0.65(2.72+0.37) 7820.9: basic grep -i '[how] to' 0.86(3.64+0.42) 7820.10: extended grep -i '[how] to' 0.84(3.62+0.46) 7820.11: perl grep -i '[how] to' 0.73(3.06+0.39) 7820.13: basic grep -i '\(e.t[^ ]*\|v.ry\) rare' 1.63(8.13+0.36) 7820.14: extended grep -i '(e.t[^ ]*|v.ry) rare' 1.64(8.01+0.44) 7820.15: perl grep -i '(e.t[^ ]*|v.ry) rare' 1.44(6.88+0.44) 7820.17: basic grep -i 'm\(ú\|u\)lt.b\(æ\|y\)te' 0.66(2.67+0.44) 7820.18: extended grep -i 'm(ú|u)lt.b(æ|y)te' 0.66(2.67+0.43) 7820.19: perl grep -i 'm(ú|u)lt.b(æ|y)te' 0.59(2.31+0.37) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21perf: emit progress output when unpacking & buildingÆvar Arnfjörð Bjarmason1-0/+2
Amend the t/perf/run output so that in addition to the "Running N tests" heading currently being emitted, it also emits "Unpacking $rev" and "Building $rev" when setting up the build/$rev directory & when building it, respectively. This makes it easier to see what's going on and what revision is being tested as the output scrolls by. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't doÆvar Arnfjörð Bjarmason2-3/+25
Add a git GIT_PERF_MAKE_COMMAND variable to compliment the existing GIT_PERF_MAKE_OPTS facility. This allows specifying an arbitrary shell command to execute instead of 'make'. This is useful e.g. in cases where the name, semantics or defaults of a Makefile flag have changed over time. It can even be used to change the contents of the tree, useful for monkeypatching ancient versions of git to get them to build. This opens Pandora's box in some ways, it's now possible to "jailbreak" the perf environment and e.g. modify the source tree via this arbitrary instead of just issuing a custom "make" command, such a command has to be re-entrant in the sense that subsequent perf runs will re-use the possibly modified tree. It would be pointless to try to mitigate or work around that caveat in a tool purely aimed at Git developers, so this change makes no attempt to do so. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-16p0004: don't error out if test repo is too smallRené Scharfe1-5/+8
Repositories with less than 4000 entries are always handled using a single thread, causing test-lazy-init-name-hash --multi to error out. Don't abort the whole test script in that case, but simply skip the multi-threaded performance check. We can still use it to compare the single-threaded speed of different versions in that case. Signed-off-by: Rene Scharfe <l.s.r@web.de> Acked-by: Jeff Hostetler <git@jeffhostetler.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-16p0004: don't abort if multi-threaded is too slowRené Scharfe1-4/+0
If the single-threaded variant beats the multi-threaded one then we may have a performance bug, but that doesn't justify aborting the test. Drop that check; we can compare the results for --single and --multi using the actual performance tests. Signed-off-by: Rene Scharfe <l.s.r@web.de> Acked-by: Jeff Hostetler <git@jeffhostetler.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-16p0004: use test_perfRené Scharfe1-0/+36
The perf test suite (more specifically: t/perf/aggregate.perl) requires each test script to write test results into a file, otherwise it aborts when aggregating. Add actual performance tests with test_perf to allow p0004 to be run together with other perf scripts. Calibrate the value for the parameter --count based on the size of the test repository, in order to get meaningful results with smaller repos yet still be able to finish the script against huge ones without having to wait for hours. Signed-off-by: Rene Scharfe <l.s.r@web.de> Acked-by: Jeff Hostetler <git@jeffhostetler.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-16p0004: avoid using pipesRené Scharfe1-3/+5
The return code of commands on the producing end of a pipe is ignored. Evaluate the outcome of test-lazy-init-name-hash by calling sort separately. Signed-off-by: Rene Scharfe <l.s.r@web.de> Acked-by: Jeff Hostetler <git@jeffhostetler.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-16p0004: simplify calls of test-lazy-init-name-hashRené Scharfe1-3/+3
The test library puts helpers into $PATH, so we can simply call them without specifying their location. The suffix $X is also not necessary because .exe files on Windows can be started without specifying their extension, and on other platforms it's empty anyway. Signed-off-by: Rene Scharfe <l.s.r@web.de> Acked-by: Jeff Hostetler <git@jeffhostetler.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-12perf: add test showing exponential growth in path globbingÆvar Arnfjörð Bjarmason1-0/+43
Add a test showing that runtimes of the wildmatch() function used for globbing in git grow exponentially in the face of some pathological globs. This issue affects both globs matching filenames via e.g. ls-files, and globs matching refnames via e.g. for-each-ref. As noted in the test description this is a test to see whether Git suffers from the issue noted in an article Russ Cox posted today about common bugs in various glob implementations: https://research.swtch.com/glob Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-12perf: add function to setup a fresh test repoÆvar Arnfjörð Bjarmason2-4/+16
Add a function to setup a fresh test repo via 'git init' to compliment the existing functions to copy over a normal & large repo. Some performance tests don't need any existing repository data at all to be significant, e.g. tests which stress glob matches against single pathological revisions or files, which I'm about to add in a subsequent commit. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08p3400: add perf tests for rebasing many changesChristian Couder1-1/+21
Rebasing onto many changes is interesting, but it's also interesting to see what happens when rebasing many changes. And while at it, let's also look at the impact of using a split index. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-01Merge branch 'ab/align-perf-descriptions'Junio C Hamano2-0/+5
Output from perf tests have been updated to align their titles. * ab/align-perf-descriptions: t/perf: correctly align non-ASCII descriptions in output
2017-04-26Merge branch 'jh/add-index-entry-optim'Junio C Hamano4-0/+263
"git checkout" that handles a lot of paths has been optimized by reducing the number of unnecessary checks of paths in the has_dir_name() function. * jh/add-index-entry-optim: read-cache: speed up has_dir_name (part 2) read-cache: speed up has_dir_name (part 1) read-cache: speed up add_index_entry during checkout p0006-read-tree-checkout: perf test to time read-tree read-cache: add strcmp_offset function
2017-04-23Merge branch 'jh/string-list-micro-optim'Junio C Hamano1-0/+49
The string-list API used a custom reallocation strategy that was very inefficient, instead of using the usual ALLOC_GROW() macro, which has been fixed. * jh/string-list-micro-optim: string-list: use ALLOC_GROW macro when reallocing string_list
2017-04-23t/perf: correctly align non-ASCII descriptions in outputÆvar Arnfjörð Bjarmason2-0/+5
Change the test descriptions from being treated as binary blobs by perl to being treated as UTF-8. This ensures that e.g. a test description like "æ" is counted as 1 character, not 2. I have WIP performance tests for non-ASCII grep patterns on another topic that are affected by this. Now instead of: $ ./run p0000-perf-lib-sanity.sh [...] 0000.4: export a weird var 0.00(0.00+0.00) 0000.5: éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś 0.00(0.00+0.00) 0000.7: important variables available in subshells 0.00(0.00+0.00) [...] We emit: [...] 0000.4: export a weird var 0.00(0.00+0.00) 0000.5: éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś 0.00(0.00+0.00) 0000.7: important variables available in subshells 0.00(0.00+0.00) [...] Fixes code originally added in 342e9ef2d9 ("Introduce a performance testing framework", 2012-02-17). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-19Merge branch 'jh/memihash-opt'Junio C Hamano1-0/+0
Hotfix for a topic that is already in 'master'. * jh/memihash-opt: p0004: make perf test executable t3008: skip lazy-init test on a single-core box test-online-cpus: helper to return cpu count name-hash: fix buffer overrun
2017-04-19p0006-read-tree-checkout: perf test to time read-treeJeff Hostetler4-0/+263
Created t/perf/repos/many-files.sh to generate large, but artificial repositories. Created t/perf/inflate-repo.sh to alter an EXISTING repo to have a set of large commits. This can be used to create a branch with 1M+ files in repositories like git.git or linux.git, but with more realistic content. It does this by making multiple copies of the entire worktree in a series of sub-directories. The branch name and ballast structure created by both scripts match, so either script can be used to generate very large test repositories for the following perf test. Created t/perf/p0006-read-tree-checkout.sh to measure performance on various read-tree, checkout, and update-index operations. This test can run using either normal repos or ones from the above scripts. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-18p0004: make perf test executableChristian Couder1-0/+0
It looks like in 89c3b0ad43 (name-hash: add perf test for lazy_init_name_hash, 2017-03-23) p0004 was not created with the execute unix rights. Let's fix that. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Acked-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-15string-list: use ALLOC_GROW macro when reallocing string_listJeff Hostetler1-0/+49
Use ALLOC_GROW() macro when reallocing a string_list array rather than simply increasing it by 32. This is a performance optimization. During status on a very large repo and there are many changes, a significant percentage of the total run time is spent reallocing the wt_status.changes array. This change decreases the time in wt_status_collect_changes_worktree() from 125 seconds to 45 seconds on my very large repository. This produced a modest gain on my 1M file artificial repo, but broke even on linux.git. Test HEAD^^ HEAD --------------------------------------------------------------------------------------- 0005.2: read-tree status br_ballast (1000001) 8.29(5.62+2.62) 8.22(5.57+2.63) -0.8% Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-28Merge branch 'jh/memihash-opt'Junio C Hamano1-0/+19
The name-hash used for detecting paths that are different only in cases (which matter on case insensitive filesystems) has been optimized to take advantage of multi-threading when it makes sense. * jh/memihash-opt: name-hash: add test-lazy-init-name-hash to .gitignore name-hash: add perf test for lazy_init_name_hash name-hash: add test-lazy-init-name-hash name-hash: perf improvement for lazy_init_name_hash hashmap: document memihash_cont, hashmap_disallow_rehash api hashmap: add disallow_rehash setting hashmap: allow memihash computation to be continued name-hash: specify initial size for istate.dir_hash table
2017-03-24name-hash: add perf test for lazy_init_name_hashJeff Hostetler1-0/+19
Created t/perf/p0004-lazy-init-name-hash.sh test to demonstrate correctness and performance gains with the multithreaded version of lazy_init_name_hash(). Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-14Merge branch 'dp/filter-branch-prune-empty'Junio C Hamano1-0/+5
"git filter-branch --prune-empty" drops a single-parent commit that becomes a no-op, but did not drop a root commit whose tree is empty. * dp/filter-branch-prune-empty: p7000: add test for filter-branch with --prune-empty filter-branch: fix --prune-empty on parentless commits t7003: ensure --prune-empty removes entire branch when applicable t7003: ensure --prune-empty can prune root commit
2017-03-03p7000: add test for filter-branch with --prune-emptyDevin J. Pohly1-0/+5
Signed-off-by: Devin J. Pohly <djpohly@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-03t/perf: add fallback for pre-bin-wrappers versions of gitJeff King1-0/+3
It's tempting to say: ./run v1.0.0 HEAD to see how we've sped up Git over the years. Unfortunately, this doesn't quite work because versions of Git prior to v1.7.0 lack bin-wrappers, so our "run" script doesn't correctly put them in the PATH. Worse, it means we silently find whatever other "git" is in the PATH, and produce test results that have no bearing on what we asked for. Let's fallback to the main git directory when bin-wrappers isn't present. Many modern perf scripts won't run with such an antique version of Git, of course, but at least those failures are detected and reported (and you're free to write a limited perf script that works across many versions). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-03t/perf: use $MODERN_GIT for all repo-copying stepsJeff King1-2/+2
Since 1a0962dee (t/perf: fix regression in testing older versions of git, 2016-06-22), we point "$MODERN_GIT" to a copy of git that matches the t/perf script itself, and which can be used for tasks outside of the actual timings. This is needed because the setup done by perf scripts keeps moving forward in time, and may use features that the older versions of git we are testing do not have. That commit used $MODERN_GIT to fix a case where we relied on the relatively recent --git-path option. But if you go back further still, there are more problems. Since 7501b5921 (perf: make the tests work in worktrees, 2016-05-13), we use "git -C", but versions of git older than 44e1e4d67 (git: run in a directory given with -C option, 2013-09-09) don't know about "-C". So testing an old version of git with a new version of t/perf will fail the setup step. We can fix this by using $MODERN_GIT during the setup; there's no need to use the antique version, since it doesn't affect the timings. Likewise, we'll adjust the "init" invocation; antique versions of git called this "init-db". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-03t/perf: export variable used in other blocksJonathan Tan1-1/+2
In p0001, a variable was created in a test_expect_success block to be used in later test_perf blocks, but was not exported. This caused the variable to not appear in those blocks (this can be verified by writing 'test -n "$commit"' in those blocks), resulting in a slightly different invocation than what was intended. Export that variable. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-10Merge branch 'rs/p5302-create-repositories-before-tests'Junio C Hamano1-0/+7
Adjust a perf test to new world order where commands that do require a repository are really strict about having a repository. * rs/p5302-create-repositories-before-tests: p5302: create repositories for index-pack results explicitly