| Age | Commit message (Collapse) | Author | Files | Lines |
|
Lose the debugging aid that may have been useful in the past, but
no longer is, in the "grep" codepaths.
* ab/lose-grep-debug:
grep/log: remove hidden --debug and --grep-debug options
|
|
Code clean-up to ensure our use of hashtables using object names as
keys use the "struct object_id" objects, not the raw hash values.
* jk/use-oid-pos:
oid_pos(): access table through const pointers
hash_pos(): convert to oid_pos()
rerere: use strmap to store rerere directories
rerere: tighten rr-cache dirname check
rerere: check dirname format while iterating rr_cache directory
commit_graft_pos(): take an oid instead of a bare hash
|
|
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Our setting of GitHub CI test jobs were a bit too eager to give up
once there is even one failure found. Tweak the knob to allow
other jobs keep running even when we see a failure, so that we can
find more failures in a single run.
* pb/ci-matrix-wo-shortcut:
ci: do not cancel all jobs of a matrix if one fails
|
|
Test fix.
* pb/blame-funcname-range-userdiff:
annotate-tests: quote variable expansions containing path names
|
|
A perf script was made more portable.
* jk/p5303-sed-portability-fix:
p5303: avoid sed GNU-ism
|
|
The implementation of "git branch --sort" wrt the detached HEAD
display has always been hacky, which has been cleaned up.
* ab/branch-sort:
branch: show "HEAD detached" first under reverse sort
branch: sort detached HEAD based on a flag
ref-filter: move ref_sorting flags to a bitfield
ref-filter: move "cmp_fn" assignment into "else if" arm
ref-filter: add braces to if/else if/else chain
branch tests: add to --sort tests
branch: change "--local" to "--list" in comment
|
|
Code clean-up.
* ma/more-opaque-lock-file:
read-cache: try not to peek into `struct {lock_,temp}file`
refs/files-backend: don't peek into `struct lock_file`
midx: don't peek into `struct lock_file`
commit-graph: don't peek into `struct lock_file`
builtin/gc: don't peek into `struct lock_file`
|
|
Text encoding fix for "git p4".
* dl/p4-encode-after-kw-expansion:
git-p4: fix syncing file types with pattern
|
|
Test update.
* ar/t6016-modernise:
t6016: move to lib-log-graph.sh framework
|
|
Clean up option descriptions in "git cmd --help".
* zh/arg-help-format:
builtin/*: update usage format
parse-options: format argh like error messages
|
|
Doc update.
* ma/doc-pack-format-varint-for-sizes:
pack-format.txt: document sizes at start of delta data
|
|
Code clean-up.
* ma/t1300-cleanup:
t1300: don't needlessly work with `core.foo` configs
t1300: remove duplicate test for `--file no-such-file`
t1300: remove duplicate test for `--file ../foo`
|
|
A 3-year old test that was not testing anything useful has been
corrected.
* fc/t6030-bisect-reset-removes-auxiliary-files:
test: bisect-porcelain: fix location of files
|
|
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Test framework fix.
* sg/test-stress-jobs:
test-lib: prevent '--stress-jobs=X' from being ignored
|
|
We've carried compatibility codepaths for compilers without
variadic macros for quite some time, but the world may be ready for
them to be removed. Force compilation failure on exotic platforms
where variadic macros are not available to find out who screams in
such a way that we can easily revert if it turns out that the world
is not yet ready.
* jk/weather-balloon-require-variadic-macro:
git-compat-util: always enable variadic macros
|
|
Our setting of GitHub CI test jobs were a bit too eager to give up
once there is even one failure found. Tweak the knob to allow
other jobs keep running even when we see a failure, so that we can
find more failures in a single run.
* pb/ci-matrix-wo-shortcut:
ci: do not cancel all jobs of a matrix if one fails
|
|
Test fix.
* pb/blame-funcname-range-userdiff:
annotate-tests: quote variable expansions containing path names
|
|
A perf script was made more portable.
* jk/p5303-sed-portability-fix:
p5303: avoid sed GNU-ism
|
|
The "pack-objects" command needs to iterate over all the tags when
automatic tag following is enabled, but it actually iterated over
all refs and then discarded everything outside "refs/tags/"
hierarchy, which was quite wasteful.
* jv/pack-objects-narrower-ref-iteration:
builtin/pack-objects.c: avoid iterating all refs
|
|
When removing many branches and tags, the code used to do so one
ref at a time. There is another API it can use to delete multiple
refs, and it makes quite a lot of performance difference when the
refs are packed.
* ph/use-delete-refs:
use delete_refs when deleting tags or branches
|
|
The ls-refs protocol operation has been optimized to narrow the
sub-hierarchy of refs/ it walks to produce response.
* tb/ls-refs-optim:
ls-refs.c: traverse prefixes of disjoint "ref-prefix" sets
ls-refs.c: initialize 'prefixes' before using it
refs: expose 'for_each_fullref_in_prefixes'
|
|
"git ls-files" can and does show multiple entries when the index is
unmerged, which is a source for confusion unless -s/-u option is in
use. A new option --deduplicate has been introduced.
* zh/ls-files-deduplicate:
ls-files.c: add --deduplicate option
ls_files.c: consolidate two for loops into one
ls_files.c: bugfix for --deleted and --modified
|
|
Document, clean-up and optimize the code around the cache-tree
extension in the index.
* ds/cache-tree-basics:
cache-tree: speed up consecutive path comparisons
cache-tree: use ce_namelen() instead of strlen()
index-format: discuss recursion of cache-tree better
index-format: update preamble to cache tree extension
index-format: use 'cache tree' over 'cached tree'
cache-tree: trace regions for prime_cache_tree
cache-tree: trace regions for I/O
cache-tree: use trace2 in cache_tree_update()
unpack-trees: add trace2 regions
tree-walk: report recursion counts
|
|
ORT merge strategy learns more support for merge conflicts.
* en/ort-conflict-handling:
merge-ort: add handling for different types of files at same path
merge-ort: copy find_first_merges() implementation from merge-recursive.c
merge-ort: implement format_commit()
merge-ort: copy and adapt merge_submodule() from merge-recursive.c
merge-ort: copy and adapt merge_3way() from merge-recursive.c
merge-ort: flesh out implementation of handle_content_merge()
merge-ort: handle book-keeping around two- and three-way content merge
merge-ort: implement unique_path() helper
merge-ort: handle directory/file conflicts that remain
merge-ort: handle D/F conflict where directory disappears due to merge
|
|
"git log" learned a new "--diff-merges=<how>" option.
* so/log-diff-merge: (32 commits)
t4013: add tests for --diff-merges=first-parent
doc/git-show: include --diff-merges description
doc/rev-list-options: document --first-parent changes merges format
doc/diff-generate-patch: mention new --diff-merges option
doc/git-log: describe new --diff-merges options
diff-merges: add '--diff-merges=1' as synonym for 'first-parent'
diff-merges: add old mnemonic counterparts to --diff-merges
diff-merges: let new options enable diff without -p
diff-merges: do not imply -p for new options
diff-merges: implement new values for --diff-merges
diff-merges: make -m/-c/--cc explicitly mutually exclusive
diff-merges: refactor opt settings into separate functions
diff-merges: get rid of now empty diff_merges_init_revs()
diff-merges: group diff-merge flags next to each other inside 'rev_info'
diff-merges: split 'ignore_merges' field
diff-merges: fix -m to properly override -c/--cc
t4013: add tests for -m failing to override -c/--cc
t4013: support test_expect_failure through ':failure' magic
diff-merges: revise revs->diff flag handling
diff-merges: handle imply -p on -c/--cc logic for log.c
...
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Build fix.
* js/skip-dashed-built-ins-from-config-mak:
SKIP_DASHED_BUILT_INS: respect `config.mak`
|
|
Doc fix for packfile URI feature.
* jt/packfile-as-uri-doc:
Doc: clarify contents of packfile sent as URI
|
|
Documentation for "git fsck" lost stale bits that has become
incorrect.
* ab/fsck-doc-fix:
fsck doc: remove ancient out-of-date diagnostics
|
|
When more than one commit with the same patch ID appears on one
side, "git log --cherry-pick A...B" did not exclude them all when a
commit with the same patch ID appears on the other side. Now it
does.
* jk/log-cherry-pick-duplicate-patches:
patch-ids: handle duplicate hashmap entries
|
|
Newline characters in the host and path part of git:// URL are
now forbidden.
* jk/forbid-lf-in-git-url:
fsck: reject .gitmodules git:// urls with newlines
git_connect_git(): forbid newlines in host and path
|
|
Fix for procedure to building CI test environment for mac.
* jc/macos-install-dependencies-fix:
ci/install-depends: attempt to fix "brew cask" stuff
|
|
Doc update.
* tb/local-clone-race-doc:
Documentation/git-clone.txt: document race with --local
|
|
Doc update.
* bc/doc-status-short:
docs: rephrase and clarify the git status --short format
|
|
Comments update.
* ab/gettext-charset-comment-fix:
gettext.c: remove/reword a mostly-useless comment
Makefile: remove a warning about old GETTEXT_POISON flag
|
|
Doc update.
* ug/doc-lose-dircache:
doc: remove "directory cache" from man pages
|
|
Test fix.
* ad/t4129-setfacl-target-fix:
t4129: fix setfacl-related permissions failure
|
|
Test fix.
* jk/t5516-deflake:
t5516: loosen "not our ref" error check
|
|
Doc update.
* vv/send-email-with-less-secure-apps-access:
git-send-email.txt: mention less secure app access with Gmail
|
|
Fix 2.29 regression where "git mergetool --tool-help" fails to list
all the available tools.
* pb/mergetool-tool-help-fix:
mergetool--lib: fix '--tool-help' to correctly show available tools
|
|
"git for-each-repo --config=<var> <cmd>" should not run <cmd> for
any repository when the configuration variable <var> is not defined
even once.
* ds/for-each-repo-noopfix:
for-each-repo: do nothing on empty config
|
|
Doc update.
* jc/sign-off:
SubmittingPatches: tighten wording on "sign-off" procedure
|
|
Some tests expect that "ls -l" output has either '-' or 'x' for
group executable bit, but setgid bit can be inherited from parent
directory and make these fields 'S' or 's' instead, causing test
failures.
* mt/t4129-with-setgid-dir:
t4129: don't fail if setgid is set in the test directory
|
|
"git stash" did not work well in a sparsely checked out working
tree.
* en/stash-apply-sparse-checkout:
stash: fix stash application in sparse-checkouts
stash: remove unnecessary process forking
t7012: add a testcase demonstrating stash apply bugs in sparse checkouts
|
|
Test fix.
* nk/perf-fsmonitor-cleanup:
p7519: allow running without watchman prereq
|
|
Diagnose command line error of "git rebase" early.
* rs/rebase-commit-validation:
rebase: verify commit parameter
|
|
Doc fix.
* pb/doc-modules-git-work-tree-typofix:
gitmodules.txt: fix 'GIT_WORK_TREE' variable name
|
|
Doc fix.
* ta/doc-typofix:
doc: fix some typos
|
|
"git fetch --recurse-submodules" fix (second attempt).
* pk/subsub-fetch-fix-take-2:
submodules: fix of regression on fetching of non-init subsub-repo
|
|
|
|
The .use_shell flag in struct child_process that is passed to
run_command() API has been clarified with a bit more documentation.
* jk/run-command-use-shell-doc:
run-command: document use_shell option
|
|
The peel_ref() API has been replaced with peel_iterated_oid().
* jk/peel-iterated-oid:
refs: switch peel_ref() to peel_iterated_oid()
|
|
Build fix.
* js/skip-dashed-built-ins-from-config-mak:
SKIP_DASHED_BUILT_INS: respect `config.mak`
|
|
Doc fix for packfile URI feature.
* jt/packfile-as-uri-doc:
Doc: clarify contents of packfile sent as URI
|
|
Test clean-up plus UI improvement by hiding extra refs that
the prefetch task uses from "log --decorate" output.
* ds/maintenance-prefetch-cleanup:
t7900: clean up some broken refs
maintenance: set log.excludeDecoration durin prefetch
|
|
Documentation for "git fsck" lost stale bits that has become
incorrect.
* ab/fsck-doc-fix:
fsck doc: remove ancient out-of-date diagnostics
|
|
The test case added by 9466e3809d ("blame: enable funcname blaming with
userdiff driver", 2020-11-01) forgot to quote variable expansions. This
causes failures when the current directory contains blanks.
One variable that the test case introduces will not have IFS characters
and could remain without quotes, but let's quote all expansions for
consistency, not just the one that has the path name.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Acked-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Using "1~5" isn't portable. Nobody seems to have noticed, since perhaps
people don't tend to run the perf suite on more exotic platforms. Still,
it's better to set a good example.
We can use:
perl -ne 'print if $. % 5 == 1'
instead. But we can further observe that perl does a good job of the
other parts of this pipeline, and fold the whole thing together.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we are looking up an oid in an array, we obviously don't need to
write to the array. Let's mark it as const in the function interfaces,
as well as in the local variables we use to derference the void pointer
(note a few cases use pointers-to-pointers, so we mark everything
const).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
All of our callers are actually looking up an object_id, not a bare
hash. Likewise, the arrays they are looking in are actual arrays of
object_id (not just raw bytes of hashes, as we might find in a pack
.idx; those are handled by bsearch_hash()).
Using an object_id gives us more type safety, and makes the callers
slightly shorter. It also gets rid of the word "sha1" from several
access functions, though we could obviously also rename those with
s/sha1/hash/.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We store a struct for each directory we access under .git/rr-cache. The
structs are kept in an array sorted by the binary hash associated with
their name (and we do lookups with a binary search).
This works OK, but there are a few small downsides:
- the amount of code isn't huge, but it's more than we'd need using one
of our other stock data structures
- the insertion into a sorted array is quadratic (though in practice
it's unlikely anybody has enough conflicts for this to matter)
- it's intimately tied to the representation of an object hash. This
isn't a big deal, as the conflict ids we generate use the same hash,
but it produces a few awkward bits (e.g., we are the only user of
hash_pos() that is not using object_id).
Let's instead just treat the directory names as strings, and store them
in a strmap. This is less code, and removes the use of hash_pos().
Insertion is now non-quadratic, though we probably use a bit more
memory. Besides the hash table overhead, and storing hex bytes instead
of a binary hash, we actually store each name twice. Other code expects
to access the name of a rerere_dir struct from the struct itself, so we
need a copy there. But strmap keeps its own copy of the name, as well.
Using a bare hashmap instead of strmap means we could use the name for
both, but at the cost of extra code (e.g., our own comparison function).
Likewise, strmap has a feature to use a pointer to the in-struct name at
the cost of a little extra code. I didn't do either here, as simple code
seemed more important than squeezing out a few bytes of efficiency.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We check only that get_sha1_hex() doesn't complain, which means we'd
match an all-hex name with trailing cruft after it. This probably
doesn't matter much in practice, since there shouldn't be anything else
in the rr-cache directory, but it could possibly cause us to mix up sha1
and sha256 entries (which also shouldn't be intermingled, but could be
leftovers from a repository conversion).
Note that "get_sha1_hex()" is a confusing historical name. It is
actually using the_hash_algo, so it would be sha256 in a sha256 repo.
We'll switch to using parse_oid_hex(), because that conveniently
advances our pointer. But it also gets rid of the sha1 name. Arguably
it's a little funny to use "object_id" here for something that isn't
actually naming an object, but it's unlikely to be a problem (and is
contained in a single function).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In rerere_gc(), we walk over the .git/rr_cache directory and create a
struct for each entry we find. We feed any name we get from readdir() to
find_rerere_dir(), which then calls get_sha1_hex() on it (since we use
the binary hash as a lookup key). If that fails (i.e., the directory
name is not what we expected), it returns NULL. But the comment in
find_rerere_dir() says "BUG".
It _would_ be a bug for the call from new_rerere_id_hex(), the only
other code path, to fail here; it's generating the hex internally. But
the call in rerere_gc() is using it say "is this a plausible directory
name".
Let's instead have rerere_gc() do its own "is this plausible" check.
That has two benefits:
- we can now reliably BUG() inside find_rerere_dir(), which would
catch bugs in the other code path (and we now will never return NULL
from the function, which makes it easier to see that a rerere_id
struct will always have a non-NULL "collection" field).
- it makes the use of the binary hash an implementation detail of
find_rerere_dir(), not known by callers. That will free us up to
change it in a future patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
All of our callers have an object_id, and are just dereferencing the
hash field to pass to us. Let's take the actual object_id instead. We
still access the hash to pass to hash_pos, but it's a step in the right
direction.
This makes the callers slightly simpler, but also gets rid of the
untyped pointer, as well as the now-inaccurate name "sha1".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We allow variadic macros in the code base, but only if there is fallback
code for platforms that lack it. This leads to some annoyances:
- the code is more complicated because of the fallbacks (e.g.,
trace_printf(), etc, is implemented twice with a set of parallel
wrappers).
- some constructs are just impossible and we've had to live without
them (e.g., a cross between FLEX_ALLOC and xstrfmt)
Since this feature is present in C99, we may be able to start counting
on it being available everywhere. Let's start with a weather balloon
patch to find out.
This patch makes the absolute minimal change by always setting
HAVE_VARIADIC_MACROS. If somebody runs into a platform where it's a
problem, they can undo it by commenting out the define. Likewise, if we
have to revert this, it would be quite unlikely to cause conflicts.
Once we feel comfortable that this is the right direction, then we can
start ripping out all the spots that actually look at the flag, and
removing the dead code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The CI/PR GitHub Actions workflow uses the 'matrix' strategy for the
"windows-test", "vs-test", "regular" and "dockerized" jobs. The default
behaviour of GitHub Actions is to cancel all in-progress jobs in a
matrix if one of the job of the matrix fails [1].
This is not ideal as a failure early in a job, like during installation of
the build/test dependencies on a specific platform, leads to the
cancellation of all other jobs in the matrix.
Set the 'fail-fast' variable to 'false' for all four matrix jobs in the
workflow.
[1] https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#jobsjob_idstrategyfail-fast
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
'./t1234-foo.sh --stress-jobs=X ...' is supposed to run that test
script in X parallel jobs, but the number of jobs specified on the
command line is entirely ignored if other '--stress'-related options
follow. I.e. both './t1234-foo.sh --stress-jobs=X --stress-limit=Y'
and './t1234-foo.sh --stress-jobs=X --stress' fall back to using twice
the number of CPUs parallel jobs instead.
The former has been broken since commit de69e6f6c9 (tests: let
--stress-limit=<N> imply --stress, 2019-03-03) [1], which started to
unconditionally overwrite the $stress variable holding the specified
number of jobs in its effort to imply '--stress'. The latter has been
broken since f545737144 (tests: introduce --stress-jobs=<N>,
2019-03-03), because it didn't consider that handling '--stress' will
overwrite that variable as well.
We could fix this by being more careful about (over)writing that
$stress variable and checking first whether it has already been set.
But I think it's cleaner to use a dedicated variable to hold the
number of specified parallel jobs, so let's do that instead.
[1] In de69e6f6c9 there was no '--stress-jobs=X' option yet, the
number of parallel jobs had to be specified via '--stress=X', so,
strictly speaking, de69e6f6c9 broke './t1234-foo.sh --stress=X
--stress-limit=Y'.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Remove the hidden "grep --debug" and "log --grep-debug" options added
in 17bf35a3c7b (grep: teach --debug option to dump the parse tree,
2012-09-13).
At the time these options seem to have been intended to go along with
a documentation discussion and to help the author of relevant tests to
perform ad-hoc debugging on them[1].
Reasons to want this gone:
1. They were never documented, and the only (rather trivial) use of
them in our own codebase for testing is something I removed back
in e01b4dab01e (grep: change non-ASCII -i test to stop using
--debug, 2017-05-20).
2. Googling around doesn't show any in-the-wild uses I could dig up,
and on the Git ML the only mentions after the original discussion
seem to have been when they came up in unrelated diff contexts, or
that test commit of mine.
3. An exception to that is c581e4a7499 (grep: under --debug, show
whether PCRE JIT is enabled, 2019-08-18) where we added the
ability to dump out when PCREv2 has the JIT in effect.
The combination of that and my earlier b65abcafc7a (grep: use PCRE
v2 for optimized fixed-string search, 2019-07-01) means Git prints
this out in its most common in-the-wild configuration:
$ git log --grep-debug --grep=foo --grep=bar --grep=baz --all-match
pcre2_jit_on=1
pcre2_jit_on=1
pcre2_jit_on=1
[all-match]
(or
pattern_body<body>foo
(or
pattern_body<body>bar
pattern_body<body>baz
)
)
$ git grep --debug \( -e foo --and -e bar \) --or -e baz
pcre2_jit_on=1
pcre2_jit_on=1
pcre2_jit_on=1
(or
(and
patternfoo
patternbar
)
patternbaz
)
I.e. for each pattern we're considering for the and/or/--all-match
etc. debugging we'll now diligently spew out another identical line
saying whether the PCREv2 JIT is on or not.
I think that nobody's complained about that rather glaringly obviously
bad output says something about how much this is used, i.e. it's
not.
The need for this debugging aid for the composed grep/log patterns
seems to have passed, and the desire to dump the JIT config seems to
have been another one-off around the time we had JIT-related issues on
the PCREv2 codepath. That the original author of this debugging
facility seemingly hasn't noticed the bad output since then[2] is
probably some indicator.
1. https://lore.kernel.org/git/cover.1347615361.git.git@drmicha.warpmail.net/
2. https://lore.kernel.org/git/xmqqk1b8x0ac.fsf@gitster-ct.c.googlers.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Follow-up fixes and improvements to ab/mailmap topic.
* ab/mailmap-fixup:
t4203: make blame output massaging more robust
mailmap doc: use correct environment variable 'GIT_WORK_TREE'
t4203: stop losing return codes of git commands
test-lib-functions.sh: fix usage for test_commit()
|
|
Abstract accesses to in-core revindex that allows enumerating
objects stored in a packfile in the order they appear in the pack,
in preparation for introducing an on-disk precomputed revindex.
* tb/pack-revindex-api: (21 commits)
for_each_object_in_pack(): clarify pack vs index ordering
pack-revindex.c: avoid direct revindex access in 'offset_to_pack_pos()'
pack-revindex: hide the definition of 'revindex_entry'
pack-revindex: remove unused 'find_revindex_position()'
pack-revindex: remove unused 'find_pack_revindex()'
builtin/gc.c: guess the size of the revindex
for_each_object_in_pack(): convert to new revindex API
unpack_entry(): convert to new revindex API
packed_object_info(): convert to new revindex API
retry_bad_packed_offset(): convert to new revindex API
get_delta_base_oid(): convert to new revindex API
rebuild_existing_bitmaps(): convert to new revindex API
try_partial_reuse(): convert to new revindex API
get_size_by_pos(): convert to new revindex API
show_objects_for_type(): convert to new revindex API
bitmap_position_packfile(): convert to new revindex API
check_object(): convert to new revindex API
write_reused_pack_verbatim(): convert to new revindex API
write_reused_pack_one(): convert to new revindex API
write_reuse_object(): convert to new revindex API
...
|
|
Update the Code-of-conduct to version 2.0 from the upstream (we've
been using version 1.4).
* ab/coc-update-to-2.0:
CoC: update to version 2.0 + local changes
CoC: explicitly take any whitespace breakage
CoC: Update word-wrapping to match upstream
|
|
Introduce two new ways to feed configuration variable-value pairs
via environment variables, and tweak the way GIT_CONFIG_PARAMETERS
encodes variable/value pairs to make it more robust.
* ps/config-env-pairs:
config: allow specifying config entries via envvar pairs
environment: make `getenv_safe()` a public function
config: store "git -c" variables using more robust format
config: parse more robust format in GIT_CONFIG_PARAMETERS
config: extract function to parse config pairs
quote: make sq_dequote_step() a public function
config: add new way to pass config via `--config-env`
git: add `--super-prefix` to usage string
|
|
A bit of code refactoring.
* cc/write-promisor-file:
pack-write: die on error in write_promisor_file()
fetch-pack: refactor writing promisor file
fetch-pack: rename helper to create_promisor_file()
|
|
"git bundle" learns "--stdin" option to read its refs from the
standard input. Also, it now does not lose refs whey they point
at the same object.
* jx/bundle:
bundle: arguments can be read from stdin
bundle: lost objects when removing duplicate pendings
test: add helper functions for git-bundle
|
|
Clean-up docs, codepaths and tests around mailmap.
* ab/mailmap: (22 commits)
shortlog: remove unused(?) "repo-abbrev" feature
mailmap doc + tests: document and test for case-insensitivity
mailmap tests: add tests for empty "<>" syntax
mailmap tests: add tests for whitespace syntax
mailmap tests: add a test for comment syntax
mailmap doc + tests: add better examples & test them
tests: refactor a few tests to use "test_commit --append"
test-lib functions: add an --append option to test_commit
test-lib functions: add --author support to test_commit
test-lib functions: document arguments to test_commit
test-lib functions: expand "test_commit" comment template
mailmap: test for silent exiting on missing file/blob
mailmap tests: get rid of overly complex blame fuzzing
mailmap tests: add a test for "not a blob" error
mailmap tests: remove redundant entry in test
mailmap tests: improve --stdin tests
mailmap tests: modernize syntax & test idioms
mailmap tests: use our preferred whitespace syntax
mailmap doc: start by mentioning the comment syntax
check-mailmap doc: note config options
...
|
|
"git fetch" learns to treat ref updates atomically in all-or-none
fashion, just like "git push" does, with the new "--atomic" option.
* ps/fetch-atomic:
fetch: implement support for atomic reference updates
fetch: allow passing a transaction to `s_update_ref()`
fetch: refactor `s_update_ref` to use common exit path
fetch: use strbuf to format FETCH_HEAD updates
fetch: extract writing to FETCH_HEAD
|
|
When more than one commit with the same patch ID appears on one
side, "git log --cherry-pick A...B" did not exclude them all when a
commit with the same patch ID appears on the other side. Now it
does.
* jk/log-cherry-pick-duplicate-patches:
patch-ids: handle duplicate hashmap entries
|
|
Prepare tests not to be affected by the name of the default branch
"git init" creates.
* js/default-branch-name-tests-final-stretch: (28 commits)
tests: drop prereq `PREPARE_FOR_MAIN_BRANCH` where no longer needed
t99*: adjust the references to the default branch name "main"
tests(git-p4): transition to the default branch name `main`
t9[5-7]*: adjust the references to the default branch name "main"
t9[0-4]*: adjust the references to the default branch name "main"
t8*: adjust the references to the default branch name "main"
t7[5-9]*: adjust the references to the default branch name "main"
t7[0-4]*: adjust the references to the default branch name "main"
t6[4-9]*: adjust the references to the default branch name "main"
t64*: preemptively adjust alignment to prepare for `master` -> `main`
t6[0-3]*: adjust the references to the default branch name "main"
t5[6-9]*: adjust the references to the default branch name "main"
t55[4-9]*: adjust the references to the default branch name "main"
t55[23]*: adjust the references to the default branch name "main"
t551*: adjust the references to the default branch name "main"
t550*: adjust the references to the default branch name "main"
t5503: prepare aligned comment for replacing `master` with `main`
t5[0-4]*: adjust the references to the default branch name "main"
t5323: prepare centered comment for `master` -> `main`
t4*: adjust the references to the default branch name "main"
...
|
|
After expiring a reflog and making a single commit, the reflog for
the branch would record a single entry that knows both @{0} and
@{1}, but we failed to answer "what commit were we on?", i.e. @{1}
* dl/reflog-with-single-entry:
refs: allow @{n} to work with n-sized reflog
refs: factor out set_read_ref_cutoffs()
|
|
"git diff" showed a submodule working tree with untracked cruft as
"Submodule commit <objectname>-dirty", but a natural expectation is
that the "-dirty" indicator would align with "git describe --dirty",
which does not consider having untracked files in the working tree
as source of dirtiness. The inconsistency has been fixed.
* sj/untracked-files-in-submodule-directory-is-not-dirty:
diff: do not show submodule with untracked files as "-dirty"
|
|
Warn loudly when the "pack-redundant" command, which has been left
stale with almost unusable performance issues, gets used, as we no
longer want to recommend its use (instead just "repack -d" instead).
* jc/deprecate-pack-redundant:
pack-redundant: gauge the usage before proposing its removal
|
|
Newline characters in the host and path part of git:// URL are
now forbidden.
* jk/forbid-lf-in-git-url:
fsck: reject .gitmodules git:// urls with newlines
git_connect_git(): forbid newlines in host and path
|
|
The implementation of "git branch --sort" wrt the detached HEAD
display has always been hacky, which has been cleaned up.
* ab/branch-sort:
branch: show "HEAD detached" first under reverse sort
branch: sort detached HEAD based on a flag
ref-filter: move ref_sorting flags to a bitfield
ref-filter: move "cmp_fn" assignment into "else if" arm
ref-filter: add braces to if/else if/else chain
branch tests: add to --sort tests
branch: change "--local" to "--list" in comment
|
|
File-level rename detection updates.
* en/diffcore-rename:
diffcore-rename: remove unnecessary duplicate entry checks
diffcore-rename: accelerate rename_dst setup
diffcore-rename: simplify and accelerate register_rename_src()
t4058: explore duplicate tree entry handling in a bit more detail
t4058: add more tests and documentation for duplicate tree entry handling
diffcore-rename: reduce jumpiness in progress counters
diffcore-rename: simplify limit check
diffcore-rename: avoid usage of global in too_many_rename_candidates()
diffcore-rename: rename num_create to num_destinations
|
|
Code clean-up.
* ma/more-opaque-lock-file:
read-cache: try not to peek into `struct {lock_,temp}file`
refs/files-backend: don't peek into `struct lock_file`
midx: don't peek into `struct lock_file`
commit-graph: don't peek into `struct lock_file`
builtin/gc: don't peek into `struct lock_file`
|
|
Rename detection is added to the "ORT" merge strategy.
* en/merge-ort-3:
merge-ort: add implementation of type-changed rename handling
merge-ort: add implementation of normal rename handling
merge-ort: add implementation of rename collisions
merge-ort: add implementation of rename/delete conflicts
merge-ort: add implementation of both sides renaming differently
merge-ort: add implementation of both sides renaming identically
merge-ort: add basic outline for process_renames()
merge-ort: implement compare_pairs() and collect_renames()
merge-ort: implement detect_regular_renames()
merge-ort: add initial outline for basic rename detection
merge-ort: add basic data structures for handling renames
|
|
"git mktag" validates its input using its own rules before writing
a tag object---it has been updated to share the logic with "git
fsck".
* ab/mktag: (23 commits)
mktag: add a --[no-]strict option
mktag: mark strings for translation
mktag: convert to parse-options
mktag: allow omitting the header/body \n separator
mktag: allow turning off fsck.extraHeaderEntry
fsck: make fsck_config() re-usable
mktag: use fsck instead of custom verify_tag()
mktag: use puts(str) instead of printf("%s\n", str)
mktag: remove redundant braces in one-line body "if"
mktag: use default strbuf_read() hint
mktag tests: test verify_object() with replaced objects
mktag tests: improve verify_object() test coverage
mktag tests: test "hash-object" compatibility
mktag tests: stress test whitespace handling
mktag tests: run "fsck" after creating "mytag"
mktag tests: don't create "mytag" twice
mktag tests: don't redirect stderr to a file needlessly
mktag tests: remove needless SHA-1 hardcoding
mktag tests: use "test_commit" helper
mktag tests: don't needlessly use a subshell
...
|
|
During a merge conflict, the name of a file may appear multiple
times in "git ls-files" output, once for each stage. If you use
both `--delete` and `--modify` at the same time, the output may
mention a deleted file twice.
When none of the '-t', '-u', or '-s' options is in use, these
duplicate entries do not add much value to the output.
Introduce a new '--deduplicate' option to suppress them.
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: extended doc and rewritten commit log]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This will make it easier to show only one entry per filename in the
next step.
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: corrected the log message]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This situation may occur in the original code: lstat() failed
but we use `&st` to feed ie_modified() later.
Therefore, we can directly execute show_ce without the judgment of
ie_modified() when lstat() has failed.
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: fixed misindented code]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
ls-refs performs a single revision walk over the whole ref namespace,
and sends ones that match with one of the given ref prefixes down to the
user.
This can be expensive if there are many refs overall, but the portion of
them covered by the given prefixes is small by comparison.
To attempt to reduce the difference between the number of refs
traversed, and the number of refs sent, only traverse references which
are in the longest common prefix of the given prefixes. This is very
reminiscent of the approach taken in b31e2680c4 (ref-filter.c: find
disjoint pattern prefixes, 2019-06-26) which does an analogous thing for
multi-patterned 'git for-each-ref' invocations.
The callback 'send_ref' is resilient to ignore extra patterns by
discarding any arguments which do not begin with at least one of the
specified prefixes.
Similarly, the code introduced in b31e2680c4 is resilient to stop early
at metacharacters, but we only pass strict prefixes here. At worst we
would return too many results, but the double checking done by send_ref
will throw away anything that doesn't start with something in the prefix
list.
Finally, if no prefixes were provided, then implicitly add the empty
string (which will match all references) since this matches the existing
behavior (see the "no restrictions" comment in "ls-refs.c:ref_match()").
Original-patch-by: Jacob Vosmaer <jacob@gitlab.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Correctly initialize the "prefixes" strvec using strvec_init() instead
of simply zeroing it via the earlier memset().
There's no way to trigger a crash, since the first 'ref-prefix' command
will initialize the strvec via the 'ALLOC_GROW' in 'strvec_push_nodup()'
(the alloc and nr variables are already zero'd, so the call to
ALLOC_GROW is valid).
If no "ref-prefix" command was given, then the call to
'ls-refs.c:ref_match()' will abort early after it reads the zero in
'prefixes->nr'. Likewise, strvec_clear() will only call free() on the
array, which is NULL, so we're safe there, too.
But, all of this is dangerous and requires more reasoning than it would
if we simply called 'strvec_init()', so do that.
Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This function was used in the ref-filter.c code to find the longest
common prefix of among a set of refspecs, and then to iterate all of the
references that descend from that prefix.
A future patch will want to use that same code from ls-refs.c, so
prepare by exposing and moving it to refs.c. Since there is nothing
specific to the ref-filter code here (other than that it was previously
the only caller of this function), this really belongs in the more
generic refs.h header.
The code moved in this patch is identical before and after, with the one
exception of renaming some arguments to be consistent with other
functions exposed in refs.h.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In git-pack-objects, we iterate over all the tags if the --include-tag
option is passed on the command line. For some reason this uses
for_each_ref which is expensive if the repo has many refs. We should
use for_each_tag_ref instead.
Because the add_ref_tag callback will now only visit tags we
simplified it a bit.
The motivation for this change is that we observed performance issues
with a repository on gitlab.com that has 500,000 refs but only 2,000
tags. The fetch traffic on that repo is dominated by CI, and when we
changed CI to fetch with 'git fetch --no-tags' we saw a dramatic
change in the CPU profile of git-pack-objects. This lead us to this
particular ref walk. More details in:
https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/746#note_483546598
Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
It's unclear how run-command's use_shell option should impact the
arguments fed to a command. Plausibly it could mean that we glue all of
the arguments together into a string to pass to the shell, in which case
that opens the question of whether the caller needs to quote them.
But in fact we don't implement it that way (and even if we did, we'd
probably auto-quote the arguments as part of the glue step). And we must
not receive quoted arguments, because we might actually optimize out the
shell entirely (i.e., the caller does not even know if a shell will be
involved in the end or not).
Since this ambiguity may have been the cause of a recent bug, let's
document the option a bit.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
'git tag -d' accepts one or more tag refs to delete, but each deletion
is done by calling `delete_ref` on each argv. This is very slow when
removing from packed refs. Use delete_refs instead so all the removals
can be done inside a single transaction with a single update.
Do the same for 'git branch -d'.
Since delete_refs performs all the packed-refs delete operations
inside a single transaction, if any of the deletes fail then all
them will be skipped. In practice, none of them should fail since
we verify the hash of each one before calling delete_refs, but some
network error or odd permissions problem could have different results
after this change.
Also, since the file-backed deletions are not performed in the same
transaction, those could succeed even when the packed-refs transaction
fails.
After deleting branches, remove the branch config only if the branch
ref was removed and was not subsequently added back in.
A manual test deleting 24,000 tags took about 30 minutes using
delete_ref. It takes about 5 seconds using delete_refs.
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Phil Hord <phil.hord@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The peel_ref() interface is confusing and error-prone:
- it's typically used by ref iteration callbacks that have both a
refname and oid. But since they pass only the refname, we may load
the ref value from the filesystem again. This is inefficient, but
also means we are open to a race if somebody simultaneously updates
the ref. E.g., this:
int some_ref_cb(const char *refname, const struct object_id *oid, ...)
{
if (!peel_ref(refname, &peeled))
printf("%s peels to %s",
oid_to_hex(oid), oid_to_hex(&peeled);
}
could print nonsense. It is correct to say "refname peels to..."
(you may see the "before" value or the "after" value, either of
which is consistent), but mentioning both oids may be mixing
before/after values.
Worse, whether this is possible depends on whether the optimization
to read from the current iterator value kicks in. So it is actually
not possible with:
for_each_ref(some_ref_cb);
but it _is_ possible with:
head_ref(some_ref_cb);
which does not use the iterator mechanism (though in practice, HEAD
should never peel to anything, so this may not be triggerable).
- it must take a fully-qualified refname for the read_ref_full() code
path to work. Yet we routinely pass it partial refnames from
callbacks to for_each_tag_ref(), etc. This happens to work when
iterating because there we do not call read_ref_full() at all, and
only use the passed refname to check if it is the same as the
iterator. But the requirements for the function parameters are quite
unclear.
Instead of taking a refname, let's instead take an oid. That fixes both
problems. It's a little funny for a "ref" function not to involve refs
at all. The key thing is that it's optimizing under the hood based on
having access to the ref iterator. So let's change the name to make it
clear why you'd want this function versus just peel_object().
There are two other directions I considered but rejected:
- we could pass the peel information into the each_ref_fn callback.
However, we don't know if the caller actually wants it or not. For
packed-refs, providing it is essentially free. But for loose refs,
we actually have to peel the object, which would be wasteful in most
cases. We could likewise pass in a flag to the callback indicating
whether the peeled information is known, but that complicates those
callbacks, as they then have to decide whether to manually peel
themselves. Plus it requires changing the interface of every
callback, whether they care about peeling or not, and there are many
of them.
- we could make a function to return the peeled value of the current
iterated ref (computing it if necessary), and BUG() otherwise. I.e.:
int peel_current_iterated_ref(struct object_id *out);
Each of the current callers is an each_ref_fn callback, so they'd
mostly be happy. But:
- we use those callbacks with functions like head_ref(), which do
not use the iteration code. So we'd need to handle the fallback
case there, anyway.
- it's possible that a caller would want to call into generic code
that sometimes is used during iteration and sometimes not. This
encapsulates the logic to do the fast thing when possible, and
fallback when necessary.
The implementation is mostly obvious, but I want to call out a few
things in the patch:
- the test-tool coverage for peel_ref() is now meaningless, as it all
collapses to a single peel_object() call (arguably they were pretty
uninteresting before; the tricky part of that function is the
fast-path we see during iteration, but these calls didn't trigger
that). I've just dropped it entirely, though note that some other
tests relied on the tags we created; I've moved that creation to the
tests where it matters.
- we no longer need to take a ref_store parameter, since we'd never
look up a ref now. We do still rely on a global "current iterator"
variable which _could_ be kept per-ref-store. But in practice this
is only useful if there are multiple recursive iterations, at which
point the more appropriate solution is probably a stack of
iterators. No caller used the actual ref-store parameter anyway
(they all call the wrapper that passes the_repository).
- the original only kicked in the optimization when the "refname"
pointer matched (i.e., not string comparison). We do likewise with
the "oid" parameter here, but fall back to doing an actual oideq()
call. This in theory lets us kick in the optimization more often,
though in practice no current caller cares. It should never be
wrong, though (peeling is a property of an object, so two refs
pointing to the same object would peel identically).
- the original took care not to touch the peeled out-parameter unless
we found something to put in it. But no caller cares about this, and
anyway, it is enforced by peel_object() itself (and even in the
optimized iterator case, that's where we eventually end up). We can
shorten the code and avoid an extra copy by just passing the
out-parameter through the stack.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When `SKIP_DASHED_BUILT_INS` is specified in `config.mak`, the dashed
form of the built-ins was still generated.
By moving the `SKIP_DASHED_BUILT_INS` handling after `config.mak` was
read, this can be avoided.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Remove diagnostics that haven't been emitted by "fsck" or its
predecessors for around 15 years. This documentation was added in
c64b9b88605 (Reference documentation for the core git commands.,
2005-05-05), but was out-of-date quickly after that.
Notes on individual diagnostics:
- "expect dangling commits": Added in bcee6fd8e71 (Make 'fsck' able
to[...], 2005-04-13), documented in c64b9b88605. Not emitted since
1024932f019 (fsck-cache: walk the 'refs' directory[...],
2005-05-18).
- "missing sha1 directory": Added in 20222118ae4 (Add first cut at
"fsck-cache"[...], 2005-04-08), documented in c64b9b88605. Not
emitted since 230f13225df (Create object subdirectories on demand,
2005-10-08).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Clarify that, when the packfile-uri feature is used, the client should
not assume that the extra packfiles downloaded would only contain a
single blob, but support packfiles containing multiple objects of all
types.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The tests for the 'prefetch' task create remotes and fetch refs into
'refs/prefetch/<remote>/' and tags into 'refs/tags/'. These tests use
the remotes to create objects not intended to be seen by the "local"
repository.
In that sense, the incrmental-repack tasks did not have these objects
and refs in mind. That test replaces the object directory with a
specific pack-file layout for testing the batch-size logic. However,
this causes some operations to start showing warnings such as:
error: refs/prefetch/remote1/one does not point to a valid object!
error: refs/tags/one does not point to a valid object!
This only shows up if you run the tests verbosely and watch the output.
It caught my eye and I _thought_ that there was a bug where 'git gc' or
'git repack' wouldn't check 'refs/prefetch/' before pruning objects.
That is incorrect. Those commands do handle 'refs/prefetch/' correctly.
All that is left is to clean up the tests in t7900-maintenance.sh to
remove these tags and refs that are not being repacked for the
incremental-repack tests. Use update-ref to ensure this works with all
ref backends.
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The 'prefetch' task fetches refs from all remotes and places them in the
refs/prefetch/<remote>/ refspace. As this task is intended to run in the
background, this allows users to keep their local data very close to the
remote servers' data while not updating the users' understanding of the
remote refs in refs/remotes/<remote>/.
However, this can clutter 'git log' decorations with copies of the refs
with the full name 'refs/prefetch/<remote>/<branch>'.
The log.excludeDecoration config option was added in a6be5e67 (log: add
log.excludeDecoration config option, 2020-05-16) for exactly this
purpose.
Ensure we set this only for users that would benefit from it by
assigning it at the beginning of the prefetch task. Other alternatives
would be during 'git maintenance register' or 'git maintenance start',
but those might assign the config even when the prefetch task is
disabled by existing config. Further, users could run 'git maintenance
run --task=prefetch' using their own scripting or scheduling. This
provides the best coverage to automatically update the config when
valuable.
It is improbable, but possible, that users might want to run the
prefetch task _and_ see these refs in their log decorations. This seems
incredibly unlikely to me, but users can always opt-in on a
command-by-command basis using --decorate-refs=refs/prefetch/.
Test that this works in a few cases. In particular, ensure that our
assignment of log.excludeDecoration=refs/prefetch/ is additive to other
existing exclusions. Further, ensure we do not add multiple copies in
multiple runs.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The previous change reduced time spent in strlen() while comparing
consecutive paths in verify_cache(), but we can do better. The
conditional checks the existence of a directory separator at the correct
location, but only after doing a string comparison. Swap the order to be
logically equivalent but perform fewer string comparisons.
To test the effect on performance, I used a repository with over three
million paths in the index. I then ran the following command on repeat:
git -c index.threads=1 commit --amend --allow-empty --no-edit
Here are the measurements over 10 runs after a 5-run warmup:
Benchmark #1: v2.30.0
Time (mean ± σ): 854.5 ms ± 18.2 ms
Range (min … max): 825.0 ms … 892.8 ms
Benchmark #2: Previous change
Time (mean ± σ): 833.2 ms ± 10.3 ms
Range (min … max): 815.8 ms … 849.7 ms
Benchmark #3: This change
Time (mean ± σ): 815.5 ms ± 18.1 ms
Range (min … max): 795.4 ms … 849.5 ms
This change is 2% faster than the previous change and 5% faster than
v2.30.0.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Use the name length field of cache entries instead of calculating its
value anew.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The end of the cache tree index extension format trails off with
ellipses ever since 23fcc98 (doc: technical details about the index
file format, 2011-03-01). While an intuitive reader could gather what
this means, it could be better to use "and so on" instead.
Really, this is only justified because I also wanted to point out that
the number of subtrees in the index format is used to determine when the
recursive depth-first-search stack should be "popped." This should help
to add clarity to the format.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
I had difficulty in my efforts to learn about the cache tree extension
based on the documentation and code because I had an incorrect
assumption about how it behaved. This might be due to some ambiguity in
the documentation, so this change modifies the beginning of the cache
tree format by expanding the description of the feature.
My hope is that this documentation clarifies a few things:
1. There is an in-memory recursive tree structure that is constructed
from the extension data. This structure has a few differences, such
as where the name is stored.
2. What does it mean for an entry to be invalid?
3. When exactly are "new" trees created?
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The index has a "cache tree" extension. This corresponds to a
significant API implemented in cache-tree.[ch]. However, there are a few
places that refer to this erroneously as "cached tree". These are rare,
but notably the index-format.txt file itself makes this error.
The only other reference is in t7104-reset-hard.sh.
Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Commands such as "git reset --hard" rebuild the in-memory representation
of the cache tree index extension by parsing tree objects starting at a
known root tree. The performance of this operation can vary widely
depending on the width and depth of the repository's working directory
structure. Measure the time in this operation using trace2 regions in
prime_cache_tree().
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
As we write or read the cache tree index extension, it can be good to
isolate how much of the file I/O time is spent constructing this
in-memory tree from the existing index or writing it out again to the
new index file. Use trace2 regions to indicate that we are spending time
on this operation.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Fix for procedure to building CI test environment for mac.
* jc/macos-install-dependencies-fix:
ci/install-depends: attempt to fix "brew cask" stuff
|
|
Doc update.
* tb/local-clone-race-doc:
Documentation/git-clone.txt: document race with --local
|
|
Doc update.
* bc/doc-status-short:
docs: rephrase and clarify the git status --short format
|
|
Text encoding fix for "git p4".
* dl/p4-encode-after-kw-expansion:
git-p4: fix syncing file types with pattern
|
|
Comments update.
* ab/gettext-charset-comment-fix:
gettext.c: remove/reword a mostly-useless comment
Makefile: remove a warning about old GETTEXT_POISON flag
|
|
Doc update.
* ug/doc-lose-dircache:
doc: remove "directory cache" from man pages
|
|
Test fix.
* ad/t4129-setfacl-target-fix:
t4129: fix setfacl-related permissions failure
|
|
Test fix.
* jk/t5516-deflake:
t5516: loosen "not our ref" error check
|
|
Doc update.
* vv/send-email-with-less-secure-apps-access:
git-send-email.txt: mention less secure app access with Gmail
|
|
Fix 2.29 regression where "git mergetool --tool-help" fails to list
all the available tools.
* pb/mergetool-tool-help-fix:
mergetool--lib: fix '--tool-help' to correctly show available tools
|
|
"git for-each-repo --config=<var> <cmd>" should not run <cmd> for
any repository when the configuration variable <var> is not defined
even once.
* ds/for-each-repo-noopfix:
for-each-repo: do nothing on empty config
|
|
Doc update.
* jc/sign-off:
SubmittingPatches: tighten wording on "sign-off" procedure
|
|
Some tests expect that "ls -l" output has either '-' or 'x' for
group executable bit, but setgid bit can be inherited from parent
directory and make these fields 'S' or 's' instead, causing test
failures.
* mt/t4129-with-setgid-dir:
t4129: don't fail if setgid is set in the test directory
|
|
Follow-up on the "maintenance part-3" which introduced scheduled
maintenance tasks to support platforms whose native scheduling
methods are not 'cron'.
* ds/maintenance-part-4:
maintenance: use Windows scheduled tasks
maintenance: use launchctl on macOS
maintenance: include 'cron' details in docs
maintenance: extract platform-specific scheduling
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Bash completion (in contrib/) update to make it easier for
end-users to add completion for their custom "git" subcommands.
* fc/completion-aliases-support:
completion: add proper public __git_complete
test: completion: add tests for __git_complete
completion: bash: improve function detection
completion: bash: add __git_have_func helper
|
|
"git stash" did not work well in a sparsely checked out working
tree.
* en/stash-apply-sparse-checkout:
stash: fix stash application in sparse-checkouts
stash: remove unnecessary process forking
t7012: add a testcase demonstrating stash apply bugs in sparse checkouts
|
|
Test update.
* ar/t6016-modernise:
t6016: move to lib-log-graph.sh framework
|
|
Clean up option descriptions in "git cmd --help".
* zh/arg-help-format:
builtin/*: update usage format
parse-options: format argh like error messages
|
|
Test fix.
* nk/perf-fsmonitor-cleanup:
p7519: allow running without watchman prereq
|
|
The topological walk codepath is covered by new trace2 stats.
* ds/trace2-topo-walk:
revision: trace topo-walk statistics
|
|
Diagnose command line error of "git rebase" early.
* rs/rebase-commit-validation:
rebase: verify commit parameter
|
|
Retire more names with "sha1" in it.
* ma/sha1-is-a-hash:
hash-lookup: rename from sha1-lookup
sha1-lookup: rename `sha1_pos()` as `hash_pos()`
object-file.c: rename from sha1-file.c
object-name.c: rename from sha1-name.c
|
|
Doc update.
* ma/doc-pack-format-varint-for-sizes:
pack-format.txt: document sizes at start of delta data
|
|
Code clean-up.
* ma/t1300-cleanup:
t1300: don't needlessly work with `core.foo` configs
t1300: remove duplicate test for `--file no-such-file`
t1300: remove duplicate test for `--file ../foo`
|
|
Doc fix.
* pb/doc-modules-git-work-tree-typofix:
gitmodules.txt: fix 'GIT_WORK_TREE' variable name
|
|
Doc fix.
* ta/doc-typofix:
doc: fix some typos
|
|
"git rev-parse" can be explicitly told to give output as absolute
or relative path with the `--path-format=(absolute|relative)` option.
* bc/rev-parse-path-format:
rev-parse: add option for absolute or relative path formatting
abspath: add a function to resolve paths with missing components
|
|
The configuration variable 'core.abbrev' can be set to 'no' to
force no abbreviation regardless of the hash algorithm.
* ew/decline-core-abbrev:
core.abbrev=no disables abbreviations
|
|
While we currently have the `GIT_CONFIG_PARAMETERS` environment variable
which can be used to pass runtime configuration data to git processes,
it's an internal implementation detail and not supposed to be used by
end users.
Next to being for internal use only, this way of passing config entries
has a major downside: the config keys need to be parsed as they contain
both key and value in a single variable. As such, it is left to the user
to escape any potentially harmful characters in the value, which is
quite hard to do if values are controlled by a third party.
This commit thus adds a new way of adding config entries via the
environment which gets rid of this shortcoming. If the user passes the
`GIT_CONFIG_COUNT=$n` environment variable, Git will parse environment
variable pairs `GIT_CONFIG_KEY_$i` and `GIT_CONFIG_VALUE_$i` for each
`i` in `[0,n)`.
While the same can be achieved with `git -c <name>=<value>`, one may
wish to not do so for potentially sensitive information. E.g. if one
wants to set `http.extraHeader` to contain an authentication token,
doing so via `-c` would trivially leak those credentials via e.g. ps(1),
which typically also shows command arguments.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `getenv_safe()` helper function helps to safely retrieve multiple
environment values without the need to depend on platform-specific
behaviour for the return value's lifetime. We'll make use of this
function in a following patch, so let's make it available by making it
non-static and adding a declaration.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The previous commit added a new format for $GIT_CONFIG_PARAMETERS which
is able to robustly handle subsections with "=" in them. Let's start
writing the new format. Unfortunately, this does much less than you'd
hope, because "git -c" itself has the same ambiguity problem! But it's
still worth doing:
- we've now pushed the problem from the inter-process communication
into the "-c" command-line parser. This would free us up to later
add an unambiguous format there (e.g., separate arguments like "git
--config key value", etc).
- for --config-env, the parser already disallows "=" in the
environment variable name. So:
git --config-env section.with=equals.key=ENVVAR
will robustly set section.with=equals.key to the contents of
$ENVVAR.
The new test shows the improvement for --config-env.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we stuff config options into GIT_CONFIG_PARAMETERS, we shell-quote
each one as a single unit, like:
'section.one=value1' 'section.two=value2'
On the reading side, we de-quote to get the individual strings, and then
parse them by splitting on the first "=" we find. This format is
ambiguous, because an "=" may appear in a subsection. So the config
represented in a file by both:
[section "subsection=with=equals"]
key = value
and:
[section]
subsection = with=equals.key=value
ends up in this flattened format like:
'section.subsection=with=equals.key=value'
and we can't tell which was desired. We have traditionally resolved this
by taking the first "=" we see starting from the left, meaning that we
allowed arbitrary content in the value, but not in the subsection.
Let's make our environment format a bit more robust by separately
quoting the key and value. That turns those examples into:
'section.subsection=with=equals.key'='value'
and:
'section.subsection'='with=equals.key=value'
respectively, and we can tell the difference between them. We can detect
which format is in use for any given element of the list based on the
presence of the unquoted "=". That means we can continue to allow the
old format to work to support any callers which manually used the old
format, and we can even intermingle the two formats. The old format
wasn't documented, and nobody was supposed to be using it. But it's
likely that such callers exist in the wild, so it's nice if we can avoid
breaking them. Likewise, it may be possible to trigger an older version
of "git -c" that runs a script that calls into a newer version of "git
-c"; that new version would see the intermingled format.
This does create one complication, which is that the obvious format in
the new scheme for
[section]
some-bool
is:
'section.some-bool'
with no equals. We'd mistake that for an old-style variable. And it even
has the same meaning in the old style, but:
[section "with=equals"]
some-bool
does not. It would be:
'section.with=equals=some-bool'
which we'd take to mean:
[section]
with = equals=some-bool
in the old, ambiguous style. Likewise, we can't use:
'section.some-bool'=''
because that's ambiguous with an actual empty string. Instead, we'll
again use the shell-quoting to give us a hint, and use:
'section.some-bool'=
to show that we have no value.
Note that this commit just expands the reading side. We'll start writing
the new format via "git -c" in a future patch. In the meantime, the
existing "git -c" tests will make sure we didn't break reading the old
format. But we'll also add some explicit coverage of the two formats to
make sure we continue to handle the old one after we move the writing
side over.
And one final note: since we're now using the shell-quoting as a
semantically meaningful hint, this closes the door to us ever allowing
arbitrary shell quoting, like:
'a'shell'would'be'ok'with'this'.key=value
But we have never supported that (only what sq_quote() would produce),
and we are probably better off keeping things simple, robust, and
backwards-compatible, than trying to make it easier for humans. We'll
continue not to advertise the format of the variable to users, and
instead keep "git -c" as the recommended mechanism for setting config
(even if we are trying to be kind not to break users who may be relying
on the current undocumented format).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the "git blame --porcelain" output, lines that ends with three
integers may not be the line that shows a commit object with line
numbers and block length (the contents from the blamed file or the
summary field can have a line that happens to match). Also, the
names of the author may have more than three SP separated tokens
("git blame -L242,+1 cf6de18aabf7 Documentation/SubmittingPatches"
gives an example). The existing "grep -E | cut" pipeline is a bit
too loose on these two points.
While they can be assumed on the test data, it is not so hard to
use the right pattern from the documented format, so let's do so.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
gitmailmap(5) uses 'GIT_WORK_DIR' to refer to the root of the
repository, but this environment variable does not exist.
Use the correct spelling for that variable, 'GIT_WORK_TREE'.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We run "git pull" against "$cask_repo"; clarify that we are
expecting not to have any of our own modifications and running "git
pull" to merely update, by passing "--ff-only" on the command line.
Also, the "brew cask install" command line triggers an error message
that says:
Error: Calling brew cask install is disabled! Use brew install
[--cask] instead.
In addition, "brew install caskroom/cask/perforce" step triggers an
error that says:
Error: caskroom/cask was moved. Tap homebrew/cask instead.
Attempt to see if blindly following the suggestion in these error
messages gets us into a better shape.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We may return objects in one of two orders: how they appear in the .idx
(sorted by object id) or how they appear in the packfile itself. To
further complicate matters, we have two ordering variables, "i" and
"pos", and it is not clear to which order they apply.
Let's clarify this by using an unambiguous name where possible, and
leaving a comment for the variable that does double-duty.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a pipe, only the return code of the last command is used. Thus, all
other commands will have their return codes masked. Rewrite pipes so
that there are no git commands upstream so that their failure is
reported.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The usage comment for test_commit() shows that the --author option
should be given as `--author=<author>`. However, this is incorrect as it
only works when given as `--author <author>`. Correct this erroneous
text.
Also, for the sake of correctness, fix the description as well since we
invoke `git commit` with `--author <author>`, not `--author=<author>`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
write_promisor_file() already uses xfopen(), so it would die
if the file cannot be opened for writing. To be consistent
with this behavior and not overlook issues, let's also die if
there are errors when we are actually writing to the file.
Suggested-by: Jeff King <peff@peff.net>
Suggested-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
To prepare for on-disk reverse indexes, remove a spot in
'offset_to_pack_pos()' that looks at the 'revindex' array in 'struct
packed_git'.
Even though this use of the revindex pointer is within pack-revindex.c,
this clean up is still worth doing. Since the 'revindex' pointer will be
NULL when reading from an on-disk reverse index (instead the
'revindex_data' pointer will be mmaped to the 'pack-*.rev' file), this
call-site would have to include a conditional to lookup the offset for
position 'mi' each iteration through the search.
So instead of open-coding 'pack_pos_to_offset()', call it directly from
within 'offset_to_pack_pos()'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Now that all spots outside of pack-revindex.c that reference 'struct
revindex_entry' directly have been removed, it is safe to hide the
implementation by moving it from pack-revindex.h to pack-revindex.c.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Now that all 'find_revindex_position()' callers have been removed (and
converted to the more descriptive 'offset_to_pack_pos()'), it is almost
safe to get rid of 'find_revindex_position()' entirely. Almost, except
for the fact that 'offset_to_pack_pos()' calls
'find_revindex_position()'.
Inline 'find_revindex_position()' into 'offset_to_pack_pos()', and
then remove 'find_revindex_position()' entirely.
This is a straightforward refactoring with one minor snag.
'offset_to_pack_pos()' used to load the index before calling
'find_revindex_position()'. That means that by the time
'find_revindex_position()' starts executing, 'p->num_objects' can be
safely read. After inlining, be careful to not read 'p->num_objects'
until _after_ 'load_pack_revindex()' (which loads the index as a
side-effect) has been called.
Another small fix that is included is converting the upper- and
lower-bounds to be unsigned's instead of ints. This dates back to
92e5c77c37 (revindex: export new APIs, 2013-10-24)--ironically, the last
time we introduced new APIs here--but this unifies the types.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Now that no callers of 'find_pack_revindex()' remain, remove the
function's declaration and implementation entirely.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
'estimate_repack_memory()' takes into account the amount of memory
required to load the reverse index in memory by multiplying the assumed
number of objects by the size of the 'revindex_entry' struct.
Prepare for hiding the definition of 'struct revindex_entry' by removing
a 'sizeof()' of that type from outside of pack-revindex.c. Instead,
guess that one off_t and one uint32_t are required per object. Strictly
speaking, this is a worse guess than asking for 'sizeof(struct
revindex_entry)' directly, since the true size of this struct is 16
bytes with padding on the end of the struct in order to align the offset
field.
But, this is an approximation anyway, and it does remove a use of the
'struct revindex_entry' from outside of pack-revindex internals.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Avoid looking at the 'revindex' pointer directly and instead call
'pack_pos_to_index()'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Remove direct manipulation of the 'struct revindex_entry' type as well
as calls to the deprecated API in 'packfile.c:unpack_entry()'. Usual
clean-up is performed (replacing '->nr' with calls to
'pack_pos_to_index()' and so on).
Add an additional check to make sure that 'obj_offset()' points at a
valid object. In the case this check is violated, we cannot call
'mark_bad_packed_object()' because we don't know the OID. At the top of
the call stack is do_oid_object_info_extended() (via
packed_object_info()), which does mark the object.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Convert another call of 'find_pack_revindex()' to its replacement
'pack_pos_to_offset()'. Likewise:
- Avoid manipulating `struct packed_git`'s `revindex` pointer directly
by removing the pointer-as-array indexing.
- Add an additional guard to check that the offset 'obj_offset()'
points to a real object. This should be the case with well-behaved
callers to 'packed_object_info()', but isn't guarenteed.
Other blocks that fill in various other values from the 'struct
object_info' request handle bad inputs by setting the type to
'OBJ_BAD' and jumping to 'out'. Do the same when given a bad offset
here.
The previous code would have segfaulted when given a bad
'obj_offset' value, since 'find_pack_revindex()' would return
'NULL', and then the line that fills 'oi->disk_sizep' would try to
access 'NULL[1]' with a stride of 16 bytes (the width of 'struct
revindex_entry)'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Perform exactly the same conversion as in the previous commit to another
caller within 'packfile.c'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Replace direct accesses to the 'struct revindex' type with a call to
'pack_pos_to_index()'.
Likewise drop the old-style 'find_pack_revindex()' with its replacement
'offset_to_pack_pos()' (while continuing to perform the same error
checking).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Remove another instance of looking at the revindex directly by instead
calling 'pack_pos_to_index()'. Unlike other patches, this caller only
cares about the index position of each object in the loop.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Remove another instance of direct revindex manipulation by calling
'pack_pos_to_offset()' instead (the caller here does not care about the
index position of the object at position 'pos').
Note that we cannot just use the existing "offset" variable to store the
value we get from pack_pos_to_offset(). It is incremented by
unpack_object_header(), but we later need the original value. Since
we'll no longer have revindex->offset to read it from, we'll store that
in a separate variable ("header" since it points to the entry's header
bytes).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Remove another caller that holds onto a 'struct revindex_entry' by
replacing the direct indexing with calls to 'pack_pos_to_offset()' and
'pack_pos_to_index()'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Avoid storing the revindex entry directly, since this structure will
soon be removed from the public interface. Instead, store the offset and
index position by calling 'pack_pos_to_offset()' and
'pack_pos_to_index()', respectively.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Replace find_revindex_position() with its counterpart in the new API,
offset_to_pack_pos().
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Replace direct accesses to the revindex with calls to
'offset_to_pack_pos()' and 'pack_pos_to_index()'.
Since this caller already had some error checking (it can jump to the
'give_up' label if it encounters an error), we can easily check whether
or not the provided offset points to an object in the given pack. This
error checking existed prior to this patch, too, since the caller checks
whether the return value from 'find_pack_revindex()' was NULL or not.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Replace a direct access to the revindex array with
'pack_pos_to_offset()'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Replace direct revindex accesses with calls to 'pack_pos_to_offset()'
and 'pack_pos_to_index()'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
First replace 'find_pack_revindex()' with its replacement
'offset_to_pack_pos()'. This prevents any bogus OFS_DELTA that may make
its way through until 'write_reuse_object()' from causing a bad memory
read (if 'revidx' is 'NULL')
Next, replace a direct access of '->nr' with the wrapper function
'pack_pos_to_index()'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the next several patches, we will prepare for loading a reverse index
either in memory (mapping the inverse of the .idx's contents in-core),
or directly from a yet-to-be-introduced on-disk format. To prepare for
that, we'll introduce an API that avoids the caller explicitly indexing
the revindex pointer in the packed_git structure.
There are four ways to interact with the reverse index. Accordingly,
four functions will be exported from 'pack-revindex.h' by the time that
the existing API is removed. A caller may:
1. Load the pack's reverse index. This involves opening up the index,
generating an array, and then sorting it. Since opening the index
can fail, this function ('load_pack_revindex()') returns an int.
Accordingly, it takes only a single argument: the 'struct
packed_git' the caller wants to build a reverse index for.
This function is well-suited for both the current and new API.
Callers will have to continue to open the reverse index explicitly,
but this function will eventually learn how to detect and load a
reverse index from the on-disk format, if one exists. Otherwise, it
will fallback to generating one in memory from scratch.
2. Convert a pack position into an offset. This operation is now
called `pack_pos_to_offset()`. It takes a pack and a position, and
returns the corresponding off_t.
Any error simply calls BUG(), since the callers are not well-suited
to handle a failure and keep going.
3. Convert a pack position into an index position. Same as above; this
takes a pack and a position, and returns a uint32_t. This operation
is known as `pack_pos_to_index()`. The same thinking about error
conditions applies here as well.
4. Find the pack position for a given offset. This operation is now
known as `offset_to_pack_pos()`. It takes a pack, an offset, and a
pointer to a uint32_t where the position is written, if an object
exists at that offset. Otherwise, -1 is returned to indicate
failure.
Unlike some of the callers that used to access '->offset' and '->nr'
directly, the error checking around this call is somewhat more
robust. This is important since callers should always pass an offset
which points at the boundary of two objects. The API, unlike direct
access, enforces that that is the case.
This will become important in a subsequent patch where a caller
which does not but could check the return value treats the signed
`-1` from `find_revindex_position()` as an index into the 'revindex'
array.
Two design warts are carried over into the new API:
- Asking for the index position of an out-of-bounds object will result
in a BUG() (since no such object exists), but asking for the offset
of the non-existent object at the end of the pack returns the total
size of the pack.
This makes it convenient for callers who always want to take the
difference of two adjacent object's offsets (to compute the on-disk
size) but don't want to worry about boundaries at the end of the
pack.
- offset_to_pack_pos() lazily loads the reverse index, but
pack_pos_to_index() doesn't (callers of the former are well-suited
to handle errors, but callers of the latter are not).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Update the CoC added in 5cdf2301 (add a Code of Conduct document,
2019-09-24 from version 1.4 to version 2.0. This is the version found
at [1] with the following minor changes:
- We preserve the change to the CoC in 3f9ef874a73 (CODE_OF_CONDUCT:
mention individual project-leader emails, 2019-09-26)
- We preserve the custom intro added in 5cdf2301d4a (add a Code of
Conduct document, 2019-09-24)
This change intentionally preserves a warning emitted on "git diff
--check". It's better to make it easily diff-able with upstream than
to fix whitespace changes in our version while we're at it.
1. https://www.contributor-covenant.org/version/2/0/code_of_conduct/code_of_conduct.md
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Elijah Newren <newren@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylor.com>
Acked-by: Jonathan Tan <jonathantanmy@google.com>
Acked-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Let's replace the 2 different pieces of code that write a
promisor file in 'builtin/repack.c' and 'fetch-pack.c'
with a new function called 'write_promisor_file()' in
'pack-write.c' and 'pack.h'.
This might also help us in the future, if we want to put
back the ref names and associated hashes that were in
the promisor files we are repacking in 'builtin/repack.c'
as suggested by a NEEDSWORK comment just above the code
we are refactoring.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
As we are going to refactor the code that actually writes
the promisor file into a separate function in a following
commit, let's rename the current write_promisor_file()
function to create_promisor_file().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Remove support for the magical "repo-abbrev" comment in .mailmap
files. This was added to .mailmap parsing in [1], as a generalized
feature of the git-shortlog Perl script added earlier in [2].
There was no documentation or tests for this feature, and I don't
think it's used in practice anymore.
What it did was to allow you to specify a single string to be
search-replaced with "/.../" in the .mailmap file. E.g. for
linux.git's current .mailmap:
git archive --remote=git@gitlab.com:linux-kernel/linux.git \
HEAD -- .mailmap | grep -a repo-abbrev
# repo-abbrev: /pub/scm/linux/kernel/git/
Then when running e.g.:
git shortlog --merges --author=Linus -1 v5.10-rc7..v5.10 | grep Merge
We'd emit (the [...] is mine):
Merge tag [...]git://git.kernel.org/.../tip/tip
But will now emit:
Merge tag [...]git.kernel.org/pub/scm/linux/kernel/git/tip/tip
I think at this point this is just a historical artifact we can get
rid of. It was initially meant for Linus's own use when we integrated
the Perl script[2], but since then it seems he's stopped using it.
Digging through Linus's release announcements on the LKML[3] the last
release I can find that made use of this output is Linux 2.6.25-rc6
back in March 2008[4]. Later on Linus started using --no-merges[5],
and nowadays seems to prefer some custom not-quite-shortlog format of
merges from lieutenants[6].
You will still see it on linux.git if you run "git shortlog" manually
yourself with --merges, with this removed you can still get the same
output with:
git log --pretty=fuller v5.10-rc7..v5.10 |
sed 's!/pub/scm/linux/kernel/git/!/.../!g' |
git shortlog
Arguably we should do the same for the search-replacing of "[PATCH]"
at the beginning with "". That seems to be another relic of a bygone
era when linux.git patches would have their E-Mail subject lines
applied as-is by "git am" or whatever. But we documented that feature
in "git-shortlog(1)", and it seems more widely applicable than
something purely kernel-specific.
1. 7595e2ee6ef (git-shortlog: make common repository prefix
configurable with .mailmap, 2006-11-25)
2. fa375c7f1b6 (Add git-shortlog perl script, 2005-06-04)
3. https://lore.kernel.org/lkml/
4. https://lore.kernel.org/lkml/alpine.LFD.1.00.0803161651350.3020@woody.linux-foundation.org/
5. https://lore.kernel.org/lkml/BANLkTinrbh7Xi27an3uY7pDWrNKhJRYmEA@mail.gmail.com/
6. https://lore.kernel.org/lkml/CAHk-=wg1+kf1AVzXA-RQX0zjM6t9J2Kay9xyuNqcFHWV-y5ZYw@mail.gmail.com/
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add documentation and more tests for case-insensitivity. The existing
test only matched on the E-Mail part, but as shown here we also match
the name with strcasecmp().
This behavior was last discussed on the mailing list in the thread
starting at [1]. It seems we're keeping it like this, so let's
document it.
1. https://lore.kernel.org/git/87czykvg19.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add tests for mailmap's handling of "<>", which is allowed on the RHS,
but not the LHS of a "<LHS> <RHS>" pair.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add tests for mailmap's handling of whitespace, i.e. how it trims
space within "<>" and around author names.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a test for mailmap comment syntax. As noted in [1] there was no
test coverage for this. Let's make sure a future change doesn't break
it.
1. https://lore.kernel.org/git/CAN0heSoKYWXqskCR=GPreSHc6twCSo1345WTmiPdrR57XSShhA@mail.gmail.com/
Reported-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Change the mailmap documentation added in 0925ce4d49 (Add map_user()
and clear_mailmap() to mailmap, 2009-02-08) to continue discussing the
Jane/Joe example. I think this makes things a lot less confusing as
we're building up more complex examples using one set of data which
covers all the things we'd like to discuss.
Also add tests to assert that what our documentation says is what's
actually happening. This is mostly (or entirely) covered by existing
tests which I'm not deleting, but having these tests for the synopsis
makes it easier to follow-along while reading the tests & docs.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Refactor a few more tests to use the new "--append" option to
"test_commit". I added it for use in the mailmap tests, but this
demonstrates how useful it is in general.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add an --append option to test_commit to append <contents> to the
<file> we're writing to. This simplifies a lot of test setup, as shown
in some of the tests being changed here.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add support for --author to "test_commit". This will simplify some
current and future tests, one of those is being changed here.
Let's also line-wrap the "git commit" command invocation to make diffs
that add subsequent options easier to add, as they'll only need to add
a new option line.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The --notick argument was added in [1] and was followed by --signoff
in [2], but neither of these commits added any documentation for these
options. When -C was added in [3] a comment was added to document it,
but not the other options. Let's document all of these options.
1. 44b85e89d7 (t7003: add test to filter a branch with a commit at
epoch, 2012-07-12),
2. 5ed75e2a3f (cherry-pick: don't forget -s on failure, 2012-09-14).
3. 6f94351b0a (test-lib-functions.sh: teach test_commit -C <dir>,
2016-12-08)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Expand the comment template for "test_commit" to match that of
"test_commit_bulk" added in b1c36cb849 (test-lib: introduce
test_commit_bulk, 2019-07-02). It has several undocumented options,
which won't all fit on one line. Follow-up commit(s) will document
them.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
That we silently ignore missing mailmap.file or mailmap.blob values is
intentional. See 938a60d64f (mailmap: clean up read_mailmap error
handling, 2012-12-12). However, nothing tested for this. Let's do that
by checking that stderr is empty in those cases.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Change a test that used a custom fuzzing function since
bfdfa3d414 (t4203 (mailmap): stop hardcoding commit ids and dates,
2010-10-15) to just use the "blame --porcelain" output instead.
We could use the same pattern as 0ba9c9a0fb (t8008: rely on
rev-parse'd HEAD instead of sha1 value, 2017-07-26) does to do this,
but there wouldn't be any point. We're not trying to test "blame"
output here in general, just that "blame" pays attention to the
mailmap.
So it's sufficient to get the blamed line(s) and authors from the
output, which is much easier with the "--porcelain" option.
It would still be possible for there to be a bug in "blame" such that
it uses the mailmap for its "--porcelain" output, but not the regular
output. Let's test for that simply by checking if specifying the
mailmap changes the output.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a test for one of the error conditions added in
938a60d64f (mailmap: clean up read_mailmap error handling,
2012-12-12).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Remove a redundant line in a test added in d20d654fe8 (Change current
mailmap usage to do matching on both name and email of
author/committer., 2009-02-08).
This didn't conceivably test anything useful and is most likely a
copy/paste error.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The --stdin tests setup the "contact" file in the main setup, let's
instead set it up in the test that uses it.
Also refactor the first test so it's obvious that the point of it is
that "check-mailmap" will spew its input as-is when given no
argument. For that one we can just use the "expect" file as-is.
Also add tests for how other "--stdin" cases are handled, e.g. one
where we actually do a mapping.
For the rest of --stdin testing we just assume we're going to get the
same output. We could follow-up and make sure everything's
round-tripped through both --stdin and the file/blob backends, but I
don't think there's much point in that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Refactor the mailmap tests to:
* Setup "actual" test files in the body of "test_expect_success"
* Don't have X of "test_expect_success X Y" be an unquoted string.
* Not to carry over test config between tests, and instead use
"test_config".
* Replace various "echo" a line-at-a-time patterns with here-docs.
* Change a case of "log.mailmap=False" to use the lower-case
"false". Both work, but this ends up in git-config's boolean
parsing and these atypical values are tested for elsewhere. Let's
use the lower-case to not draw the reader's attention to this
abnormality.
* Remove commentary asserting that things work a given way in favor
of simply testing for it, i.e. in the case of a .mailmap file
outside of the repository.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Change these tests to use the preferred whitespace around ">",
"<<-EOF" etc. This is an initial step in larger and more meaningful
refactoring of the file, which makes a subsequent commit easier to
read.
I'm not changing the whitespace of "echo <str> > file" patterns to
"echo <str> >file" because all of those will be changed to here-docs
in a subsequent commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Mentioning the comment syntax and blank line support first is in line
with how "git help config" describes its format. See
b8936cf060 (config.txt grammar, typo, and asciidoc fixes, 2006-06-08)
for the paragraph I'm copying & amending from its documentation.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a passing mention of the mailmap.file and mailmap.blob
configuration options. Before this addition a reader of the
"check-mailmap" manpage would have no idea that a custom map could be
specified, unless they'd happen to e.g. come across it in the "config"
manpage first.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Quote the mailmap.file and mailmap.blob configuration variables as
`mailmap.file` and `mailmap.blob`, and link to git-config(1). This is
in line with the preferred way of doing this in the rest of our
documentation.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Create a gitmailmap(5) page similar to how .gitmodules and .gitignore
have their own pages at gitmodules(5) and gitignore(5). Now instead of
"check-mailmap", "blame" and "shortlog" documentation including the
description of the format we link to one canonical place.
This makes things easier for readers, since in our manpage or
web-based[1] output it's not clear that the "MAPPING AUTHORS" sections
aren't subtly different, as opposed to just included.
1. https://git-scm.com/docs/git-check-mailmap
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|