| Age | Commit message (Collapse) | Author | Files | Lines |
|
Add a test script, `t/t1463-refs-optimize.sh`, for the new `git refs
optimize` command.
This script acts as a simple driver, leveraging the shared test library
created in the preceding commit. It works by overriding the
`$pack_refs` variable to "refs optimize" and then sourcing the
shared library (`t/pack-refs-tests.sh`).
This approach ensures that `git refs optimize` is tested against the
entire comprehensive test suite of `git pack-refs`, verifying
that it acts as a compatible drop-in replacement.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In preparation for adding tests for the new `git refs optimize` command,
refactor the existing t0601 test suite to make its logic shareable.
Move the core test logic from `t0601-reffiles-pack-refs.sh` into a new
`pack-refs-tests.sh` file. Inside this new script, replace hardcoded
calls to "pack-refs" with the `$pack_refs` variable.
The original `t0601-reffiles-pack-refs.sh` script now becomes a simple
"driver". It is responsible for setting the default value of the
variable and then sourcing the test library.
This new structure follows the established pattern used for sharing
tests between `git-for-each-ref` and `git-refs list` and prepares the
test suite for the `refs optimize` tests to be added in a subsequent
commit.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
It is likely that those who came to Git after 3.0 switched the
default initial branch name to 'main' would still try to follow
tutorials that were written before 3.0 happened and with the
assumption that the tool would call the initial branch 'master'.
To help these new users after 3.0 boundary, let's retain one part of
the hint we will be giving before the default changes, namely, how
to rename the branch an unconfigured Git has created just once.
We do this without telling them how to permanently configure the
default name of the initial branch, and that design choice is very
much deliberate. The whole point of switching the default name was
because we did not want to force individual users to configure their
default branch name but while the hard wired default was 'master',
they _had_ to configure it away from 'master' in order to conform to
the recent norm, and a hint that tells them how to do so is useful.
But once the default is renamed to 'main', that no longer is true.
A narrower audience who are new users that follow an instruction
that assumes the initial branch name is 'master' would only need to
learn "here is how to change the branch name to match the tutorial
you are following in the repository you created for practice", and
"here is how you keep creating repositories with the first branch
with a name everybody hates" is unnecessary.
It also needs to be noted that the advise token to squelch the
message is the same advice.defaultBranchName as before, which is
also very much deliberate. The users who do have that configured
are those who _have_ been using Git since before 3.0, and they are
not the target audience for the new advice message. Reusing the
same advise token ensures that they do not have to turn the message
off.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Adjust to the way newer versions of cURL selectivel enables tracing
options, so that our tests can continue to work.
* jk/curl-global-trace-components:
curl: add support for curl_global_trace() components
|
|
A corner case bug in "git log -L..." has been corrected.
* sg/line-log-boundary-fixes:
line-log: show all line ranges touched by the same diff range
line-log: fix assertion error
|
|
"core.commentChar=auto" that attempts to dynamically pick a
suitable comment character is non-workable, as it is too much
trouble to support for little benefit, and is marked as deprecated.
* pw/3.0-commentchar-auto-deprecation:
commit: print advice when core.commentString=auto
config: warn on core.commentString=auto
breaking-changes: deprecate support for core.commentString=auto
|
|
If the user uses a prepare-commit-msg hook to add comments to the
commit message template and sets commit.cleanup to remove them when the
commit is created then the comments will not be removed when rebase
commits the final command in a chain of "fixup" commands[1]. This
happens because f7d42ceec52 (rebase -i: do leave commit message intact
in fixup! chains, 2021-01-28) started passing the VERBATIM_MSG flag
when committing the final command in a chain of "fixup" commands. That
change was added in response to a bug report[2] where the commit
message was being cleaned up when it should not be. The cause of that
bug was that before f7d42ceec52 the sequencer passed CLEANUP_MSG
when committing the final fixup. That commit should have simply
removed the CLEANUP_MSG flag, not changed it to VERBATIM_MSG. Using
VERBATIM_MSG ignores the user's commit.cleanup config when committing
the final fixup which means it behaves differently to an ordinary
"pick" command which respects commit.cleanup.
Fix this by not setting an explicit cleanup flag when committing the
final fixup which matches the way "pick" commands behave. The test
added in f7d42ceec52 is replaced with one that checks that "fixup"
and "pick" commands do not clean up the message when commit.cleanup
is not set and do clean up the message when it is set.
[1] https://lore.kernel.org/git/CA+itcS3DxbgpFy2aPRvHQvTAYE=dU0kfeDdidVwWLU=rBAWR4w@mail.gmail.com
[2] https://lore.kernel.org/git/CANVGpwZGbzYLMeMze64e_OU9p3bjyEgzC5thmNBr6LttBt+YGw@mail.gmail.com
Reported-by: Simon Cheng <cyqsimon@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The recently introduced new subcommand git-last-modified(1) runs into an
error in some scenarios. It then would exit with the message:
BUG: paths remaining beyond boundary in last-modified
This seems to happens for example when criss-cross merges are involved.
In that scenario, the function diff_tree_combined() gets called.
The function diff_tree_combined() copies the `struct diff_options` from
the input `struct rev_info` to override some flags. One flag is
`recursive`, which is always set to 1. This has been the case since the
inception of this function in af3feefa1d (diff-tree -c: show a merge
commit a bit more sensibly., 2006-01-24).
This behavior is incompatible with git-last-modified(1), when called
non-recursive (which is the default).
The last-modified machinery uses a hashmap for all the paths it wants to
get the last-modified commit for. Through log_tree_commit() the callback
mark_path() is called. The diff machinery uses diff_tree_combined()
internally, and due to it's recursive behavior the callback receives
entries inside subtrees, but not the subtree entries themselves. So a
directory is never expelled from the hashmap, and the BUG() statement
gets hit.
Because there are many callers calling into diff_tree_combined(), both
directly and indirectly, we cannot simply change it's behavior.
Instead, add a flag `no_recursive_diff_tree_combined` which supresses
the behavior of diff_tree_combined() to override `recursive` and set
this flag in builtin/last-modified.c.
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The previous commit added tests for shadowing deprecated builtins.
Let’s make the test suite more complete by exercising a sample of
the builtins and in turn test the documentation for git-config(1):
To avoid confusion and troubles with script usage, aliases that hide
existing Git commands are ignored except for deprecated commands.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
git-whatchanged(1) is deprecated and you need to pass
`--i-still-use-this` in order to force it to work as before.
There are two affected users, or usages:
1. people who use the command in scripts; and
2. people who are used to using it interactively.
For (1) the replacement is straightforward.[1] But people in (2) might
like the name or be really used to typing it.[3]
An obvious first thought is to suggest aliasing `whatchanged` to the
git-log(1) equivalent.[1] But this doesn’t work and is awkward since you
cannot shadow builtins via aliases.
Now you are left in an uncomfortable limbo; your alias won’t work until
the command is removed for good.
Let’s lift this limitation by allowing *deprecated* builtins to be
shadowed by aliases.
The only observed demand for aliasing has been for git-whatchanged(1),
not for git-pack-redundant(1). But let’s be consistent and treat all
deprecated commands the same.
[1]:
git log --raw --no-merges
With a minor caveat: you get different outputs if you happen to
have empty commits (no changes)[2]
[2]: https://lore.kernel.org/git/20250825085428.GA367101@coredump.intra.peff.net/
[3]: https://lore.kernel.org/git/BL3P221MB0449288C8B0FA448A227FD48833AA@BL3P221MB0449.NAMP221.PROD.OUTLOOK.COM/
Based-on-patch-by: Jeff King <peff@peff.net>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A '--signed-commits=<mode>' option is already available when using
`git fast-export` to decide what should be done at export time about
commit signatures. At import time though, there is no option, or
other way, in `git fast-import` to decide about commit signatures.
To remediate that, let's add a '--signed-commits=<mode>' option to
`git fast-import` too.
For now the supported <mode>s are the same as those supported by
`git fast-export`.
The code responsible for consuming a signature is refactored into
the import_one_signature() and discard_one_signature() functions,
which makes it easier to follow the logic and add new modes in the
future.
In the 'strip' and 'warn-strip' modes, we deliberately use
discard_one_signature() to discard the signature without parsing it.
This ensures that even malformed signatures, which would cause the
parser to fail, can be successfully stripped from a commit.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The previous commit added the necessary validation and checks for F/D
conflicts in the files backend when working on case insensitive systems.
There is still a possibility for D/F conflicts. This is a different from
the F/D since for F/D conflicts, there would not be a conflict during
the lock creation phase:
refs/heads/foo.lock
refs/heads/foo/bar.lock
However there would be a conflict when the locks are committed, since we
cannot have 'refs/heads/foo/bar' and 'refs/heads/foo'. These kinds of
conflicts are checked and resolved in
`refs_verify_refnames_available()`, so the previous commit ensured that
for case-insensitive filesystems, we would lowercase the inputs to that
function.
For D/F conflicts, there is a conflict during the lock creation phase
itself:
refs/heads/foo/bar.lock
refs/heads/foo.lock
As in `lock_raw_ref()` after creating the lock, we also check for D/F
conflicts. This can occur in case-insensitive filesystems when trying to
fetch case-conflicted references like:
refs/heads/Foo/new
refs/heads/foo
D/F conflicts can also occur in case-sensitive filesystems, when the
repository already contains a directory with a lock file
'refs/heads/foo/bar.lock' and trying to fetch 'refs/heads/foo'. This
doesn't concern directories containing garbage files as those are
handled on a higher level.
To fix this, simply categorize the error as a name conflict. Also remove
this reference from the list of valid refnames for availability checks.
By categorizing the error and removing it from the list of valid
references, batched updates now knows to reject such reference updates
and apply the other reference updates.
Fix a small typo in `ref_transaction_maybe_set_rejected()` while here.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When using the files-backend on case-insensitive filesystems, there is
possibility of hitting F/D conflicts when creating references within a
single transaction, such as:
- 'refs/heads/foo'
- 'refs/heads/Foo/bar'
Ideally such conflicts are caught in `refs_verify_refnames_available()`
which is responsible for checking F/D conflicts within a given
transaction. This utility function is shared across the reference
backends. As such, it doesn't consider the issues of using a
case-insensitive file system, which only affects the files-backend.
While one solution would be to make the function aware of such issues,
this feels like leaking implementation details of file-backend specific
issues into the utility function. So opt for the more simpler option, of
lowercasing all references sent to this function when on a
case-insensitive filesystem and operating on the files-backend.
To do this, simply use a `struct strbuf` to convert the refname to
lowercase and append it to the list of refnames to be checked. Since we
use a `struct strbuf` and the memory is cleared right after, make sure
that the string list duplicates all provided string.
Without this change, the user would simply be left with a repository
with '.lock' files which were created in the 'prepare' phase of the
transaction, as the 'commit' phase would simply abort and not do the
necessary cleanup.
Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When fetching references into a repository, if a lock for a particular
reference exists, then `lock_raw_ref()` throws:
- REF_TRANSACTION_ERROR_CASE_CONFLICT: when there is a conflict
because the transaction contains conflicting references while being
on a case-insensitive filesystem.
- REF_TRANSACTION_ERROR_GENERIC: for all other errors.
The latter causes the entire set of batched updates to fail, even in
case sensitive filessystems.
Instead, return a 'REF_TRANSACTION_ERROR_CREATE_EXISTS' error. This
allows batched updates to reject the individual update which conflicts
with the existing file, while updating the rest of the references.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
During the 'prepare' phase of a reference transaction in the files
backend, we create the lock files for references to be created. When
using batched updates on case-insensitive filesystems, the entire
batched updates would be aborted if there are conflicting names such as:
refs/heads/Foo
refs/heads/foo
This affects all commands which were migrated to use batched updates in
Git 2.51, including 'git-fetch(1)' and 'git-receive-pack(1)'. Before
that, reference updates would be applied serially with one transaction
used per update. When users fetched multiple references on
case-insensitive systems, subsequent references would simply overwrite
any earlier references. So when fetching:
refs/heads/foo: 5f34ec0bfeac225b1c854340257a65b106f70ea6
refs/heads/Foo: ec3053b0977e83d9b67fc32c4527a117953994f3
refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56
The user would simply end up with:
refs/heads/foo: ec3053b0977e83d9b67fc32c4527a117953994f3
refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56
This is buggy behavior since the user is never informed about the
overrides performed and missing references. Nevertheless, the user is
left with a working repository with a subset of the references. Since
Git 2.51, in such situations fetches would simply fail without updating
any references. Which is also buggy behavior and worse off since the
user is left without any references.
The error is triggered in `lock_raw_ref()` where the files backend
attempts to create a lock file. When a lock file already exists the
function returns a 'REF_TRANSACTION_ERROR_GENERIC'. When this happens,
the entire batched updates, not individual operation, is aborted as if
it were in a transaction.
Change this to return 'REF_TRANSACTION_ERROR_CASE_CONFLICT' instead to
aid the batched update mechanism to simply reject such errors. The
change only affects batched updates since batched updates will reject
individual updates with non-generic errors. So specifically this would
only affect:
1. git fetch
2. git receive-pack
3. git update-ref --batch-updates
This bubbles the error type up to `files_transaction_prepare()` which
tries to lock each reference update. So if the locking fails, we check
if the rejection type can be ignored, which is done by calling
`ref_transaction_maybe_set_rejected()`.
As the error type is now 'REF_TRANSACTION_ERROR_CASE_CONFLICT',
the specific reference update would simply be rejected, while other
updates in the transaction would continue to be applied. This allows
partial application of references in case-insensitive filesystems when
fetching colliding references.
While the earlier implementation allowed the last reference to be
applied overriding the initial references, this change would allow the
first reference to be applied while rejecting consequent collisions.
This should be an okay compromise since with the files backend, there is
no scenario possible where we would retain all colliding references.
Let's also be more proactive and notify users on case-insensitive
filesystems about such problems by providing a brief about the issue
while also recommending using the reftable backend, which doesn't have
the same issue.
Reported-by: Joe Drew <joe.drew@indexexchange.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
With the current implementation of 'git sparse-checkout clean', we
notice that a file that was in a conflicted state does not get cleaned
up because of some internal details around the SKIP_WORKTREE bit.
This test is documenting the current behavior before we update it in the
following change.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The 'git sparse-checkout clean' subcommand is focused on directories,
deleting any tracked sparse directories to clean up the worktree and
make the sparse index feature work optimally.
However, this directory-focused approach can leave users wondering why
those directories exist at all. In my experience, these files are left
over due to ignore or exclude patterns, Windows file handles, or
possibly merge conflict resolutions.
Add a new '--verbose' option for users to see all the files that are
being deleted (with '--force') or would be deleted (with '--dry-run').
Based on usage, users may request further context on this list of files for
states such as tracked/untracked, unstaged/staged/conflicted, etc.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A broken or malicious "git fetch" can say that it has the same
object for many many times, and the upload-pack serving it can
exhaust memory storing them redundantly, which has been corrected.
* ps/upload-pack-oom-protection:
upload-pack: don't ACK non-commits repeatedly in protocol v2
t5530: modernize tests
|
|
Fixes multiple crashes around midx write-out codepaths.
* ds/midx-write-fixes:
midx-write: simplify error cases
midx-write: reenable signed comparison errors
midx-write: use uint32_t for preferred_pack_idx
midx-write: use cleanup when incremental midx fails
midx-write: put failing response value back
midx-write: only load initialized packs
|
|
"repo info" learns a short-hand option "-z" that is the same as
"--format=nul", and learns to report the objects format used in the
repository.
* lo/repo-info-step-2:
repo: add the field objects.format
repo: add the flag -z as an alias for --format=nul
|
|
Test updates.
* tc/t0450-harden:
t0450: add allowlist for builtins with missing .adoc
t0450: fix test for out-of-tree builds
|
|
"git refs exists" that works like "git show-ref --exists" has been
added.
* ms/refs-exists:
t: add test for git refs exists subcommand
t1422: refactor tests to be shareable
t1403: split 'show-ref --exists' tests into a separate file
builtin/refs: add 'exists' subcommand
|
|
Further code clean-up for multi-pack-index code paths.
* ps/object-store-midx-dedup-info:
midx: compute paths via their source
midx: stop duplicating info redundant with its owning source
midx: write multi-pack indices via their source
midx: load multi-pack indices via their source
midx: drop redundant `struct repository` parameter
odb: simplify calling `link_alt_odb_entry()`
odb: return newly created in-memory sources
odb: consistently use "dir" to refer to alternate's directory
odb: allow `odb_find_source()` to fail
odb: store locality in object database sources
|
|
The 'git sparse-checkout clean' subcommand is somewhat similar to 'git
clean' in that it will delete files that should not be in the worktree.
The big difference is that it focuses on the directories that should not
be in the worktree due to cone-mode sparse-checkout. It also does not
discriminate in the kinds of files and focuses on deleting entire
directories.
However, there are some restrictions that would be good to bring over
from 'git clean', specifically how it refuses to do anything without the
'-f'/'--force' or '-n'/'--dry-run' arguments. The 'clean.requireForce'
config can be set to 'false' to imply '--force'.
Add this behavior to avoid accidental deletion of files that cannot be
recovered from Git.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When users change their sparse-checkout definitions to add new
directories and remove old ones, there may be a few reasons why
directories no longer in scope remain (ignored or excluded files still
exist, Windows handles are still open, etc.). When these files still
exist, the sparse index feature notices that a tracked, but sparse,
directory still exists on disk and thus the index expands. This causes a
performance hit _and_ the advice printed isn't very helpful. Using 'git
clean' isn't enough (generally '-dfx' may be needed) but also this may
not be sufficient.
Add a new subcommand to 'git sparse-checkout' that removes these
tracked-but-sparse directories.
The implementation details provide a clear definition of what is happening,
but it is difficult to describe this without including the internal
implementation details. The core operation converts the index to a sparse
index (in memory if not already on disk) and then deletes any directories in
the worktree that correspond with a sparse directory entry in that sparse
index.
In the most common case, this means that a file will be removed if it is
contained within a directory that is both tracked and outside of the
sparse-checkout definition. However, there can be exceptions depending on
the current state of the index:
* If the worktree has a modification to a tracked, sparse file, then that
file's parent directories will be expanded instead of represented as
sparse directories. Siblings of those parent directories may be
considered sparse.
* If the user staged a sparse file with "git add --sparse", then that file
loses the SKIP_WORKTREE bit until the sparse-checkout is reapplied. Until
then, that file's parent directories are not represented as sparse
directory entries and thus will not be removed. Siblings of those parent
directories may be considered sparse. (There may be other reasons why
the SKIP_WORKTREE bit was removed for a file and this impact on the
sparse directories will apply to those as well.)
* If the user has a merge conflict outside of the sparse-checkout
definition, then those conflict entries prevent the parent directories
from being represented as sparse directory entries and thus are not
removed.
* The cases above present reasons why certain _file conditions_ will impact
which _directories_ are considered sparse. The list of tracked
directories that are outside of the sparse-checkout definition but not
represented as a sparse directory further reduces the list of files that
will be removed.
For these complicated reasons, the documentation details a potential list of
files that will be "considered for removal" instead of defining the list
concretely. The special cases can be handled by resolving conflicts,
committing staged changes, and running 'git sparse-checkout reapply' to
update the SKIP_WORKTREE bits as expected by the sparse-checkout definition.
It is important to make clear that this operation will remove ignored and
excluded files which would normally be ignored even by 'git clean -f' unless
the '-x' or '-X' option is provided. This is the most extreme method for
doing this, but it works when the sparse-checkout is in cone mode and is
expected to rescope based on directories, not files.
The current implementation always deletes these sparse directories
without warning. This is unacceptable for a released version, but those
features will be added in changes coming immediately after this one.
Note that this will not remove an untracked directory (or any of its
contents) if its parent is a tracked directory within the sparse-checkout
definition. This is required to prevent removing data created by tools that
perform caching operations for editors or build tools.
Thus, 'git sparse-checkout clean' is both more aggressive and more careful
than 'git clean -fx':
* It is more aggressive because it will remove _tracked_ files within the
sparse directories.
* It is less aggressive because it will leave _untracked_ files that are
not contained in sparse directories.
These special cases will be handled more explicitly in a future change that
expands tests for the 'git sparse-checkout clean' command. We handle some of
the modified, staged, and committed states including some impact on 'git
status' after cleaning.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the Git 2.51 release cycle we've refactored the object database layer
to access objects via `struct object_database` directly. To make the
transition a bit easier we have retained some of the old-style functions
in case those were widely used.
Now that Git 2.51 has been released it's time to clean up though and
drop these old wrappers. Do so and adapt the small number of newly added
users to use the new functions instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Update clar to fcbed04 (Merge pull request #123 from
pks-gitlab/pks-sandbox-ubsan, 2025-09-10). The most significant changes
since the last version include:
- Fixed platform support for HP-UX.
- Fixes for how clar handles the `-q` flag.
- A couple of leak fixes for reported clar errors.
- A new `cl_invoke()` function that retains line information.
- New infrastructure to create temporary directories.
- Improved printing of error messages so that all lines are now
properly indented.
- Proper selftests for the clar.
Most of these changes are somewhat irrelevant to us, but neither do we
have to adjust to any of these changes, either. What _is_ interesting to
us though is especially the fixed support for HP-UX, and eventually we
may also want to use `cl_invoke()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
As the tests are all run in separate repositories, set the branch
name to "master" when creating the repository for the tests where
the result depends on the branch name. In order to make it easier to
change the branch name in the future a helper function is used. This
reduces the number of tests that depend on the default branch name
being "master" and removes the last instance of a test file using
"GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master".
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Remove the penultimate use of "GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=
master" in our test suite. We have slowly been removing these ever
since we started to switch the default branch name used in tests to
"main".
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Remove one of the last remaining uses of
"TEST_GIT_DEFAULT_INITIAL_BRANCH= main" in the test suite. We have
been steadily be converting tests from using "master" as the default
branch name since the introduction of TEST_GIT_DEFAULT_INITIAL_BRANCH
in 704fed9ea22 (tests: start moving to a different default main branch
name, 2020-10-23) The changes here are purely mechanical replacing
"master" with "main"
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Since 1296cbe4b46 (init: document `init.defaultBranch` better,
2020-12-11) "git-init.adoc" has advertised that the default name
of the initial branch may change in the future. The name "main"
is chosen to match the default used by the big Git forge web sites.
The advice printed when init.defaultBranch is not set is updated
to say that the default will change to "main" in Git 3.0. Building
with WITH_BREAKING_CHANGES enabled removes the advice and changes
the default branch name to "main". The code in guess_remote_head()
that looks for "refs/heads/master" is left unchanged as that is only
called when the remote server does not support the symref capability
in the v0 protocol or the symref extension to the ls-refs list in the
v2 protocol. Such an old server is more likely to be using "master"
as the default branch name.
With the exception of the "git-init.adoc" the documentation is left
unchanged. I had hoped to parameterize the name of the default branch
by using an asciidoc attribute. Unfortunately attribute expansion
is inhibited by backticks and we use backticks to mark up ref names
so that idea does not work. As the changes to git-init.adoc show
inserting ifdef's around each instance of the branch name "master"
is cumbersome and makes the documentation sources harder to read.
Apart from "git-init.adoc" there are some other files where "master" is
used as the name of the initial branch rather than as an example of a
branch name such as "user-manual.adoc" and "gitcore-tutorial.adoc". The
name appears a lot in those so updating it with ifdef's is not really
practical. We can update that document in the 3.0 release cycle. The
other documentation where master is used as an example branch name
can be gradually converted over time.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A new command "git last-modified" has been added to show the closest
ancestor commit that touched each path.
* tc/last-modified:
last-modified: use Bloom filters when available
t/perf: add last-modified perf script
last-modified: new subcommand to show when files were last modified
|
|
"git ls-files <pathspec>..." should not necessarily have to expand
the index fully if a sparsified directory is excluded by the
pathspec; the code is taught to expand the index on demand to avoid
this.
* ds/ls-files-lazy-unsparse:
ls-files: conditionally leave index sparse
|
|
"git repack --path-walk" lost objects in some corner cases, which
has been corrected.
* ds/path-walk-repack-fix:
path-walk: create initializer for path lists
path-walk: fix setup of pending objects
|
|
Makefile tried to run multiple "cargo build" which would not work
very well; serialize their execution to work it around.
* da/cargo-serialize:
Makefile: build libgit-rs and libgit-sys serially
|
|
Color options like color.interactive and color.diff should fall back to
the value of color.ui if they aren't set. In add-interactive, we check
the specific options (e.g., color.diff) via repo_config_get_value(),
which does not depend on the main command having loaded any color config
via the git_config() callback mechanism.
But then we call want_color() on the result; if our specific config is
unset then that function uses the value of git_use_color_default. That
variable is typically set from color.ui by the git_color_config()
callback, which is called by the main command in its own git_config()
callback function.
This works fine for "add -p", whose add_config() callback calls into
git_color_config(). But it doesn't work for other commands like
"checkout -p", which is otherwise unaware of color at all. People tend
not to notice because the default is "auto", and that's what they'd set
color.ui to as well. But something like:
git -c color.ui=false checkout -p
should disable color, and it doesn't.
This regression goes back to 0527ccb1b5 (add -i: default to the built-in
implementation, 2021-11-30). In the perl version we got the color config
from "git config --get-colorbool", which did the full lookup for us.
The obvious fix is for git-checkout to add a call to git_color_config()
to its own config callback. But we'd have to do so for every command
with this problem, which is error-prone. Let's see if we can fix it more
centrally.
It is tempting to teach want_color() to look up the value of
repo_config_get_value("color.ui") itself. But I think that would have
disastrous consequences. Plumbing commands, especially older ones, avoid
porcelain config like "color.*" by simply not parsing it in their config
callbacks. Looking up the value of color.ui under the hood would
undermine that.
Instead, let's do that lookup in the add-interactive setup code. We're
already demand-loading other color config there, which is probably fine
(even in a plumbing command like "git reset", the interactive mode is
inherently porcelain-ish). That catches all commands that use the
interactive code, whether they were calling git_color_config()
themselves or not.
Reported-by: Isaac Oscar Gariano <isaacoscar@live.com.au>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The old perl git-add--interactive.perl script used the color.diff config
option to decide whether to color diffs (and if not set, it fell back to
the value of color.ui via git-config's --get-colorbool option). When we
switched to the builtin version, this was lost: we respect only
color.ui. So for example:
git -c color.diff=false add -p
would color the diff, even when it should not.
The culprit is this line in add-interactive.c's parse_diff():
if (want_color_fd(1, -1))
That "-1" means "no config has been set", which causes it to fall back
to the color.ui setting. We should instead be passing the value of
color.diff. But the problem is that we never even parse that config
option!
Instead the builtin interactive code parses only the value of
color.interactive, which is used for prompts and other messages. One
could perhaps argue that this should cover interactive diff coloring,
too, but historically it did not. The perl script treated
color.interactive and color.diff separately. So we should grab the
values for both, keeping separate fields in our add_i_state variable,
rather than a single use_color field.
We also load individual color slots (e.g., color.interactive.prompt),
leaving them as the empty string when color is disabled. This happens
via the init_color() helper in add-interactive, which checks that
use_color field. Now that there are two such fields, we need to pass the
appropriate one for each color.
The colors are mostly easy to divide up; color.interactive.* follows
color.interactive, and color.diff.* follows color.diff. But the "reset"
color is tricky. It is used for both types of coloring, but the two can
be configured independently. So we introduce two separate reset colors,
and use each in the appropriate spot.
There are two new tests. The first enables interactive prompt colors but
disables color.diff. We should see a colored prompt but not a colored
diff, showing that we are now respecting color.diff (and not
color.interactive or color.ui).
The second does the opposite. We disable color.interactive but turn on
color.diff with a custom fragment color. When we split a hunk, the
interactive code has to re-color the hunk header, which lets us check
that we correctly loaded the color.diff.frag config based on color.diff,
not color.interactive.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
After a partial stash, we may clear out the working tree by capturing
the output of diff-tree and piping it into git-apply (and likewise we
may use diff-index to restore the index). So we most definitely do not
want color diff output from that diff-tree process. And it normally
would not produce any, since its stdout is not going to a tty, and the
default value of color.ui is "auto".
However, if GIT_PAGER_IN_USE is set in the environment, that overrides
the tty check, and we'll produce a colorized diff that chokes git-apply:
$ echo y | GIT_PAGER_IN_USE=1 git stash -p
[...]
Saved working directory and index state WIP on main: 4f2e2bb foo
error: No valid patches in input (allow with "--allow-empty")
Cannot remove worktree changes
Setting this variable is a relatively silly thing to do, and not
something most users would run into. But we sometimes do it in our tests
to stimulate color. And it is a user-visible bug, so let's fix it rather
than work around it in the tests.
The root issue here is that diff-tree (and other diff plumbing) should
probably not ever produce color by default. It does so not by parsing
color.ui, but because of the baked-in "auto" default from 4c7f1819b3
(make color.ui default to 'auto', 2013-06-10). But changing that is
risky; we've had discussions back and forth on the topic over the years.
E.g.:
https://lore.kernel.org/git/86D0A377-8AFD-460D-A90E-6327C6934DFC@gmail.com/.
So let's accept that as the status quo for now and protect ourselves by
passing --no-color to the child processes. This is the same thing we did
for add-interactive itself in 1c6ffb546b (add--interactive.perl: specify
--no-color explicitly, 2020-09-07).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A previous commit allowed a server to pass additional fields through
the "promisor-remote" protocol capability after the "name" and "url"
fields, specifically the "partialCloneFilter" and "token" fields.
Let's make it possible for a client to check if these fields match
what it expects before accepting a promisor remote.
We allow this by introducing a new "promisor.checkFields"
configuration variable. It should contain a comma or space separated
list of fields that will be checked.
By limiting the protocol to specific well-defined fields, we ensure
both server and client have a shared understanding of field
semantics and usage.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
For now the "promisor-remote" protocol capability can only pass "name"
and "url" information from a server to a client in the form
"name=<remote_name>,url=<remote_url>".
To allow clients to make more informed decisions about which promisor
remotes they accept, let's make it possible to pass more information
by introducing a new "promisor.sendFields" configuration variable.
On the server side, information about a remote `foo` is stored in
configuration variables named `remote.foo.<variable-name>`. To make
it clearer and simpler, we use `field` and `field name` like this:
* `field name` refers to the <variable-name> part of such a
configuration variable, and
* `field` refers to both the `field name` and the value of such a
configuration variable.
The "promisor.sendFields" configuration variable should contain a
comma or space separated list of field names that will be looked up
in the configuration of the remote on the server to find the values
that will be passed to the client.
Only a set of predefined field names are allowed. The only field
names in this set are "partialCloneFilter" and "token". The
"partialCloneFilter" field name specifies the filter definition used
by the promisor remote, and the "token" field name can provide an
authentication credential for accessing it.
For example, if "promisor.sendFields" is set to "partialCloneFilter",
and the server has the "remote.foo.partialCloneFilter" config
variable set to a value, then that value will be passed in the
"partialCloneFilter" field in the form "partialCloneFilter=<value>"
after the "name" and "url" fields.
A following commit will allow the client to use the information to
decide if it accepts the remote or not. For now the client doesn't do
anything with the additional information it receives.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When a client performs a fetch or clone they can optionally send "have"
lines to tell the server which objects they already have available
locally. These object IDs are stored by the server in an object array so
that it can remember any objects it doesn't have to include in the pack
sent to the client.
While there isn't any reason to do so, clients are free to send the same
"have" line repeatedly. git-upload-pack(1) already knows to handle this
well: every commit it has seen via a "have" line gets marked with the
`THEY_HAVE` flag, and if such a commit is seen repeatedly we know to not
process it another time. This also has the effect that we only store the
object ID once, only, in the `have_obj` array.
There is an edge case though: if the client sends an object ID that does
not refer to a commit we neither store nor check the `THEY_HAVE` flag.
This means that we repeatedly store the same object ID in our `have_obj`
array, with two consequences:
- In protocol v2 we deduplicate ACKs for commits, but not for any
other objects as we send ACKs for every object ID in the `have_obj`
array.
- The `have_obj` array can grow in size indefinitely with both
protocols.
The potentially-more-serious issue is the second one, as we basically
have a way for an adversary to allocate arbitrarily large buffers now.
Ultimately, this doesn't seem to be all that serious though: on my
machine, the growth of that array is at around 4MB/s, and after roughly
five minutes I was only at 1GB RSS. So this is concerning, but only
mildly so.
Fix this bug by storing the `THEY_HAVE` flag independent of the object
type so that we don't store duplicate object IDs in `have_obj` anymore.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Refactor tests to follow modern best practices:
- Merge together tests that set up and verify a single use case.
- Drop empty newlines at the beginning and end of test bodies.
- Don't change directories in the main test body.
- Remove an unused `D` variable.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This instance of setting the result to 1 before going to cleanup was
accidentally removed in fcb2205b77 (midx: implement support for writing
incremental MIDX chains, 2024-08-06). Build upon a test that already deletes
a packfile to verify that this error propagates to full command failure.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The fill_packs_from_midx() method was refactored in fcb2205b77 (midx:
implement support for writing incremental MIDX chains, 2024-08-06) to
allow for preferred packfiles and incremental multi-pack-indexes.
However, this led to some conditions that can cause improperly
initialized memory in the context's list of packfiles.
The conditions caring about the preferred pack name or the incremental
flag are currently necessary to load a packfile. But the context is
still being populated with pack_info structs based on the packfile array
for the existing multi-pack-index even if prepare_midx_pack() isn't
called.
Add a new test that breaks under --stress when compiled with
SANITIZE=address. The chosen number of 100 packfiles was selected to get
the --stress output to fail about 50% of the time, while 50 packfiles
could not get a failure in most --stress runs.
The test case is marked as EXPENSIVE not only because of the number of
packfiles it creates, but because some CI environments were reporting
errors during the test that I could not reproduce, specifically around
being unable to open the packfiles or their pack-indexes.
When it fails under SANITIZE=address, it provides the following error:
AddressSanitizer:DEADLYSIGNAL
=================================================================
==3263517==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000027
==3263517==The signal is caused by a READ memory access.
==3263517==Hint: address points to the zero page.
#0 0x562d5d82d1fb in close_pack_windows packfile.c:299
#1 0x562d5d82d3ab in close_pack packfile.c:354
#2 0x562d5d7bfdb4 in write_midx_internal midx-write.c:1490
#3 0x562d5d7c7aec in midx_repack midx-write.c:1795
#4 0x562d5d46fff6 in cmd_multi_pack_index builtin/multi-pack-index.c:305
...
This failure stack trace is disconnected from the real fix because the bad
pointers are accessed later when closing the packfiles from the context.
There are a few different aspects to this fix that are worth noting:
1. We return to the previous behavior of fill_packs_from_midx to not
rely on the incremental flag or existence of a preferred pack.
2. The behavior to scan all layers of an incremental midx is kept, so
this is not a full revert of the change.
3. We skip allocating more room in the pack_info array if the pack
fails prepare_midx_pack().
4. The method has always returned 0 for success and 1 for failure, but
the condition checking for error added a check for a negative result
for failure, so that is now updated.
5. The call to open_pack_index() is removed, but this is needed later
in the case of a preferred pack. That call is moved to immediately
before its result is needed (checking for the object count).
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The flag `--show-object-format` from git-rev-parse is used for
retrieving the object storage format. This way, it is used for
querying repository metadata, fitting in the purpose of git-repo-info.
Add a new field `objects.format` to the git-repo-info subcommand
containing that information.
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Other Git commands that have nul-terminated output (e.g. git-config,
git-status, git-ls-files) have a flag `-z` for using the null character
as the record separator.
Add the `-z` flag to git-repo-info as an alias for `--format=nul`,
making it consistent with the behavior of the other commands.
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Before we were silently skipping all builtins that don't have a matching
.adoc file. This is overly loose and might skip documentation files
when it shouldn't, for example when there was a typo in the filename.
To ensure no new builtins are added without documentation, add an
allowlist: t0450/adoc-missing. In this file only builtin commands that
do *not* have a corresponding .adoc file shall be listed. If there is a
mismatch, fail the test. This should force future contributions to
either add an .adoc, or add the builtin name to the allowlist file.
Signed-off-by: Toon Claes <toon@iotcl.com>
[jc: squashed Patrick's "missing file fix" in]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a test script, `t/t1462-refs-exists.sh`, for the `git refs exists`
command.
This script acts as a simple driver, leveraging the shared test library
created in the preceding commit. It works by overriding the
`$git_show_ref_exists` variable to "git refs exists" and then sourcing the
shared library (`t/show-ref-exists-tests.sh`).
This approach ensures that `git refs exists` is tested against the
entire comprehensive test suite of `git show-ref --exists`, verifying
that it acts as a compatible drop-in replacement.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In preparation for adding tests for the `git refs exists` command,
refactor the existing t1422 test suite to make its logic shareable.
Move the core test logic from `t1422-show-ref-exists.sh` to
`show-ref-exists-tests.sh` file. Inside this script, replace hardcoded
calls to "git show-ref --exists" with the `$git_show_ref_exists`
variable.
The original `t1422-show-ref-exists.sh` script now becomes a simple
"driver". It is responsible for setting the default value of the
variable and then sourcing the test library.
This structure follows an established pattern for sharing tests and
prepares the test suite for the `refs exists` tests to be added in a
subsequent commit.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The test file for git-show-ref(1), `t1403-show-ref.sh`, contains a group
of tests for the '--exists' flag. To improve organization and to prepare
for refactoring these tests to be shareable, move the '--exists' tests
and their corresponding setup logic into a self-contained test suite,
`t1422-show-ref-exists.sh`.
This is a pure code-movement refactoring with no change in test coverage
or behavior.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* ps/object-store-midx-dedup-info:
midx: compute paths via their source
midx: stop duplicating info redundant with its owning source
midx: write multi-pack indices via their source
midx: load multi-pack indices via their source
midx: drop redundant `struct repository` parameter
odb: simplify calling `link_alt_odb_entry()`
odb: return newly created in-memory sources
odb: consistently use "dir" to refer to alternate's directory
odb: allow `odb_find_source()` to fail
odb: store locality in object database sources
|
|
"git describe <blob>" misbehaves and/or crashes in some corner
cases, which has been taught to exit with failure gracefully.
* jk/describe-blob:
describe: pass commit to describe_commit()
describe: handle blob traversal with no commits
describe: catch unborn branch in describe_blob()
describe: error if blob not found
describe: pass oid struct by const pointer
|
|
"git fetch" can clobber a symref that is dangling when the
remote-tracking HEAD is set to auto update, which has been
corrected.
* jk/no-clobber-dangling-symref-with-fetch:
refs: do not clobber dangling symrefs
t5510: prefer "git -C" to subshell for followRemoteHEAD tests
t5510: stop changing top-level working directory
t5510: make confusing config cleanup more explicit
|
|
Code clean-ups.
* ps/reftable-libgit2-cleanup:
refs/reftable: always reload stacks when creating lock
reftable: don't second-guess errors from flock interface
reftable/stack: handle outdated stacks when compacting
reftable/stack: allow passing flags to `reftable_stack_add()`
reftable/stack: fix compiler warning due to missing braces
reftable/stack: reorder code to avoid forward declarations
reftable/writer: drop Git-specific `QSORT()` macro
reftable/writer: fix type used for number of records
|
|
Test fix.
* ad/t1517-short-help-tests-fix:
t/t1517: mark tests that fail with GIT_TEST_INSTALLED
|
|
This just runs some simple last-modified commands. We already test
correctness in the regular suite, so this is just about finding
performance regressions from one version to another.
Based-on-patch-by: Jeff King <peff@peff.net>
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Similar to git-blame(1), introduce a new subcommand
git-last-modified(1). This command shows the most recent modification to
paths in a tree. It does so by expanding the tree at a given commit,
taking note of the current state of each path, and then walking
backwards through history looking for commits where each path changed
into its final commit ID.
Based-on-patch-by: Jeff King <peff@peff.net>
Improved-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When running 'git ls-files' with a pathspec, the index entries get
filtered according to that pathspec before iterating over them in
show_files(). In 78087097b8 (ls-files: add --sparse option,
2021-12-22), this iteration was prefixed with a check for the '--sparse'
option which allows the command to output directory entries; this
created a pre-loop call to ensure_full_index().
However, when a user runs 'git ls-files' where the pathspec matches
directories that are recursively matched in the sparse-checkout, there
are not any sparse directories that match the pathspec so they would not
be written to the output. The expansion in this case is just a
performance drop for no behavior difference.
Replace this global check to expand the index with a check inside the
loop for a matched sparse directory. If we see one, then expand the
index and continue from the current location. This is safe since the
previous entries in the index did not have any sparse directories and
thus would remain stable in this expansion.
A test in t1092 confirms that this changes the behavior.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In addition to the regular trace information produced by
CURLOPT_VERBOSE, recent curl versions can enable or disable tracing of
specific subsystems using a call to curl_global_trace().
This level of detail may or may not be useful for us in Git as mere
users of libcurl, but there's one case where we need it for a test. In
t5564, we set up a socks proxy, access it with GIT_TRACE_CURL set, and
expect to find socks-related messages in the output. This test is broken
in the release candidates for libcurl 8.16, as those socks messages are
no longer produced in the trace.
The problem bisects to curl's commit ab5e0bfddc (pytest: add SOCKS tests
and scoring, 2025-07-21). There the socks messages were moved from
generic infof() messages to the component-specific CURL_TRC_CF() system.
And so we do not see them by default, but only if "socks" is enabled as
a logging component.
Teach Git's http code to accept a component list from the
environment and pass it into curl_global_trace(). We can then use
that in the test to enable the correct component.
It should be safe to do so unconditionally. In older versions of curl
which don't support this call, setting the environment variable is a
noop. Likewise, any versions of curl which don't recognize the "socks"
component should silently ignore it. The manpage for curl_global_trace()
says this:
The config string is a list of comma-separated component names. Names
are case-insensitive and unknown names are ignored. The special name
"all" applies to all components. Names may be prefixed with '+' or '-'
to enable or disable detailed logging for a component.
The list of component names is not part of curl's public API. Names may
be added or disappear in future versions of libcurl. Since unknown
names are silently ignored, outdated log configurations does not cause
errors when upgrading libcurl. Given that, some names can be expected
to be fairly stable and are listed below for easy reference.
So this should let us make the test work on all versions without
worrying about confusing older (or newer) versions. For the same reason,
I've opted not to document this interface. This is deep internal voodoo
for which we can make no promises to users. In fact, I was tempted to
simply hard-code "socks" to let our test pass and not expose anything.
But I suspect a little run-time flexibility may come in handy in the
future when debugging or dealing with similar logging issues.
I also considered just putting "all" into such a hard-coded default. But
if you try it, you will see that many of the components are quite
verbose and likely not interesting. They would clutter up our trace
output if we enabled them by default.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"make -JN" with INCLUDE_LIBGIT_RS enabled causes cargo lock warnings
and can trigger ld errors during the build.
The build errors are caused by two inner "make" invocations getting
triggered concurrently: once inside of libgit-sys and another inside of
libgit-rs.
Make libgit-rs depend on libgit-sys so that "make" prevents them
from running concurrently. Apply the same logic to the test invocations.
Use cargo's "--manifest-path" option instead of "cd" in the recipes.
Signed-off-by: David Aguilar <davvid@gmail.com>
Acked-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add some advice on how to change the config settings when
"core.commentString=auto" or "core.commentChar=auto". The advice
includes instructions for clearing the config setting or setting a
fixed comment string. To try and be as specific as possible, the advice
is customized based on the user's config. If "core.commentString=auto"
is set in the system config and the user does not have write
access then the advice omits the instructions to clear the config
and recommends changing the global config instead. An alternative
approach would be to advise the user to run "git config --show-origin"
and leave them to figure out how to fix it themselves but that seems
rather unfriendly. As we're forcing them to update their config we
should try and make that as easy as possible.
In order to generate this advice we need to record each file where
either of the config keys is set and whether a key occurs more that
once in a given file. This lets us generate the list of commands to
remove all the keys and also tells us which key the "auto" setting
comes from.
As we want the user to update their config we do not provide a way
for this advice to be disabled other than changing the value of
"core.commentChar" or "core.commentString".
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
As support for this setting was deprecated in the last commit print a
warning (or die when WITH_BREAKING_CHANGES is enabled) if it is set.
Avoid bombarding the user with warnings by only printing it (a) when
running commands that call "git commit" and (b) only once per command.
Some scaffolding is added to repo_read_config() to allow it to
detect deprecated config settings and warn about them. As both
"core.commentChar" and "core.commentString" set the comment
character we record which one of them is used and tailor the
warning message appropriately.
Note the odd combination of die_message() followed by die(NULL)
is to allow the next commit to insert a call to advise() in the middle.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When "core.commentString" is set to "auto" then "git commit" will
automatically select the comment character ensuring that it is not the
first character on any of the lines in the commit message. This was
introduced by commit 84c9dc2c5a2 (commit: allow core.commentChar=auto
for character auto selection, 2014-05-17). The motivation seems to be
to avoid commenting out lines from the existing message when amending
a commit that was created with a message from a file.
Unfortunately this feature does not work with:
* commit message templates that contain comments.
* prepare-commit-msg hooks that introduce comments.
* "git commit --cleanup=strip --edit -F <file>" which means that it
is incompatible with
- the "fixup" and "squash" commands of "git rebase -i" as the
comments added by those commands are then treated as part of
the commit message.
- the conflict comments added to the commit message by "git
cherry-pick", "git rebase" etc. as these comments are then
treated as part of the commit message.
It is also ignored by "git notes" when amending a note.
The issues with comments coming from a template, hook or file are a
consequence of the design of this feature and are therefore hard to
fix.
As the costs of this feature outweigh the benefits, deprecate it and
remove it in Git 3.0. If someone comes up with some patches that fix
all the issues in a maintainable way then I'd be happy to see this
change reverted.
The next commits will add a warning and some advice for users on how
they can update their config settings.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A new subcommand "git repo" gives users a way to grab various
repository characteristics.
* lo/repo-info:
repo: add the --format flag
repo: add the field layout.shallow
repo: add the field layout.bare
repo: add the field references.format
repo: declare the repo command
|
|
Remove dependency on the_repository and other globals from the
commit-graph code, and other changes unrelated to de-globaling.
* ps/commit-graph-wo-globals:
commit-graph: stop passing in redundant repository
commit-graph: stop using `the_repository`
commit-graph: stop using `the_hash_algo`
commit-graph: refactor `parse_commit_graph()` to take a repository
commit-graph: store the hash algorithm instead of its length
commit-graph: stop using `the_hash_algo` via macros
|
|
Test clean-up.
* dk/t7005-editor-updates:
t7005: sanitize test environment for subsequent tests
t7005: stop abusing --exec-path
t7005: use modern test style
|
|
"git diff-tree" learned "--max-depth" option.
* tc/diff-tree-max-depth:
diff: teach tree-diff a max-depth parameter
within_depth: fix return for empty path
combine-diff: zero memory used for callback filepairs
|
|
"git cmd --help-all" now works outside repositories.
* dk/help-all:
builtin: also setup gently for --help-all
parse-options: refactor flags for usage_with_options_internal
|
|
Users reported an issue where objects were missing from their local
repositories after a full repack using 'git repack -adf --path-walk'.
This was alarming and took a while to create a reproducer. Here, we fix
the bug and include a test case that would fail without this fix.
The root cause is that certain objects existed in the index and had no
second versions. These objects are usually blobs, though trees can be
included if a cache-tree exists. The issue is that the revision walk
adds these objects to the "pending" list and the path-walk API forgets
to mark the lists it creates at this point as "maybe_interesting". If
these paths only ever have a single version in the history of the repo
(including the current staged version) then the parent directory never
tries to add a new object to the list and mark the list as
"maybe_interesting". Thus, when walking the list later, the group is
skipped as it is expected that no objects are interesting. This happens
even when there are actually no UNINTERESTING objects at all! This is
based on the optimization enabled by the pack.useSparse=true config
option, which is the default.
Thus, we create a test case that demonstrates the many cases of this
issue for reproducibility:
1. File a/b/c has only one committed version.
2. Files a/i and x/y only exist as staged changes.
3. Tree x/ only exists in the cache-tree.
After performing a non-path-walk repack to force all loose objects into
packfiles, run a --path-walk repack followed by 'git fsck'. This fsck is
what fails with the following errors:
error: invalid object 100644 f2e41136... for 'a/b/c'
This is the dropped instance of the single-versioned a/b/c file.
broken link from tree cfda31d8...
to tree 3f725fcd...
This is the missing tree for the single-versioned a/b/ directory.
missing blob 0ddf2bae... (a/i)
missing blob 975fbec8... (x/y)
missing blob a60d869d... (file)
missing blob f2e41136... (a/b/c)
missing tree 3f725fcd... (a/b/)
dangling tree 5896d7e... (staged root tree)
Note that since the staged root tree is missing, the fsck output cannot
even report that the staged x/ tree is missing as well.
The core problem here is that the "maybe_interesting" member of 'struct
type_and_oid_list' is not initialized to '1'. This member was added in
6333e7ae0b (path-walk: mark trees and blobs as UNINTERESTING,
2024-12-20) in a way to help when creating packfiles for a small commit
range using the sparse path algorithm (enabled by pack.useSparse=true).
The idea here is that the list is marked as "maybe_interesting" if an
object is added that does not have the UNINTERESTING flag on it. Later,
this is checked again in case all objects in the list were marked
UNINTERESTING after that point in time. In this case, the algorithm
skips the list as there is no reason to visit it.
This leads to the problem where the "maybe_interesting" member was not
appropriately initialized when the list is created from pending objects.
Initializing this in the correct places fixes the bug.
To reduce risk of similar bugs around initializing this structure, a
follow-up change will make initializing lists use a shared method.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Various options to "git diff" that makes comparison ignore certain
aspects of the differences (like "space changes are ignored",
"differences in lines that match these regular expressions are
ignored") did not work well with "--name-only" and friends.
* ly/diff-name-only-with-diff-from-content:
diff: ensure consistent diff behavior with ignore options
|
|
Code clean-up.
* ac/deglobal-fmt-merge-log-config:
builtin/fmt-merge-msg: stop depending on 'the_repository'
environment: remove the global variable 'merge_log_config'
|
|
"git diff --no-index" run inside a subdirectory under control of a
Git repository operated at the top of the working tree and stripped
the prefix from the output, and oddballs like "-" (stdin) did not
work correctly because of it. Correct the set-up by undoing what
the set-up sequence did to cwd and prefix.
* jc/diff-no-index-in-subdir:
diff: --no-index should ignore the worktree
|
|
The "list" subcommand of "git refs" acts as a front-end for
"git for-each-ref".
* ms/refs-list:
t: add test for git refs list subcommand
t6300: refactor tests to be shareable
builtin/refs: add list subcommand
builtin/for-each-ref: factor out core logic into a helper
builtin/for-each-ref: align usage string with the man page
doc: factor out common option
|
|
Revision traversal limited with pathspec, like "git log dir/*",
used to ignore changed-paths Bloom filter when the pathspec
contained wildcards; now they take advantage of the filter when
they can.
* ly/changed-path-traversal-with-magic-pathspec:
bloom: enable bloom filter with wildcard pathspec in revision traversal
|
|
Various bugs about rename handling in "ort" merge strategy have
been fixed.
* en/ort-rename-fixes:
merge-ort: fix directory rename on top of source of other rename/delete
merge-ort: fix incorrect file handling
merge-ort: clarify the interning of strings in opt->priv->path
t6423: fix missed staging of file in testcases 12i,12j,12k
t6423: document two bugs with rename-to-self testcases
merge-ort: drop unnecessary temporary in check_for_directory_rename()
merge-ort: update comments to modern testfile location
|
|
Test shuffling.
* ua/t1517-short-help-tests:
t5304: move `prune -h` test from t1517
t5200: move `update-server-info -h` test from t1517
t/t1517: automate `git subcmd -h` tests outside a repository
|
|
Test fix for breakage introduced in Git 2.50.
* rj/t6137-cygwin-fix:
t6137-*.sh: fix test failure on cygwin
|
|
"git push" had a code path that led to BUG() but it should have
been a die(), as it is a response to a usual but invalid end-user
action to attempt pushing an object that does not exist.
* dl/push-missing-object-error:
remote.c: convert if-else ladder to switch
remote.c: remove BUG in show_push_unqualified_ref_name_error()
t5516: remove surrounding empty lines in test bodies
|
|
string_list_split*() family of functions have been extended to
simplify common use cases.
* jc/string-list-split:
string-list: split-then-remove-empty can be done while splitting
string-list: optionally omit empty string pieces in string_list_split*()
diff: simplify parsing of diff.colormovedws
string-list: optionally trim string pieces split by string_list_split*()
string-list: unify string_list_split* functions
string-list: align string_list_split() with its _in_place() counterpart
string-list: report programming error with BUG
|
|
"git remote rename origin upstream" failed to move origin/HEAD to
upstream/HEAD when origin/HEAD is unborn and performed other
renames extremely inefficiently, which has been corrected.
* ps/remote-rename-fix:
builtin/remote: only iterate through refs that are to be renamed
builtin/remote: rework how remote refs get renamed
builtin/remote: determine whether refs need renaming early on
builtin/remote: fix sign comparison warnings
refs: simplify logic when migrating reflog entries
refs: pass refname when invoking reflog entry callback
|
|
"git refs migrate" to migrate the reflog entries from a refs
backend to another had a handful of bugs squashed.
* ps/reflog-migrate-fixes:
refs: fix invalid old object IDs when migrating reflogs
refs: stop unsetting REF_HAVE_OLD for log-only updates
refs/files: detect race when generating reflog entry for HEAD
refs: fix identity for migrated reflogs
ident: fix type of string length parameter
builtin/reflog: implement subcommand to write new entries
refs: export `ref_transaction_update_reflog()`
builtin/reflog: improve grouping of subcommands
Documentation/git-reflog: convert to use synopsis type
|
|
During interactive rebase, using 'drop' on a merge commit lead to
an error, which was incorrect.
* js/rebase-i-allow-drop-on-a-merge:
rebase -i: permit 'drop' of a merge commit
|
|
* lo/repo-info:
repo: add the --format flag
repo: add the field layout.shallow
repo: add the field layout.bare
repo: add the field references.format
repo: declare the repo command
|
|
When line-level log is invoked with more than one disjoint line range
in the same file, and one of the commits happens to change that file
such that one diff range modifies more than one line range, then
changes to all modified line ranges should be shown, but only the
changes in the first modified line range are:
$ git log --oneline -p
80ca903 (HEAD -> master) Initial
diff --git a/file b/file
new file mode 100644
index 0000000..00935f1
--- /dev/null
+++ b/file
@@ -0,0 +1,10 @@
+Line 1
+Line 2
+Line 3
+Line 4
+Line 5
+Line 6
+Line 7
+Line 8
+Line 9
+Line 10
$ git log --oneline -L1,2:file -L4,5:file -L7,8:file
80ca903 (HEAD -> master) Initial
diff --git a/file b/file
--- /dev/null
+++ b/file
@@ -0,0 +1,2 @@
+Line 1
+Line 2
The line-log-specific diff printer is already clever enough to handle
the case when one line range covers multiple diff ranges, but the
possibility of one diff range touching multiple disjoint line ranges
was apparently overlooked.
Add the necessary condition to dump_diff_hacky_one() to handle this case
as well, and show all modified line ranges:
$ git log --oneline -L1,2:file -L4,5:file -L7,8:file
0f9a5b4 (HEAD -> master) Initial
diff --git a/file b/file
--- /dev/null
+++ b/file
@@ -0,0 +1,2 @@
+Line 1
+Line 2
@@ -0,0 +4,2 @@
+Line 4
+Line 5
@@ -0,0 +7,2 @@
+Line 7
+Line 8
This bug was already present in the initial line-log implementation
added in 2da1d1f6f (Implement line-history search (git log -L),
2013-03-28). Interestingly, that commit already contained a canned
test case covering a similar scenario:
"-L '/long f/',/^}/:a.c -L /main/,/^}/:a.c simple"
This test case looks for two line ranges in the same file, and both
trace back disjointly to the test repository's inital commit,
therefore changes to both line ranges should have been shown for the
initial commit, but only changes for the first line range are shown.
So this test case should have failed from the very beginning, but it
never did, because, unfortunately, the canned expected result is
incorrect, as it doesn't include changes for the second line range.
A similar test with a similarly incorrect canned expected result was
added later in 209618860c (log -L: fix overlapping input ranges,
2013-04-05).
Correct these two canned expected results to contain the changes for
the second line range for the initial commit as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When line-level log is invoked with more than one disjoint line range
in the same file, and one of the commits happens to change that file
such that:
- the last line of a line range R(n) immediately preceeds the first line
modified or added by a hunk H, and
- subtracting the number of lines added by hunk H from the start and
end of the subsequent line range R(n+1) would result in a range
overlapping with line range R(n),
then git aborts with an assertion error, because those overlapping
line ranges violate the invariants:
$ git log --oneline -p
73e4e2f (HEAD -> master) Add lines 6 7 8 9 10
diff --git a/file b/file
index 572d5d9..00935f1 100644
--- a/file
+++ b/file
@@ -3,3 +3,8 @@ Line 2
Line 3
Line 4
Line 5
+Line 6
+Line 7
+Line 8
+Line 9
+Line 10
66e3561 Add lines 1 2 3 4 5
diff --git a/file b/file
new file mode 100644
index 0000000..572d5d9
--- /dev/null
+++ b/file
@@ -0,0 +1,5 @@
+Line 1
+Line 2
+Line 3
+Line 4
+Line 5
$ git log --oneline -L3,5:file -L7,8:file
git: line-log.c:73: range_set_append: Assertion `rs->nr == 0 || rs->ranges[rs->nr-1].end <= a' failed.
Aborted (core dumped)
The line-log machinery encodes line and diff ranges internally as
[start, end) pairs, i.e. include 'start' but exclude 'end', and line
numbering starts at 0 (as opposed to the -LX,Y option, where it starts
at 1, IOW the parameter -L3,5 is represented internally as { start =
2, end = 5 }).
The reason for this assertion error and some related issues is that
there are a couple of places where 'end' is mistakenly considered to
be part of the range:
- When a commit modifies an interesting path, the line-log machinery
first checks which diff range (i.e. hunk) modify any line ranges.
This is done in diff_ranges_filter_touched(), where the outer loop
iterates over the diff ranges, and in each iteration the inner
loop advances the line ranges supposedly until the current line
range ends at or after the current diff range starts, and then the
current diff and line ranges are checked for overlap.
For HEAD in the above example the first line range [2, 5) ends
just before the diff range [5, 10) starts, so the inner loop
should advance, and then the second line range [6, 8) and the diff
range should be checked for overlap.
Unfortunately, the condition of the inner loop mistakenly
considers 'end' as part of the line range, and, seeing the diff
range starting at 5 and the line range ending at 5, it doesn't
skip the first range. Consequently, the diff range and the first
line range are checked for overlap, and after that the outer loop
runs out of diff ranges, and then the processing goes on in the
false belief that this commit didn't touch any of the interesting
line ranges.
The line-log machinery later shifts the line ranges to account for
any added/removed lines in the diff ranges preceeding each line
range. This leaves the first line range intact, but attempts to
shift the second line range [6, 8) by 5 lines towards the
beginning of the file, resulting in [1, 3), triggering the
assertion error, because the two overlapping line ranges violate
the invariants.
Fix that loop condition in diff_ranges_filter_touched() to not
treat 'end' as part of the line range.
- With the above fix the assertion error is gone... but, alas, we
now get stuck in an endless loop!
This happens in range_set_difference(), where a couple of nested
loops iterate over the line and diff ranges, and a condition is
supposed to break the middle loop when the current line range ends
before the current diff range, so processing could continue with
the next line range.
For HEAD in the above example the first line range [2, 5) ends
just before the diff range [5, 10) starts, so this condition
should trigger and break the middle loop.
Unfortunately, just like in the case of the assertion error, this
conditions mistakenly considers 'end' as part of the line range,
and, seeing the line range ending at 5 and the diff range starting
at 5, it doesn't break the loop, which will then go on and on.
Fix this condition in range_set_difference() to not treat 'end' as
part of the line range.
- With the above fix the endless loop is gone... but, alas, the
output is now wrong, as it shows both line ranges for HEAD, even
though the first line range is not modified by that commit:
$ git log --oneline -L3,5:file -L7,8:file
73e4e2f (HEAD -> master) Add lines 6 7 8 9 10
diff --git a/file b/file
--- a/file
+++ b/file
@@ -3,3 +3,3 @@
Line 3
Line 4
Line 5
@@ -6,0 +7,2 @@
+Line 7
+Line 8
66e3561 Add lines 1 2 3 4 5
diff --git a/file b/file
--- /dev/null
+++ b/file
@@ -0,0 +3,3 @@
+Line 3
+Line 4
+Line 5
In dump_diff_hacky_one() a couple of nested loops are responsible
for finding and printing the modified line ranges: the big outer
loop iterates over all line ranges, and the first inner loop skips
over the diff ranges that end before the start of the current line
range. This is followed by a condition checking whether the
current diff range starts after the end of the current line range,
which, when fulfilled, continues and advances the outer loop to
the next line range.
For HEAD in the above example the first line range [2, 5) ends
just before the diff range [5, 10), so this condition should
trigger, and the outer loop should advance to the second line
range.
Unfortunately, just like in the previous cases, this condition
mistakenly considers 'end' as part of the line range, and, seeing
the first line range ending at 5 and the diff range starting at 5,
it doesn't continue to advance the outher loop, but goes on to
show the (unmodified) first line range.
Fix this condition to not treat 'end' as part of the line range,
just like in the previous cases.
After all this the command in the above example finally finishes and
produces the right output:
$ git log --oneline -L3,5:file -L7,8:file
73e4e2f (HEAD -> master) Add lines 6 7 8 9 10
diff --git a/file b/file
--- a/file
+++ b/file
@@ -6,0 +7,2 @@
+Line 7
+Line 8
66e3561 Add lines 1 2 3 4 5
diff --git a/file b/file
--- /dev/null
+++ b/file
@@ -0,0 +3,3 @@
+Line 3
+Line 4
+Line 5
Add a canned test similar to the above example, with the line ranges
adjusted to the test repository's history.
Reported-by: Evgeni Chasnovski <evgeni.chasnovski@gmail.com>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When describing a blob, we traverse from HEAD, remembering each commit
we saw, and then checking each blob to report the containing commit.
But if we haven't seen any commits at all, we'll segfault (we store the
"current" commit as an oid initialized to the null oid, causing
lookup_commit_reference() to return NULL).
This shouldn't be able to happen normally. We always start our traversal
at HEAD, which must be a commit (a property which is enforced by the
refs code). But you can trigger the segfault like this:
blob=$(echo foo | git hash-object -w --stdin)
echo $blob >.git/HEAD
git describe $blob
We can instead catch this case and return an empty result, which hits
the usual "we didn't find $blob while traversing HEAD" error.
This is a minor lie in that we did "find" the blob. And this even hints
at a bigger problem in this code: what if the traversal pointed to the
blob as _not_ part of a commit at all, but we had previously filled in
the recorded "current commit"? One could imagine this happening due to a
tag pointing directly to the blob in question.
But that can't happen, because we only traverse from HEAD, never from
any other refs. And the intent of the blob-describing code is to find
blobs within commits.
So I think this matches the original intent as closely as we can (and
again, this segfault cannot be triggered without corrupting your
repository!).
The test here does not use the formula above, which works only for the
files backend (and not reftables). Instead we use another loophole to
create the bogus state using only Git commands. See the comment in the
test for details.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When given an expected "before" state, the ref-writing code will avoid
overwriting any ref that does not match that expected state. We use the
null oid as a sentinel value for "nothing should exist", and likewise
that is the sentinel value we get when trying to read a ref that does
not exist.
But there's one corner case where this is ambiguous: dangling symrefs.
Trying to read them will yield the null oid, but there is potentially
something of value there: the dangling symref itself.
For a normal recursive write, this is OK. Imagine we have a symref
"FOO_HEAD" that points to a ref "refs/heads/bar" that does not exist,
and we try to write to it with a create operation like:
oid=$(git rev-parse HEAD) ;# or whatever
git symbolic-ref FOO_HEAD refs/heads/bar
echo "create FOO_HEAD $oid" | git update-ref --stdin
The attempt to resolve FOO_HEAD will actually resolve "bar", yielding
the null oid. That matches our expectation, and the write proceeds. This
is correct, because we are not writing FOO_HEAD at all, but writing its
destination "bar", which in fact does not exist.
But what if the operation asked not to dereference symrefs? Like this:
echo "create FOO_HEAD $oid" | git update-ref --no-deref --stdin
Resolving FOO_HEAD would still result in a null oid, and the write will
proceed. But it will overwrite FOO_HEAD itself, removing the fact that
it ever pointed to "bar".
This case is a little esoteric; we are clobbering a symref with a
no-deref write of a regular ref value. But the same problem occurs when
writing symrefs. For example:
echo "symref-create FOO_HEAD refs/heads/other" |
git update-ref --no-deref --stdin
The "create" operation asked us to create FOO_HEAD only if it did not
exist. But we silently overwrite the existing value.
You can trigger this without using update-ref via the fetch
followRemoteHEAD code. In "create" mode, it should not overwrite an
existing value. But if you manually create a symref pointing to a value
that does not yet exist (either via symbolic-ref or with "remote add
-m"), create mode will happily overwrite it.
Instead, we should detect this case and refuse to write. The correct
specification to overwrite FOO_HEAD in this case is to provide an
expected target ref value, like:
echo "symref-update FOO_HEAD refs/heads/other ref refs/heads/bar" |
git update-ref --no-deref --stdin
Note that the non-symref "update" directive does not allow you to do
this (you can only specify an oid). This is a weakness in the update-ref
interface, and you'd have to overwrite unconditionally, like:
echo "update FOO_HEAD $oid" | git update-ref --no-deref --stdin
Likewise other symref operations like symref-delete do not accept the
"ref" keyword. You should be able to do:
echo "symref-delete FOO_HEAD ref refs/heads/bar"
but cannot (and can only delete unconditionally). This patch doesn't
address those gaps. We may want to do so in a future patch for
completeness, but it's not clear if anybody actually wants to perform
those operations. The symref update case (specifically, via
followRemoteHEAD) is what I ran into in the wild.
The code for the fix is relatively straight-forward given the discussion
above. But note that we have to implement it independently for the files
and reftable backends. The "old oid" checks happen as part of the
locking process, which is implemented separately for each system. We may
want to factor this out somehow, but it's beyond the scope of this
patch. (Another curiosity is that the messages in the reftable code are
marked for translation, but the ones in the files backend are not. I
followed local convention in each case, but we may want to harmonize
this at some point).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
These tests set config within a sub-repo using (cd two && git config),
and then a separate test_when_finished outside the subshell to clean it
up. We can't use test_config to do this, because the cleanup command it
registers inside the subshell would be lost. Nor can we do it before
entering the subshell, because the config has to be set after some other
commands are run.
Let's switch these tests to use "git -C" for each command instead of a
subshell. That lets us use test_config (with -C also) at the appropriate
part of the test. And we no longer need the manual cleanup command.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Several tests in t5510 do a bare "cd subrepo", not in a subshell. This
changes the working directory for subsequent tests. As a result, almost
every test has to start with "cd $D" to go back to the top-level.
Our usual style is to do per-test environment changes like this in a
subshell, so that tests can assume they are starting at the top-level
$TRASH_DIRECTORY.
Let's switch to that style, which lets us drop all of that extra
path-handling.
Most cases can switch to using a subshell, but in a few spots we can
simplify by doing "git init foo && git -C foo ...". We do have to make
sure that we weren't intentionally touching the environment in any code
which was moved into a subshell (e.g., with a test_when_finished), but
that isn't the case for any of these tests.
All of the references to the $D variable can go away, replaced generally
with $PWD or $TRASH_DIRECTORY (if we use it inside a chdir'd subshell).
Note in one test, "fetch --prune prints the remotes url", we make sure
to use $(pwd) to get the Windows-style path on that platform (for the
other tests, the exact form doesn't matter).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Several tests set a config variable in a sub-repo we chdir into via a
subshell, like this:
(
cd "$D" &&
cd two &&
git config foo.bar baz
)
But they also clean up the variable with a when_finished directive
outside of the subshell, like this:
test_when_finished "git config unset foo.bar"
At first glance, this shouldn't work! The cleanup clause cannot be run
from the subshell (since environment changes there are lost by the time
the test snippet finishes). But since the cleanup command runs outside
the subshell, our working directory will not have been switched into
"two".
But it does work. Why?
The answer is that an earlier test does a "cd two" that moves the whole
test's working directory out of $TRASH_DIRECTORY and into "two". So the
subshell is a bit of a red herring; we are already in the right
directory! That's why we need the "cd $D" at the top of the shell, to
put us back to a known spot.
Let's make this cleanup code more explicitly specify where we expect the
config command to run. That makes the script more robust against running
a subset of the tests, and ultimately will make it easier to refactor
the script to avoid these top-level chdirs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The changes added by 39fc408562 (t/t1517: automate `git subcmd -h` tests
outside a repository, 2025-08-08) to automatically loop over all "main"
Git commands will, when run against an installed build using
GIT_TEST_INSTALLED rather than the build in the build directory, include
some extra git-gui commands that are installed by `make install`, or
credential helpers that might be installed manually from the contrib
directories. These fail the test, so record them as such.
Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When describing a blob, we search for it by traversing from HEAD. We do
this by feeding the name HEAD to setup_revisions(). But if we are on an
unborn branch, this will fail with a confusing message:
$ git describe $blob
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
It is OK for this to be an error (we cannot find $blob in an empty
traversal, so we'd eventually complain about that). But the error
message could be more helpful.
Let's resolve HEAD ourselves and pass the resolved object id to
setup_revisions(). If resolving fails, then we can print a more useful
message.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
If describe_blob() does not find the blob in question, it returns an
empty strbuf, and we print an empty line. This differs from
describe_commit(), which always either returns an answer or calls die()
itself. As the blob function was bolted onto the command afterwards, I
think its behavior is not intentional, and it is just a bug that it does
not report an error.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add the --format flag to git-repo-info. By using this flag, the users
can choose the format for obtaining the data they requested.
Given that this command can be used for generating input for other
applications and for being read by end users, it requires at least two
formats: one for being read by humans and other for being read by
machines. Some other Git commands also have two output formats, notably
git-config which was the inspiration for the two formats that were
chosen here:
- keyvalue, where the retrieved data is printed one per line, using =
for delimiting the key and the value. This is the default format,
targeted for end users.
- nul, where the retrieved data is separated by NUL characters, using
the newline character for delimiting the key and the value. This
format is targeted for being read by machines.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Justin Tobler <jltobler@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This commit is part of the series that introduces the new subcommand
git-repo-info.
The flag `--is-shallow-repository` from git-rev-parse is used for
retrieving whether the repository is shallow. This way, it is used for
querying repository metadata, fitting in the purpose of git-repo-info.
Then, add a new field `layout.shallow` to the git-repo-info subcommand
containing that information.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Justin Tobler <jltobler@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This commit is part of the series that introduces the new subcommand
git-repo-info.
The flag --is-bare-repository from git-rev-parse is used for retrieving
whether the current repository is bare. This way, it is used for
querying repository metadata, fitting in the purpose of git-repo-info.
Then, add a new field layout.bare to the git-repo-info subcommand
containing that information.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Justin Tobler <jltobler@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This commit is part of the series that introduces the new subcommand
git-repo-info.
The flag `--show-ref-format` from git-rev-parse is used for retrieving
the reference format (i.e. `files` or `reftable`). This way, it is
used for querying repository metadata, fitting in the purpose of
git-repo-info.
Add a new field `references.format` to the repo-info subcommand
containing that information.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Justin Tobler <jltobler@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Many of the commit-graph related functions take in both a repository and
the object database source (directly or via `struct commit_graph`) for
which we are supposed to load such a commit-graph. In the best case this
information is simply redundant as the source already contains a
reference to its owning object database, which in turn has a reference
to its repository. In the worst case this information could even
mismatch when passing in a source that doesn't belong to the same
repository.
Refactor the code so that we only pass in the object database source in
those cases.
There is one exception though, namely `load_commit_graph_chain_fd_st()`,
which is responsible for loading a commit-graph chain. It is expected
that parts of the commit-graph chain aren't located in the same object
source as the chain file itself, but in a different one. Consequently,
this function doesn't work on the source level but on the database level
instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Some of the editor tests manipulate the environment or config in ways
that affect future tests, but those modifications are visible to future
tests and create a footgun for them.
Use test_config, subshells, single-command environment overrides, and
test helpers to automatically undo environment and config modifications
once finished.
Best-viewed-with: --ignore-all-space
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We want the editors in this test on PATH, so put them there.
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Tests in t7005 mask Git error codes and do not use our nice test
helpers. Improve that, move some code into the setup test, and drop a
few old-style blank lines while at it.
Best-viewed-with: --ignore-all-space
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `reftable_stack_add()` function is a simple wrapper to lock the
stack, add records to it via a callback and then commit the
result. One problem with it though is that it doesn't accept any flags
for creating the addition. This makes it impossible to automatically
reload the stack in case it was modified before we managed to lock the
stack.
Add a `flags` field to plug this gap and pass it through accordingly.
For now this new flag won't be used by us, but it will be used by
libgit2.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Multi-pack indices store some information that is redundant with their
owning source:
- The locality bit that tracks whether the source is the primary
object source or an alternate.
- The object directory path the multi-pack index is located in.
- The pointer to the owning parent directory.
All of this information is already contained in `struct odb_source`. So
now that we always have that struct available when loading a multi-pack
index we have it readily accessible.
Drop the redundant information and instead store a pointer to the object
source.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
To load a multi-pack index the caller is expected to pass both the
repository and the object directory where the multi-pack index is
located. While this works, this layout has a couple of downsides:
- We need to pass in information reduntant with the owning source,
namely its object directory and whether the source is local or not.
- We don't have access to the source when loading the multi-pack
index. If we had that access, we could store a pointer to the owning
source in the MIDX and thus deduplicate some information.
- Multi-pack indices are inherently specific to the object source and
its format. With the goal of pluggable object backends in mind we
will eventually want the backends to own the logic of reading and
writing multi-pack indices. Making the logic work on top of object
sources is a step into that direction.
Refactor loading of multi-pack indices accordingly.
This surfaces one small problem though: git-multi-pack-index(1) and our
MIDX test helper both know to read and write multi-pack-indices located
in a different object directory. This issue is addressed by adding the
user-provided object directory as an in-memory alternate.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are a couple of functions that take both a `struct repository` and
a `struct multi_pack_index`. This provides redundant information though
without much benefit given that the multi-pack index already has a
pointer to its owning repository.
Drop the `struct repository` parameter from such functions. While at it,
reorder the list of parameters of `fill_midx_entry()` so that the MIDX
comes first to better align with our coding guidelines.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Refactor builtin/fmt-merge-msg.c to remove the dependancy on the global
'the_repository'. Remove the 'UNUSED' macro from the 'struct repository'
parameter and replace 'git_config()' with 'repo_config()' so that
configuration is read from the passed repository. Also, add a test to
make sure that "git fmt-merge-msg -h" can be called outside a
repository.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When traversing commits, a pathspec item can be used to limit the
traversal to commits that modify the specified paths. And the
commit-graph includes a Bloom filter to exclude commits that definitely
did not modify a given pathspec item. During commit traversal, the
Bloom filter can significantly improve performance. However, it is
disabled if the specified pathspec item contains wildcard characters
or magic signatures.
For performance reason, enable Bloom filter even if a pathspec item
contains wildcard characters by filtering only the non-wildcard part of
the pathspec item.
The function of pathspec magic signature is generally to narrow down
the path specified by the pathspecs. So, enable Bloom filter when
the magic signature is "top", "glob", "attr", "--depth" or "literal".
"exclude" is used to select paths other than the specified path, rather
than serving as a filtering function, so it cannot be used together with
the Bloom filter. Since Bloom filter is not case insensitive even in
case insensitive system (e.g. MacOS), it cannot be used together with
"icase" magic.
With this optimization, we get some improvements for pathspecs with
wildcards or magic signatures. First, in the Git repository we see these
modest results:
git log -100 -- "t/*"
Benchmark 1: new
Time (mean ± σ): 20.4 ms ± 0.6 ms
Range (min … max): 19.3 ms … 24.4 ms
Benchmark 2: old
Time (mean ± σ): 23.4 ms ± 0.5 ms
Range (min … max): 22.5 ms … 24.7 ms
git log -100 -- ":(top)t"
Benchmark 1: new
Time (mean ± σ): 16.2 ms ± 0.4 ms
Range (min … max): 15.3 ms … 17.2 ms
Benchmark 2: old
Time (mean ± σ): 18.6 ms ± 0.5 ms
Range (min … max): 17.6 ms … 20.4 ms
But in a larger repo, such as the LLVM project repo below, we get even
better results:
git log -100 -- "libc/*"
Benchmark 1: new
Time (mean ± σ): 16.0 ms ± 0.6 ms
Range (min … max): 14.7 ms … 17.8 ms
Benchmark 2: old
Time (mean ± σ): 26.7 ms ± 0.5 ms
Range (min … max): 25.4 ms … 27.8 ms
git log -100 -- ":(top)libc"
Benchmark 1: new
Time (mean ± σ): 15.6 ms ± 0.6 ms
Range (min … max): 14.4 ms … 17.7 ms
Benchmark 2: old
Time (mean ± σ): 19.6 ms ± 0.5 ms
Range (min … max): 18.6 ms … 20.6 ms
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Lidong Yan <yldhome2d2@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The act of giving "--no-index" tells Git to pretend that the current
directory is not under control of any Git index or repository, so
even when you happen to be in a Git controlled working tree, where
in that working tree should not matter.
But the start-up sequence tries to discover the top of the working
tree and chdir(2)'s there, even before Git passes control to the
subcommand being run. When diff_no_index() starts running, it
starts at a wrong (from the end-user's point of view who thinks
"git diff --no-index" is merely a better version of GNU diff)
directory, and the original directory the user started the command
is at "prefix".
Because the paths given from argv[] have already been adjusted to
account for this path shuffling by prepending the prefix, and
showing the resulting path by stripping the prefix, the effect of
these nonsense operations (nonsense in the context of "--no-index",
that is) is usually not observable.
Except for special cases like "-", where it is not preprocessed by
prepending the prefix.
Instead of papering over by adding more special cases only to cater
to the no-index codepath in the generic code, drive the diff
machinery more faithfully to what is going on. If the user started
"git diff --no-index" in directory X/Y/Z in a working tree
controlled by Git, and the start up sequence of Git chdir(2)'ed up
to directory X and left Y/Z in the prefix, revert the effect of the
start up sequence by chdir'ing back to Y/Z and emptying the prefix.
Reported-by: Gregoire Geis <opensource@gregoirege.is>
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Commit 6fd1106aa4 ("t3700: Skip a test with backslashes in pathspec",
2009-03-13) introduced the BSLASHPSPEC prerequisite. This prerequisite
allows tests to check for systems that can use backslashes in pathspecs
(e.g. to escape glob special characters). On windows (and cygwin), this
does not work because backslashes are used as directory separators, and
git eagerly converts them to forward slashes.
This test file uses the FUNNYNAMES prerequisite to skip this test file
on windows, despite not really being appropriate for this test, which
does not hold on cygwin. The FUNNYNAMES prerequisite is set when the
system can create files with embedded quotes ("), tabs or newlines in
the name. Since cygwin can satisfy FUNNYNAMES, but not BSLASHPSPEC, this
leads to test failures on cygwin.
In order to skip these tests on cygwin, replace the FUNNYNAMES prerequisite
with BSLASHPSPEC, so that this test file is skipped on both windows and
cygwin. While here, fix a few test titles as well.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Git experts often check the help summary of a command to make sure they
spell options right when suggesting advice to colleagues. Further, they
might check hidden options when responding to queries about deprecated
options like git-rebase(1)'s "preserve merges" option. But some commands
don't support "--help-all" outside of a git directory. Running (for
example)
git rebase --help-all
outside a directory fails in "setup_git_directory", erroring with the
localized form of
fatal: not a git repository (or any of the parent directories): .git
Like 99caeed05d (Let 'git <command> -h' show usage without a git dir,
2009-11-09), we want to show the "--help-all" output even without a git
dir. Make "--help-all" where we expect "-h" to mean
"setup_git_directory_gently", and interpose early in the natural place
("show_usage_with_options_if_asked").
Do the same for usage callers with show_usage_if_asked.
The exception is merge-recursive, whose help block doesn't use newer
APIs.
Best-viewed-with: --ignore-space-change
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When using Meson, builds are out-of-tree and $GIT_BUILD_DIR gets set to
the path where the build output is landing. To locate the Documentation
sources, test 't0450' was using that path.
Modify test 't0450' to use `$GIT_SOURCE_DIR/Documentation` to find the
documentation sources.
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When "git push <remote> <src>:<dst>" does not spell out the
destination side of the ref fully, and when <src> is not given
as a reference but an object name, the code tries to give advice
messages based on the type of that object.
The type is determined by calling odb_read_object_info() and
signalled by its return value. The code however reported a
programming error with BUG() when this function said that there
is no such object, which happens when the object name is given
as a full hexadecimal (if the object name is given as a partial
hexadecimal or an non-existing ref, the function would have died
without returning, so this BUG() wouldn't have triggered). This
is wrong. It is an ordinary end-user mistake to give an object
name that does not exist and treated as such.
An example of the error message produced is as follows:
error: The destination you provided is not a full refname (i.e.,
starting with "refs/"). We tried to guess what you meant by:
- Looking for a ref that matches 'branch' on the remote side.
- Checking if the <src> being pushed ('0000000000000000000000000000000000000001')
is a ref in "refs/{heads,tags}/". If so we add a corresponding
refs/{heads,tags}/ prefix on the remote side.
Neither worked, so we gave up. You must fully qualify the ref.
BUG: remote.c:1221: '0000000000000000000000000000000000000001' should be commit/tag/tree/blob, is '-1'
fatal: the remote end hung up unexpectedly
Aborted (core dumped)
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This style with the empty lines in test bodies was from when the test
suite was being developed. Remove the empty lines to match the modern
test style.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In git-diff, options like `-w` and `-I<regex>`, two files are considered
equivalent under the specified "ignore" rules, even when they are not
bit-for-bit identical. For options like `--raw`, `--name-status`,
and `--name-only`, git-diff deliberately compares only the SHA values
to determine whether two files are equivalent, for performance reasons.
As a result, a file shown in `git diff --name-status` may not appear
in `git diff --patch`.
To quickly determine whether two files are equivalent, add a helper
function diff_flush_patch_quietly() in diff.c. Add `.dry_run` field in
`struct diff_options`. When `.dry_run` is true, builtin_diff() returns
immediately upon finding any change. Call diff_flush_patch_quietly()
to determine if we should flush `--raw`, `--name-only` or `--name-status`
output.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Lidong Yan <yldhome2d2@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
t1517 is now focused on testing subcommands outside a repository.
Move the in-repo `-h` test for `prune` to t5304, which covers
this command.
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
t1517 is now focused on testing subcommands outside a repository.
Move the in-repo `-h` test for `update-server-info` to t5200,
which covers this command.
Suggested-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Replace manual `-h` tests with a loop over all subcommands using
`git --list-cmds=main`. This ensures consistent coverage of `-h`
behavior outside a repo and future-proofs the test by covering
new commands automatically.
Known exceptions are skipped or marked as expected failures.
Suggested-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When you are doing a tree-diff, there are basically two options: do not
recurse into subtrees at all, or recurse indefinitely. While most
callers would want to always recurse and see full pathnames, some may
want the efficiency of looking only at a particular level of the tree.
This is currently easy to do for the top-level (just turn off
recursion), but you cannot say "show me what changed in subdir/, but do
not recurse".
This patch adds a max-depth parameter which is measured from the closest
pathspec match, so that you can do:
git log --raw --max-depth=1 -- a/b/c
and see the raw output for a/b/c/, but not those of a/b/c/d/
(instead of the raw output you would see for a/b/c/d).
Co-authored-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The within_depth() function is used to check whether pathspecs limited
by a max-depth parameter are acceptable. It takes a path to check, a
maximum depth, and a "base" depth. It counts the components in the
path (by counting slashes), adds them to the base, and compares them to
the maximum.
However, if the base does not have any slashes at all, we always return
`true`. If the base depth is 0, then this is correct; no matter what the
maximum is, we are always within it. However, if the base depth is
greater than 0, then we might return an erroneous result.
This ends up not causing any user-visible bugs in the current code. The
call sites in dir.c always pass a base depth of 0, so are unaffected.
But tree_entry_interesting() uses this function differently: it will
pass the prefix of the current entry, along with a `1` if the entry is a
directory, in essence checking whether items inside the entry would be
of interest. It turns out not to make a difference in behavior, but the
reasoning is complex.
Given a tree like:
file
a/file
a/b/file
walking the tree and calling tree_entry_interesting() will yield the
following results:
(with max_depth=0):
file: yes
a: yes
a/file: no
a/b: no
(with max_depth=1):
file: yes
a: yes
a/file: yes
a/b: no
So we have inconsistent behavior in considering directories interesting.
If they are at the edge of our depth but at the root, we will recurse
into them, but then find all of their entries uninteresting (e.g., in
the first case, we will look at "a" but find "a/*" uninteresting). But
if they are at the edge of our depth and not at the root, then we will
not recurse (in the second example, we do not even bother entering
"a/b").
This turns out not to matter because the only caller which uses
max-depth pathspecs is cmd_grep(), which only cares about blob entries.
From its perspective, it is exactly the same to not recurse into a
subtree, or to recurse and find that it contains no matching entries.
Not recursing is merely an optimization.
It is debatable whether tree_entry_interesting() should consider such an
entry interesting. The only caller does not care if it sees the tree
itself, and can benefit from the optimization. But if we add a
"max-depth" limiter to regular diffs, then a diff with
DIFF_OPT_TREE_IN_RECURSIVE would probably want to show the tree itself,
but not what it contains.
This patch just fixes within_depth(), which means we consider such
entries uninteresting (and makes the current caller happy). If we want
to change that in the future, then this fix is still the correct first
step, as the current behavior is simply inconsistent.
This has the effect the function tree_entry_interesting() now behaves
like following on the first example:
(with max_depth=0):
file: yes
a: no
a/file: no
a/b: no
Meaning we won't step in "a/" no more to realize all "a/*" entries are
uninterested, but we stop at the tree entry itself.
Based-on-patch-by: Jeff King <peff@peff.net>
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
At GitHub, we've got a real-world repository that has been triggering
failures of the form:
git: merge-ort.c:3007: process_renames: Assertion `newinfo && !newinfo->merged.clean' failed.
which comes from the line:
VERIFY_CI(newinfo);
Unfortunately, this one has been quite complex to unravel, and is a
bit complex to explain. So, I'm going to carefully try to explain each
relevant piece needed to understand the fix, then carefully build up
from a simple testcase to some of the relevant testcases.
== New special case we need to consider ==
Rename pairs in the diffcore machinery connect the source path of a
rename with the destination path of a rename. Since we have rename
pairs to consider on both sides of history since the merge base,
merging has to consider a few special cases of possible overlap:
A) two rename pairs having the same target path
B) two rename pairs having the same source path
C) the source path of one rename pair being the target path of a
different rename pair
Some of these came up often enough that we gave them names:
A) a rename/rename(2to1) conflict (looks similar to an add/add conflict)
B) a rename/rename(1to2) conflict, which represents the same path being
renamed differently on the two sides of history
C) not yet named
merge-ort is well-prepared to handle cases (A) and (B), as was
merge-recursive (which was merge-ort's predecessor). Case (C) was
briefly considered during the years of merge-recursive maintenance,
but the full extent of support it got was a few FIXME/TODO comments
littered around the code highlighting some of the places that would
probably need to be fixed to support it. When I wrote merge-ort I
ignored case (C) entirely, since I believed that case (C) was only
possible if we were to support break detection during merges. Not
only had break detection never been supported by any merge algorithm,
I thought break detection wasn't worth the effort to support in a
merge algorithm. However, it turns out that case (C) can be triggered
without break detection, if there's enough moving pieces.
Before I dive into how to trigger case (C) with directory renames plus
other renames, it might be helpful to use a simpler example with break
detection first. And before we get to that it may help to explain
some more basics of handling renames in the merge algorithm. So, let
me first backup and provide a quick refresher on each of
* handling renames
* what break detection would mean, if supported in merging
* handling directory renames
From there, I'll build up from a basic directory rename detection case
to one that triggers a failure currently.
== Handling renames ==
In the merge machinery when we have a rename of a path A -> B,
processing that rename needs to remove path A, and make sure that path B
has the relevant information. Note that if the content was also
modified on both sides, this may mean that we have 3 different stages
that need to be stored at path B instead of having some stored at path
A.
Having all stages stored at path B makes it much easier for users to
investigate and resolve the content conflict associated with a renamed
path. For example:
* "git status" doesn't have to figure out how to list paths A & B and
attempt to connect them for users; it can just list path B.
* Users can use "git ls-files -u B" (instead of trying to find the
previous name of the file so they can list both, i.e. "git ls-files
-u A B")
* Users can resolve via "git add B" (without needing to "git rm A")
== What break detection would mean ==
If break detection were supported, we might have cases where A -> B
*and* C -> A, meaning that both rename pairs might believe they need to
update A. In particular, the processing of A -> B would need to be
careful to not clear out all stages of A and mark it resolved, while
both renames would need to figure out which stages of A belong with A
and which belong with B, so that both paths have the right stages
associated with them.
merge-ort (like merge-recursive before it) makes no attempt to handle
break detection; it runs with break detection turned off. It would
need to be retrofitted to handle such cases.
== Directory rename detection ==
If one side of history renames directory D/ -> E/, and the other side of
history adds new files to D/, then directory rename detection notices
and suggests moving those new files to E/. A similar thing is done for
paths renamed into D/, causing them to be transitively renamed into E/.
The default in the merge machinery is to report a conflict whenever a
directory rename might modify the location of a path, so that users can
decide whether they wanted the original path or the
directory-rename-induced location. However, that means the default
codepath still runs through all the directory rename detection logic, it
just supplements it with providing conflict notices when it is done.
== Building up increasingly complex testcases ==
I'll start with a really simple directory rename example, and then
slowly add twists that explain new pieces until we get to the
problematic cases:
=== Testcase 1 ===
Let's start with a concrete example, where particular files/directories of
interest that exist or are changed on each side are called out:
Original: <nothing of note>
our side: rename B/file -> C/file
their side: rename C/ -> A/
For this case, we'd expect to see the original B/file appear not at
C/file but at A/file.
(We would also expect a conflict notice that the user will want to
choose between C/file and A/file, but I'm going to ignore conflict
notices from here on by assuming merge.directoryRenames is set to
`true` rather than `conflict`; the only difference that assumption
makes is whether that makes the merge be considered to be conflicted
and whether it prints a conflict notice; what is written to the index
or working directory is unchanged.)
=== Testcase 2 ===
Modify testcase 1 by having A/file exist from the start:
Original: A/file exists
our side: rename B/file -> C/file
their side: rename C/ -> A/
In such a case, to avoid user confusion at what looks kind of like an
add/add conflict (even though the original path at A/file was not added
by either side of the merge), we turn off directory rename detection for
this path and print a "in the way" warning to the user:
CONFLICT (implicit dir rename): Existing file/dir ... in the way ...
The testcases in section 5 of t6423 explore these in more detail.
=== Testcase 3 ===
Let's modify testcase 1 in a slightly different way: have A/file be
added by their side rather than it already existing.
Original: <nothing of note>
our side: rename B/file -> C/file
their side: rename C/ -> A/
add A/file
In this case, the directory rename detection basically transforms our
side's original B/file -> C/file into a B/file -> A/file, and so we
get a rename/add conflict, with one version of A/file coming from the
renamed file, and another coming from the new A/file, each stored as
stages 2 and 3 in conflicts. This kind of add/add conflict is perhaps
slightly more complex than a regular add/add conflict, but with the
printed messages it makes sense where it came from and we have
different stages of the file to work with to resolve the conflict.
=== Testcase 4 ===
Let's do something similar to testcase 3, but have the opposite side of
history add A/file:
Original: <nothing of note>
our side: rename B/file -> C/file
add A/file
their side: rename C/ -> A/
Now if we allow directory rename detection to modify C/file to A/file,
then we also get a rename/add conflict, but in this case we'd need both
higher order stages being recorded on side 2, which makes no sense. The
index can't store multiple stage 2 entries, and even if we could, it
would probably be confusing for users to work with. So, similar to what
we do when there was an A/file in the original version, we simply turn
off directory rename detection for cases like this and provide the "in
the way" CONFLICT notice to the user.
=== Testcase 5 ===
We're slowly getting closer. Let's mix it up by having A/file exist at
the beginning but not exist on their side:
original: A/file exists
our side: rename B/file -> C/file
their side: rename C/ -> A/
rename A/file -> D/file
For this case, you could say that since A/file -> D/file, it's no longer
in the way of C/file being moved by directory rename detection to
A/file. But that would give us a case where A/file is both the source
and the target of a rename, similar to break detection, which the code
isn't currently equipped to handle.
This is not yet the case that causes current failures; to the current
code, this kind of looks like testcase 4 in that A/file is in the way
on our side (since A/file was in the original and was umodified by our
side). So, it results in a "in the way" notification with directory
rename detection being turned off for A/file so that B/file ends up at
C/file.
Perhaps the resolution could be improved in the future, but our "in
the way" checks prevented such problems by noticing that A/file exists
on our side and thus turns off directory rename detection from
affecting C/file's location. So, while the merge result could be
perhaps improved, the fact that this is currently handled by giving
the user an "in the way" message gives the user a chance to resolve
and prevents the code from tripping itself up.
=== Testcase 6 ===
Let's modify testcase 5 a bit more, to also delete A/file on our side:
original: A/file exists
our side: rename B/file -> C/file
delete A/file
their side: rename C/ -> A/
rename A/file -> D/file
Now the "in the way" logic doesn't detect that there's an A/file in
the way (neither side has an A/file anymore), so it's fine to
transitively rename C/file further to A/file...except that we end up
with A/file being both the source of one rename, and the target of a
different rename. Each rename pair tries to handle the resolution of
the source and target paths of its own rename. But when we go to
process the second rename pair in process_renames(), we do not expect
either the source or the destination to be marked as handled already;
so, when we hit the sanity checks that these are not handled:
VERIFY_CI(oldinfo);
VERIFY_CI(newinfo);
then one of these is going to throw an assertion failure since the
previous rename pair already marked both of its paths as handled.
This will give us an error of the form:
git: merge-ort.c:3007: process_renames: Assertion `newinfo && !newinfo->merged.clean' failed.
This is the failure we're currently triggering, and it fundamentally
depends on:
* a path existing in the original
* that original path being removed or renamed on *both* sides
* some kind of directory rename moving some *other* path into that
original path
This was added as testcase 12q in t6423.
=== Testcase 7 ===
Bonus bug found while investigating!
Let's go back to the comparison between testcases 2 & 3, and set up a
file present on their side that we need to consider:
Original: A/file exists
our side: rename B/file -> C/file
rename A/file -> D/file
their side: rename C/ -> A/
Here, there is no A/file in the way on our side like testcase 4.
There is an A/file present on their side like testcase 3, which was an
add/add conflict, but that's associated with the file be renamed to
D/file. So, that really shouldn't be an add/add conflict because we
instead want all modes of the original A/file to be transported to
D/file.
Unfortunately, the current code kind of treats it like an add/add
conflict instead...but even worse. There is also a valid mode for
A/file in the original, which normally goes to stage 1. However, an
add/add conflict should be represented in the index with no mode at
stage 1 (for the original side), only modes at stages 2 and 3 (for our
and their side), so for an add/add we'd expect that mode for A/file in
the original version to be cleared out (or be transported to D/file).
Unfortunately, the code currently leaves not only the stage 3 entry
for A/file intact, it also leaves the stage 1 entry for A/file. This
results in `git ls-files -u A/file` output of the form:
100644 d00491fd7e5bb6fa28c517a0bb32b8b506539d4d 1 A/file
100644 0cfbf08886fca9a91cb753ec8734c84fcbe52c9f 2 A/file
100644 d00491fd7e5bb6fa28c517a0bb32b8b506539d4d 3 A/file
This would likely cause users to believe this isn't an add/add
conflict; rather, this would lead them to believe that A/file was only
modified on our side and that therefore it should not have been a
conflict in the first place. And while resolving the conflict in
favor of our side is the correct resolution (because stages 1 and 3
should have been cleared out in the first place), this is certainly
likely to cause confusion for anyone attempting to investigate why
this path was marked as conflicted.
This was added as testcase 12p in t6423.
== Attempted solutions that I discarded ==
1) For each side of history, create a strset of the sources of each
rename on the other side of history. Then when using directory
renames to modify existing renames, verify that we aren't renaming
to a source of another rename.
Unfortunately, the "relevant renames" optimization in merge-ort
means we often don't detect renames -- we just see a delete and an
add -- which is easy to forget and makes debugging testcases harder,
but it also turns out that this solution in insufficient to solve
the related problems in the area (more on that below).
2) Modify the code to be aware of the possibility of renaming to
the source of another side's rename, and make all the conflict
resolution logic for each case (including existing
rename/rename(2to1) and rename/rename(1to2) cases) handle the
additional complexity. It turns out there was much more code to
audit than I wanted, for a really niche case. I didn't like how
many changes were needed, and aborted.
== Solution ==
We do not want the stages of unrelated files appearing at the same path
in the index except when dealing with an add/add conflict. While we
previously handled this for stages 2 & 3, we also need to worry about
stage 1. So check for a stage 1 index entry being in the way of a
directory rename.
However, if we can detect that the stage 1 index entry is actually from
a related file due to a directory-rename-causes-rename-to-self
situation, then we can allow the stage 1 entry to remain.
From this wording, you may note that it's not just rename cases that
are a problem; bugs could be triggered with directory renames vs simple
adds. That leads us to...
== Testcases 8+ ==
Another bonus bug, found via understanding our final solutions (and the
failure of our first attempted solution)!
Let's tweak testcase 7 a bit:
Original: A/file exists
our side: delete A/file
add -> C/file
their side: delete A/file
rename C/ -> A/
Here, there doesn't seem to be a big problem. Sure C/file gets modified
via the directory rename of C/ -> A/ so that it becomes A/file, but
there's no file in the way, right? Actually, here we have a problem
that the stage 1 entry of A/file would be combined with the stage 2
entry of C/file, and make it look like a modify/delete conflict.
Perhaps there is some extra checking that could be added to the code to
make it attempt to clear out the stage 1 entry of A/file, but the
various rename-to-self-via-directory-rename testcases make that a bit
more difficult. For now, it's easier to just treat this as a
path-in-the-way situation and not allow the directory rename to modify
C/file.
That sounds all well and good, but it does have an interesting side
effect. Due to the "relevant renames" optimizations in merge-ort (i.e.
only detect the renames you need), 100% renames whose files weren't
modified on the other side often go undetected. This means that if we
modify this testcase slightly to:
Original: A/file exists
our side: A/file -> C/file
their side: rename C/ -> A/
Then although this looks like where the directory rename just moves
C/file back to A/file and there's no problem, we may not detect the
A/file -> C/file rename. Instead it will look like a deletion of A/file
and an addition of C/file. The directory rename then appears to be
moving C/file to A/file, which is on top of an "unrelated" file (or at
least a file it doesn't know is related). So, we will report
path-in-the-way conflicts now in cases where we didn't before. That's
better than silently and accidentally combining stages of unrelated
files and making them look like a modify/delete; users can investigate
the reported conflict and simply resolve it.
This means we tweak the expected solution for testcases 12i, 12j, and
12k. (Those three tests are basically the same test repeated three
times, but I was worried when I added those that subtle differences in
parent/child, sibling/sibling, and toplevel directories might mess up
how rename-to-self testcases actually get handled.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We have multiple bugs here -- accidental silent file deletion,
accidental silent file retention for files that should be deleted,
and incorrect number of entries left in the index.
The series merged at commit d3b88be1b450 (Merge branch
'en/merge-dir-rename-corner-case-fix', 2021-07-16) introduced testcase
12i-12k in t6423 which checked for rename-to-self cases, and fixed bugs
that merge-ort and merge-recursive had with these testcases. At the
time, I noted that merge-ort had one bug for these cases, while
merge-recursive had two. It turns out that merge-ort did in fact have
another bug, but the "relevant renames" optimizations were masking it.
If we modify testcase 12i from t6423 to modify the file in the commit
that renames it (but only modify it enough that it can still be detected
as a rename), then we can trigger silent deletion of the file.
Tweak testcase 12i slightly to make the file in question have more than
one line in it. This leaves the testcase intact other than changing the
initial contents of this one file. The purpose of this tweak is to
minimize the changes between this testcase and a new one that we want to
add. Then duplicate testcase 12i as 12i2, changing it so that it adds a
single line to the file in question when it is renamed; testcase 12i2
then serves as a testcase for this merge-ort bug that I previously
overlooked.
Further, commit 98a1a00d5301 (t6423: add a testcase causing a failed
assertion in process_renames, 2025-03-06), fixed an issue with
rename-to-self but added a new testcase, 12n, that only checked for
whether the merge ran to completion. A few commits ago, we modified
this test to check for the number of entries in the index -- but noted
that the number was wrong. And we also noted a
silently-keep-instead-of-delete bug at the same time in the new testcase
12n2.
In summary, we have the following bugs with rename-to-self cases:
* silent deletion of file expected to be kept (t6423 testcase 12i2)
* silent retention of file expected to be removed (t6423 testcase 12n2)
* wrong number of extries left in the index (t6423 testcase 12n)
All of these bugs arise because in a rename-to-self case, when we have a
rename A->B, both A and B name the same file. The code in
process_renames() assumes A & B are different, and tries to move the
higher order stages and file contents so that they are associated just
with the new path, but the assumptions of A & B being different can
cause A to be deleted when it's not supposed to be or mark B as resolved
and kept in place when it's supposed to be deleted. Since A & B are
already the same path in the rename-to-self case, simply skip the steps
in process_renames() for such files to fix these bugs.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Commit 806f83287f8d (t6423: test directory renames causing
rename-to-self, 2021-06-30) introduced testcase 12i-12k but omitted
staging one of the files and copy-pasted that mistake to the other
tests. This means the merge runs with an unstaged change, even though
that isn't related to what is being tested and makes the test look more
complicated than it is.
The cover letter for the series associated with the above commit (see
Message-ID: pull.1039.git.git.1624727121.gitgitgadget@gmail.com) noted
that these testcases triggered two bugs in merge-recursive but only one
in merge-ort; in merge-recursive these testcases also triggered a
silent deletion of the file in question when it shouldn't be deleted.
What I didn't realize at the time was that the deletion bug in merge-ort
was merely being sidestepped by the "relevant renames" optimization but
can actually be triggered. A subsequent commit will deal with that
additional bug, but it was complicated by the mistaken forgotten
staging, so this commit first fixes that issue.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When commit 98a1a00d5301 (t6423: add a testcase causing a failed
assertion in process_renames, 2025-03-06) was added, I tweaked
the commit message, and moved the test into t6423. However, that
still left two other things missing that made this test unlike the
others in the same testfile:
* It didn't have an English description of the test setup like
all other tests in t6423
* It didn't check that the right number of files were present at
the end
The former issue is a minor detail that isn't that critical, but the
latter feels more important. If it had been done, I might have noticed
another bug. In particular, this testcase involves
Side A: rename world -> tools/world
and
Side B: rename tools/ -> <the toplevel>
Side B: remove world
The tools/ -> <toplevel> rename turns the world -> tools/world rename
into world -> world, i.e. a rename-to-self case. But, it's a path
conflict because merge.directoryRenames defaults to false. There's
no content conflict because Side A didn't modify world, so we should
just take the content of world from Side B -- i.e. delete it. So, we
have a conflict on the path, but not on its content. We could consider
letting the content trump since it is unconflicted, but if we are going
to leave a conflict, it should certainly represent that 'world' existed
both in the base version and on Side A. Currently it doesn't.
Add a description of this test, add some checking of the number of
entries in the index at the end of the merge, and mark the test as
expecting to fail for now. A subsequent commit will fix this bug.
While at it, I found another related bug from a nearly identical setup
but setting merge.directoryRenames=true. Copy testcase 12n into 12n2,
changing it to use merge instead of cherry-pick, and turn on directory
renames for this test. In this case, since there is no content conflict
and no path conflict, it should be okay to delete the file.
Unfortunately, the code resolves without conflict but silently leaves
world despite the fact it should be deleted. It might also be okay if
the code spuriously thought there was a modify/delete conflict here;
that would at least notify users to look closer and then when they
notice there was no change since the base version, they can easily
resolve. A conflict notice is much better than silently providing the
wrong resolution. Cover this with the 12n2 testcase, which for now is
marked as expecting to fail as well.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Squelch false-positive compiler warning.
* dl/squelch-maybe-uninitialized:
t/unit-tests/clar: fix -Wmaybe-uninitialized with -Og
remote: bail early from set_head() if missing remote name
|
|
It was recently reported [1] that renaming a remote that has dangling
symrefs is broken. This issue can be trivially reproduced:
$ git init repo
Initialized empty Git repository in /tmp/repo/.git/
$ cd repo/
$ git remote add origin /dev/null
$ git symbolic-ref refs/remotes/origin/HEAD refs/remotes/origin/master
$ git remote rename origin renamed
$ git symbolic-ref refs/remotes/origin/HEAD
refs/remotes/origin/master
$ git symbolic-ref refs/remotes/renamed/HEAD
fatal: ref refs/remotes/renamed/HEAD is not a symbolic ref
As one can see, the "HEAD" reference did not get renamed but stays in
the same place. There are two issues here:
- We use `refs_resolve_ref_unsafe()` to resolve references, but we
don't pass the `RESOLVE_REF_NO_RECURSE` flag. Consequently, if the
reference does not resolve, the function will fail and we thus
ignore this branch.
- We use `refs_for_each_ref()` to iterate through the old remote's
references, but that function ignores broken references.
Both of these issues are easy to fix. But having a closer look at the
logic that renames remote references surfaces that it leaves a lot to be
desired overall.
The problem is that we're using O(|refs| + |symrefs| * 2) many reference
transactions to perform the renames. We first delete all symrefs, then
individually rename every direct reference and finally we recreate the
symrefs. On the one hand this isn't even remotely an atomic operation,
so if we hit any error we'll already have deleted all references.
But more importantly it is also extremely inefficient. The number of
transactions for symrefs doesn't really bother us too much, as there
should generally only be a single symref anyway ("HEAD"). But the
renames are very expensive:
- For the "reftable" backend we perform auto-compaction after every
single rename, which does add up.
- For the "files" backend we potentially have to rewrite the
"packed-refs" file on every single rename in case they are packed.
The consequence here is quadratic runtime performance. Renaming a
100k references takes hours to complete.
Refactor the code to use a single transaction to perform all the
reference updates atomically, which speeds up the transaction quite
significantly:
Benchmark 1: rename remote (refformat = files, revision = HEAD~)
Time (mean ± σ): 238.770 s ± 13.857 s [User: 91.473 s, System: 143.793 s]
Range (min … max): 204.863 s … 247.699 s 10 runs
Benchmark 2: rename remote (refformat = files, revision = HEAD)
Time (mean ± σ): 2.103 s ± 0.036 s [User: 0.360 s, System: 1.313 s]
Range (min … max): 2.011 s … 2.141 s 10 runs
Summary
rename remote (refformat = files, revision = HEAD) ran
113.53 ± 6.87 times faster than rename remote (refformat = files, revision = HEAD~)
For the "reftable" backend we see a significant speedup, as well, but
given that we don't have quadratic runtime behaviour there it's way less
extreme:
Benchmark 1: rename remote (refformat = reftable, revision = HEAD~)
Time (mean ± σ): 8.604 s ± 0.539 s [User: 4.985 s, System: 2.368 s]
Range (min … max): 7.880 s … 9.556 s 10 runs
Benchmark 2: rename remote (refformat = reftable, revision = HEAD)
Time (mean ± σ): 1.177 s ± 0.103 s [User: 0.446 s, System: 0.270 s]
Range (min … max): 1.023 s … 1.410 s 10 runs
Summary
rename remote (refformat = reftable, revision = HEAD) ran
7.31 ± 0.79 times faster than rename remote (refformat = reftable, revision = HEAD~)
There is one issue though with using atomic transactions: when nesting a
remote into itself it can happen that renamed references conflict with
the old referencse. For example, when we have a reference
"refs/remotes/origin/foo" and we rename "origin" to "origin/foo", then
we'll end up with an F/D conflict when we try to create the renamed
reference "refs/remotes/origin/foo/foo".
This situation is overall quite unlikely to happen: people tend to not
use nested remotes, and if they do they must at the same time also have
a conflicting refname. But the end result would be that the old remote
references stay intact whereas all the other parts of the repository
have been adjusted for the new remote name.
Address this by queueing and preparing the reference update before we
touch any other part of the repository. Like this we can make sure that
the reference update will go through before rewriting the configuration.
Otherwise, if the transaction fails to prepare we can gracefully abort
the whole operation without any changes having been performed in the
repository yet. Furthermore, we can detect the conflict and print some
helpful advice for how the user can resolve this situation. So overall,
the tradeoff is that:
- Reference transactions are now all-or-nothing. This is a significant
improvement over the previous state where we may have ended up with
partially-renamed references.
- Rewriting references is now significantly faster.
- We only rewrite the configuration in case we know that all
references can be updated.
- But we may refuse to rename a remote in case references conflict.
Overall this seems like an acceptable tradeoff.
While at it, fix the handling of symbolic/broken references by using
`refs_for_each_rawref()`. Add tests that cover both this reported issue
and tests that exercise nesting of remotes.
One thing to note: with this change we cannot provide a proper progress
monitor anymore as we queue the references into the transactions as we
iterate through them. Consequently, as we don't know yet how many refs
there are in total, we cannot report how many percent of the operation
is done anymore. But that's a small price to pay considering that you
now shouldn't need the progress monitor in most situations at all
anymore.
[1]: <CANrWfmQWa=RJnm7d3C7ogRX6Tth2eeuGwvwrNmzS2gr+eP0OpA@mail.gmail.com>
Reported-by: Han Jiang <jhcarl0814@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
With `refs_for_each_reflog_ent()` callers can iterate through all the
reflog entries for a given reference. The callback that is being invoked
for each such entry does not receive the name of the reference that we
are currently iterating through. This isn't really a limiting factor, as
callers can simply pass the name via the callback data.
But this layout sometimes does make for a bit of an awkward calling
pattern. One example: when iterating through all reflogs, and for each
reflog we iterate through all refnames, we have to do some extra book
keeping to track which reference name we are currently yielding reflog
entries for.
Change the signature of the callback function so that the reference name
of the reflog gets passed through to it. Adapt callers accordingly and
start using the new parameter in trivial cases. The next commit will
refactor the reference migration logic to make use of this parameter so
that we can simplify its logic a bit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* ps/reflog-migrate-fixes:
refs: fix invalid old object IDs when migrating reflogs
refs: stop unsetting REF_HAVE_OLD for log-only updates
refs/files: detect race when generating reflog entry for HEAD
refs: fix identity for migrated reflogs
ident: fix type of string length parameter
builtin/reflog: implement subcommand to write new entries
refs: export `ref_transaction_update_reflog()`
builtin/reflog: improve grouping of subcommands
Documentation/git-reflog: convert to use synopsis type
|
|
4c063c82e9 (rebase -i: improve error message when picking merge,
2024-05-30) added advice texts for cases when a merge commit is
passed as argument of sequencer command that cannot operate with
a merge commit. However, it forgot about the 'drop' command, so
that in this case the BUG() in the default branch is reached.
Handle 'drop' like 'merge', i.e., permit it without a message.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When migrating reflog entries between different storage formats we end
up with invalid old object IDs for the migrated entries: instead of
writing the old object ID of the to-be-migrated entry, we end up with
the all-zeroes object ID.
The root cause of this issue is that we don't know to use the old object
ID provided by the caller. Instead, we manually resolve the old object
ID by resolving the current value of its matching reference. But as that
reference does not yet exist in the target ref storage we always end up
resolving it to all-zeroes.
This issue got unnoticed as there is no user-facing command that would
even show the old object ID. While `git log -g` knows to show the new
object ID, we don't have any formatting directive to show the old object
ID.
Fix the bug by introducing a new flag `REF_LOG_USE_PROVIDED_OIDS`. If
set, backends are instructed to use the old and new object IDs provided
by the caller, without doing any manual resolving. Set this flag in
`ref_transaction_update_reflog()`.
Amend our tests in t1460-refs-migrate to use our test tool to read
reflog entries. This test tool prints out both old and new object ID of
each reflog entry, which fixes the test gap. Furthermore it also prints
the full identity used to write the reflog, which provides test coverage
for the previous commit in this patch series that fixed the identity for
migrated reflogs.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While we provide a couple of subcommands in git-reflog(1) to remove
reflog entries, we don't provide any to write new entries. Obviously
this is not an operation that really would be needed for many use cases
out there, or otherwise people would have complained that such a command
does not exist yet. But the introduction of the "reftable" backend
changes the picture a bit, as it is now basically impossible to manually
append a reflog entry if one wanted to do so due to the binary format.
Plug this gap by introducing a simple "write" subcommand. For now, all
this command does is to append a single new reflog entry with the given
object IDs and message to the reflog. More specifically, it is not yet
possible to:
- Write multiple reflog entries at once.
- Insert reflog entries at arbitrary indices.
- Specify the date of the reflog entry.
- Insert reflog entries that refer to nonexistent objects.
If required, those features can be added at a future point in time. For
now though, the new command aims to fulfill the most basic use cases
while being as strict as possible when it comes to verifying parameters.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The case where a new submodule takes a path where used to be a
completely different subproject is now dealt a bit better than
before.
* kj/renamed-submodule:
fixup! submodule: skip redundant active entries when pattern covers path
fixup! submodule: prevent overwriting .gitmodules on path reuse
submodule: skip redundant active entries when pattern covers path
submodule: prevent overwriting .gitmodules on path reuse
|
|
Add a test script, `t/t1461-refs-list.sh`, for the new `git refs list`
command.
This script acts as a simple driver, leveraging the shared test library
created in the preceding commit. It works by overriding the
`$git_for_each_ref` variable to "git refs list" and then sourcing the
shared library (`t/for-each-ref-tests.sh`).
This approach ensures that `git refs list` is tested against the
entire comprehensive test suite of `git for-each-ref`, verifying
that it acts as a compatible drop-in replacement.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: shejialuo <shejialuo@gmail.com>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In preparation for adding tests for the new `git refs list` command,
refactor the existing t6300 test suite to make its logic shareable.
Move the core test logic from `t6300-for-each-ref.sh` into a new
`for-each-ref-tests.sh` file. Inside this new script, replace hardcoded
calls to "git for-each-ref" with the `$git_for_each_ref` variable.
The original `t6300-for-each-ref.sh` script now becomes a simple
"driver". It is responsible for setting the default value of the
variable and then sourcing the test library.
This new structure follows the established pattern used for sharing
tests between `git-blame` and `git-annotate` and prepares the test suite
for the `refs list` tests to be added in a subsequent commit.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: shejialuo <shejialuo@gmail.com>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Usage string for `git for-each-ref` was out of sync with its official
documentation. The test `t0450-txt-doc-vs-help.sh` was marked as broken
due to this.
Update the usage string to match the documentation. This allows the test
to pass, so remove the corresponding 'known breakage' marker from the
test file.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: shejialuo <shejialuo@gmail.com>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When building with -Og on gcc 15.1.1, the build produces a warning. In
practice, though, this cannot be hit because `exact` acts as a guard and
that variable can only be set after `matchlen` is already initialized
Assign a default value to `matchlen` so that the warning is silenced.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Comment fix.
* jc/test-hashmap-is-still-here:
test-hashmap: document why it is no longer used but still there
|
|
Test fix.
* ch/t7450-recursive-clone-test-fix:
t7450: inspect the correct path a broken code would write to
|
|
Build fix.
* ps/meson-clar-decls-fix:
meson: ensure correct "clar-decls.h" header is used
|
|
"git add/etc -p" now honor the diff.context configuration variable,
and also they learn to honor the -U<n> command-line option.
* lm/add-p-context:
add-patch: add diff.context command line overrides
add-patch: respect diff.context configuration
t: use test_config in t4055
t: use test_grep in t3701 and t4055
|
|
The config API had a set of convenience wrapper functions that
implicitly use the_repository instance; they have been removed and
inlined at the calling sites.
* ps/config-wo-the-repository: (21 commits)
config: fix sign comparison warnings
config: move Git config parsing into "environment.c"
config: remove unused `the_repository` wrappers
config: drop `git_config_set_multivar()` wrapper
config: drop `git_config_get_multivar_gently()` wrapper
config: drop `git_config_set_multivar_in_file_gently()` wrapper
config: drop `git_config_set_in_file_gently()` wrapper
config: drop `git_config_set()` wrapper
config: drop `git_config_set_gently()` wrapper
config: drop `git_config_set_in_file()` wrapper
config: drop `git_config_get_bool()` wrapper
config: drop `git_config_get_ulong()` wrapper
config: drop `git_config_get_int()` wrapper
config: drop `git_config_get_string()` wrapper
config: drop `git_config_get_string()` wrapper
config: drop `git_config_get_string_multi()` wrapper
config: drop `git_config_get_value()` wrapper
config: drop `git_config_get_value()` wrapper
config: drop `git_config_get()` wrapper
config: drop `git_config_clear()` wrapper
...
|
|
Code clean-up.
* kn/for-each-ref-skip-updates:
ref-filter: use REF_ITERATOR_SEEK_SET_PREFIX instead of '1'
t6302: add test combining '--start-after' with '--exclude'
for-each-ref: reword the documentation for '--start-after'
for-each-ref: fix documentation argument ordering
ref-cache: use 'size_t' instead of int for length
|
|
A new test to ensure that a recent change will keep working.
* jb/t7510-gpg-program-path:
t7510: use $PWD instead of $(pwd) inside PATH
t7510: add test cases for non-absolute gpg program
|
|
Test clean-up.
* cc/t9350-cleanup:
t9350: redirect input to only fast-import
|
|
A few file descriptors left unclosed upon program completion in a
few test helper programs are now closed.
* hl/test-helper-fd-close:
test-delta: close output descriptor after use
test-delta: use strbufs to hold input files
test-delta: handle errors with die()
t/helper/test-truncate: close file descriptor after truncation
|
|
"git rebase -i" with bogus rebase.instructionFormat configuration
failed to produce the todo file after recording the state files,
leading to confused "git status"; this has been corrected.
* ow/rebase-verify-insn-fmt-before-initializing-state:
rebase: write script before initializing state
|
|
"git for-each-ref" learns "--start-after" option to help
applications that want to page its output.
* kn/for-each-ref-skip:
ref-cache: set prefix_state when seeking
for-each-ref: introduce a '--start-after' option
ref-filter: remove unnecessary else clause
refs: selectively set prefix in the seek functions
ref-cache: remove unused function 'find_ref_entry()'
refs: expose `ref_iterator` via 'refs.h'
|
|
Thanks to the new STRING_LIST_SPLIT_NONEMPTY flag, a common pattern
to split a string into a string list and then remove empty items in
the resulting list is no longer needed. Instead, just tell the
string_list_split*() to omit empty ones while splitting.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Teach the unified split_string() machinery a new flag bit,
STRING_LIST_SPLIT_NONEMPTY, to cause empty split pieces to be
omitted from the resulting string list.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Teach the unified split_string() to take an optional "flags" word,
and define the first flag STRING_LIST_SPLIT_TRIM to cause the split
pieces to be trimmed before they are placed in the string list.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The string_list_split_in_place() function was updated by 52acddf3
(string-list: multi-delimiter `string_list_split_in_place()`,
2023-04-24) to take more than one delimiter characters, hoping that
we can later use it to replace our uses of strtok(). We however did
not make a matching change to the string_list_split() function,
which is very similar.
Before giving both functions more features in future commits, allow
string_list_split() to also take more than one delimiter characters
to make them closer to each other.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The reftable unit tests are now ported to the "clar" unit testing
framework.
* sk/reftable-clarify-tests:
t/unit-tests: finalize migration of reftable-related tests
t/unit-tests: convert reftable stack test to use clar
t/unit-tests: convert reftable record test to use clar
t/unit-tests: convert reftable readwrite test to use clar
t/unit-tests: convert reftable table test to use clar
t/unit-tests: convert reftable pq test to use clar
t/unit-tests: convert reftable merged test to use clar
t/unit-tests: convert reftable block test to use clar
t/unit-tests: convert reftable basics test to use clar test framework
t/unit-tests: implement clar specific reftable test helper functions
|
|
"git pull" learned to pay attention to pull.autostash configuration
variable, which overrides rebase/merge.autostash.
* ly/pull-autostash:
pull: add pull.autoStash config option
|
|
Leakfix.
* jk/unleak-reflog-expire-entry:
reflog: close leak of reflog expire entry
|
|
pw/3.0-commentchar-auto-deprecation
* ps/config-wo-the-repository: (21 commits)
config: fix sign comparison warnings
config: move Git config parsing into "environment.c"
config: remove unused `the_repository` wrappers
config: drop `git_config_set_multivar()` wrapper
config: drop `git_config_get_multivar_gently()` wrapper
config: drop `git_config_set_multivar_in_file_gently()` wrapper
config: drop `git_config_set_in_file_gently()` wrapper
config: drop `git_config_set()` wrapper
config: drop `git_config_set_gently()` wrapper
config: drop `git_config_set_in_file()` wrapper
config: drop `git_config_get_bool()` wrapper
config: drop `git_config_get_ulong()` wrapper
config: drop `git_config_get_int()` wrapper
config: drop `git_config_get_string()` wrapper
config: drop `git_config_get_string()` wrapper
config: drop `git_config_get_string_multi()` wrapper
config: drop `git_config_get_value()` wrapper
config: drop `git_config_get_value()` wrapper
config: drop `git_config_get()` wrapper
config: drop `git_config_clear()` wrapper
...
|
|
Prior to 05e9cd64 (config: quote values containing CR character,
2025-05-19), a repository can trick "clone --recurse-submodules"
into running a post-checkout hook shipped with the project. The
test was written to make sure the trick would no longer run the
hook with the fix in the commit.
However, the test did not check for the path the hook would
create; correct the path to the expected one if the bug were
still with us.
Signed-off-by: chenjianhu <chenjianhu@kylinos.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
As I ended up wasting a few dozen minutes looking for the reason why
this is still here, help future developers by saving them from
wasting their time by documenting why this code that apparently is
not used by anybody is still here.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This patch compliments the previous commit, where builtins that use
add-patch infrastructure now respect diff.context and
diff.interHunkContext file configurations.
In particular, this patch helps users who don't want to set persistent
context configurations or just want a way to override them on a one-time
basis, by allowing the relevant builtins to accept corresponding command
line options that override the file configurations.
This mimics commands such as diff and log, which allow for both context
file configuration and command line overrides.
Signed-off-by: Leon Michalak <leonmichalak6@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Various builtins that use add-patch infrastructure do not respect
the user's diff.context and diff.interHunkContext file configurations.
The user may be used to seeing their diffs with customized context size,
but not in the patches "git add -p" shows them to pick from.
Teach add-patch infrastructure to read these configuration variables and
pass their values when spawning the underlying plumbing commands as
their command line option.
Signed-off-by: Leon Michalak <leonmichalak6@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Use the modern "test_config" test utility instead of manual"git config"
as the former provides clean up on test completion.
This is a prerequisite to the commits that follow which add to this test
file.
Signed-off-by: Leon Michalak <leonmichalak6@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
As a preparatory clean-up, use the "test_grep" test utility instead of
regular "grep" which provides better debug information if tests fail.
Signed-off-by: Leon Michalak <leonmichalak6@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "clar-decls.h" header gets generated by us to extract prototypes of
unit test functions from our clar-based tests. This generated file is
then written into "t/unit-tests/" and included via "unit-test.h". The
intent of all this is that we can keep "-Wmissing-prototype" warnings
enabled. If we had that warning disabled, it would be easy to miss in
case any of the non-static functions had a typo in its name and thus
wasn't picked up by our test case extractor.
Including the file directly has a big downside though: if a source tree
was built both with our Makefile and with Meson, then the Meson build
would include the "clar-decls.h" file from our Makefile. And if those
are out of sync we get compiler errors.
We already fixed a similar issue in 4771501c0a (meson: ensure correct
version-def.h is used, 2025-01-14). Let's do the same and pass the
absolute path to "clar-decls.h" via a preprocessor define.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
On Windows, $(pwd) will give us a Windows-style path like "D:/foo".
Putting that into $PATH confuses anybody parsing that variable, since
colon is a separator character in $PATH. Instead, we should use the
Unix-style value we get from $PWD ("/d/foo").
This is similar to the cases fixed by 71dd50472d (t0021, t5615: use $PWD
instead of $(pwd) in PATH-like shell variables, 2016-11-11).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The '--start-after' doesn't explicitly mention being compatible with the
'--exclude' flag, generally only incompatibility is explicitly called
out. However, it would be nice to test the compatibility between the
two to avoid future regressions. Let's do that.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git commit" that concludes a conflicted merge failed to notice and remove
existing comment added automatically (like "# Conflicts:") when the
core.commentstring is set to 'auto'.
* ac/auto-comment-char-fix:
config: set comment_line_str to "#" when core.commentChar=auto
commit: avoid scanning trailing comments when 'core.commentChar' is "auto"
|
|
The pop_most_recent_commit() function can have quite expensive
worst case performance characteristics, which has been optimized by
using prio-queue data structure.
* rs/pop-recent-commit-with-prio-queue:
commit: use prio_queue_replace() in pop_most_recent_commit()
prio-queue: add prio_queue_replace()
commit: convert pop_most_recent_commit() to prio_queue
|
|
A number of tests in "t9350-fast-export.sh" are using sub-shells to
redirect content to a number of commands instead of only
`git fast-import`.
This is confusing and possibly error-prone, so let's change those tests
so that no sub-shell is used and the content goes only to
`git fast-import`.
Reported-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
GIT_TEST_INSTALLED was not honored in the recent topic related to
SHA256 hashes, which has been corrected.
* kl/test-installed-fix:
test-lib: respect GIT_TEST_INSTALLED when querying default hash
|
|
|
|
configure_added_submodule always writes an explicit
submodule.<name>.active entry, even when the new
path is already matched by submodule.active
patterns. This leads to unnecessary and cluttered configuration.
change the logic to centralize wildmatch-based pattern lookup,
in configure_added_submodule. Wrap the active-entry write in a conditional
that only fires when that helper reports no existing pattern covers the
submodule’s path.
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Adding a submodule at a path that previously hosted
another submodule (e.g., 'child') reuses the submodule
name derived from the path. If the original submodule
was only moved (e.g., to 'child_old') and not renamed,
this silently overwrites its configuration in .gitmodules.
This behavior loses user configuration and causes
confusion when the original submodule is expected
to remain intact. It assumes that the path-derived
name is always safe to reuse, even though the name
might still be in use elsewhere in the repository.
Teach module_add() to check if the computed submodule
name already exists in the repository's submodule config,
and if so, refuse the operation unless the user explicitly
renames the submodule or uses the --force option,
which will automatically generate a unique name by
appending a number (e.g., child1).
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The old `lib-reftable.{c,h}` implemented helper functions for our
homegrown unit-testing framework. As part of migrating reftable-related
tests to the Clar framework, Clar-specific versions of these functions
in `lib-reftable-clar.{c,h}` were introduced.
Now that all test files using these helpers have been converted to Clar,
we can safely remove the original `lib-reftable.{c,h}` and rename the
Clar- specific versions back to `lib-reftable.{c,h}`. This restores a
clean and consistent naming scheme for shared test utilities.
Finally, update our build system to reflect the changes made and remove
redundant code related to the reftable tests and our old homegrown
unit-testing setup. `test-lib.{c,h}` remains unchanged in our build
system as some files particularly `t/helper/test-example-tap.c` depends
on it in order to run, and removing that would be beyond the scope of
this patch.
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Adapt reftable stack test file to use clar by using clar assertions
where necessary.
This marks the end of all unit tests migrated away from the
`unit-tests/t-*.c` pattern, there are no longer any files matching that
glob. Remove the sanity check for `t-*.c` files to prevent Meson
configuration errors during CI and local builds.
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Adapt reftable record test file to use clar by using clar assertions
where necessary.
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Adapt reftable readwrite test file to use clar by using clar assertions
where necessary.
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Adapt reftable table test file to use clar by using clar assertions
where necessary.
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Adapt reftable priority queue test file to use clar by using clar
assertions where necessary.
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Adapt reftable merged test file to use clar testing framework by using
clar assertions where necessary.
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Adapt reftable block test file to use clar testing framework by using
clar assertions where necessary.
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Adapt reftable basics test file to clar by using clar assertions
where necessary.Break up test edge case to improve modularity and
clarity.
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Helper functions defined in `t/unit-tests/lib-reftable.{c,h}` are
required for the reftable-related test files to run. In the current
implementation these functions are designed to conform with our
homegrown unit-testing structure. So in other to convert the reftable
test files, there is need for a clar specific implementation of these
helper functions.
Implement equivalent helper functions in `lib-reftable-clar.{c,h}` to
use clar. These functions conform with the clar testing framework and
become available for all reftable-related test files implemented using
the clar testing framework, which requires them. This will be used by
subsequent commits.
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
After we write to the output file, the program exits. This naturally
closes the descriptor. But we should do an explicit close for two
reasons:
1. It's possible to hit an error on close(), which we should detect
and report via our exit code.
2. Leaking descriptors is a bad practice in general. Even if it isn't
meaningful here, it sets a bad example.
It is tempting to write:
if (write_in_full(fd, ...) < 0 || close(fd) < 0)
die_errno(...);
But that pattern contains a subtle problem that has resulted in
descriptor leaks before. If write_in_full() fails, we'll short-circuit
and never call close(), leaking the descriptor.
That's not a problem here, since our error path dies instead of
returning up the stack. But since we're trying to set a good example,
let's write it out as two separate conditions. As a bonus, that lets us
produce a slightly more specific error message.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We want to read the whole contents of two files into memory. If we
switch from raw ptr/len pairs to strbufs, we can use strbuf_read_file()
to shorten the code.
This incidentally fixes two small bugs:
1. We stat() the files and allocate our buffers based on st.st_size.
But that is an off_t which may be larger than the size_t we'd use
to allocate. We should use xsize_t() to do a checked conversion.
Otherwise integer truncation (on a file >4GB) could cause us to
under-allocate (though in practice this does not result in a buffer
overflow because the same truncation happens when read_in_full()
also takes a size_t).
2. We get the size from st.st_size, and then try to read_in_full()
that many bytes. But it may return fewer bytes than expected (if
the file changed racily and we get an early EOF), leading us to
read uninitialized bytes in the allocated buffer. We don't notice
because we only check the value for error, not that we got the
expected number of bytes.
The strbuf code doesn't run into this, because it just reads to EOF,
expanding the buffer dynamically as necessary. Neither bug is a big deal
for a test helper, but fixing them is a nice bonus on top of simplifying
the code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This is a short test helper that does all of its work in the main
function. When we encounter an error, we try to clean up memory and
descriptors and then jump to an error return, which exits the program.
We can get the same effect by just calling die(), which means we do not
have to bother with cleaning up. This simplifies the code, and also
removes some inconsistencies where a few code paths forgot to clean up
descriptors (though in practice it was not a big deal since we were
exiting anyway).
In addition to die() and die_errno(), we'll also use a few of our usual
helpers like xopen() and usage() that make things more ergonomic.
This does change the exit code in these cases from 1 to 128, but I
don't think it matters (and arguably is better, as we'd already exit 128
for other errors like xmalloc() failure).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Clean up the way how signature on commit objects are exported to
and imported from fast-import stream.
* cc/fast-import-export-signature-names:
fast-(import|export): improve on commit signature output format
|
|
Lift the limitation to use changed-path filter in "git log" so that
it can be used for a pathspec with multiple literal paths.
* ly/changed-paths-traversal:
bloom: optimize multiple pathspec items in revision
revision: make helper for pathspec to bloom keyvec
bloom: replace struct bloom_key * with struct bloom_keyvec
bloom: rename function operates on bloom_key
bloom: add test helper to return murmur3 hash
|
|
In "config.c" we host both the business logic to read and write config
files as well as the logic to parse specific Git-related variables. On
the one hand this is mixing concerns, but even more importantly it means
that we cannot easily remove the dependency on `the_repository` in our
config parsing logic.
Move the logic into "environment.c". This file is a grab bag of all
kinds of global state already, so it is quite a good fit. Furthermore,
it also hosts most of the global variables that we're parsing the config
values into, making this an even better fit.
Note that there is one hidden change: in `parse_fsync_components()` we
use an `int` to iterate through `ARRAY_SIZE(fsync_component_names)`. But
as -Wsign-compare warnings are enabled in this file this causes a
compiler warning. The issue is fixed by using a `size_t` instead.
This change allows us to drop the `USE_THE_REPOSITORY_VARIABLE`
declaration.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 036876a1067 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config_get_bool()`. All
callsites are adjusted so that they use
`repo_config_get_bool(the_repository, ...)` instead. While some
callsites might already have a repository available, this mechanical
conversion is the exact same as the current situation and thus cannot
cause any regression. Those sites should eventually be cleaned up in a
later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 036876a1067 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config_get_int()`. All
callsites are adjusted so that they use
`repo_config_get_int(the_repository, ...)` instead. While some callsites
might already have a repository available, this mechanical conversion is
the exact same as the current situation and thus cannot cause any
regression. Those sites should eventually be cleaned up in a later patch
series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 036876a1067 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config_get_string()`.
All callsites are adjusted so that they use
`repo_config_get_string(the_repository, ...)` instead. While some
callsites might already have a repository available, this mechanical
conversion is the exact same as the current situation and thus cannot
cause any regression. Those sites should eventually be cleaned up in a
later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 036876a1067 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config_get_value()`. All
callsites are adjusted so that they use
`repo_config_get_value(the_repository, ...)` instead. While some
callsites might already have a repository available, this mechanical
conversion is the exact same as the current situation and thus cannot
cause any regression. Those sites should eventually be cleaned up in a
later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 036876a1067 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config_get_value()`. All
callsites are adjusted so that they use
`repo_config_get_value(the_repository, ...)` instead. While some
callsites might already have a repository available, this mechanical
conversion is the exact same as the current situation and thus cannot
cause any regression. Those sites should eventually be cleaned up in a
later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 036876a1067 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config_get()`. All
callsites are adjusted so that they use `repo_config_get(the_repository,
...)` instead. While some callsites might already have a repository
available, this mechanical conversion is the exact same as the current
situation and thus cannot cause any regression. Those sites should
eventually be cleaned up in a later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 036876a1067 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config()`. All callsites
are adjusted so that they use `repo_config(the_repository, ...)`
instead. While some callsites might already have a repository available,
this mechanical conversion is the exact same as the current situation
and thus cannot cause any regression. Those sites should eventually be
cleaned up in a later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
find_cfg_ent() allocates a struct reflog_expire_entry_option via
FLEX_ALLOC_MEM and inserts it into a linked list in the
reflog_expire_options structure. The entries in this list are never
freed, resulting in a leak in cmd_reflog_expire and the gc reflog expire
maintenance task:
Direct leak of 39 byte(s) in 1 object(s) allocated from:
#0 0x7ff975ee6883 in calloc (/lib64/libasan.so.8+0xe6883)
#1 0x0000010edada in xcalloc ../wrapper.c:154
#2 0x000000df0898 in find_cfg_ent ../reflog.c:28
#3 0x000000df0898 in reflog_expire_config ../reflog.c:70
#4 0x00000095c451 in configset_iter ../config.c:2116
#5 0x0000006d29e7 in git_config ../config.h:724
#6 0x0000006d29e7 in cmd_reflog_expire ../builtin/reflog.c:205
#7 0x0000006d504c in cmd_reflog ../builtin/reflog.c:419
#8 0x0000007e4054 in run_builtin ../git.c:480
#9 0x0000007e4054 in handle_builtin ../git.c:746
#10 0x0000007e8a35 in run_argv ../git.c:813
#11 0x0000007e8a35 in cmd_main ../git.c:953
#12 0x000000441e8f in main ../common-main.c:9
#13 0x7ff9754115f4 in __libc_start_call_main (/lib64/libc.so.6+0x35f4)
#14 0x7ff9754116a7 in __libc_start_main@@GLIBC_2.34 (/lib64/libc.so.6+0x36a7)
#15 0x000000444184 in _start (/home/jekeller/libexec/git-core/git+0x444184)
Close this leak by adding a reflog_clear_expire_config() function which
iterates the linked list and frees its elements. Call it upon exit of
cmd_reflog_expire() and reflog_expire_condition().
Add a basic test which covers this leak. While at it, cover the
functionality from commit commit 3cb22b8efe (Per-ref reflog expiry
configuration, 2008-06-15). We've had this support for years, but lacked
any tests.
Co-developed-by: Jeff King <peff@peff.net>
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Fix a resource leak where the file descriptor was not closed after
truncating a file in t/helper/test-truncate.c.
Signed-off-by: Hoyoung Lee <lhywkd22@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
These cases cover scenarios where `gpg.program` is set as a program in
`$PATH` or as a path relative to the user's home directory.
Signed-off-by: Jonas Brandstötter <jonas.brandstoetter@gmx.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a function to replace the top element of the queue that basically
does the same as prio_queue_get() followed by prio_queue_put(), but
without the work by prio_queue_get() to rebalance the heap. It can be
used to optimize loops that get one element and then immediately add
another one. That's common e.g., with commit history traversal, where
we get out a commit and then put in its parents.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
pop_most_recent_commit() calls commit_list_insert_by_date() for parent
commits, which is itself called in a loop. This can lead to quadratic
complexity if there are many merges. Replace the commit_list with a
prio_queue to ensure logarithmic worst case complexity and convert all
three users.
Add a performance test that exercises one of them using a pathological
history that consists of 50% merges and 50% root commits to demonstrate
the speedup:
Test v2.50.1 HEAD
----------------------------------------------------------------------
1501.2: rev-parse ':/65535' 2.48(2.47+0.00) 0.20(0.19+0.00) -91.9%
Alas, sane histories don't benefit from the conversion much, and
traversing Git's own history takes a 1% performance hit on my machine:
$ hyperfine -w3 -L git ./git_2.50.1,./git '{git} rev-parse :/^Initial.revision'
Benchmark 1: ./git_2.50.1 rev-parse :/^Initial.revision
Time (mean ± σ): 1.071 s ± 0.004 s [User: 1.052 s, System: 0.017 s]
Range (min … max): 1.067 s … 1.078 s 10 runs
Benchmark 2: ./git rev-parse :/^Initial.revision
Time (mean ± σ): 1.079 s ± 0.003 s [User: 1.060 s, System: 0.017 s]
Range (min … max): 1.074 s … 1.083 s 10 runs
Summary
./git_2.50.1 rev-parse :/^Initial.revision ran
1.01 ± 0.00 times faster than ./git rev-parse :/^Initial.revision
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Git uses `rebase.autostash` or `merge.autostash` to determine whether a
dirty worktree is allowed during pull. However, this behavior is not
clearly documented, making it difficult for users to discover how to
enable autostash, or causing them to unknowingly enable it. Add new
config option `pull.autostash` along with its documentation and test
cases.
`pull.autostash` provides the same functionality as `rebase.autostash`
and `merge.autostash`, but overrides them when set. If `pull.autostash`
is not set, it falls back to `rebase.autostash` or `merge.autostash`,
depending on the value of `pull.rebase`.
Signed-off-by: Lidong Yan <yldhome2d2@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git remote" now detects remote names that overlap with each other
(e.g., remote nickname "outer" and "outer/inner" are used at the
same time), as it will lead to overlapping remote-tracking
branches.
* jk/remote-avoid-overlapping-names:
remote: detect collisions in remote names
|