| Age | Commit message (Collapse) | Author | Files | Lines |
|
All the Perforce tests are free of memory leaks. This went unnoticed
because most folks do not have p4 and p4d installed on their computers.
Consequently, given that the prerequisites for running those tests
aren't fulfilled, `TEST_PASSES_SANITIZE_LEAK=check` won't notice that
those tests are indeed memory leak free.
Mark those tests accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Some of the tests in t98xx modify the Perforce depot in ways that the
tool wouldn't normally allow. This is done to test behaviour of git-p4
in certain edge cases that we have observed in the wild, but which
should in theory not be possible.
Naturally, modifying the depot on disk directly is quite intimate with
the tool and thus prone to breakage when Perforce updates the way that
data is stored. And indeed, those tests are broken nowadays with r23 of
Perforce. While a file revision was previously stored as a plain file
"depot/file,v", it is now stored in a directory "depot/file,d" with
compression.
Adapt those tests to handle both old- and new-style depot layouts.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Test fix as a follow-up to an already graduated topic.
* jk/am-retry:
t4153: stop redirecting input from /dev/zero
|
|
"git var GIT_SHELL_PATH" should report the path to the shell used
to spawn external commands, but it didn't do so on Windows, which
has been corrected.
* js/var-git-shell-path:
var(win32): do report the GIT_SHELL_PATH that is actually used
run-command: declare the `git_shell_path()` function globally
run-command(win32): resolve the path to the Unix shell early
mingw(is_msys2_sh): handle forward slashes in the `sh.exe` path, too
win32: override `fspathcmp()` with a directory separator-aware version
strvec: declare the `strvec_push_nodup()` function globally
run-command: refactor getting the Unix shell path into its own function
|
|
"git push '' HEAD:there" used to hit a BUG(); it has been corrected
to die with "fatal: bad repository ''".
* kn/push-empty-fix:
builtin/push: call set_refspecs after validating remote
|
|
The test framework learned to take the test body not as a single
string but as a here-document.
* jk/test-body-in-here-doc:
t/.gitattributes: ignore whitespace in chainlint expect files
t: convert some here-doc test bodies
test-lib: allow test snippets as here-docs
chainlint.pl: add tests for test body in heredoc
chainlint.pl: recognize test bodies defined via heredoc
chainlint.pl: check line numbers in expected output
chainlint.pl: force CRLF conversion when opening input files
chainlint.pl: do not spawn more threads than we have scripts
chainlint.pl: only start threads if jobs > 1
chainlint.pl: add test_expect_success call to test snippets
|
|
Tests that use GIT_TEST_SANITIZE_LEAK_LOG feature got their exit
status inverted, which has been corrected.
* rj/test-sanitize-leak-log-fix:
test-lib: GIT_TEST_SANITIZE_LEAK_LOG enabled by default
test-lib: fix GIT_TEST_SANITIZE_LEAK_LOG
|
|
Commit 852a171018 (am: let command-line options override saved options,
2015-08-04) redirected a few "git am" invocations from /dev/zero, even
though it did not expect "am" to read the input. This was necessary at
the time because those tests used test_terminal, and as described in
18d8c26930 (test_terminal: redirect child process' stdin to a pty,
2015-08-04):
Note that due to the way the code is structured, the child's stdin
pseudo-tty will be closed when we finish reading from our stdin. This
means that in the common case, where our stdin is attached to /dev/null,
the child's stdin pseudo-tty will be closed immediately. Some operations
like isatty(), which git-am uses, require the file descriptor to be
open, and hence if the success of the command depends on such functions,
test_terminal's stdin should be redirected to a source with large amount
of data to ensure that the child's stdin is not closed, e.g.
test_terminal git am --3way </dev/zero
But we later dropped the use of test_terminal in 53ce2e3f0a (am: add
explicit "--retry" option, 2024-06-06). That commit dropped one of the
redirections from /dev/zero but not the other.
In theory the remaining one should not cause any problems, but it turns
out that at least one platform (NonStop) does not have /dev/zero at all.
We never noticed before because it also did not pass the TTY prereq,
meaning these tests were not run at all there until 53ce2e3f0a.
So let's drop the useless /dev/zero mention. There are others in the
test suite, but they are run only for tests marked with EXPENSIVE (so
not typically by default).
Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The http transport can now be told to send request with
authentication material without first getting a 401 response.
* bc/http-proactive-auth:
http: allow authenticating proactively
|
|
A new warning message is issued when a command has to expand a
sparse index to handle working tree cruft that are outside of the
sparse checkout.
* ds/advice-sparse-index-expansion:
advice: warn when sparse index expands
|
|
Address-looking strings found on the trailer are now placed on the
Cc: list after running through sanitize_address by "git send-email".
* cb/send-email-sanitize-trailer-addresses:
git-send-email: use sanitized address when reading mbox body
|
|
The "ort" merge backend saw one bugfix for a crash that happens
when inner merge gets killed, and assorted code clean-ups.
* en/ort-inner-merge-error-fix:
merge-ort: fix missing early return
merge-ort: convert more error() cases to path_msg()
merge-ort: upon merge abort, only show messages causing the abort
merge-ort: loosen commented requirements
merge-ort: clearer propagation of failure-to-function from merge_submodule
merge-ort: fix type of local 'clean' var in handle_content_merge ()
merge-ort: maintain expected invariant for priv member
merge-ort: extract handling of priv member into reusable function
|
|
A test in reftable library has been rewritten using the unit test
framework.
* cp/unit-test-reftable-record:
t-reftable-record: add tests for reftable_log_record_compare_key()
t-reftable-record: add tests for reftable_ref_record_compare_name()
t-reftable-record: add index tests for reftable_record_is_deletion()
t-reftable-record: add obj tests for reftable_record_is_deletion()
t-reftable-record: add log tests for reftable_record_is_deletion()
t-reftable-record: add ref tests for reftable_record_is_deletion()
t-reftable-record: add comparison tests for obj records
t-reftable-record: add comparison tests for index records
t-reftable-record: add comparison tests for ref records
t-reftable-record: add reftable_record_cmp() tests for log records
t: move reftable/record_test.c to the unit testing framework
|
|
"git push" that pushes only deletion gave an unnecessary and
harmless error message when push negotiation is configured, which
has been corrected.
* jc/disable-push-nego-for-deletion:
push: avoid showing false negotiation errors
|
|
Test suite has been taught not to unnecessarily rely on DNS failing
a bogus external name.
* jk/tests-without-dns:
t/lib-bundle-uri: use local fake bundle URLs
t5551: do not confirm that bogus url cannot be used
t5553: use local url for invalid fetch
|
|
An existing test of oidmap API has been rewritten with the
unit-test framework.
* gt/unit-test-oidmap:
t: migrate helper/test-oidmap.c to unit-tests/t-oidmap.c
|
|
"git describe --dirty --broken" forgot to refresh the index before
seeing if there is any chang, ("git describe --dirty" correctly did
so), which has been corrected.
* as/describe-broken-refresh-index-fix:
describe: refresh the index when 'broken' flag is used
|
|
A test that no longer leaks has been marked as such.
* rj/t0613-no-longer-leaks:
t0613: mark as leak-free
|
|
A test that no longer leaks has been marked as such.
* rj/t0612-no-longer-leaks:
t0612: mark as leak-free
|
|
On Windows, Unix-like paths like `/bin/sh` make very little sense. In
the best case, they simply don't work, in the worst case they are
misinterpreted as absolute paths that are relative to the drive
associated with the current directory.
To that end, Git does not actually use the path `/bin/sh` that is
recorded e.g. when `run_command()` is called with a Unix shell
command-line. Instead, as of 776297548e (Do not use SHELL_PATH from
build system in prepare_shell_cmd on Windows, 2012-04-17), it
re-interprets `/bin/sh` as "look up `sh` on the `PATH` and use the
result instead".
This is the logic users expect to be followed when running `git var
GIT_SHELL_PATH`.
However, when 1e65721227 (var: add support for listing the shell,
2023-06-27) introduced support for `git var GIT_SHELL_PATH`, Windows was
not special-cased as above, which is why it outputs `/bin/sh` even
though that disagrees with what Git actually uses.
Let's fix this by using the exact same logic as `prepare_shell_cmd()`,
adjusting the Windows-specific `git var GIT_SHELL_PATH` test case to
verify that it actually finds a working executable.
Reported-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When an end-user runs "git push" with an empty string for the remote
repository name, e.g.
$ git push '' main
"git push" fails with a BUG(). Even though this is a nonsense request
that we want to fail, we shouldn't hit a BUG(). Instead we want to give
a sensible error message, e.g., 'bad repository'".
This is because since 9badf97c42 (remote: allow resetting url list,
2024-06-14), we reset the remote URL if the provided URL is empty. When
a user of 'remotes_remote_get' tries to fetch a remote with an empty
repo name, the function initializes the remote via 'make_remote'. But
the remote is still not a valid remote, since the URL is empty, so it
tries to add the URL alias using 'add_url_alias'. This in-turn will call
'add_url', but since the URL is empty we call 'strvec_clear' on the
`remote->url`. Back in 'remotes_remote_get', we again check if the
remote is valid, which fails, so we return 'NULL' for the 'struct
remote *' value.
The 'builtin/push.c' code, calls 'set_refspecs' before validating the
remote. This worked with empty repo names earlier since we would get a
remote, albeit with an empty URL. With the new changes, we get a 'NULL'
remote value, this causes the check for remote to fail and raises the
BUG in 'set_refspecs'.
Do a simple fix by doing remote validation first. Also add a test to
validate the bug fix. With this, we can also now directly pass remote to
'set_refspecs' instead of it trying to lazily obtain it.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
As we currently describe in t/README, it can happen that:
Some tests run "git" (or "test-tool" etc.) without properly checking
the exit code, or git will invoke itself and fail to ferry the
abort() exit code to the original caller.
Therefore, GIT_TEST_SANITIZE_LEAK_LOG=true is needed to be set to
capture all memory leaks triggered by our tests.
It seems unnecessary to force users to remember this option, as
forgetting it could lead to missed memory leaks.
We could solve the problem by making it "true" by default, but that
might suggest we think "false" makes sense, which isn't the case.
Therefore, the best approach is to remove the option entirely while
maintaining the capability to detect memory leaks in blind spots of our
tests.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The ".expect" files in t/chainlint/ are snippets of expected output from
the chainlint script, and do not necessarily conform to our usual code
style. Especially with the recent change to retain line numbers, blank
lines in the input script end up with trailing whitespace as we print
"3 " for line 3, for example. The point of these files is to match the
output verbatim, so let's not complain about the trailing spaces.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The t1404 script checks a lot of output from Git which contains single
quotes. Because the test snippets are themselves wrapped in the same
single-quotes, we have to resort to using $SQ to match them. This is
error-prone and makes the tests harder to read.
Instead, let's use the new here-doc feature added in the previous
commit, which lets us write anything in the test body we want (except
the here-doc end marker on a line by itself, of course).
Note that we do use "\" in our marker to avoid interpolation (which is
the whole point). But we don't use "<<-", as we want to preserve
whitespace in the snippet (and running with "-v" before and after shows
that we produce the exact same output, except with the ugly $SQ
references fixed).
I just converted every test here, even though only some of them use
$SQ. But it would be equally correct to mix-and-match styles if we don't
mind the inconsistency.
I've also converted a few tests in t0600 which were moved from t1404 (I
had written this patch before they were moved, but it seemed worth
porting over the changes rather than losing them).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Most test snippets are wrapped in single quotes, like:
test_expect_success 'some description' '
do_something
'
This sometimes makes the snippets awkward to write, because you can't
easily use single quotes within them. We sometimes work around this with
$SQ, or by loosening regexes to use "." instead of a literal quote, or
by using double quotes when we'd prefer to use single-quotes (and just
adding extra backslash-escapes to avoid interpolation).
This commit adds another option: feeding the snippet via the function's
stdin. This doesn't conflict with anything the snippet would want to do,
because we always redirect its stdin from /dev/null anyway (which we'll
continue to do).
A few notes on the implementation:
- it would be nice to push this down into test_run_, but we can't, as
test_expect_success and test_expect_failure want to see the actual
script content to report it for verbose-mode. A helper function
limits the amount of duplication in those callers here.
- The helper function is a little awkward to call, as you feed it the
name of the variable you want to set. The more natural thing in
shell would be command substitution like:
body=$(body_or_stdin "$2")
but that loses trailing whitespace. There are tricks around this,
like:
body=$(body_or_stdin "$2"; printf .)
body=${body%.}
but we'd prefer to keep such tricks in the helper, not in each
caller.
- I implemented the helper using a sequence of "read" calls. Together
with "-r" and unsetting the IFS, this preserves incoming whitespace.
An alternative is to use "cat" (which then requires the gross "."
trick above). But this saves us a process, which is probably a good
thing. The "read" builtin does use more read() syscalls than
necessary (one per byte), but that is almost certainly a win over a
separate process.
Both are probably slower than passing a single-quoted string, but
the difference is lost in the noise for a script that I converted as
an experiment.
- I handle test_expect_success and test_expect_failure here. If we
like this style, we could easily extend it to other spots (e.g.,
lazy_prereq bodies) on top of this patch.
- even though we are using "local", we have to be careful about our
variable names. Within test_expect_success, any variable we declare
with local will be seen as local by the test snippets themselves (so
it wouldn't persist between tests like normal variables would).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The chainlint.pl script recently learned about the upcoming:
test_expect_success 'some test' - <<\EOT
TEST_BODY
EOT
syntax, where TEST_BODY should be checked in the usual way. Let's make
sure this works by adding a few tests. The "here-doc-body" file tests
the basic syntax, including an embedded here-doc which we should still
be able to recognize.
Likewise the "here-doc-body-indent" checks the same thing, but using the
"<<-" operator. We wouldn't expect this to be used normally, but we
would not want to accidentally miss a body that uses it. The
"pathological" variant checks the opposite: we don't get confused by an
indented tag within the here-doc body.
The "here-doc-double" tests the handling of two here-doc tags on the
same line. This is not something we'd expect anybody to do in practice,
but the code was written defensively to handle this, so let's make sure
it works.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In order to check tests for semantic problems, chainlint.pl scans test
scripts, looking for tests defined as:
test_expect_success [prereq] title '
body
'
where `body` is a single string which is then treated as a standalone
chunk of code and "linted" to detect semantic issues. (The same happens
for `test_expect_failure` definitions.)
The introduction of test definitions in which the test body is instead
presented via a heredoc rather than as a single string creates a blind
spot in the linting process since such invocations are not recognized by
chainlint.pl.
Prepare for this new style by also recognizing tests defined as:
test_expect_success [prereq] title - <<\EOT
body
EOT
A minor complication is that chainlint.pl has never considered heredoc
bodies significant since it doesn't scan them for semantic problems,
thus it has always simply thrown them away. However, with the new
`test_expect_success` calling sequence, heredoc bodies become
meaningful, thus need to be captured.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While working on chainlint.pl recently, we introduced some bugs that
showed incorrect line numbers in the output. But it was hard to notice,
since we sanitize the output by removing all of the line numbers! It
would be nice to retain these so we can catch any regressions.
The main reason we sanitize is for maintainability: we concatenate all
of the test snippets into a single file, so it's hard for each ".expect"
file to know at which offset its test input will be found. We can handle
that by storing the per-test line numbers in the ".expect" files, and
then dynamically offsetting them as we build the concatenated test and
expect files together.
The changes to the ".expect" files look like tedious boilerplate, but it
actually makes adding new tests easier. You can now just run:
perl chainlint.pl chainlint/foo.test |
tail -n +2 >chainlint/foo.expect
to save the output of the script minus the comment headers (after
checking that it is correct, of course). Whereas before you had to strip
the line numbers. The conversions here were done mechanically using
something like the script above, and then spot-checked manually.
It would be possible to do all of this in shell via the Makefile, but it
gets a bit complicated (and requires a lot of extra processes). Instead,
I've written a short perl script that generates the concatenated files
(we already depend on perl, since chainlint.pl uses it). Incidentally,
this improves a few other things:
- we incorrectly used $(CHAINLINTTMP_SQ) inside a double-quoted
string. So if your test directory required quoting, like:
make "TEST_OUTPUT_DIRECTORY=/tmp/h'orrible"
we'd fail the chainlint tests.
- the shell in the Makefile didn't handle &&-chaining correctly in its
loops (though in practice the "sed" and "cat" invocations are not
likely to fail).
- likewise, the sed invocation to strip numbers was hiding the exit
code of chainlint.pl itself. In practice this isn't a big deal;
since there are linter violations in the test files, we expect it to
exit non-zero. But we could later use exit codes to distinguish
serious errors from expected ones.
- we now use a constant number of processes, instead of scaling with
the number of test scripts. So it should be a little faster (on my
machine, "make check-chainlint" goes from 133ms to 73ms).
There are some alternatives to this approach, but I think this is still
a good intermediate step:
1. We could invoke chainlint.pl individually on each test file, and
compare it to the expected output (and possibly using "make" to
avoid repeating already-done checks). This is a much bigger change
(and we'd have to figure out what to do with the "# LINT" lines in
the inputs). But in this case we'd still want the "expect" files to
be annotated with line numbers. So most of what's in this patch
would be needed anyway.
2. Likewise, we could run a single chainlint.pl and feed it all of the
scripts (with "--jobs=1" to get deterministic output). But we'd
still need to annotate the scripts as we did here, and we'd still
need to either assemble the "expect" file, or break apart the
script output to compare to each individual ".expect" file.
So we may pursue those in the long run, but this patch gives us more
robust tests without too much extra work or moving in a useless
direction.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The lexer in chainlint.pl can't handle CRLF line endings; it complains
about an internal error in scan_token() if we see one. For example, in
our Windows CI environment:
$ perl chainlint.pl chainlint/for-loop.test | cat -v
Thread 2 terminated abnormally: internal error scanning character '^M'
This doesn't break "make check-chainlint" (yet), because we assemble a
concatenated input by passing the contents of each file through "sed".
And the "sed" we use will strip out the CRLFs. But the next patch is
going to rework this a bit, which does break check-chainlint on Windows.
Plus it's probably nicer to folks on Windows who might work on chainlint
itself and write new tests.
In theory we could fix the parser to handle this, but it's not really
worth the trouble. We should be able to ask the input layer to translate
the line endings for us. In fact, I'd expect this to happen by default,
as perl's documentation claims Win32 uses the ":unix:crlf" PERLIO layer
by default ("unix" here just refers to using read/write syscalls, and
then "crlf" layers the translation on top). However, this doesn't seem
to be the case in our Windows CI environment. I didn't dig into the
exact reason, but it is perhaps because we are using an msys build of
perl rather than a "true" Win32 build.
At any rate, it is easy-ish to just ask explicitly for the conversion.
In the above example, setting PERLIO=crlf in the environment is enough
to make it work. Curiously, though, this doesn't work when invoking
chainlint via "make". Again, I didn't dig into it, but it may have to do
with msys programs calling Windows programs or vice versa.
We can make it work consistently by just explicitly asking for CRLF
translation when we open the files. This will even work on non-Windows
platforms, though we wouldn't really expect to find CRLF files there.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The chainlint.pl script spawns worker threads to check many scripts in
parallel. This is good if you feed it a lot of scripts. But if you give
it few (or one), then the overhead of spawning the threads dominates. We
can easily notice that we have fewer scripts than threads and scale back
as appropriate.
This patch reduces the time to run:
time for i in chainlint/*.test; do
perl chainlint.pl $i
done >/dev/null
on my system from ~4.1s to ~1.1s, where I have 8+8 cores.
As with the previous patch, this isn't the usual way we run chainlint
(we feed many scripts at once, which is why it supports threading in the
first place). So this won't make a big difference in the real world, but
it may help us out in the future, and it makes experimenting with and
debugging the chainlint tests a bit more pleasant.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
If the system supports threads, chainlint.pl will always spawn worker
threads to do the real work. But when --jobs=1, this is pointless, since
we could just do the work in the main thread. And spawning even a single
thread has a high overhead. For example, on my Linux system, running:
for i in chainlint/*.test; do
perl chainlint.pl --jobs=1 $i
done >/dev/null
takes ~1.7s without this patch, and ~1.1s after. We don't usually spawn
a bunch of individual chainlint.pl processes (instead we feed several
scripts at once, and the parallelism outweighs the setup cost). But it's
something we've considered doing, and since we already have fallback
code for systems without thread support, it's pretty easy to make this
work.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The chainlint tests are a series of individual files, each holding a
test body. The "make check-chainlint" target assembles them into a
single file, adding a "test_expect_success" function call around each.
Let's instead include that function call in the files themselves. This
is a little more boilerplate, but has several advantages:
1. You can now run chainlint manually on snippets with just "perl
chainlint.perl chainlint/foo.test". This can make developing and
debugging a little easier.
2. Many of the tests implicitly relied on the syntax of the lines
added by the Makefile (in particular the use of single-quotes).
This assumption is much easier to see when the single-quotes are
alongside the test body.
3. We had no way to test how the chainlint program handled
various test_expect_success lines themselves. Now we'll be able to
check variations.
The change to the .test files was done mechanically, using the same
test names they would have been assigned by the Makefile (this is
important to match the expected output). The Makefile has the minimal
change to drop the extra lines; there are more cleanups possible but a
future patch in this series will rewrite this substantially anyway.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When making a request over HTTP(S), Git only sends authentication if it
receives a 401 response. Thus, if a repository is open to the public
for reading, Git will typically never ask for authentication for fetches
and clones.
However, there may be times when a user would like to authenticate
nevertheless. For example, a forge may give higher rate limits to users
who authenticate because they are easier to contact in case of excessive
use. Or it may be useful for a known heavy user, such as an internal
service, to proactively authenticate so its use can be monitored and, if
necessary, throttled.
Let's make this possible with a new option, "http.proactiveAuth". This
option specifies a type of authentication which can be used to
authenticate against the host in question. This is necessary because we
lack the WWW-Authenticate header to provide us details; similarly, we
cannot accept certain types of authentication because we require
information from the server, such as a nonce or challenge, to
successfully authenticate.
If we're in auto mode and we got a username and password, set the
authentication scheme to Basic. libcurl will not send authentication
proactively unless there's a single choice of allowed authentication,
and we know in this case we didn't get an authtype entry telling us what
scheme to use, or we would have taken a different codepath and written
the header ourselves. In any event, of the other schemes that libcurl
supports, Digest and NTLM require a nonce or challenge, which means that
they cannot work with proactive auth, and GSSAPI does not use a username
and password at all, so Basic is the only logical choice among the
built-in options.
Note that the existing http_proactive_auth variable signifies proactive
auth if there are already credentials, which is different from the
functionality we're adding, which always seeks credentials even if none
are provided. Nonetheless, t5540 tests the existing behavior for
WebDAV-based pushes to an open repository without credentials, so we
preserve it. While at first this may seem an insecure and bizarre
decision, it may be that authentication is done with TLS certificates,
in which case it might actually provide a quite high level of security.
Expand the variable to use an enum to handle the additional cases and a
helper function to distinguish our new cases from the old ones.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When bundleURI interface fetches multiple bundles, Git failed to
take full advantage of all bundles and ended up slurping duplicated
objects.
* xx/bundie-uri-fixes:
unbundle: extend object verification for fetches
fetch-pack: expose fsckObjects configuration logic
bundle-uri: verify oid before writing refs
|
|
More memory leaks have been plugged.
* ps/leakfixes-more: (29 commits)
builtin/blame: fix leaking ignore revs files
builtin/blame: fix leaking prefixed paths
blame: fix leaking data for blame scoreboards
line-range: plug leaking find functions
merge: fix leaking merge bases
builtin/merge: fix leaking `struct cmdnames` in `get_strategy()`
sequencer: fix memory leaks in `make_script_with_merges()`
builtin/clone: plug leaking HEAD ref in `wanted_peer_refs()`
apply: fix leaking string in `match_fragment()`
sequencer: fix leaking string buffer in `commit_staged_changes()`
commit: fix leaking parents when calling `commit_tree_extended()`
config: fix leaking "core.notesref" variable
rerere: fix various trivial leaks
builtin/stash: fix leak in `show_stash()`
revision: free diff options
builtin/log: fix leaking commit list in git-cherry(1)
merge-recursive: fix memory leak when finalizing merge
builtin/merge-recursive: fix leaking object ID bases
builtin/difftool: plug memory leaks in `run_dir_diff()`
object-name: free leaking object contexts
...
|
|
The Bloom filter used for path limited history traversal was broken
on systems whose "char" is unsigned; update the implementation and
bump the format version to 2.
* tb/path-filter-fix:
bloom: introduce `deinit_bloom_filters()`
commit-graph: reuse existing Bloom filters where possible
object.h: fix mis-aligned flag bits table
commit-graph: new Bloom filter version that fixes murmur3
commit-graph: unconditionally load Bloom filters
bloom: prepare to discard incompatible Bloom filters
bloom: annotate filters with hash version
repo-settings: introduce commitgraph.changedPathsVersion
t4216: test changed path filters with high bit paths
t/helper/test-read-graph: implement `bloom-filters` mode
bloom.h: make `load_bloom_filter_from_graph()` public
t/helper/test-read-graph.c: extract `dump_graph_info()`
gitformat-commit-graph: describe version 2 of BDAT
commit-graph: ensure Bloom filters are read with consistent settings
revision.c: consult Bloom filters for root commits
t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
|
|
date parser updates to be more careful about underflowing epoch
based timestamp.
* db/date-underflow-fix:
date: detect underflow/overflow when parsing dates with timezone offset
t0006: simplify prerequisites
|
|
When GIT_PAGER failed to spawn, depending on the code path taken,
we failed immediately (correct) or just spew the payload to the
standard output (incorrect). The code now always fail immediately
when GIT_PAGER fails.
* rj/pager-die-upon-exec-failure:
pager: die when paging to non-existing command
|
|
Typically, forcing a sparse index to expand to a full index means that
Git could not determine the status of a file outside of the
sparse-checkout and needed to expand sparse trees into the full list of
sparse blobs. This operation can be very slow when the sparse-checkout
is much smaller than the full tree at HEAD.
When users are in this state, there is usually a modified or untracked
file outside of the sparse-checkout mentioned by the output of 'git
status'. There are a number of reasons why this is insufficient:
1. Users may not have a full understanding of which files are inside or
outside of their sparse-checkout. This is more common in monorepos
that manage the sparse-checkout using custom tools that map build
dependencies into sparse-checkout definitions.
2. In some cases, an empty directory could exist outside the
sparse-checkout and these empty directories are not reported by 'git
status' and friends.
3. If the user has '.gitignore' or 'exclude' files, then 'git status'
will squelch the warnings and not demonstrate any problems.
In order to help users who are in this state, add a new advice message
to indicate that a sparse index is expanded to a full index. This
message should be written at most once per process, so add a static
global 'give_advice_on_expansion' to sparse-index.c. Further, there is a
case in 'git sparse-checkout set' that uses the sparse index as an
in-memory data structure (even when writing a full index) so we need to
disable the message in that kind of case.
The t1092-sparse-checkout-compatibility.sh test script compares the
behavior of several Git commands across full and sparse repositories,
including sparse repositories with and without a sparse index. We need
to disable the advice in the sparse-index repo to avoid differences in
stderr. By leaving the advice on in the sparse-checkout repo (without
the sparse index), we can test the behavior of disabling the advice in
convert_to_sparse(). (Indeed, these tests are how that necessity was
discovered.) Add a test that reenables the advice and demonstrates that
the message is output.
The advice message is defined outside of expand_index() to avoid super-
wide lines. It is also defined as a macro to avoid compile issues with
-Werror=format-security.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
helper/test-oidmap.c along with t0016-oidmap.sh test the oidmap.h
library which is built on top of hashmap.h.
Migrate them to the unit testing framework for better performance,
concise code and better debugging. Along with the migration also plug
memory leaks and make the test logic independent for all the tests.
The migration removes 'put' tests from t0016, because it is used as
setup to all the other tests, so testing it separately does not yield
any benefit.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Reviewed-by: Josh Steadmon <steadmon@google.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When "git push" is configured to use the push negotiation, a push of
deletion of a branch (without pushing anything else) may end up not
having anything to negotiate for the common ancestor discovery.
In such a case, we end up making an internal invocation of "git
fetch --negotiate-only" without any "--negotiate-tip" parameters
that stops the negotiate-only fetch from being run, which by itself
is not a bad thing (one fewer round-trip), but the end-user sees a
"fatal: --negotiate-only needs one or more --negotiation-tip=*"
message that the user cannot act upon.
Teach "git push" to notice the situation and omit performing the
negotiate-only fetch to begin with. One fewer process spawned, one
fewer "alarming" message given the user.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git diff --no-ext-diff" when diff.external is configured ignored
the "--color-moved" option.
* rs/diff-color-moved-w-no-ext-diff-fix:
diff: allow --color-moved with --no-ext-diff
|
|
Memory ownership rules for the in-core representation of
remote.*.url configuration values have been straightened out, which
resulted in a few leak fixes and code clarification.
* jk/remote-wo-url:
remote: drop checks for zero-url case
remote: always require at least one url in a remote
t5801: test remote.*.vcs config
t5801: make remote-testgit GIT_DIR setup more robust
remote: allow resetting url list
config: document remote.*.url/pushurl interaction
remote: simplify url/pushurl selection
remote: use strvecs to store remote url/pushurl
remote: transfer ownership of memory in add_url(), etc
remote: refactor alias_url() memory ownership
archive: fix check for missing url
|
|
A CPP macro USE_THE_REPOSITORY_VARIABLE is introduced to help
transition the codebase to rely less on the availability of the
singleton the_repository instance.
* ps/use-the-repository:
hex: guard declarations with `USE_THE_REPOSITORY_VARIABLE`
t/helper: remove dependency on `the_repository` in "proc-receive"
t/helper: fix segfault in "oid-array" command without repository
t/helper: use correct object hash in partial-clone helper
compat/fsmonitor: fix socket path in networked SHA256 repos
replace-object: use hash algorithm from passed-in repository
protocol-caps: use hash algorithm from passed-in repository
oidset: pass hash algorithm when parsing file
http-fetch: don't crash when parsing packfile without a repo
hash-ll: merge with "hash.h"
refs: avoid include cycle with "repository.h"
global: introduce `USE_THE_REPOSITORY_VARIABLE` macro
hash: require hash algorithm in `empty_tree_oid_hex()`
hash: require hash algorithm in `is_empty_{blob,tree}_oid()`
hash: make `is_null_oid()` independent of `the_repository`
hash: convert `oidcmp()` and `oideq()` to compare whole hash
global: ensure that object IDs are always padded
hash: require hash algorithm in `oidread()` and `oidclr()`
hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()`
hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions
|
|
The output from "git cat-file --batch-check" and "--batch-command
(info)" should not be unbuffered, for which some tests have been
added.
* ew/cat-file-unbuffered-tests:
t1006: ensure cat-file info isn't buffered by default
Git.pm: use array in command_bidi_pipe example
|
|
reftable_log_record_compare_key() is a function defined by
reftable/record.{c, h} and is used to compare the keys of two
log records when sorting multiple log records using 'qsort'.
In the current testing setup, this function is left unexercised.
Add a testing function for the same.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
reftable_ref_record_compare_name() is a function defined by
reftable/record.{c, h} and is used to compare the refname of two
ref records when sorting multiple ref records using 'qsort'.
In the current testing setup, this function is left unexercised.
Add a testing function for the same.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
reftable_record_is_deletion() is a function defined in
reftable/record.{c, h} that determines whether a record is of
type deletion or not. In the current testing setup, this function
is left untested for index records.
Add tests for this function in the case of index records.
Note that since index records cannot be of type deletion, this function
must always return '0' when called on an index record.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
reftable_record_is_deletion() is a function defined in
reftable/record.{c, h} that determines whether a record is of
type deletion or not. In the current testing setup, this function
is left untested for two of the four record types (obj, index).
Add tests for this function in the case of obj records.
Note that since obj records cannot be of type deletion, this function
must always return '0' when called on an obj record.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
reftable_record_is_deletion() is a function defined in
reftable/record.{c, h} that determines whether a record is of
type deletion or not. In the current testing setup, this function
is left untested for three of the four record types (log, obj, index).
Add tests for this function in the case of log records.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
reftable_record_is_deletion() is a function defined in
reftable/record.{c, h} that determines whether a record is of
type deletion or not. In the current testing setup, this function
is left untested for all the four record types (ref, log, obj, index).
Add tests for this function in the case of ref records.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the current testing setup for obj records, the comparison
functions for obj records, reftable_obj_record_cmp_void() and
reftable_obj_record_equal_void() are left untested.
Add tests for the same by using the wrapper functions
reftable_record_cmp() and reftable_record_equal() for
reftable_index_record_cmp_void() and reftable_index_record_equal_void()
respectively.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the current testing setup for index records, the comparison
functions for index records, reftable_index_record_cmp() and
reftable_index_record_equal() are left untested.
Add tests for the same by using the wrapper functions
reftable_record_cmp() and reftable_record_equal() for
reftable_index_record_cmp() and reftable_index_record_equal()
respectively.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the current testing setup for ref records, the comparison
functions for ref records, reftable_ref_record_cmp_void() and
reftable_ref_record_equal() are left untested.
Add tests for the same by using the wrapper functions
reftable_record_cmp() and reftable_record_equal() for
reftable_ref_record_cmp_void() and reftable_ref_record_equal()
respectively.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the current testing setup for log records, only
reftable_log_record_equal() among log record's comparison functions
is tested.
Modify the existing tests to exercise reftable_log_record_cmp_void()
(using the wrapper function reftable_record_cmp()) alongside
reftable_log_record_equal().
Note that to achieve this, we'll need to replace instances of
reftable_log_record_equal() with the wrapper function
reftable_record_equal().
Rename the now modified test to reflect its nature of exercising
all comparison operations, not just equality.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
reftable/record_test.c exercises the functions defined in
reftable/record.{c, h}. Migrate reftable/record_test.c to the
unit testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework, and renaming the tests to fit unit-tests' naming scheme.
While at it, change the type of index variable 'i' to 'size_t'
from 'int'. This is because 'i' is used in comparison against
'ARRAY_SIZE(x)' which is of type 'size_t'.
Also, use set_hash() which is defined locally in the test file
instead of set_test_hash() which is defined by
reftable/test_framework.{c, h}. This is fine to do as both these
functions are similarly implemented, and
reftable/test_framework.{c, h} is not #included in the ported test.
Get rid of reftable_record_print() from the tests as well, because
it clutters the test framework's output and we have no way of
verifying the output.
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A quick test tells us that t0612 does not trigger any leak:
$ make SANITIZE=leak test GIT_TEST_PASSING_SANITIZE_LEAK=check GIT_TEST_SANITIZE_LEAK_LOG=true GIT_TEST_OPTS=-i T=t0612-reftable-jgit-compatibility.sh
[...]
*** t0612-reftable-jgit-compatibility.sh ***
in GIT_TEST_PASSING_SANITIZE_LEAK=check mode, setting --invert-exit-code for TEST_PASSES_SANITIZE_LEAK != true
ok 1 - CGit repository can be read by JGit
ok 2 - JGit repository can be read by CGit
ok 3 - mixed writes from JGit and CGit
ok 4 - JGit can read multi-level index
# passed all 4 test(s)
1..4
# faking up non-zero exit with --invert-exit-code
make[2]: *** [Makefile:75: t0612-reftable-jgit-compatibility.sh] Error 1
Let's mark it as leak-free to silence the machinery activated by
`GIT_TEST_PASSING_SANITIZE_LEAK=check`.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When a test that leaks runs with GIT_TEST_SANITIZE_LEAK_LOG=true,
the test returns zero, which is not what we want.
In the if-else's chain we have in "check_test_results_san_file_", we
consider three variables: $passes_sanitize_leak, $sanitize_leak_check
and, implicitly, GIT_TEST_SANITIZE_LEAK_LOG (always set to "true" at
that point).
For the first two variables we have different considerations depending
on the value of $test_failure, which makes sense. However, for the
third, GIT_TEST_SANITIZE_LEAK_LOG, we don't; regardless of
$test_failure, we use "invert_exit_code=t" to produce a non-zero
return value.
That assumes "$test_failure" is always zero at that point. But it
may not be:
$ git checkout v2.40.1
$ make test SANITIZE=leak T=t3200-branch.sh # this fails
$ make test SANITIZE=leak GIT_TEST_SANITIZE_LEAK_LOG=true T=t3200-branch.sh # this succeeds
[...]
With GIT_TEST_SANITIZE_LEAK_LOG=true, our logs revealed a memory leak, exiting with a non-zero status!
# faked up failures as TODO & now exiting with 0 due to --invert-exit-code
We need to use "invert_exit_code=t" only when "$test_failure" is zero.
Let's add the missing conditions in the if-else's chain to make it work
as expected.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We can mark t0613 as leak-free:
$ make test SANITIZE=leak GIT_TEST_PASSING_SANITIZE_LEAK=check GIT_TEST_SANITIZE_LEAK_LOG=true T=t0613-reftable-write-options.sh
[...]
*** t0613-reftable-write-options.sh ***
in GIT_TEST_PASSING_SANITIZE_LEAK=check mode, setting --invert-exit-code for TEST_PASSES_SANITIZE_LEAK != true
ok 1 - default write options
ok 2 - disabled reflog writes no log blocks
ok 3 - many refs results in multiple blocks
ok 4 - tiny block size leads to error
ok 5 - small block size leads to multiple ref blocks
ok 6 - small block size fails with large reflog message
ok 7 - block size exceeding maximum supported size
ok 8 - restart interval at every single record
ok 9 - restart interval exceeding maximum supported interval
ok 10 - object index gets written by default with ref index
ok 11 - object index can be disabled
# passed all 11 test(s)
1..11
# faking up non-zero exit with --invert-exit-code
make[2]: *** [Makefile:75: t0613-reftable-write-options.sh] Error 1
Do it.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Addresses that are mentioned on the trailers in the commit log
messages (e.g., "Reviewed-by") are added to the "Cc:" list by "git
send-email". These hand-written addresses, however, may be
malformed (e.g., having unquoted "." and other punctutation marks in
the display-name part) and can upset MTA.
The code does use the sanitize_address() helper on these
address-looking strings to turn them into valid addresses, but it is
used only to see if the address should be suppressed. The original
string taken from the message is added to the @cc list if the code
decides the address is not suppressed.
Because the addresses on trailer lines are hand-written and more
likely to contain malformed addresses, when adding to the @cc list,
use the result from sanitize_address, not the original. Note that
we do not modify the behaviour for addresses taken from the e-mail
headers, as they are more likely to be machine generated and
well-formed.
Signed-off-by: Csókás, Bence <csokas.bence@prolan.hu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Fix for a progress bar.
* ds/ahead-behind-fix:
commit-graph: increment progress indicator
|
|
An overly large ".gitignore" files are now rejected silently.
* jk/cap-exclude-file-size:
dir.c: reduce max pattern file size to 100MB
dir.c: skip .gitignore, etc larger than INT_MAX
|
|
The safe.directory configuration knob has been updated to
optionally allow leading path matches.
* jc/safe-directory-leading-path:
safe.directory: allow "lead/ing/path/*" match
|
|
"git init" in an already created directory, when the user
configuration has includeif.onbranch, started to fail recently,
which has been corrected.
* ps/fix-reinit-includeif-onbranch:
setup: fix bug with "includeIf.onbranch" when initializing dir
|
|
The chainlint script (invoked during "make test") did nothing when
it failed to detect the number of available CPUs. It now falls
back to 1 CPU to avoid the problem.
* es/chainlint-ncores-fix:
chainlint.pl: latch CPU count directly reported by /proc/cpuinfo
chainlint.pl: fix incorrect CPU count on Linux SPARC
chainlint.pl: make CPU count computation more robust
|
|
Test fix.
* mt/t0211-typofix:
t/t0211-trace2-perf.sh: fix typo patern -> pattern
|
|
Scalar fix.
* ds/scalar-reconfigure-all-fix:
scalar: avoid segfault in reconfigure --all
|
|
The maximum size of attribute files is enforced more consistently.
* tb/attr-limits:
attr.c: move ATTR_MAX_FILE_SIZE check into read_attr_from_buf()
|
|
zsh can pretend to be a normal shell pretty well except for some
glitches that we tickle in some of our scripts. Work them around
so that "vimdiff" and our test suite works well enough with it.
* bc/zsh-compatibility:
vimdiff: make script and tests work with zsh
t4046: avoid continue in &&-chain for zsh
|
|
A scheduled "git maintenance" job is expected to work on all
repositories it knows about, but it stopped at the first one that
errored out. Now it keeps going.
* js/for-each-repo-keep-going:
maintenance: running maintenance should not stop on errors
for-each-repo: optionally keep going on an error
|
|
"git stash -S" did not handle binary files correctly, which has
been corrected.
* aj/stash-staged-fix:
stash: fix "--staged" with binary files
|
|
The procedure to build multi-pack-index got confused by the
replace-refs mechanism, which has been corrected by disabling the
latter.
* xx/disable-replace-when-building-midx:
midx: disable replace objects
|
|
"git rebase --signoff" used to forget that it needs to add a
sign-off to the resulting commit when told to continue after a
conflict stops its operation.
* pw/rebase-m-signoff-fix:
rebase -m: fix --signoff with conflicts
sequencer: store commit message in private context
sequencer: move current fixups to private context
sequencer: start removing private fields from public API
sequencer: always free "struct replay_opts"
|
|
"git fetch-pack -k -k" without passing "--lock-pack" (which we
never do ourselves) did not work at all, which has been corrected.
* jk/fetch-pack-fsck-wo-lock-pack:
fetch-pack: fix segfault when fscking without --lock-pack
|
|
A helper function shared between two tests had a copy-paste bug,
which has been corrected.
* jk/t5500-typofix:
t5500: fix mistaken $SERVER reference in helper function
|
|
When "git merge" sees that the index cannot be refreshed (e.g. due
to another process doing the same in the background), it died but
after writing MERGE_HEAD etc. files, which was useless for the
purpose to recover from the failure.
* kz/merge-fail-early-upon-refresh-failure:
merge: avoid write merge state when unable to write index
|
|
A few of the bundle URI tests point config at a fake bundle; they care
only that the client has been configured with _some_ bundle, but it
doesn't have to actually contain objects.
For the file:// tests, we use "$BUNDLE_URI_REPO_URI/fake.bdl", a
non-existent file inside the actual remote repo. But for git:// and
http:// tests, we use "https://example.com/fake.bdl". This works OK in
practice, but it means we actually make a request to example.com (which
returns a placeholder HTML response). That can be annoying when running
the test suite on a spotty network (it doesn't produce a wrong result,
since we expect it to fail, but it may introduce delays).
We can reduce our dependency on the outside world by using a local URL.
It would work to just do "file://$PWD/fake.bdl" here, since the bundle
code does not care about the actual location. But in the long run I
suspect we may have more restrictions on which protocols can be passed
around as bundle URIs. So instead, let's stick with the file:// repo's
pattern and just point to a bogus name based on the remote repo's URL.
For http this makes perfect sense; we'll make a request to the local
http server and find that there's nothing there. For git:// it's a
little weird, as you wouldn't normally access a bundle file over git://
at all. But it's probably the most reasonable guess we can make for now,
and anybody who tightens protocol selection later will know better
what's the best path forward.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
t5551 tries to access a URL with a bogus hostname and confirms that
http.curloptResolve lets us use this otherwise unresolvable name.
Before doing so, though, we confirm that trying to access the bogus
hostname without http.curloptResolve fails as expected. This isn't
testing Git at all, but is confirming the test's assumptions. That's
often a good thing to do, but in this case it means that we'll actually
try to resolve the external name. Even though it's unlikely that
"gitbogusexamplehost.invalid" would ever resolve, the DNS lookup itself
may take time.
It's probably reasonable to just assume that this obviously-bogus name
would not actually resolve in practice, which lets us reduce our test
suite's dependency on the outside world.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We test how "fetch --set-upstream" behaves when given an invalid URL,
using the bogus URL "http://nosuchdomain.example.com". But finding out
that it is invalid requires an actual DNS lookup.
Reduce our dependency on external factors by using an invalid local
filesystem URL, which works just as well for our purposes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When describe is run with 'dirty' flag, we refresh the index
to make sure it is in sync with the filesystem before
determining if the working tree is dirty. However, this is
not done for the codepath where the 'broken' flag is used.
This causes `git describe --broken --dirty` to false
positively report the worktree being dirty if a file has
different stat info than what is recorded in the index.
Running `git update-index -q --refresh` to refresh the index
before running diff-index fixes the problem.
Also add tests to deliberately update stat info of a
file before running describe to verify it behaves correctly.
Reported-by: Paul Millar <paul.millar@desy.de>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Overriding the date of a commit to be close to "1970-01-01 00:00:00"
with a large enough positive timezone for the equivelant GMT time to be
before the epoch is considered valid by `parse_date_basic`. Similar
behaviour occurs when using a date close to "2099-12-31 23:59:59" (the
maximum date allowed by `tm_to_time_t`) with a large enough negative
timezone offset.
This leads to an integer underflow or underflow respectively in the
commit timestamp, which is not caught by `git-commit`, but will cause
other services to fail, such as `git-fsck`, which, for the first case,
reports "badDateOverflow: invalid author/committer line - date causes
integer overflow".
Instead check the timezone offset and fail if the resulting time comes
before the epoch "1970-01-01T00:00:00Z" or after the maximum date
"2099-12-31T23:59:59Z".
Using the REQUIRE_64BIT_TIME prerequisite, make sure that the tests
near the end of Git time (aka end of year 2099) are not attempted on
purely 32-bit systems, as they cannot express timestamp beyond 2038
anyway.
Signed-off-by: Darcy Burke <acednes@gmail.com>
[jc: fixups for 32-bit platforms]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The system must support 64-bit time and its time_t must be 64-bit
wide to pass these tests. Combine these two prerequisites together
to simplify the tests. In theory, they could be fulfilled
independently and tests could require only one without the other,
but in practice, these must come hand-in-hand.
Update the "check_parse" test helper to pay attention to the
REQUIRE_64BIT_TIME variable, which can be set to the HAVE_64BIT_TIME
prerequisite so that a parse test can be skipped on 32-bit systems.
This will be used in the next step to skip tests for timestamps near
the end of year 2099, as 32-bit systems will not be able to express
a timestamp beyond 2038 anyway.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In an earlier commit, a bug was described where it's possible for Git to
produce non-murmur3 hashes when the platform's "char" type is signed,
and there are paths with characters whose highest bit is set (i.e. all
characters >= 0x80).
That patch allows the caller to control which version of Bloom filters
are read and written. However, even on platforms with a signed "char"
type, it is possible to reuse existing Bloom filters if and only if
there are no changed paths in any commit's first parent tree-diff whose
characters have their highest bit set.
When this is the case, we can reuse the existing filter without having
to compute a new one. This is done by marking trees which are known to
have (or not have) any such paths. When a commit's root tree is verified
to not have any such paths, we mark it as such and declare that the
commit's Bloom filter is reusable.
Note that this heuristic only goes in one direction. If neither a commit
nor its first parent have any paths in their trees with non-ASCII
characters, then we know for certain that a path with non-ASCII
characters will not appear in a tree-diff against that commit's first
parent. The reverse isn't necessarily true: just because the tree-diff
doesn't contain any such paths does not imply that no such paths exist
in either tree.
So we end up recomputing some Bloom filters that we don't strictly have
to (i.e. their bits are the same no matter which version of murmur3 we
use). But culling these out is impossible, since we'd have to perform
the full tree-diff, which is the same effort as computing the Bloom
filter from scratch.
But because we can cache our results in each tree's flag bits, we can
often avoid recomputing many filters, thereby reducing the time it takes
to run
$ git commit-graph write --changed-paths --reachable
when upgrading from v1 to v2 Bloom filters.
To benchmark this, let's generate a commit-graph in linux.git with v1
changed-paths in generation order[^1]:
$ git clone git@github.com:torvalds/linux.git
$ cd linux
$ git commit-graph write --reachable --changed-paths
$ graph=".git/objects/info/commit-graph"
$ mv $graph{,.bak}
Then let's time how long it takes to go from v1 to v2 filters (with and
without the upgrade path enabled), resetting the state of the
commit-graph each time:
$ git config commitGraph.changedPathsVersion 2
$ hyperfine -p 'cp -f $graph.bak $graph' -L v 0,1 \
'GIT_TEST_UPGRADE_BLOOM_FILTERS={v} git.compile commit-graph write --reachable --changed-paths'
On linux.git (where there aren't any non-ASCII paths), the timings
indicate that this patch represents a speed-up over recomputing all
Bloom filters from scratch:
Benchmark 1: GIT_TEST_UPGRADE_BLOOM_FILTERS=0 git.compile commit-graph write --reachable --changed-paths
Time (mean ± σ): 124.873 s ± 0.316 s [User: 124.081 s, System: 0.643 s]
Range (min … max): 124.621 s … 125.227 s 3 runs
Benchmark 2: GIT_TEST_UPGRADE_BLOOM_FILTERS=1 git.compile commit-graph write --reachable --changed-paths
Time (mean ± σ): 79.271 s ± 0.163 s [User: 74.611 s, System: 4.521 s]
Range (min … max): 79.112 s … 79.437 s 3 runs
Summary
'GIT_TEST_UPGRADE_BLOOM_FILTERS=1 git.compile commit-graph write --reachable --changed-paths' ran
1.58 ± 0.01 times faster than 'GIT_TEST_UPGRADE_BLOOM_FILTERS=0 git.compile commit-graph write --reachable --changed-paths'
On git.git, we do have some non-ASCII paths, giving us a more modest
improvement from 4.163 seconds to 3.348 seconds, for a 1.24x speed-up.
On my machine, the stats for git.git are:
- 8,285 Bloom filters computed from scratch
- 10 Bloom filters generated as empty
- 4 Bloom filters generated as truncated due to too many changed paths
- 65,114 Bloom filters were reused when transitioning from v1 to v2.
[^1]: Note that this is is important, since `--stdin-packs` or
`--stdin-commits` orders commits in the commit-graph by their pack
position (with `--stdin-packs`) or in the raw input (with
`--stdin-commits`).
Since we compute Bloom filters in the same order that commits appear
in the graph, we must see a commit's (first) parent before we process
the commit itself. This is only guaranteed to happen when sorting
commits by their generation number.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The murmur3 implementation in bloom.c has a bug when converting series
of 4 bytes into network-order integers when char is signed (which is
controllable by a compiler option, and the default signedness of char is
platform-specific). When a string contains characters with the high bit
set, this bug causes results that, although internally consistent within
Git, does not accord with other implementations of murmur3 (thus,
the changed path filters wouldn't be readable by other off-the-shelf
implementatios of murmur3) and even with Git binaries that were compiled
with different signedness of char. This bug affects both how Git writes
changed path filters to disk and how Git interprets changed path filters
on disk.
Therefore, introduce a new version (2) of changed path filters that
corrects this problem. The existing version (1) is still supported and
is still the default, but users should migrate away from it as soon
as possible.
Because this bug only manifests with characters that have the high bit
set, it may be possible that some (or all) commits in a given repo would
have the same changed path filter both before and after this fix is
applied. However, in order to determine whether this is the case, the
changed paths would first have to be computed, at which point it is not
much more expensive to just compute a new changed path filter.
So this patch does not include any mechanism to "salvage" changed path
filters from repositories. There is also no "mixed" mode - for each
invocation of Git, reading and writing changed path filters are done
with the same version number; this version number may be explicitly
stated (typically if the user knows which version they need) or
automatically determined from the version of the existing changed path
filters in the repository.
There is a change in write_commit_graph(). graph_read_bloom_data()
makes it possible for chunk_bloom_data to be non-NULL but
bloom_filter_settings to be NULL, which causes a segfault later on. I
produced such a segfault while developing this patch, but couldn't find
a way to reproduce it neither after this complete patch (or before),
but in any case it seemed like a good thing to include that might help
future patch authors.
The value in t0095 was obtained from another murmur3 implementation
using the following Go source code:
package main
import "fmt"
import "github.com/spaolacci/murmur3"
func main() {
fmt.Printf("%x\n", murmur3.Sum32([]byte("Hello world!")))
fmt.Printf("%x\n", murmur3.Sum32([]byte{0x99, 0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff}))
}
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Subsequent commits will teach Git another version of changed path
filter that has different behavior with paths that contain at least
one character with its high bit set, so test the existing behavior as
a baseline.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Implement a mode of the "read-graph" test helper to dump out the
hexadecimal contents of the Bloom filter(s) contained in a commit-graph.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Prepare for the 'read-graph' test helper to perform other tasks besides
dumping high-level information about the commit-graph by extracting its
main routine into a separate function.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The changed-path Bloom filter mechanism is parameterized by a couple of
variables, notably the number of bits per hash (typically "m" in Bloom
filter literature) and the number of hashes themselves (typically "k").
It is critically important that filters are read with the Bloom filter
settings that they were written with. Failing to do so would mean that
each query is liable to compute different fingerprints, meaning that the
filter itself could return a false negative. This goes against a basic
assumption of using Bloom filters (that they may return false positives,
but never false negatives) and can lead to incorrect results.
We have some existing logic to carry forward existing Bloom filter
settings from one layer to the next. In `write_commit_graph()`, we have
something like:
if (!(flags & COMMIT_GRAPH_NO_WRITE_BLOOM_FILTERS)) {
struct commit_graph *g = ctx->r->objects->commit_graph;
/* We have changed-paths already. Keep them in the next graph */
if (g && g->chunk_bloom_data) {
ctx->changed_paths = 1;
ctx->bloom_settings = g->bloom_filter_settings;
}
}
, which drags forward Bloom filter settings across adjacent layers.
This doesn't quite address all cases, however, since it is possible for
intermediate layers to contain no Bloom filters at all. For example,
suppose we have two layers in a commit-graph chain, say, {G1, G2}. If G1
contains Bloom filters, but G2 doesn't, a new G3 (whose base graph is
G2) may be written with arbitrary Bloom filter settings, because we only
check the immediately adjacent layer's settings for compatibility.
This behavior has existed since the introduction of changed-path Bloom
filters. But in practice, this is not such a big deal, since the only
way up until this point to modify the Bloom filter settings at write
time is with the undocumented environment variables:
- GIT_TEST_BLOOM_SETTINGS_BITS_PER_ENTRY
- GIT_TEST_BLOOM_SETTINGS_NUM_HASHES
- GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS
(it is still possible to tweak MAX_CHANGED_PATHS between layers, but
this does not affect reads, so is allowed to differ across multiple
graph layers).
But in future commits, we will introduce another parameter to change the
hash algorithm used to compute Bloom fingerprints itself. This will be
exposed via a configuration setting, making this foot-gun easier to use.
To prevent this potential issue, validate that all layers of a split
commit-graph have compatible settings with the newest layer which
contains Bloom filters.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Original-test-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The commit-graph stores changed-path Bloom filters which represent the
set of paths included in a tree-level diff between a commit's root tree
and that of its parent.
When a commit has no parents, the tree-diff is computed against that
commit's root tree and the empty tree. In other words, every path in
that commit's tree is stored in the Bloom filter (since they all appear
in the diff).
Consult these filters during pathspec-limited traversals in the function
`rev_same_tree_as_empty()`. Doing so yields a performance improvement
where we can avoid enumerating the full set of paths in a parentless
commit's root tree when we know that the path(s) of interest were not
listed in that commit's changed-path Bloom filter.
Suggested-by: SZEDER Gábor <szeder.dev@gmail.com>
Original-patch-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The existing implementation of test_bloom_filters_not_used() asserts
that the Bloom filter sub-system has not been initialized at all, by
checking for the absence of any data from it from trace2.
In the following commit, it will become possible to load Bloom filters
without using them (e.g., because the `commitGraph.changedPathVersion`
introduced later in this series is incompatible with the hash version
with which the commit-graph's Bloom filters were written).
When this is the case, it's possible to initialize the Bloom filter
sub-system, while still not using any Bloom filters. When this is the
case, check that the data dump from the Bloom sub-system is all zeros,
indicating that no filters were used.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When trying to execute a non-existent program from GIT_PAGER, we display
an error. However, we also send the complete text to the terminal
and return a successful exit code. This can be confusing for the user
and the displayed error could easily become obscured by a lengthy
text.
For example, here the error message would be very far above after
sending 50 MB of text:
$ GIT_PAGER=non-existent t/test-terminal.perl git log | wc -c
error: cannot run non-existent: No such file or directory
50314363
Let's make the error clear by aborting the process and return an error
so that the user can easily correct their mistake.
This will be the result of the change:
$ GIT_PAGER=non-existent t/test-terminal.perl git log | wc -c
error: cannot run non-existent: No such file or directory
fatal: unable to execute pager 'non-existent'
0
The behavior change we're introducing in this commit affects two tests
in t7006, which is a good sign regarding test coverage and requires us
to address it.
The first test is 'git skips paging non-existing command'. This test
comes from f7991f01f2 (t7006: clean up SIGPIPE handling in trace2 tests,
2021-11-21,) where a modification was made to a test that was originally
introduced in c24b7f6736 (pager: test for exit code with and without
SIGPIPE, 2021-02-02). That original test was, IMHO, in the same
direction we're going in this commit.
At any rate, this test obviously needs to be adjusted to check the new
behavior we are introducing. Do it.
The second test being affected is: 'non-existent pager doesnt cause
crash', introduced in f917f57f40 (pager: fix crash when pager program
doesn't exist, 2021-11-24). As its name states, it has the intention of
checking that we don't introduce a regression that produces a crash when
GIT_PAGER points to a nonexistent program.
This test could be considered redundant nowadays, due to us already
having several tests checking implicitly what a non-existent command in
GIT_PAGER produces. However, let's maintain a good belt-and-suspenders
strategy; adapt it to the new world.
Finally, it's worth noting that we are not changing the behavior if the
command specified in GIT_PAGER is a shell command. In such cases, it
is:
$ GIT_PAGER=:\;non-existent t/test-terminal.perl git log
:;non-existent: 1: non-existent: not found
died of signal 13 at t/test-terminal.perl line 33.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git update-server-info" and "git commit-graph --write" have been
updated to use the tempfile API to avoid leaving cruft after
failing.
* tb/commit-graph-use-tempfile:
server-info.c: remove temporary info files on exit
commit-graph.c: remove temporary graph layers on exit
|
|
For over a year, setting add.interactive.useBuiltin configuration
variable did nothing but giving a "this does not do anything"
warning. Finally remove it.
* jc/add-i-retire-usebuiltin-config:
add-i: finally retire add.interactive.useBuiltin
|
|
We forgot to normalize the result of getcwd() to NFC on macOS where
all other paths are normalized, which has been corrected. This still
does not address the case where core.precomposeUnicode configuration
is not defined globally.
* tb/precompose-getcwd:
macOS: ls-files path fails if path of workdir is NFD
|
|
The pseudo-merge reachability bitmap to help more efficient storage
of the reachability bitmap in a repository with too many refs has
been added.
* tb/pseudo-merge-reachability-bitmap: (26 commits)
pack-bitmap.c: ensure pseudo-merge offset reads are bounded
Documentation/technical/bitmap-format.txt: add missing position table
t/perf: implement performance tests for pseudo-merge bitmaps
pseudo-merge: implement support for finding existing merges
ewah: `bitmap_equals_ewah()`
pack-bitmap: extra trace2 information
pack-bitmap.c: use pseudo-merges during traversal
t/test-lib-functions.sh: support `--notick` in `test_commit_bulk()`
pack-bitmap: implement test helpers for pseudo-merge
ewah: implement `ewah_bitmap_popcount()`
pseudo-merge: implement support for reading pseudo-merge commits
pack-bitmap.c: read pseudo-merge extension
pseudo-merge: scaffolding for reads
pack-bitmap: extract `read_bitmap()` function
pack-bitmap-write.c: write pseudo-merge table
pseudo-merge: implement support for selecting pseudo-merge commits
config: introduce `git_config_double()`
pack-bitmap: make `bitmap_writer_push_bitmapped_commit()` public
pack-bitmap: implement `bitmap_writer_has_bitmapped_object_id()`
pack-bitmap-write: support storing pseudo-merge commits
...
|
|
We ignore the option --color-moved if an external diff program is
configured, presumably because its overhead is unnecessary in that case.
Respect the option if we don't actually use the external diff, though.
Reported-by: lolligerhans@gmx.de
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "--heads" option of "ls-remote" and "show-ref" has been been
deprecated; "--branches" replaces "--heads".
* jc/heads-are-branches:
show-ref: introduce --branches and deprecate --heads
ls-remote: introduce --branches and deprecate --heads
refs: call branches branches
|
|
When the user adds to "git rebase -i" instruction to "pick" a merge
commit, the error experience is not pleasant. Such an error is now
caught earlier in the process that parses the todo list.
* pw/rebase-i-error-message:
rebase -i: improve error message when picking merge
rebase -i: pass struct replay_opts to parse_insn_line()
|
|
Fix for a progress bar.
* ds/ahead-behind-fix:
commit-graph: increment progress indicator
|
|
Setting core.abbrev too early before the repository set-up
(typically in "git clone") caused segfault, which as been
corrected.
* ps/abbrev-length-before-setup-fix:
object-name: don't try to abbreviate to lengths greater than hexsz
parse-options-cb: stop clamping "--abbrev=" to hash length
config: fix segfault when parsing "core.abbrev" without repo
|
|
"git format-patch --interdiff" for multi-patch series learned to
turn on cover letters automatically (unless told never to enable
cover letter with "--no-cover-letter" and such).
* rj/format-patch-auto-cover-with-interdiff:
format-patch: assume --cover-letter for diff in multi-patch series
t4014: cleanups in a few tests
|
|
"git update-ref --stdin" learned to handle transactional updates of
symbolic-refs.
* kn/update-ref-symref:
update-ref: add support for 'symref-update' command
reftable: pick either 'oid' or 'target' for new updates
update-ref: add support for 'symref-create' command
update-ref: add support for 'symref-delete' command
update-ref: add support for 'symref-verify' command
refs: specify error for regular refs with `old_target`
refs: create and use `ref_update_expects_existing_old_ref()`
|
|
"oidtree" tests were rewritten to use the unit test framework.
* gt/unit-test-oidtree:
t/: migrate helper/test-oidtree.c to unit-tests/t-oidtree.c
|
|
Assorted fixes to multi-pack-index code paths.
* tb/multi-pack-reuse-fix:
pack-revindex.c: guard against out-of-bounds pack lookups
pack-bitmap.c: avoid uninitialized `pack_int_id` during reuse
midx-write.c: do not read existing MIDX with `packs_to_include`
|
|
"git diff --exit-code --ext-diff" learned to take the exit status
of the external diff driver into account when deciding the exit
status of the overall "git diff" invocation when configured to do
so.
* rs/diff-exit-code-with-external-diff:
diff: let external diffs report that changes are uninteresting
userdiff: add and use struct external_diff
t4020: test exit code with external diffs
|
|
The end of t5500 contains two tests which use a single helper function,
fetch_filter_blob_limit_zero(). It takes a parameter to point to the
path of the server repository, which we store locally as $SERVER. The
first caller uses the relative path "server", while the second points
into the httpd document root.
Commit 07ef3c6604 (fetch test: use more robust test for filtered
objects, 2019-12-23) refactored some lines, but accidentally switched
"$SERVER" to "server" in one spot. That means the second caller is
looking at the server directory from the previous test rather than its
own.
This happens to work out because the "server" directory from the first
test is still hanging around, and the contents of the two are identical.
But it was clearly not the intended behavior, and is fragile to cleaning
up the leftovers from the first test.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The fetch-pack internals have multiple options related to creating
".keep" lock-files for the received pack:
- if args.lock_pack is set, then we tell index-pack to create a .keep
file. In the fetch-pack plumbing command, this is triggered by
passing "-k" twice.
- if the caller passes in a pack_lockfiles string list, then we use it
to record the path of the keep-file created by index-pack. We get
that name by reading the stdout of index-pack. In the fetch-pack
command, this is triggered by passing the (undocumented) --lock-pack
option; without it, we pass in a NULL string list.
So it's possible to ask index-pack to create the lock-file (using "-k
-k") but not ask to record it (by avoiding "--lock-pack"). This worked
fine until 5476e1efde (fetch-pack: print and use dangling .gitmodules,
2021-02-22), but now it causes a segfault.
Before that commit, if pack_lockfiles was NULL, we wouldn't bother
reading the output from index-pack at all. But since that commit,
index-pack may produce extra output if we asked it to fsck. So even if
nobody cares about the lockfile path, we still need to read it to skip
to the output we do care about.
We correctly check that we didn't get a NULL lockfile path (which can
happen if we did not ask it to create a .keep file at all), but we
missed the case where the lockfile path is not NULL (due to "-k -k") but
the pack_lockfiles string_list is NULL (because nobody passed
"--lock-pack"), and segfault trying to add to the NULL string-list.
We can fix this by skipping the append to the string list when either
the value or the list is NULL. In that case we must also free the
lockfile path to avoid leaking it when it's non-NULL.
Nobody noticed the bug for so long because the transport code used by
"git fetch" always passes in a pack_lockfiles pointer, and remote-curl
(the main user of the fetch-pack plumbing command) always passes
--lock-pack.
Reported-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
merge_submodule() stores errors using path_msg(), whereas other call
sites make use of the error() function. This is inconsistent, and
moving towards path_msg() seems more friendly for libification efforts
since it will allow the caller to determine whether the error messages
need to be printed.
Note that this deferred handling of error messages changes the error
message in a recursive merge from
error: failed to execute internal merge
to
From inner merge: error: failed to execute internal merge
which provides a little more information about the error which may be
useful. Since the recursive merge strategy still only shows the older
error, we had to adjust the new testcase introduced a few commits ago to
just search for the older message somewhere in the output.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The calling convention for the merge machinery is
One call to init_merge_options()
One or more calls to merge_incore_[non]recursive()
One call to merge_finalize()
(possibly indirectly via merge_switch_to_result())
Both merge_switch_to_result() and merge_finalize() expect
opt->priv == NULL && result->priv != NULL
which is supposed to be set up by our move_opt_priv_to_result_priv()
function. However, two codepaths dealing with error cases did not
execute this necessary logic, which could result in assertion failures
(or, if assertions were compiled out, could result in segfaults). Fix
the oversight and add a test that would have caught one of these
problems.
While at it, also tighten an existing test for a non-recursive merge
to verify that it fails with appropriate status. Most merge tests in
the testsuite check either for success or conflicts; those testing for
neither are rare and it is good to ensure they support the invariant
assumed by builtin/merge.c in this comment:
/*
* The backend exits with 1 when conflicts are
* left to be resolved, with 2 when it does not
* handle the given merge at all.
*/
So, explicitly check for the exit status of 2 in these cases.
Reported-by: Matt Cree <matt.cree@gearset.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The existing fetch.fsckObjects and transfer.fsckObjects configurations
were not fully applied to bundle-involved fetches, including direct
bundle fetches and bundle-uri enabled fetches. Furthermore, there was no
object verification support for unbundle.
This commit extends object verification support in `bundle.c:unbundle`
by adding the `VERIFY_BUNDLE_FSCK` option to `verify_bundle_flags`. When
this option is enabled, we append the `--fsck-objects` flag to
`git-index-pack`.
The `VERIFY_BUNDLE_FSCK` option is now used by bundle-involved fetches,
where we use `fetch-pack.c:fetch_pack_fsck_objects` to determine whether
to enable this option for `bundle.c:unbundle`, specifically in:
- `transport.c:fetch_refs_from_bundle` for direct bundle fetches.
- `bundle-uri.c:unbundle_from_file` for bundle-uri enabled fetches.
This addition ensures a consistent logic for object verification during
fetches. Tests have been added to confirm functionality in the scenarios
mentioned above.
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When using the bundle-uri mechanism with a bundle list containing
multiple interrelated bundles, we encountered a bug where tips from
downloaded bundles were not discovered, thus resulting in rather slow
clones. This was particularly problematic when employing the
"creationTokens" heuristic.
To reproduce this issue, consider a repository with a single branch
"main" pointing to commit "A". Firstly, create a base bundle with:
git bundle create base.bundle main
Then, add a new commit "B" on top of "A", and create an incremental
bundle for "main":
git bundle create incr.bundle A..main
Now, generate a bundle list with the following content:
[bundle]
version = 1
mode = all
heuristic = creationToken
[bundle "base"]
uri = base.bundle
creationToken = 1
[bundle "incr"]
uri = incr.bundle
creationToken = 2
A fresh clone with the bundle list above should result in a reference
"refs/bundles/main" pointing to "B" in the new repository. However, git
would still download everything from the server, as if it had fetched
nothing locally.
So why the "refs/bundles/main" is not discovered? After some digging I
found that:
1. Bundles in bundle list are downloaded to local files via
`bundle-uri.c:download_bundle_list` or via
`bundle-uri.c:fetch_bundles_by_token` for the "creationToken"
heuristic.
2. Each bundle is unbundled via `bundle-uri.c:unbundle_from_file`, which
is called by `bundle-uri.c:unbundle_all_bundles` or called within
`bundle-uri.c:fetch_bundles_by_token` for the "creationToken"
heuristic.
3. To get all prerequisites of the bundle, the bundle header is read
inside `bundle-uri.c:unbundle_from_file` to by calling
`bundle.c:read_bundle_header`.
4. Then it calls `bundle.c:unbundle`, which calls
`bundle.c:verify_bundle` to ensure the repository contains all the
prerequisites.
5. `bundle.c:verify_bundle` calls `parse_object`, which eventually
invokes `packfile.c:prepare_packed_git` or
`packfile.c:reprepare_packed_git`, filling
`raw_object_store->packed_git` and setting `packed_git_initialized`.
6. If `bundle.c:unbundle` succeeds, it writes refs via
`refs.c:refs_update_ref` with `REF_SKIP_OID_VERIFICATION` set. Here
bundle refs which can target arbitrary objects are written to the
repository.
7. Finally, in `fetch-pack.c:do_fetch_pack_v2`, the functions
`fetch-pack.c:mark_complete_and_common_ref` and
`fetch-pack.c:mark_tips` are called with `OBJECT_INFO_QUICK` set to
find local tips for negotiation. The `OBJECT_INFO_QUICK` flag
prevents `packfile.c:reprepare_packed_git` from being called,
resulting in failures to parse OIDs that reside only in the latest
bundle.
In the example above, when unbunding "incr.bundle", "base.pack" is added
to `packed_git` due to prerequisites verification. However, "B" cannot
be found for negotiation because it exists in "incr.pack", which is not
included in `packed_git`.
Fix the bug by removing `REF_SKIP_OID_VERIFICATION` flag when writing
bundle refs. When `refs.c:refs_update_ref` is called to write the
corresponding bundle refs, it triggers `refs.c:ref_transaction_commit`.
This, in turn, invokes `refs.c:ref_transaction_prepare`, which calls
`transaction_prepare` of the refs storage backend. For files backend, it
is `files-backend.c:files_transaction_prepare`, and for reftable
backend, it is `reftable-backend.c:reftable_be_transaction_prepare`.
Both functions eventually call `object.c:parse_object`, which can invoke
`packfile.c:reprepare_packed_git` to refresh `packed_git`. This ensures
that bundle refs point to valid objects and that all tips from bundle
refs are correctly parsed during subsequent negotiations.
A set of negotiation-related tests for cloning with bundle-uri has been
included to demonstrate that downloaded bundles are utilized to
accelerate fetching.
Additionally, another test has been added to show that bundles with
incorrect headers, where refs point to non-existent objects, do not
result in any bundle refs being created in the repository.
Reviewed-by: Karthik Nayak <karthik.188@gmail.com>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While working on buffering changes to `git cat-file' in a
separate patch, I inadvertently made the output of --batch-check
and the `info' command of --batch-command buffered as if
opt->buffer_output is turned on by default.
Buffering by default breaks some 3rd-party Perl scripts using
cat-file, but this breakage was not detected anywhere in our
test suite. Add a small Perl snippet to test this problem since
(AFAIK) other equivalent ways to test this behavior from Bourne
shell and/or awk would require racy sleeps, non-portable FIFOs
or tedious C code.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Writing the merge state after the index write fails is meaningless and
could potentially cause Git to lose changes.
Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Building with "-Werror -Wwrite-strings" is now supported.
* ps/no-writable-strings: (27 commits)
config.mak.dev: enable `-Wwrite-strings` warning
builtin/merge: always store allocated strings in `pull_twohead`
builtin/rebase: always store allocated string in `options.strategy`
builtin/rebase: do not assign default backend to non-constant field
imap-send: fix leaking memory in `imap_server_conf`
imap-send: drop global `imap_server_conf` variable
mailmap: always store allocated strings in mailmap blob
revision: always store allocated strings in output encoding
remote-curl: avoid assigning string constant to non-const variable
send-pack: always allocate receive status
parse-options: cast long name for OPTION_ALIAS
http: do not assign string constant to non-const field
compat/win32: fix const-correctness with string constants
pretty: add casts for decoration option pointers
object-file: make `buf` parameter of `index_mem()` a constant
object-file: mark cached object buffers as const
ident: add casts for fallback name and GECOS
entry: refactor how we remove items for delayed checkouts
line-log: always allocate the output prefix
line-log: stop assigning string constant to file parent buffer
...
|
|
"git am" has a safety feature to prevent it from starting a new
session when there already is a session going. It reliably
triggers when a mbox is given on the command line, but it has to
rely on the tty-ness of the standard input. Add an explicit way to
opt out of this safety with a command line option.
* jk/am-retry:
test-terminal: drop stdin handling
am: add explicit "--retry" option
|
|
A new command has been added to migrate a repository that uses the
files backend for its ref storage to use the reftable backend, with
limitations.
* ps/ref-storage-migration:
builtin/refs: new command to migrate ref storage formats
refs: implement logic to migrate between ref storage formats
refs: implement removal of ref storages
worktree: don't store main worktree twice
reftable: inline `merged_table_release()`
refs/files: fix NULL pointer deref when releasing ref store
refs/files: extract function to iterate through root refs
refs/files: refactor `add_pseudoref_and_head_entries()`
refs: allow to skip creation of reflog entries
refs: pass storage format to `ref_store_init()` explicitly
refs: convert ref storage format to an enum
setup: unset ref storage when reinitializing repository version
|
|
The inter/range-diff output has been moved to the end of the patch
when format-patch adds it to a single patch, instead of writing it
before the patch text, to be consistent with what is done for a
cover letter for a multi-patch series.
* jc/format-patch-with-range-diff:
format-patch: move range/inter diff at the end of a single patch output
show_log: factor out interdiff/range-diff generation
|
|
The "proc-receive" test helper implicitly relies on `the_repository` via
`parse_oid_hex()`. This isn't necessary though, and in fact the whole
command does not depend on `the_repository` at all.
Stop setting up `the_repository` and use `parse_oid_hex_any()` to parse
object IDs.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "oid-array" test helper can supposedly work without a Git
repository, but will in fact crash because `the_repository->hash_algo`
is not initialized. This is because `oid_pos()`, which is used by
`oid_array_lookup()`, depends on `the_hash_algo->rawsz`.
Ideally, we'd adapt `oid_pos()` to not depend on `the_hash_algo`
anymore. That is a bigger untertaking though, so instead we fall back to
SHA1 when there is no repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `object_info()` function of the partial-clone helper is responsible
for checking the object ID of a repository other than `the_repository`.
We use `parse_oid_hex()` in this function though, which means that we
still depend on `the_repository->hash_algo`.
Fix this by using the object hash of the function-local repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The git-http-fetch(1) command accepts a `--packfile=` option, which
allows the user to specify that it shall fetch a specific packfile,
only. The parameter here is the hash of the packfile, which is specific
to the object hash used by the repository. This requirement is implicit
though via our use of `parse_oid_hex()`, which internally uses
`the_repository`.
The git-http-fetch(1) command allows for there to be no repository
though, which only exists such that we can show usage via the "-h"
option. In that case though, starting with c8aed5e8da (repository: stop
setting SHA1 as the default object hash, 2024-05-07), `the_repository`
does not have its object hash initialized anymore and thus we would
crash when trying to parse the object ID outside of a repository.
Fix this issue by dying immediately when we see a "--packfile="
parameter when outside a Git repository. This is not a functional
regression as we would die later on with the same error anyway.
Add a test to detect the segfault. We use the "nongit" function to do
so, which we need to allow-list in `test_must_fail ()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "hash-ll.h" header was introduced via d1cbe1e6d8 (hash-ll.h: split
out of hash.h to remove dependency on repository.h, 2023-04-22) to make
explicit the split between hash-related functions that rely on the
global `the_repository`, and those that don't. This split is no longer
necessary now that we we have removed the reliance on `the_repository`.
Merge "hash-ll.h" back into "hash.h". This causes some code units to not
include "repository.h" anymore, which requires us to add some forward
declarations.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Use of the `the_repository` variable is deprecated nowadays, and we
slowly but steadily convert the codebase to not use it anymore. Instead,
callers should be passing down the repository to work on via parameters.
It is hard though to prove that a given code unit does not use this
variable anymore. The most trivial case, merely demonstrating that there
is no direct use of `the_repository`, is already a bit of a pain during
code reviews as the reviewer needs to manually verify claims made by the
patch author. The bigger problem though is that we have many interfaces
that implicitly rely on `the_repository`.
Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code
units to opt into usage of `the_repository`. The intent of this macro is
to demonstrate that a certain code unit does not use this variable
anymore, and to keep it from new dependencies on it in future changes,
be it explicit or implicit
For now, the macro only guards `the_repository` itself as well as
`the_hash_algo`. There are many more known interfaces where we have an
implicit dependency on `the_repository`, but those are not guarded at
the current point in time. Over time though, we should start to add
guards as required (or even better, just remove them).
Define the macro as required in our code units. As expected, most of our
code still relies on the global variable. Nearly all of our builtins
rely on the variable as there is no way yet to pass `the_repository` to
their entry point. For now, declare the macro in "biultin.h" to keep the
required changes at least a little bit more contained.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Both `oidread()` and `oidclr()` use `the_repository` to derive the hash
function that shall be used. Require callers to pass in the hash
algorithm to get rid of this implicit dependency.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Now that the previous commit removed the possibility that a "struct
remote" will ever have zero url fields, we can drop a number of
redundant checks and untriggerable code paths.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we return a struct from remote_get(), the result _almost_ always
has at least one url. In remotes_remote_get_1(), we do this:
if (name_given && !valid_remote(ret))
add_url_alias(remote_state, ret, name);
if (!valid_remote(ret))
return NULL;
So if the remote doesn't have a url, we give it one based on the name
(this is how unconfigured urls are used as remotes). And if that doesn't
work, we return NULL.
But there's a catch: valid_remote() checks that we have at least one url
_unless_ the remote.*.vcs field is set. This comes from c578f51d52 (Add
a config option for remotes to specify a foreign vcs, 2009-11-18), and
the whole idea was to support remote helpers that don't have their own
url.
However, that mode has been broken since 25d5cc488a (Pass unknown
protocols to external protocol handlers, 2009-12-09)! That commit
unconditionally looks at the url in get_helper(), causing a segfault
with something like:
git -c remote.foo.vcs=bar fetch foo
We could fix that now, of course. But given that it has been broken for
almost 15 years and nobody noticed, there's a better option. This weird
"there might not be a url" special case requires checks all over the
code base, and it's not clear if there are other similar segfaults
lurking. It would be nice if we could drop that special case.
So instead, let's let the "the remote name is the url" code kick in. If
you have "remote.foo.vcs", then your url (unless otherwise configured)
is "foo". This does have a visible effect compared to what 25d5cc488a
was trying to do. The idea back then is that for a remote without a url,
we'd run:
# only one command-line option!
git-remote-bar foo
whereas with our default url, now we'll run:
git-remote-bar foo foo
Again, in practice nobody can be relying on this because it has been
segfaulting for 15 years. We should consider just removing this "vcs"
config option entirely, but that would be a user-visible breakage. So by
fixing it this way, we can keep things working that have been working,
and simplify away one special case inside our code.
This fixes the segfault from 25d5cc488a (demonstrated by the test), and
we can build further cleanups on top.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The usual way to trigger a remote helper is to use the "::" syntax from:
87422439d1 (Allow specifying the remote helper in the url, 2009-11-18).
Doing:
git config remote.origin.url hg::https://example.com/repo
will run "git-remote-hg origin https://example.com/repo". Or you can
use the fallback handling from 25d5cc488a (Pass unknown protocols to
external protocol handlers, 2009-12-09):
git config remote.origin.url "foo://bar"
which will run "git-remote-foo origin foo://bar".
But there's a third way, from c578f51d52 (Add a config option for
remotes to specify a foreign vcs, 2009-11-18):
git config remote.origin.vcs foo
git config remote.origin.url bar
which will run "git-remote-foo origin bar". This is mostly redundant
with the other methods, except that it is supposed to allow you to run
without a URL at all. So:
git config remote.origin.vcs foo
would run "git-remote-foo origin" with no extra URL parameter (under the
assumption that the helper somehow knows how to access the remote repo).
However, this mode has been broken since 25d5cc488a, shortly after it
was added! That commit taught the transport code to always look at the
URL string to parse off the "foo::" bits, meaning it would always
segfault in the no-url case. You can see that with:
git -c remote.foo.vcs=bar fetch foo
Nobody seems to have noticed in the almost 15 years since, so presumably
it's not a well-used feature. And without that, arguably the whole
remote.*.vcs feature could be removed entirely, as it isn't offering
anything you couldn't do with the "helper::" syntax. But it _does_ work
if you have a URL, and it has been advertised in the documentation for
all that time. So we shouldn't just remove it without warning.
Likewise, even if we were going to deprecate it, we should avoid
breaking it in the meantime. Since there are no tests for it at all,
let's add a few basic ones:
- this syntax doesn't work well with "git clone" (another point
against it versus "helper::"). But we can use "clone -c" to set up
the config manually, passing the URL as usual to clone. This does
work, though note that I had to use --no-local in the test to avoid
broken interactions between the local code and the helper. In the
real world this would be a non-issue, since the remote URL would
generally not also be a local Git repo!
- likewise, we should be able to set up the config manually and fetch
into a repository. This also works.
- we can simulate a vcs that has no URL support by stuffing the remote
path into another environment variable. This should work, but
doesn't (it hits the segfault mentioned above).
In the first two cases, I took the extra step of checking GIT_TRACE
output to confirm that we actually ran the helper (since the URL is a
valid Git repo, the clone/fetch would appear to work even if we
didn't use the helper at all!).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Our tests use a fake helper that just imports from an existing Git
repository. We're fed the path to that repo on the command line, and
derive the GIT_DIR by tacking on "/.git".
This is wrong if the path is a bare repository, but that's OK since this
is just a limited test. But it's also wrong if the transport code feeds
us the actual .git directory itself (i.e., we expect "/path/to/repo" but
it gives us "/path/to/repo/.git"). None of the current tests do that,
but let's future-proof ourselves against adding a test that does.
We can instead ask "rev-parse" to set our GIT_DIR. Note that we have to
first unset other git variables from our environment. Coming into this
script, we'll have GIT_DIR set to the fetching repository, and we need
to "switch" to the remote one.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Because remote.*.url is treated as a multi-valued key, there is no way
to override previous config. So for example if you have
remote.origin.url set to some wrong value, doing:
git -c remote.origin.url=right fetch
would not work. It would append "right" to the list, which means we'd
still fetch from "wrong" (since subsequent values are used only as push
urls).
Let's provide a mechanism to reset the list, like we do for other
multi-valued keys (e.g., credential.helper, http.extraheaders, and
merge.suppressDest all use this "empty string means reset" pattern).
Reported-by: Mathew George <mathewegeorge@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Now that the url/pushurl fields of "struct remote" own their strings, we
can switch from bare arrays to strvecs. This has a few advantages:
- push/clear are now one-liners
- likewise the free+assigns in alias_all_urls() can use
strvec_replace()
- we now use size_t for storage, avoiding possible overflow
- this will enable some further cleanups in future patches
There's quite a bit of fallout in the code that reads these fields, as
it tends to access these arrays directly. But it's mostly a mechanical
replacement of "url_nr" with "url.nr", and "url[i]" with "url.v[i]",
with a few variations (e.g. "*url" could become "*url.v", but I used
"url.v[0]" for consistency).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* gt/unit-test-oidtree:
t/: migrate helper/test-oidtree.c to unit-tests/t-oidtree.c
|
|
* ps/ref-storage-migration:
builtin/refs: new command to migrate ref storage formats
refs: implement logic to migrate between ref storage formats
refs: implement removal of ref storages
worktree: don't store main worktree twice
reftable: inline `merged_table_release()`
refs/files: fix NULL pointer deref when releasing ref store
refs/files: extract function to iterate through root refs
refs/files: refactor `add_pseudoref_and_head_entries()`
refs: allow to skip creation of reflog entries
refs: pass storage format to `ref_store_init()` explicitly
refs: convert ref storage format to an enum
setup: unset ref storage when reinitializing repository version
|
|
This fixes a bug that was introduced by 368d19b0b7 (commit-graph:
refactor compute_topological_levels(), 2023-03-20): Previously, the
progress indicator was updated from `i + 1` where `i` is the loop
variable of the enclosing `for` loop. After this patch, the update used
`info->progress_cnt + 1` instead, however, unlike `i`, the
`progress_cnt` attribute was not incremented. Let's increment it.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
[jc: squashed in a test update from Patrick Steinhardt]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A test helper that essentially is unit tests on the "decorate"
logic has been rewritten using the unit-tests framework.
* gt/decorate-unit-test:
t/: migrate helper/test-example-decorate to the unit testing framework
|
|
Many memory leaks in the sparse-checkout code paths have been
plugged.
* jk/sparse-leakfix:
sparse-checkout: free duplicate hashmap entries
sparse-checkout: free string list after displaying
sparse-checkout: free pattern list in sparse_checkout_list()
sparse-checkout: free sparse_filename after use
sparse-checkout: refactor temporary sparse_checkout_patterns
sparse-checkout: always free "line" strbuf after reading input
sparse-checkout: reuse --stdin buffer when reading patterns
dir.c: always copy input to add_pattern()
dir.c: free removed sparse-pattern hashmap entries
sparse-checkout: clear patterns when init() sees existing sparse file
dir.c: free strings in sparse cone pattern hashmaps
sparse-checkout: pass string literals directly to add_pattern()
sparse-checkout: free string list in write_cone_to_file()
|
|
An overly large ".gitignore" files are now rejected silently.
* jk/cap-exclude-file-size:
dir.c: reduce max pattern file size to 100MB
dir.c: skip .gitignore, etc larger than INT_MAX
|
|
The safe.directory configuration knob has been updated to
optionally allow leading path matches.
* jc/safe-directory-leading-path:
safe.directory: allow "lead/ing/path/*" match
|
|
A pair of test helpers that essentially are unit tests on hash
algorithms have been rewritten using the unit-tests framework.
* gt/t-hash-unit-test:
t/: migrate helper/test-{sha1, sha256} to unit-tests/t-hash
strbuf: introduce strbuf_addstrings() to repeatedly add a string
|
|
Basic unit tests for reftable have been reimplemented under the
unit test framework.
* cp/reftable-unit-test:
t: improve the test-case for parse_names()
t: add test for put_be16()
t: move tests from reftable/record_test.c to the new unit test
t: move tests from reftable/stack_test.c to the new unit test
t: move reftable/basics_test.c to the unit testing framework
|
|
A new test was added to ensure git commands that are designed to
run outside repositories do work.
* jc/t1517-more:
imap-send: minimum leakfix
t1517: more coverage for commands that work without repository
|
|
helper/test-oidtree.c along with t0069-oidtree.sh test the oidtree.h
library, which is a wrapper around crit-bit tree. Migrate them to
the unit testing framework for better debugging and runtime
performance. Along with the migration, add an extra check for
oidtree_each() test, which showcases how multiple expected matches can
be given to check_each() helper.
To achieve this, introduce a new library called 'lib-oid.h'
exclusively for the unit tests to use. It currently mainly includes
utility to generate object_id from an arbitrary hex string
(i.e. '12a' -> '12a0000000000000000000000000000000000000'). This also
handles the hash algo selection based on GIT_TEST_DEFAULT_HASH.
This library will also be helpful when we port other unit tests such
as oid-array, oidset etc.
Helped-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
[jc: small fixlets squashed in]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `OPT__ABBREV()` option allows the user to specify the length that
object hashes shall be abbreviated to. This length needs to be in the
range of `(MIN_ABBREV, the_hash_algo->hexsz)`, which is why we clamp the
value as required. While this makes sense in the case of `MIN_ABBREV`,
it is unnecessary for the upper boundary as the value is eventually
passed down to `repo_find_unnique_abbrev_r()`, which handles values
larger than the current hash length just fine.
In the preceding commit, we have changed parsing of the "core.abbrev"
config to stop clamping to the upper boundary. Let's do the same here so
that the code becomes simpler, we are consistent with how we treat the
"core.abbrev" config and so that we stop depending on `the_repository`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "core.abbrev" config allows the user to specify the minimum length
when abbreviating object hashes. Next to the values "auto" and "no",
this config also accepts a concrete length that needs to be bigger or
equal to the minimum length and smaller or equal to the hash algorithm's
hex length. While the former condition is trivial, the latter depends on
the object format used by the current repository. It is thus a variable
upper boundary that may either be 40 (SHA-1) or 64 (SHA-256).
This has two major downsides. First, the user that specifies this config
must be aware of the object hashes that its repository use. If they want
to configure the value globally, then they cannot pick any value in the
range `[41, 64]` if they have any repository that uses SHA-1. If they
did, Git would error out when parsing the config.
Second, and more importantly, parsing "core.abbrev" crashes when outside
of a Git repository because we dereference `the_hash_algo` to figure out
its hex length. Starting with c8aed5e8da (repository: stop setting SHA1
as the default object hash, 2024-05-07) though, we stopped initializing
`the_hash_algo` outside of Git repositories.
Fix both of these issues by not making it an error anymore when the
given length exceeds the hash length. Instead, leave the abbreviated
length intact. `repo_find_unique_abbrev_r()` handles this just fine
except for a performance penalty which we will fix in a subsequent
commit.
Reported-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When performing multi-pack reuse, reuse_partial_packfile_from_bitmap()
is responsible for generating an array of bitmapped_pack structs from
which to perform reuse.
In the multi-pack case, we loop over the MIDXs packs and copy the result
of calling `nth_bitmapped_pack()` to construct the list of reusable
paths.
But we may also want to do pack-reuse over a single pack, either because
we only had one pack to perform reuse over (in the case of single-pack
bitmaps), or because we explicitly asked to do single pack reuse even
with a MIDX[^1].
When this is the case, the array we generate of reusable packs contains
only a single element, which is either (a) the pack attached to the
single-pack bitmap, or (b) the MIDX's preferred pack.
In 795006fff4 (pack-bitmap: gracefully handle missing BTMP chunks,
2024-04-15), we refactored the reuse_partial_packfile_from_bitmap()
function and stopped assigning the pack_int_id field when reusing only
the MIDX's preferred pack. This results in an uninitialized read down in
try_partial_reuse() like so:
==7474==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x55c5cd191dde in try_partial_reuse pack-bitmap.c:1887:8
#1 0x55c5cd191dde in reuse_partial_packfile_from_bitmap_1 pack-bitmap.c:2001:8
#2 0x55c5cd191dde in reuse_partial_packfile_from_bitmap pack-bitmap.c:2105:3
#3 0x55c5cce0bd0e in get_object_list_from_bitmap builtin/pack-objects.c:4043:3
#4 0x55c5cce0bd0e in get_object_list builtin/pack-objects.c:4156:27
#5 0x55c5cce0bd0e in cmd_pack_objects builtin/pack-objects.c:4596:3
#6 0x55c5ccc8fac8 in run_builtin git.c:474:11
which happens when try_partial_reuse() tries to call
midx_pair_to_pack_pos() when it tries to reject cross-pack deltas.
Avoid the uninitialized read by ensuring that the pack_int_id field is
set in the single-pack reuse case by setting it to either the MIDX
preferred pack's pack_int_id, or '-1', in the case of single-pack
bitmaps. In the latter case, we never read the pack_int_id field, so
the choice of '-1' is intentional as a "garbage in, garbage out"
measure.
Guard against further regressions in this area by adding a test which
ensures that we do not throw out deltas from the preferred pack as
"cross-pack" due to an uninitialized pack_int_id.
[^1]: This can happen for a couple of reasons, either because the
repository is configured with 'pack.allowPackReuse=(true|single)', or
because the MIDX was generated prior to the introduction of the BTMP
chunk, which contains information necessary to perform multi-pack
reuse.
Reported-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Commit d6a8c58675 (midx-write.c: support reading an existing MIDX with
`packs_to_include`, 2024-05-29) changed the MIDX generation machinery to
support reading from an existing MIDX when writing a new one.
Unfortunately, the rest of the MIDX generation machinery is not prepared
to deal with such a change. For instance, the function responsible for
adding to the object ID fanout table from a MIDX source
(midx_fanout_add_midx_fanout()) will gladly add objects from an existing
MIDX for some fanout level regardless of whether or not those objects
came from packs that are to be included in the subsequent MIDX write.
This results in broken pseudo-pack object order (leading to incorrect
object traversal results) and segmentation faults, like so (generated by
running the added test prior to the changes in midx-write.c):
#0 0x000055ee31393f47 in midx_pack_order (ctx=0x7ffdde205c70) at midx-write.c:590
#1 0x000055ee31395a69 in write_midx_internal (object_dir=0x55ee32570440 ".git/objects",
packs_to_include=0x7ffdde205e20, packs_to_drop=0x0, preferred_pack_name=0x0,
refs_snapshot=0x0, flags=15) at midx-write.c:1171
#2 0x000055ee31395f38 in write_midx_file_only (object_dir=0x55ee32570440 ".git/objects",
packs_to_include=0x7ffdde205e20, preferred_pack_name=0x0, refs_snapshot=0x0, flags=15)
at midx-write.c:1274
[...]
In stack frame #0, the code on midx-write.c:590 is using the new pack ID
corresponding to some object which was added from the existing MIDX.
Importantly, the pack from which that object was selected in the
existing MIDX does not appear in the new MIDX as it was excluded via
`--stdin-packs`.
In this instance, the pack in question had pack ID "1" in the existing
MIDX, but since it was excluded from the new MIDX, we never filled in
that entry in the pack_perm table, resulting in:
(gdb) p *ctx->pack_perm@2
$1 = {0, 1515870810}
Which is what causes the segfault above when we try and read:
struct pack_info *pack = &ctx->info[ctx->pack_perm[i]];
if (pack->bitmap_pos == BITMAP_POS_UNKNOWN)
pack->bitmap_pos = 0;
Fundamentally, we should be able to read information from an existing
MIDX when generating a new one. But in practice the midx-write.c code
assumes that we won't run into issues like the above with incongruent
pack IDs, and often makes those assumptions in extremely subtle and
fragile ways.
Instead, let's avoid reading from an existing MIDX altogether, and stick
with the pre-d6a8c58675 implementation. Harden against any regressions
in this area by adding a test which demonstrates these issues.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When parsing the blame configuration we add "blame.ignoreRevsFile"
configs to a string list. This string list is declared as with `NODUP`,
and thus we hand over the allocated string to that list. We eventually
end up calling `string_list_clear()` on that list, but due to it being
declared as `NODUP` we will not release the associated strings and thus
leak memory.
Fix this issue by setting up the list as `DUP` instead and free the
config string after insertion.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In `cmd_blame()` we compute prefixed paths by calling `add_prefix()`,
which itself calls `prefix_path()`. While `prefix_path()` returns an
allocated string, `add_prefix()` pretends to return a constant string.
Consequently, this path never gets freed.
Fix the return type to be `char *` and free the path to plug the memory
leak.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are some memory leaks when cleaning up blame scoreboards. Fix
those.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When calling either the recursive or the ORT merge machineries we need
to provide a list of merge bases. The ownership of that parameter is
then implicitly transferred to the callee, which is somewhat fishy.
Furthermore, that list may leak in some cases where the merge machinery
runs into an error, thus causing a memory leak.
Refactor the code such that we stop transferring ownership. Instead, the
merge machinery will now create its own local copies of the passed in
list as required if they need to modify the list. Free the list at the
callsites as required.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In "builtin/merge.c" we use the helper infrastructure to figure out what
merge strategies there are. We never free contents of the `cmdnames`
structures though and thus leak their memory.
Fix this by exposing the already existing `clean_cmdnames()` function to
release their memory. As this name isn't quite idiomatic, rename it to
`cmdnames_release()` while at it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Fix some trivial memory leaks in `make_script_with_merges()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In `wanted_peer_refs()` we first create a copy of the "HEAD" ref. This
copy may not actually be passed back to the caller, but is not getting
freed in this case. Fix this.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Before calling `update_pre_post_images()`, we call `strbuf_detach()` to
put its buffer into a new string variable that we then pass to that
function. Besides being rather pointless, it also causes us to leak
memory of that variable because we never free it.
Get rid of the variable altogether and instead reach into the `strbuf`
directly. While at it, refactor the code to have a common exit path and
mark string that do not contain allocated memory as constant.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When creating commits via `commit_tree_extended()`, the caller passes in
a string list of parents. This call implicitly transfers ownership of
that list to the function, which is quite surprising to begin with. But
to make matters worse, `commit_tree_extended()` doesn't even bother to
free the list of parents in error cases. The result is a memory leak,
and one that the caller cannot fix by themselves because they do not
know whether parts of the string list have already been released.
Refactor the code such that callers can keep ownership of the list of
parents, which is getting indicated by parameter being a constant
pointer now. Free the lists at the calling site and add a common exit
path to those sites as required.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The variable used to track the "core.notesref" config is not getting
freed before we assign to it and thus leaks. Fix this.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We leak various different string lists in the rerere code. Free those to
plug them.
Note that the `merge_rr` variable is intentionally being free'd with the
`free_util` parameter set to 1. The `util` field is used there to store
the IDs of every rerere item and thus needs to be freed, as well.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We leak the `revision_args()` variable. Fix this.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There is a todo comment in `release_revisions()` that mentions that we
need to free the diff options, which was added via 54c8a7c379 (revisions
API: add a TODO for diff_free(&revs->diffopt), 2022-04-14). Releasing
the diff options wasn't quite feasible at that time because some call
sites rely on its contents to remain even after the revisions have been
released.
In fact, there really only are a couple of callsites that misbehave
here:
- `cmd_shortlog()` releases the revisions, but continues to access its
file pointer.
- `do_diff_cache()` creates a shallow copy of `struct diff_options`,
but does not set the `no_free` member. Consequently, we end up
releasing resources of the caller-provided diff options.
- `diff_free()` and friends do not play nice when being called
multiple times as they don't unset data structures that they have
just released.
Fix all of those cases and enable the call to `diff_free()`, which plugs
a bunch of memory leaks.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We're storing the list of commits that git-cherry(1) is about to print
into a temporary list. This list is never getting free'd and thus leaks.
Fix this.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We do not free some members of `struct merge_options`' private data.
Fix this to plug those leaks.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In `cmd_merge_recursive()` we have a static array of object ID bases
that we pass to `merge_recursive_generic()`. This interface is somewhat
weird though because the latter function accepts a pointer to a pointer
of object IDs, which requires us to allocate the object IDs on the heap.
And as we never free those object IDs, the end result is a leak.
While we can easily solve this leak by just freeing the respective
object IDs, the whole calling convention is somewhat weird. Instead,
refactor `merge_recursive_generic()` to accept a plain pointer to object
IDs so that we can avoid allocating them altogether.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While it is documented in `struct object_context::path` that this
variable needs to be released by the caller, this fact is rather easy to
miss given that we do not ever provide a function to release the object
context. And of course, while some callers dutifully release the path,
many others don't.
Introduce a new `object_context_release()` function that releases the
path. Convert callsites that used to free the path to use that new
function and add missing calls to callsites that were leaking memory.
Refactor those callsites as required to have a single return path, only.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
git-rev-list(1) can speed up its object size calculations for reachable
objects via a bitmap walk, if there is any bitmap. This is done in
`try_bitmap_disk_usage()`, which tries to optimistically load the bitmap
and then use it, if available. It never frees it though, leading to a
memory leak. Fix this.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In `prune_notes()` we first store the notes that are to be deleted in a
local list, and then iterate through that list to delete those notes one
by one. We never free the list though and thus leak its memory. Fix
this.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We never free the display notes options embedded into `struct revision`.
Implement a new function `release_display_notes()` that we can call in
`release_revisions()` to fix this.
There is another gotcha here though: we play some games with the string
list used to track extra notes refs, where we sometimes set the bit that
indicates that strings should be strdup'd and sometimes unset it. This
dance is done to avoid a copy of an already-allocated string when we
call `enable_ref_display_notes()`. But this dance is rather pointless as
we can instead call `string_list_append_nodup()` to transfer ownership
of the allocated string to the list.
Refactor the code to do so and drop the `strdup_strings` dance.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When computing rename conflicts in our recursive merge algorithm we set
up `struct rename_conflict_info`s to track that information. We never
free those data structures though and thus leak memory.
We need to be a bit more careful here though because the same rename
conflict info can be assigned to multiple structures. Accommodate for
this by introducing a `rename_conflict_info_owned` bit that we can use
to steer whether or not the rename conflict info shall be free'd.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We have a bunch of memory leaks in git-rev-parse(1)'s `--parseopt` mode.
Refactor the code to use `struct strvec`s to make it easier for us to
track the lifecycle of those leaking variables and then free them.
While at it, remove the unneeded static lifetime for some of the
variables.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When creating a bundle, we set up a revision walk, but never release
data associated with it. Furthermore, we create a mostly-shallow copy of
that revision walk where we only adapt its pending objects such that we
can reuse the walk. While that copy must not be released, the pending
objects array need to be.
Plug those memory leaks by releasing the revision walk and the pending
objects of the copied revision walk.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While we clear most of the members of `struct notes_rewrite_cfg` in
`finish_copy_notes_for_rewrite()`, we do not clear the notes tree. Fix
this to plug this memory leak.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `OPT_FILENAME()` option will, if set, put an allocated string into
the user-provided variable. Consequently, that variable thus needs to be
free'd by the caller of `parse_options()`. Some callsites don't though
and thus leak memory. Fix those.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When reversing revisions in a rev walk, `get_revision()` will allocate a
new commit list and assign it to `revs->commits`. It does not free the
old list though, which makes it leak. Fix this.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Memory leaks in "git mv" has been plugged.
* jk/leakfixes:
mv: replace src_dir with a strvec
mv: factor out empty src_dir removal
mv: move src_dir cleanup to end of cmd_mv()
t-strvec: mark variable-arg helper with LAST_ARG_MUST_BE_NULL
t-strvec: use va_end() to match va_start()
|
|
The alias-expanded command lines are logged to the trace output.
* iw/trace-argv-on-alias:
run-command: show prepared command
Documentation: alias: add notes on shell expansion
Documentation: alias: rework notes into points
|
|
The options --exit-code and --quiet instruct git diff to indicate
whether it found any significant changes by exiting with code 1 if it
did and 0 if there were none. Currently this doesn't work if external
diff programs are involved, as we have no way to learn what they found.
Add that ability in the form of the new configuration options
diff.trustExitCode and diff.<driver>.trustExitCode and the environment
variable GIT_EXTERNAL_DIFF_TRUST_EXIT_CODE. They pair with the config
options diff.external and diff.<driver>.command and the environment
variable GIT_EXTERNAL_DIFF, respectively.
The new options are off by default, keeping the old behavior. Enabling
them indicates that the external diff returns exit code 1 if it finds
significant changes and 0 if it doesn't, like diff(1).
The name of the new options is taken from the git difftool and mergetool
options of similar purpose. (There they enable passing on the exit code
of a diff tool and to infer whether a merge done by a merge tool is
successful.)
The new feature sets the diff flag diff_from_contents in
diff_setup_done() if we need the exit code and are allowed to call
external diffs. This disables the optimization that avoids calling the
program with --quiet. Add it back by skipping the call if the external
diff is not able to report empty diffs. We can only do that check after
evaluating the file-specific attributes in run_external_diff().
If we do run the external diff with --quiet, send its output to
/dev/null.
I considered checking the output of the external diff to check whether
its empty. It was added as 11be65cfa4 (diff: fix --exit-code with
external diff, 2024-05-05) and quickly reverted, as it does not work
with external diffs that do not write to stdout. There's no reason why
a graphical diff tool would even need to write anything there at all.
I also considered using a non-zero exit code for empty diffs, which
could be done without adding new configuration options. We'd need to
disable the optimization that allows git diff --quiet to skip calling
external diffs, though -- that might be quite surprising if graphical
diff programs are involved. And assigning the opposite meaning of the
exit codes compared to diff(1) and git diff --exit-code to the external
diff can cause unnecessary confusion.
Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add tests to check the exit code of git diff with its options --quiet
and --exit-code when using an external diff program. Currently we
cannot tell whether it found significant changes or not.
While at it, document briefly that --quiet turns off execution of
external diff programs because that behavior surprised me for a moment
while writing the tests.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we deal with a multi-patch series in git-format-patch(1), if we see
`--interdiff` or `--range-diff` but no `--cover-letter`, we return with
an error, saying:
fatal: --range-diff requires --cover-letter or single patch
or:
fatal: --interdiff requires --cover-letter or single patch
This makes sense because the cover-letter is where we place the diff
from the previous version.
However, considering that `format-patch` generates a multi-patch as
needed, let's adopt a similar "cover as necessary" approach when using
`--interdiff` or `--range-diff`.
Therefore, relax the requirement for an explicit `--cover-letter` in a
multi-patch series when the user says `--iterdiff` or `--range-diff`.
Still, if only to return the error, respect "format.coverLetter=no" and
`--no-cover-letter`.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Arrange things we are going to create to be removed at end, and then
start creating them. That way, we will clean them up even if we fail
after creating some but before the end of the command.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code clean-up around writing the .midx files.
* tb/midx-write-cleanup:
pack-bitmap.c: reimplement `midx_bitmap_filename()` with helper
midx: replace `get_midx_rev_filename()` with a generic helper
midx-write.c: support reading an existing MIDX with `packs_to_include`
midx-write.c: extract `fill_packs_from_midx()`
midx-write.c: extract `should_include_pack()`
midx-write.c: pass `start_pack` to `compute_sorted_entries()`
midx-write.c: reduce argument count for `get_sorted_entries()`
midx-write.c: tolerate `--preferred-pack` without bitmaps
|
|
The `git_log_output_encoding` variable can be set via the `--encoding=`
option. When doing so, we conditionally either assign it to the passed
value, or if the value is "none" we assign it the empty string.
Depending on which of the both code paths we pick though, the variable
may end up being assigned either an allocated string or a string
constant.
This is somewhat risky and may easily lead to bugs when a different code
path may want to reassign a new value to it, freeing the previous value.
We already to this when parsing the "i18n.logoutputencoding" config in
`git_default_i18n_config()`. But because the config is typically parsed
before we parse command line options this has been fine so far.
Regardless of that, safeguard the code such that the variable always
contains an allocated string. While at it, also free the old value in
case there was any to plug a potential memory leak.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We're about to enable `-Wwrite-strings`, which changes the type of
string constants to `const char[]`. Fix various sites where we assign
such constants to non-const variables.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add 'symref-update' command to the '--stdin' mode of 'git-update-ref' to
allow updates of symbolic refs. The 'symref-update' command takes in a
<new-target>, which the <ref> will be updated to. If the <ref> doesn't
exist it will be created.
It also optionally takes either an `ref <old-target>` or `oid
<old-oid>`. If the <old-target> is provided, it checks to see if the
<ref> targets the <old-target> before the update. If <old-oid> is provided
it checks <ref> to ensure that it is a regular ref and <old-oid> is the
OID before the update. This by extension also means that this when a
zero <old-oid> is provided, it ensures that the ref didn't exist before.
The divergence in syntax from the regular `update` command is because if
we don't use a `(ref | oid)` prefix for the old_value, then there is
ambiguity around if the value provided should be treated as an oid or a
reference. This is more so the reason, because we allow anything
committish to be provided as an oid. While 'symref-verify' and
'symref-delete' also take in `<old-target>` we do not have this
divergence there as those commands only work with symrefs. Whereas
'symref-update' also works with regular refs and allows users to convert
regular refs to symrefs.
The command allows users to perform symbolic ref updates within a
transaction. This provides atomicity and allows users to perform a set
of operations together.
This command supports deref mode, to ensure that we can update
dereferenced regular refs to symrefs.
Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add 'symref-create' command to the '--stdin' mode 'git-update-ref' to
allow creation of symbolic refs in a transaction. The 'symref-create'
command takes in a <new-target>, which the created <ref> will point to.
Also, support the 'core.prefersymlinkrefs' config, wherein if the config
is set and the filesystem supports symlinks, we create the symbolic ref
as a symlink. We fallback to creating a regular symref if creating the
symlink is unsuccessful.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a new command 'symref-delete' to allow deletions of symbolic refs in
a transaction via the '--stdin' mode of the 'git-update-ref' command.
The 'symref-delete' command can, when given an <old-target>, delete the
provided <ref> only when it points to <old-target>.
This command is only compatible with the 'no-deref' mode because we
optionally want to check the 'old_target' of the ref being deleted.
De-referencing a symbolic ref would provide a regular ref and we already
have the 'delete' command for regular refs.
While users can also use 'git symbolic-ref -d' to delete symbolic refs,
the 'symref-delete' command in 'git-update-ref' allows users to do so
within a transaction, which promises atomicity of the operation and can
be batched with other commands.
When no 'old_target' is provided it can also delete regular refs,
similar to how the 'delete' command can delete symrefs when no 'old_oid'
is provided.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The 'symref-verify' command allows users to verify if a provided <ref>
contains the provided <old-target> without changing the <ref>. If
<old-target> is not provided, the command will verify that the <ref>
doesn't exist.
The command allows users to verify symbolic refs within a transaction,
and this means users can perform a set of changes in a transaction only
when the verification holds good.
Since we're checking for symbolic refs, this command will only work with
the 'no-deref' mode. This is because any dereferenced symbolic ref will
point to an object and not a ref and the regular 'verify' command can be
used in such situations.
Add required tests for symref support in 'verify'. Since we're here,
also add reflog checks for the pre-existing 'verify' tests, there is no
divergence from behavior, but we never tested to ensure that reflog
wasn't affected by the 'verify' command.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Since the introduction of split commit graph layers in 92b1ea66b9a
(Merge branch 'ds/commit-graph-incremental', 2019-07-19), the function
write_commit_graph_file() has done the following when writing an
incremental commit-graph layer:
- used a lock_file to control access to the commit-graph-chain file
- used an auxiliary file (whose descriptor was stored in 'fd') to
write the new commit-graph layer itself
Using a lock_file to control access to the commit-graph-chain is
sensible, since only one writer may modify it at a time. Likewise, when
the commit-graph machinery is writing out a single layer, the lock_file
structure is used to modify the commit-graph itself. This is also
sensible, since the non-incremental commit-graph may also have at most
one writer.
However, using an auxiliary temporary file without using the tempfile.h
API means that writes that fail after the temporary graph layer has been
created will leave around a file in
$GIT_DIR/objects/info/commit-graphs/tmp_graph_XXXXXX
The commit-graph-chain file and non-incremental commit-graph do not
suffer from this problem as the lockfile.h API uses the tempfile.h API
transparently, so processes that died before moving those finals into
their final location cleaned up after themselves.
Ensure that the temporary file used to write incremental commit-graphs
is also managed with the tempfile.h API, to ensure that we do not ever
leave tmp_graph_XXXXXX files laying around.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The promisor.quiet configuration knob can be set to true to make
lazy fetching from promisor remotes silent.
* th/quiet-lazy-fetch-from-promisor:
promisor-remote: add promisor.quiet configuration option
|
|
Leakfixes.
* ps/leakfixes:
builtin/mv: fix leaks for submodule gitfile paths
builtin/mv: refactor to use `struct strvec`
builtin/mv duplicate string list memory
builtin/mv: refactor `add_slash()` to always return allocated strings
strvec: add functions to replace and remove strings
submodule: fix leaking memory for submodule entries
commit-reach: fix memory leak in `ahead_behind()`
builtin/credential: clear credential before exit
config: plug various memory leaks
config: clarify memory ownership in `git_config_string()`
builtin/log: stop using globals for format config
builtin/log: stop using globals for log config
convert: refactor code to clarify ownership of check_roundtrip_encoding
diff: refactor code to clarify memory ownership of prefixes
config: clarify memory ownership in `git_config_pathname()`
http: refactor code to clarify memory ownership
checkout: clarify memory ownership in `unique_tracking_name()`
strbuf: fix leak when `appendwholeline()` fails with EOF
transport-helper: fix leaking helper name
|
|
Since 18d8c26930 (test_terminal: redirect child process' stdin to a pty,
2015-08-04), we set up a pty and copy stdin to the child program. But
this ends up being racy; once we send all of the bytes and close the
descriptor, the child program will no longer see a terminal! isatty()
will return 0, and trying to read may return EIO, even if we didn't yet
get all of the bytes.
This was mentioned even in the commit message of 18d8c26930, but we
hacked around it by just sending an infinite input from /dev/zero (in
the intended case, we only cared about isatty(0), not reading actual
input).
And it came up again recently in:
https://lore.kernel.org/git/d42a55b1-1ba9-4cfb-9c3d-98ea4d86da33@gmail.com/
where we tried to actually send bytes, but they don't always all come
through. So this interface is somewhat of an accident waiting to happen;
a caller might not even care about stdin being a tty, but will get bit
by the flaky behavior.
One solution would probably be to avoid closing test_terminal's end of
the pty altogether. But then the other side would never see EOF on its
stdin. That may be OK for some cases, but it's another gotcha that
might cause races or deadlocks, depending on what the child expects to
read.
Let's instead just drop test_terminal's stdin feature completely. Since
the previous commit dropped the two cases from t4153 for which the
feature was originally added, there are no callers left that need it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
After a patch fails, you can ask "git am" to try applying it again with
new options by running without any of the resume options. E.g.:
git am <patch
# oops, it failed; let's try again
git am --3way
But since this second command has no explicit resume option (like
"--continue"), it looks just like an invocation to read a fresh patch
from stdin. To avoid confusing the two cases, there are some heuristics,
courtesy of 8d18550318 (builtin-am: reject patches when there's a
session in progress, 2015-08-04):
if (in_progress) {
/*
* Catch user error to feed us patches when there is a session
* in progress:
*
* 1. mbox path(s) are provided on the command-line.
* 2. stdin is not a tty: the user is trying to feed us a patch
* from standard input. This is somewhat unreliable -- stdin
* could be /dev/null for example and the caller did not
* intend to feed us a patch but wanted to continue
* unattended.
*/
if (argc || (resume_mode == RESUME_FALSE && !isatty(0)))
die(_("previous rebase directory %s still exists but mbox given."),
state.dir);
if (resume_mode == RESUME_FALSE)
resume_mode = RESUME_APPLY;
[...]
So if no resume command is given, then we require that stdin be a tty,
and otherwise complain about (potentially) receiving an mbox on stdin.
But of course you might not actually have a terminal available! And
sadly there is no explicit way to hit this same code path; this is the
only place that sets RESUME_APPLY. So you're stuck, and scripts like our
test suite have to bend over backwards to create a pseudo-tty.
Let's provide an explicit option to trigger this mode. The code turns
out to be quite simple; just setting "resume_mode" to RESUME_FALSE is
enough to dodge the tty check, and then our state is the same as it
would be with the heuristic case (which we'll continue to allow).
When we don't have a session in progress, there's already code to
complain when resume_mode is set (but we'll add a new test to cover
that).
To test the new option, we'll convert the existing tests that rely on
the fake stdin tty. That lets us test them on more platforms, and will
let us simplify test_terminal a bit in a future patch.
It does, however, mean we're not testing the tty heuristic at all.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Introduce a new command that allows the user to migrate a repository
between ref storage formats. This new command is implemented as part of
a new git-refs(1) executable. This is due to two reasons:
- There is no good place to put the migration logic in existing
commands. git-maintenance(1) felt unwieldy, and git-pack-refs(1) is
not the correct place to put it, either.
- I had it in my mind to create a new low-level command for accessing
refs for quite a while already. git-refs(1) is that command and can
over time grow more functionality relating to refs. This should help
discoverability by consolidating low-level access to refs into a
single executable.
As mentioned in the preceding commit that introduces the ref storage
format migration logic, the new `git refs migrate` command still has a
bunch of restrictions. These restrictions are documented accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The ref backends do not have any way to disable the creation of reflog
entries. This will be required for upcoming ref format migration logic
so that we do not create any entries that didn't exist in the original
ref database.
Provide a new `REF_SKIP_CREATE_REFLOG` flag that allows the caller to
disable reflog entry creation.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The configuration variable stopped doing anything (other than
announcing itself as a variable that does not do anything useful,
when it is used) in Git 2.40.
At this point, it is not even worth giving the warning, which was
meant to be a way to help users notice they are carrying unused
cruft in their configuration files and give them a chance to
clean-up.
Let's remove the warning and documentation for it, and truly stop
paying attention to it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
Documentation/config/add.txt | 6 ------
builtin/add.c | 6 +-----
t/t3701-add-interactive.sh | 15 ---------------
3 files changed, 1 insertion(+), 26 deletions(-)
|
|
In insert_recursive_pattern(), we create a new pattern_entry to insert
into the parent_hashmap. If we find that the same entry already exists
in the hashmap, we skip adding the new one. But we forget to free the new
one, creating a leak.
We can fix it by cleaning up the discarded entry. It would probably be
possible to avoid creating it in the first place, but it's non-trivial.
We'd have to define a "keydata" struct that lets us compare the existing
entries to the broken-out fields. It's probably not worth the
complexity, so we'll punt on that for now.
There is one subtlety here: our insertion is happening in a loop, with
each iteration looking at the pattern we just inserted (hence the
"recursive" in the name). So if we skip insertion, what do we look at?
The obvious answer is that we should remember the existing duplicate we
found and use that. But I _think_ in that case, we probably already have
all of the recursive bits already (from when the original entry was
added). And so just breaking out of the loop would be correct. But I'm
not 100% sure on that; after all, the original leaky code could have
done the same break, but it didn't.
So I went with the "obvious answer" above, which has no chance of
changing the behavior aside from fixing the leak.
With this patch, t1091 can now be marked leak-free.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a2bc523e1e (dir.c: skip .gitignore, etc larger than INT_MAX,
2024-05-31) we put capped the size of some files whose parsing code and
data structures used ints. Setting the limit to INT_MAX was a natural
spot, since we know the parsing code would misbehave above that.
But it also leaves the possibility of overflow errors when we multiply
that limit to allocate memory. For instance, a file consisting only of
"a\na\n..." could have INT_MAX/2 entries. Allocating an array of
pointers for each would need INT_MAX*4 bytes on a 64-bit system, enough
to overflow a 32-bit int.
So let's give ourselves a bit more safety margin by giving a much
smaller limit. The size 100MB is somewhat arbitrary, but is based on the
similar value for attribute files added by 3c50032ff5 (attr: ignore
overly large gitattributes files, 2022-12-01).
There's no particular reason these have to be the same, but the idea is
that they are in the ballpark of "so huge that nobody would care, but
small enough to avoid malicious overflow". So lacking a better guess, it
makes sense to use the same value. The implementation here doesn't share
the same constant, but we could change that later (or even give it a
runtime config knob, though nobody has complained yet about the
attribute limit).
And likewise, let's add a few tests that exercise the limits, based on
the attr ones. In this case, though, we never read .gitignore from the
index; the blob code is exercised only for sparse filters. So we'll
trigger it that way.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We call the tips of branches "heads", but this command calls the
option to show only branches "--heads", which confuses the branches
themselves and the tips of branches.
Straighten the terminology by introducing "--branches" option that
limits the output to branches, and deprecate "--heads" option used
that way.
We do not plan to remove "--heads" or "-h" yet; we may want to do so
at Git 3.0, in which case, we may need to start advertising upcoming
removal with an extra warning when they are used.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We call the tips of branches "heads", but this command calls the
option to show only branches "--heads", which confuses the branches
themselves and the tips of branches.
Straighten the terminology by introducing "--branches" option that
limits the output to branches, and deprecate "--heads" option used
that way.
We do not plan to remove "--heads" or "-h" yet; we may want to do so
at Git 3.0, in which case, we may need to start advertising upcoming
removal with an extra warning when they are used.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In add_pattern_to_hashsets(), we remove entries from the
recursive_hashmap when adding similar ones to the parent_hashmap. I
won't pretend to understand all of what's going on here, but there's an
obvious leak: whatever we removed from recursive_hashmap is not
referenced anywhere else, and is never free()d.
We can easily fix this by asking the hashmap to return a pointer to the
old entry. This makes t7002 now completely leak-free.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The add_pattern() function takes a pattern string, but neither makes a
copy of it nor takes ownership of the memory. So it is the caller's
responsibility to make sure the string hangs around as long as the
pattern_list which references it.
There are a few cases in sparse-checkout where we use string literal
patterns by stuffing them into a strbuf, detaching the buffer, and then
passing the result into add_pattern(). This creates a leak when the
pattern_list is eventually cleared, since we don't retain a copy of the
detached buffer to free.
But we can observe that the whole strbuf dance is unnecessary. The point
was presumably[1] to satisfy the lifetime requirement of the string. But
string literals have static duration; we can count on them lasting for
the whole program.
So we can fix the leak by just passing them directly. And as a bonus,
that simplifies the code. The leaks can be seen in t7002, which drops
from 25 leaks to 22 with this patch. It also makes t3602 and t1090
leak-free.
In the long run, we will also want to clean up this (undocumented!)
memory lifetime requirement of add_pattern(). But that can come in a
later patch; passing the string literals directly will be the right
thing either way.
[1] The code in question comes from 416adc8711 (sparse-checkout: update
working directory in-process for 'init', 2019-11-21) and 99dfa6f970
(sparse-checkout: use in-process update for disable subcommand,
2019-11-21), but I didn't see anything in their commit messages or
on the list explaining the strbufs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When "git push" notices that the commit at the tip of the ref on
the other side it is about to overwrite does not exist locally, it
used to first try fetching it if the local repository is a partial
clone. The command has been taught not to do so and immediately
fail instead.
* th/push-local-ff-check-without-lazy-fetch:
push: don't fetch commit object when checking existence
|
|
"git init" in an already created directory, when the user
configuration has includeif.onbranch, started to fail recently,
which has been corrected.
* ps/fix-reinit-includeif-onbranch:
setup: fix bug with "includeIf.onbranch" when initializing dir
|