| Age | Commit message (Collapse) | Author | Files | Lines |
|
"git blame" learns "--diff-algorithm=<algo>" option.
* ad/blame-diff-algorithm:
blame: make diff algorithm configurable
xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
|
|
At GitHub, we had a repository that was triggering
git: merge-ort.c:3032: process_renames: Assertion `newinfo && !newinfo->merged.clean` failed.
during git replay.
This sounds similar to the somewhat recent f6ecb603ff8a (merge-ort: fix
directory rename on top of source of other rename/delete, 2025-08-06),
but the cause is different. Unlike that case, there are no
rename-to-self situations arising in this case, and new to this case it
can only be triggered during a replay operation on the 2nd or later
commit being replayed, never on the first merge in the sequence.
To trigger, the repository needs:
* an upstream which:
* renames a file to a different directory, e.g.
old/file -> new/file
* leaves other files remaining in the original directory (so that
e.g. "old/" still exists upstream even though file has been
removed from it and placed elsewhere)
* a topic branch being rebased where:
* a commit in the sequence:
* modifies old/file
* a subsequent commit in the sequence being replayed:
* does NOT touch *anything* under new/
* does NOT touch old/file
* DOES modify other paths under old/
* does NOT have any relevant renames that we need to detect
_anywhere_ elsewhere in the tree (meaning this interacts
interestingly with both directory renames and cached renames)
In such a case, the assertion will trigger. The fix turns out to be
surprisingly simple. I have a very vague recollection that I actually
considered whether to add such an if-check years ago when I added the
very similar one for oldinfo in 1b6b902d95a5 (merge-ort:
process_renames() now needs more defensiveness, 2021-01-19), but I think
I couldn't figure out a possible way to trigger it and was worried at
the time that if I didn't know how to trigger it then I wasn't so sure
that simply skipping it was correct. Waiting did give me a chance to
put more thorough tests and checks into place for the rename-to-self
cases a few months back, which I might not have found as easily
otherwise. Anyway, put the check in place now and add a test that
demonstrates the fix.
Note that this bug, as demonstrated by the conditions listed above,
runs at the intersection of relevant renames, trivial directory
resolutions, and cached renames. All three of those optimizations are
ones that unfortunately make the code (and testcases!) a bit more
complex, and threading all three makes it a bit more so. However, the
testcase isn't crazy enough that I'd expect no one to ever hit it in
practice, and was confused why we didn't see it before. After some
digging, I discovered that merge.directoryRenames=false is a workaround
to this bug, and GitHub used that setting until recently (it was a
"temporary" match-what-libgit2-does piece of code that lasted years
longer than intended). Since the conditions I gave above for triggering
this bug rule out the possibility of there being directory renames, one
might assume that it shouldn't matter whether you try to detect such
renames if there aren't any. However, due to commit a16e8efe5c2b
(merge-ort: fix merge.directoryRenames=false, 2025-03-13), the heavy
hammer used there means that merge.directoryRenames=false ALSO turns off
rename caching, which is critical to triggering the bug. This becomes
a bit more than an aside since...
Re-reading that old commit, a16e8efe5c2b (merge-ort: fix
merge.directoryRenames=false, 2025-03-13), it appears that the solution
to this latest bug might have been at least a partial alternative
solution to that old commit. And it may have been an improved
alternative (or at least help implement one), since it may be able to
avoid the heavy-handed disabling of rename cache. That might be an
interesting future thing to investigate, but is not critical for the
current fix. However, since I spent time digging it all up, at least
leave a small comment tweak breadcrumb to help some future reader
(myself or others) who wants to dig further to connect the dots a little
quicker.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While developing commit a16e8efe5c2b (merge-ort: fix
merge.directoryRenames=false, 2025-03-13), I was testing things out and
had an extra condition on one of the if-blocks that I occasionally
swapped between '&& 0' and '&& 1' to see the effects of the changes. I
forgot to remove it before submitting and it wasn't caught in review.
Remove it now.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The XDF_DIFF_ALGORITHM_MASK bit mask only includes bits for the patience
and histogram diffs, not for the minimal one. This means that when
reseting the diff algorithm to the default one, one needs to separately
clear the bit for the minimal diff. There are places in the code that fail
to do that: merge-ort.c and builtin/merge-file.c.
Add the XDF_NEED_MINIMAL bit to the bit mask, and remove the separate
clearing of this bit in the places where it hasn't been forgotten.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Various bugs about rename handling in "ort" merge strategy have
been fixed.
* en/ort-rename-fixes:
merge-ort: fix directory rename on top of source of other rename/delete
merge-ort: fix incorrect file handling
merge-ort: clarify the interning of strings in opt->priv->path
t6423: fix missed staging of file in testcases 12i,12j,12k
t6423: document two bugs with rename-to-self testcases
merge-ort: drop unnecessary temporary in check_for_directory_rename()
merge-ort: update comments to modern testfile location
|
|
Various bugs about rename handling in "ort" merge strategy have
been fixed.
* en/ort-rename-fixes:
merge-ort: fix directory rename on top of source of other rename/delete
merge-ort: fix incorrect file handling
merge-ort: clarify the interning of strings in opt->priv->path
t6423: fix missed staging of file in testcases 12i,12j,12k
t6423: document two bugs with rename-to-self testcases
merge-ort: drop unnecessary temporary in check_for_directory_rename()
merge-ort: update comments to modern testfile location
|
|
At GitHub, we've got a real-world repository that has been triggering
failures of the form:
git: merge-ort.c:3007: process_renames: Assertion `newinfo && !newinfo->merged.clean' failed.
which comes from the line:
VERIFY_CI(newinfo);
Unfortunately, this one has been quite complex to unravel, and is a
bit complex to explain. So, I'm going to carefully try to explain each
relevant piece needed to understand the fix, then carefully build up
from a simple testcase to some of the relevant testcases.
== New special case we need to consider ==
Rename pairs in the diffcore machinery connect the source path of a
rename with the destination path of a rename. Since we have rename
pairs to consider on both sides of history since the merge base,
merging has to consider a few special cases of possible overlap:
A) two rename pairs having the same target path
B) two rename pairs having the same source path
C) the source path of one rename pair being the target path of a
different rename pair
Some of these came up often enough that we gave them names:
A) a rename/rename(2to1) conflict (looks similar to an add/add conflict)
B) a rename/rename(1to2) conflict, which represents the same path being
renamed differently on the two sides of history
C) not yet named
merge-ort is well-prepared to handle cases (A) and (B), as was
merge-recursive (which was merge-ort's predecessor). Case (C) was
briefly considered during the years of merge-recursive maintenance,
but the full extent of support it got was a few FIXME/TODO comments
littered around the code highlighting some of the places that would
probably need to be fixed to support it. When I wrote merge-ort I
ignored case (C) entirely, since I believed that case (C) was only
possible if we were to support break detection during merges. Not
only had break detection never been supported by any merge algorithm,
I thought break detection wasn't worth the effort to support in a
merge algorithm. However, it turns out that case (C) can be triggered
without break detection, if there's enough moving pieces.
Before I dive into how to trigger case (C) with directory renames plus
other renames, it might be helpful to use a simpler example with break
detection first. And before we get to that it may help to explain
some more basics of handling renames in the merge algorithm. So, let
me first backup and provide a quick refresher on each of
* handling renames
* what break detection would mean, if supported in merging
* handling directory renames
From there, I'll build up from a basic directory rename detection case
to one that triggers a failure currently.
== Handling renames ==
In the merge machinery when we have a rename of a path A -> B,
processing that rename needs to remove path A, and make sure that path B
has the relevant information. Note that if the content was also
modified on both sides, this may mean that we have 3 different stages
that need to be stored at path B instead of having some stored at path
A.
Having all stages stored at path B makes it much easier for users to
investigate and resolve the content conflict associated with a renamed
path. For example:
* "git status" doesn't have to figure out how to list paths A & B and
attempt to connect them for users; it can just list path B.
* Users can use "git ls-files -u B" (instead of trying to find the
previous name of the file so they can list both, i.e. "git ls-files
-u A B")
* Users can resolve via "git add B" (without needing to "git rm A")
== What break detection would mean ==
If break detection were supported, we might have cases where A -> B
*and* C -> A, meaning that both rename pairs might believe they need to
update A. In particular, the processing of A -> B would need to be
careful to not clear out all stages of A and mark it resolved, while
both renames would need to figure out which stages of A belong with A
and which belong with B, so that both paths have the right stages
associated with them.
merge-ort (like merge-recursive before it) makes no attempt to handle
break detection; it runs with break detection turned off. It would
need to be retrofitted to handle such cases.
== Directory rename detection ==
If one side of history renames directory D/ -> E/, and the other side of
history adds new files to D/, then directory rename detection notices
and suggests moving those new files to E/. A similar thing is done for
paths renamed into D/, causing them to be transitively renamed into E/.
The default in the merge machinery is to report a conflict whenever a
directory rename might modify the location of a path, so that users can
decide whether they wanted the original path or the
directory-rename-induced location. However, that means the default
codepath still runs through all the directory rename detection logic, it
just supplements it with providing conflict notices when it is done.
== Building up increasingly complex testcases ==
I'll start with a really simple directory rename example, and then
slowly add twists that explain new pieces until we get to the
problematic cases:
=== Testcase 1 ===
Let's start with a concrete example, where particular files/directories of
interest that exist or are changed on each side are called out:
Original: <nothing of note>
our side: rename B/file -> C/file
their side: rename C/ -> A/
For this case, we'd expect to see the original B/file appear not at
C/file but at A/file.
(We would also expect a conflict notice that the user will want to
choose between C/file and A/file, but I'm going to ignore conflict
notices from here on by assuming merge.directoryRenames is set to
`true` rather than `conflict`; the only difference that assumption
makes is whether that makes the merge be considered to be conflicted
and whether it prints a conflict notice; what is written to the index
or working directory is unchanged.)
=== Testcase 2 ===
Modify testcase 1 by having A/file exist from the start:
Original: A/file exists
our side: rename B/file -> C/file
their side: rename C/ -> A/
In such a case, to avoid user confusion at what looks kind of like an
add/add conflict (even though the original path at A/file was not added
by either side of the merge), we turn off directory rename detection for
this path and print a "in the way" warning to the user:
CONFLICT (implicit dir rename): Existing file/dir ... in the way ...
The testcases in section 5 of t6423 explore these in more detail.
=== Testcase 3 ===
Let's modify testcase 1 in a slightly different way: have A/file be
added by their side rather than it already existing.
Original: <nothing of note>
our side: rename B/file -> C/file
their side: rename C/ -> A/
add A/file
In this case, the directory rename detection basically transforms our
side's original B/file -> C/file into a B/file -> A/file, and so we
get a rename/add conflict, with one version of A/file coming from the
renamed file, and another coming from the new A/file, each stored as
stages 2 and 3 in conflicts. This kind of add/add conflict is perhaps
slightly more complex than a regular add/add conflict, but with the
printed messages it makes sense where it came from and we have
different stages of the file to work with to resolve the conflict.
=== Testcase 4 ===
Let's do something similar to testcase 3, but have the opposite side of
history add A/file:
Original: <nothing of note>
our side: rename B/file -> C/file
add A/file
their side: rename C/ -> A/
Now if we allow directory rename detection to modify C/file to A/file,
then we also get a rename/add conflict, but in this case we'd need both
higher order stages being recorded on side 2, which makes no sense. The
index can't store multiple stage 2 entries, and even if we could, it
would probably be confusing for users to work with. So, similar to what
we do when there was an A/file in the original version, we simply turn
off directory rename detection for cases like this and provide the "in
the way" CONFLICT notice to the user.
=== Testcase 5 ===
We're slowly getting closer. Let's mix it up by having A/file exist at
the beginning but not exist on their side:
original: A/file exists
our side: rename B/file -> C/file
their side: rename C/ -> A/
rename A/file -> D/file
For this case, you could say that since A/file -> D/file, it's no longer
in the way of C/file being moved by directory rename detection to
A/file. But that would give us a case where A/file is both the source
and the target of a rename, similar to break detection, which the code
isn't currently equipped to handle.
This is not yet the case that causes current failures; to the current
code, this kind of looks like testcase 4 in that A/file is in the way
on our side (since A/file was in the original and was umodified by our
side). So, it results in a "in the way" notification with directory
rename detection being turned off for A/file so that B/file ends up at
C/file.
Perhaps the resolution could be improved in the future, but our "in
the way" checks prevented such problems by noticing that A/file exists
on our side and thus turns off directory rename detection from
affecting C/file's location. So, while the merge result could be
perhaps improved, the fact that this is currently handled by giving
the user an "in the way" message gives the user a chance to resolve
and prevents the code from tripping itself up.
=== Testcase 6 ===
Let's modify testcase 5 a bit more, to also delete A/file on our side:
original: A/file exists
our side: rename B/file -> C/file
delete A/file
their side: rename C/ -> A/
rename A/file -> D/file
Now the "in the way" logic doesn't detect that there's an A/file in
the way (neither side has an A/file anymore), so it's fine to
transitively rename C/file further to A/file...except that we end up
with A/file being both the source of one rename, and the target of a
different rename. Each rename pair tries to handle the resolution of
the source and target paths of its own rename. But when we go to
process the second rename pair in process_renames(), we do not expect
either the source or the destination to be marked as handled already;
so, when we hit the sanity checks that these are not handled:
VERIFY_CI(oldinfo);
VERIFY_CI(newinfo);
then one of these is going to throw an assertion failure since the
previous rename pair already marked both of its paths as handled.
This will give us an error of the form:
git: merge-ort.c:3007: process_renames: Assertion `newinfo && !newinfo->merged.clean' failed.
This is the failure we're currently triggering, and it fundamentally
depends on:
* a path existing in the original
* that original path being removed or renamed on *both* sides
* some kind of directory rename moving some *other* path into that
original path
This was added as testcase 12q in t6423.
=== Testcase 7 ===
Bonus bug found while investigating!
Let's go back to the comparison between testcases 2 & 3, and set up a
file present on their side that we need to consider:
Original: A/file exists
our side: rename B/file -> C/file
rename A/file -> D/file
their side: rename C/ -> A/
Here, there is no A/file in the way on our side like testcase 4.
There is an A/file present on their side like testcase 3, which was an
add/add conflict, but that's associated with the file be renamed to
D/file. So, that really shouldn't be an add/add conflict because we
instead want all modes of the original A/file to be transported to
D/file.
Unfortunately, the current code kind of treats it like an add/add
conflict instead...but even worse. There is also a valid mode for
A/file in the original, which normally goes to stage 1. However, an
add/add conflict should be represented in the index with no mode at
stage 1 (for the original side), only modes at stages 2 and 3 (for our
and their side), so for an add/add we'd expect that mode for A/file in
the original version to be cleared out (or be transported to D/file).
Unfortunately, the code currently leaves not only the stage 3 entry
for A/file intact, it also leaves the stage 1 entry for A/file. This
results in `git ls-files -u A/file` output of the form:
100644 d00491fd7e5bb6fa28c517a0bb32b8b506539d4d 1 A/file
100644 0cfbf08886fca9a91cb753ec8734c84fcbe52c9f 2 A/file
100644 d00491fd7e5bb6fa28c517a0bb32b8b506539d4d 3 A/file
This would likely cause users to believe this isn't an add/add
conflict; rather, this would lead them to believe that A/file was only
modified on our side and that therefore it should not have been a
conflict in the first place. And while resolving the conflict in
favor of our side is the correct resolution (because stages 1 and 3
should have been cleared out in the first place), this is certainly
likely to cause confusion for anyone attempting to investigate why
this path was marked as conflicted.
This was added as testcase 12p in t6423.
== Attempted solutions that I discarded ==
1) For each side of history, create a strset of the sources of each
rename on the other side of history. Then when using directory
renames to modify existing renames, verify that we aren't renaming
to a source of another rename.
Unfortunately, the "relevant renames" optimization in merge-ort
means we often don't detect renames -- we just see a delete and an
add -- which is easy to forget and makes debugging testcases harder,
but it also turns out that this solution in insufficient to solve
the related problems in the area (more on that below).
2) Modify the code to be aware of the possibility of renaming to
the source of another side's rename, and make all the conflict
resolution logic for each case (including existing
rename/rename(2to1) and rename/rename(1to2) cases) handle the
additional complexity. It turns out there was much more code to
audit than I wanted, for a really niche case. I didn't like how
many changes were needed, and aborted.
== Solution ==
We do not want the stages of unrelated files appearing at the same path
in the index except when dealing with an add/add conflict. While we
previously handled this for stages 2 & 3, we also need to worry about
stage 1. So check for a stage 1 index entry being in the way of a
directory rename.
However, if we can detect that the stage 1 index entry is actually from
a related file due to a directory-rename-causes-rename-to-self
situation, then we can allow the stage 1 entry to remain.
From this wording, you may note that it's not just rename cases that
are a problem; bugs could be triggered with directory renames vs simple
adds. That leads us to...
== Testcases 8+ ==
Another bonus bug, found via understanding our final solutions (and the
failure of our first attempted solution)!
Let's tweak testcase 7 a bit:
Original: A/file exists
our side: delete A/file
add -> C/file
their side: delete A/file
rename C/ -> A/
Here, there doesn't seem to be a big problem. Sure C/file gets modified
via the directory rename of C/ -> A/ so that it becomes A/file, but
there's no file in the way, right? Actually, here we have a problem
that the stage 1 entry of A/file would be combined with the stage 2
entry of C/file, and make it look like a modify/delete conflict.
Perhaps there is some extra checking that could be added to the code to
make it attempt to clear out the stage 1 entry of A/file, but the
various rename-to-self-via-directory-rename testcases make that a bit
more difficult. For now, it's easier to just treat this as a
path-in-the-way situation and not allow the directory rename to modify
C/file.
That sounds all well and good, but it does have an interesting side
effect. Due to the "relevant renames" optimizations in merge-ort (i.e.
only detect the renames you need), 100% renames whose files weren't
modified on the other side often go undetected. This means that if we
modify this testcase slightly to:
Original: A/file exists
our side: A/file -> C/file
their side: rename C/ -> A/
Then although this looks like where the directory rename just moves
C/file back to A/file and there's no problem, we may not detect the
A/file -> C/file rename. Instead it will look like a deletion of A/file
and an addition of C/file. The directory rename then appears to be
moving C/file to A/file, which is on top of an "unrelated" file (or at
least a file it doesn't know is related). So, we will report
path-in-the-way conflicts now in cases where we didn't before. That's
better than silently and accidentally combining stages of unrelated
files and making them look like a modify/delete; users can investigate
the reported conflict and simply resolve it.
This means we tweak the expected solution for testcases 12i, 12j, and
12k. (Those three tests are basically the same test repeated three
times, but I was worried when I added those that subtle differences in
parent/child, sibling/sibling, and toplevel directories might mess up
how rename-to-self testcases actually get handled.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We have multiple bugs here -- accidental silent file deletion,
accidental silent file retention for files that should be deleted,
and incorrect number of entries left in the index.
The series merged at commit d3b88be1b450 (Merge branch
'en/merge-dir-rename-corner-case-fix', 2021-07-16) introduced testcase
12i-12k in t6423 which checked for rename-to-self cases, and fixed bugs
that merge-ort and merge-recursive had with these testcases. At the
time, I noted that merge-ort had one bug for these cases, while
merge-recursive had two. It turns out that merge-ort did in fact have
another bug, but the "relevant renames" optimizations were masking it.
If we modify testcase 12i from t6423 to modify the file in the commit
that renames it (but only modify it enough that it can still be detected
as a rename), then we can trigger silent deletion of the file.
Tweak testcase 12i slightly to make the file in question have more than
one line in it. This leaves the testcase intact other than changing the
initial contents of this one file. The purpose of this tweak is to
minimize the changes between this testcase and a new one that we want to
add. Then duplicate testcase 12i as 12i2, changing it so that it adds a
single line to the file in question when it is renamed; testcase 12i2
then serves as a testcase for this merge-ort bug that I previously
overlooked.
Further, commit 98a1a00d5301 (t6423: add a testcase causing a failed
assertion in process_renames, 2025-03-06), fixed an issue with
rename-to-self but added a new testcase, 12n, that only checked for
whether the merge ran to completion. A few commits ago, we modified
this test to check for the number of entries in the index -- but noted
that the number was wrong. And we also noted a
silently-keep-instead-of-delete bug at the same time in the new testcase
12n2.
In summary, we have the following bugs with rename-to-self cases:
* silent deletion of file expected to be kept (t6423 testcase 12i2)
* silent retention of file expected to be removed (t6423 testcase 12n2)
* wrong number of extries left in the index (t6423 testcase 12n)
All of these bugs arise because in a rename-to-self case, when we have a
rename A->B, both A and B name the same file. The code in
process_renames() assumes A & B are different, and tries to move the
higher order stages and file contents so that they are associated just
with the new path, but the assumptions of A & B being different can
cause A to be deleted when it's not supposed to be or mark B as resolved
and kept in place when it's supposed to be deleted. Since A & B are
already the same path in the rename-to-self case, simply skip the steps
in process_renames() for such files to fix these bugs.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Because merge-ort is dealing with potentially all the pathnames in the
repository, it sometimes needs to do an awful lot of string comparisons.
Because of this, struct merge_options_internal's path member was
envisioned from the beginning to contain an interned value for every
path in order to allow us to compare strings via pointer comparison
instead of using strcmp. See
* 5b59c3db059d (merge-ort: setup basic internal data structures,
2020-12-13)
* f591c4724615 (merge-ort: copy and adapt merge_3way() from
merge-recursive.c, 2021-01-01)
for some of the early comments.
However, the original comment was slightly misleading when it switched
from mentioning paths to only mentioning directories. Fix that, and
while at it also point to an example in the code which applies the extra
needed care to permit the pointer comparison optimization.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
check_for_directory_rename() had a weirdly coded check for whether a
strmap contained a certain key. Replace the temporary variable and call
to strmap_get_entry() with the more natural strmap_contains() call.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In commit 919df3195553 (Collect merge-related tests to t64xx,
2020-08-10), merge related tests were moved from t60xx to t64xx. Some
comments in merge-ort relating to some tricky code referenced specific
testcases within certain testfiles for additional information, but
referred to their historical testfile names; update the testfile names
to mention their modern location.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Reduce implicit assumption and dependence on the_repository in the
object-file subsystem.
* ps/object-file-wo-the-repository:
object-file: get rid of `the_repository` in index-related functions
object-file: get rid of `the_repository` in `force_object_loose()`
object-file: get rid of `the_repository` in `read_loose_object()`
object-file: get rid of `the_repository` in loose object iterators
object-file: remove declaration for `for_each_file_in_obj_subdir()`
object-file: inline `for_each_loose_file_in_objdir_buf()`
object-file: get rid of `the_repository` when writing objects
odb: introduce `odb_write_object()`
loose: write loose objects map via their source
object-file: get rid of `the_repository` in `finalize_object_file()`
object-file: get rid of `the_repository` in `loose_object_info()`
object-file: get rid of `the_repository` when freshening objects
object-file: inline `check_and_freshen()` functions
object-file: get rid of `the_repository` in `has_loose_object()`
object-file: stop using `the_hash_algo`
object-file: fix -Wsign-compare warnings
|
|
In 036876a1067 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config_get_bool()`. All
callsites are adjusted so that they use
`repo_config_get_bool(the_repository, ...)` instead. While some
callsites might already have a repository available, this mechanical
conversion is the exact same as the current situation and thus cannot
cause any regression. Those sites should eventually be cleaned up in a
later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 036876a1067 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config_get_int()`. All
callsites are adjusted so that they use
`repo_config_get_int(the_repository, ...)` instead. While some callsites
might already have a repository available, this mechanical conversion is
the exact same as the current situation and thus cannot cause any
regression. Those sites should eventually be cleaned up in a later patch
series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 036876a1067 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config_get_string()`.
All callsites are adjusted so that they use
`repo_config_get_string(the_repository, ...)` instead. While some
callsites might already have a repository available, this mechanical
conversion is the exact same as the current situation and thus cannot
cause any regression. Those sites should eventually be cleaned up in a
later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 036876a1067 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config()`. All callsites
are adjusted so that they use `repo_config(the_repository, ...)`
instead. While some callsites might already have a repository available,
this mechanical conversion is the exact same as the current situation
and thus cannot cause any regression. Those sites should eventually be
cleaned up in a later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We do not have a backend-agnostic way to write objects into an object
database. While there is `write_object_file()`, this function is rather
specific to the loose object format.
Introduce `odb_write_object()` to plug this gap. For now, this function
is a simple wrapper around `write_object_file()` and doesn't even use
the passed-in object database yet. This will change in subsequent
commits, where `write_object_file()` is converted so that it works on
top of an `odb_source`. `odb_write_object()` will then become
responsible for deciding which source an object shall be written to.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Rename `repo_read_object_file()` to `odb_read_object()` to match other
functions related to the object database and our modern coding
guidelines.
Introduce a compatibility wrapper so that any in-flight topics will
continue to compile.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Rename `oid_object_info()` to `odb_read_object_info()` as well as their
`_extended()` variant to match other functions related to the object
database and our modern coding guidelines.
Introduce compatibility wrappers so that any in-flight topics will
continue to compile.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the preceding commits we have renamed the structures contained in
"object-store.h" to `struct object_database` and `struct odb_backend`.
As such, the code files "object-store.{c,h}" are confusingly named now.
Rename them to "odb.{c,h}" accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Git Forges may be interested in whether two branches can be merged while
not being interested in what the resulting merge tree is nor which files
conflicted. For such cases, add a new mergeability_only option. This
option allows the merge machinery to, in the "outer layer" of the merge:
* exit upon first[-ish] conflict
* avoid (not prevent) writing merged blobs/trees to the object store
I have a number of qualifiers there, so let me explain each:
"outer layer":
Note that since the recursive merge of merge bases (corresponding to
call_depth > 0) can conflict without the outer final merge
(corresponding to call_depth == 0) conflicting, we can't short-circuit
nor avoid writing merged blobs/trees to the object store during those
inner merges.
"first-ish conflict":
The current patch only exits early from process_entries() on the first
conflict it detects, but conflicts could have been detected in a
previous function call, namely detect_and_process_renames(). However:
* conflicts detected by detect_and_process_renames() are quite rare
conflict types
* the detection would still come after regular rename detection
(which is the expensive part of detect_and_process_renames()), so
it is not saving us much in computation time given that
process_entries() directly follows detect_and_process_renames()
* [this overlaps with the next bullet point] process_entries() is the
place where virtually all object writing occurs (object writing is
sometimes more of a concern for Forges than computation time), so
exiting early here isn't saving us much in object writes either
* the code changes needed to handle an earlier exit are slightly
more invasive in detect_and_process_renames() than for
process_entries().
Given the rareness of the even earlier conflicts, the limited savings
we'd get from exiting even earlier, and in an attempt to keep this
patch simpler, we don't guarantee that we actually exit on the first
conflict detected. We can always revisit this decision later if we
decide that a further micro-optimization to exit slightly earlier in
rare cases is worthwhile.
"avoid (not prevent) writing objects":
The detect_and_process_renames() call can also write objects to the
object store, when rename/rename conflicts involve one (or more) files
that have also been modified on both sides. Because of this alternate
call path leading to handle_content_merges(), our "early exit" does not
prevent writing objects entirely, even within the "outer layer"
(i.e. even within call_depth == 0). I figure that's fine though, since
we're already writing objects for the inner merges (i.e. for call_depth
> 0), which are likely going to represent vastly more objects than files
involved in rename/rename+modify/modify cases in the outer merge, on
average.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code clean-up.
* ps/object-file-cleanup:
object-store: merge "object-store-ll.h" and "object-store.h"
object-store: remove global array of cached objects
object: split out functions relating to object store subsystem
object-file: drop `index_blob_stream()`
object-file: split up concerns of `HASH_*` flags
object-file: split out functions relating to object store subsystem
object-file: move `xmmap()` into "wrapper.c"
object-file: move `git_open_cloexec()` to "compat/open.c"
object-file: move `safe_create_leading_directories()` into "path.c"
object-file: move `mkdir_in_gitdir()` into "path.c"
|
|
Remove remnants of the recursive merge strategy backend, which was
superseded by the ort merge strategy.
* en/merge-recursive-debug:
builtin/{merge,rebase,revert}: remove GIT_TEST_MERGE_ALGORITHM
tests: remove GIT_TEST_MERGE_ALGORITHM and test_expect_merge_algorithm
merge-recursive.[ch]: thoroughly debug these
merge, sequencer: switch recursive merges over to ort
sequencer: switch non-recursive merges over to ort
merge-ort: enable diff-algorithms other than histogram
builtin/merge-recursive: switch to using merge_ort_generic()
checkout: replace merge_trees() with merge_ort_nonrecursive()
|
|
The object layer has been updated to take an explicit repository
instance as a parameter in more code paths.
* ps/object-wo-the-repository:
hash: stop depending on `the_repository` in `null_oid()`
hash: fix "-Wsign-compare" warnings
object-file: split out logic regarding hash algorithms
delta-islands: stop depending on `the_repository`
object-file-convert: stop depending on `the_repository`
pack-bitmap-write: stop depending on `the_repository`
pack-revindex: stop depending on `the_repository`
pack-check: stop depending on `the_repository`
environment: move access to "core.bigFileThreshold" into repo settings
pack-write: stop depending on `the_repository` and `the_hash_algo`
object: stop depending on `the_repository`
csum-file: stop depending on `the_repository`
|
|
While we have the "object-store.h" header, most of the functionality for
object stores is actually hosted in "object-file.c". This makes it hard
to find relevant functions and causes us to mix up concerns.
Split out functions relating to the object store subsystem into a new
"object-store.c" file.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* ps/object-wo-the-repository:
hash: stop depending on `the_repository` in `null_oid()`
hash: fix "-Wsign-compare" warnings
object-file: split out logic regarding hash algorithms
delta-islands: stop depending on `the_repository`
object-file-convert: stop depending on `the_repository`
pack-bitmap-write: stop depending on `the_repository`
pack-revindex: stop depending on `the_repository`
pack-check: stop depending on `the_repository`
environment: move access to "core.bigFileThreshold" into repo settings
pack-write: stop depending on `the_repository` and `the_hash_algo`
object: stop depending on `the_repository`
csum-file: stop depending on `the_repository`
|
|
As a wise man once told me, "Deleted code is debugged code!" So, move
the functions that are shared between merge-recursive and merge-ort from
the former to the latter, and then debug the remainder of
merge-recursive.[ch].
Joking aside, merge-ort was always intended to replace merge-recursive.
It has numerous advantages over merge-recursive (operates much faster,
can operate without a worktree or index, and fixes a number of known
bugs and suboptimal merges). Since we have now replaced all callers of
merge-recursive with equivalent functions from merge-ort, move the
shared functions from the former to the latter, and delete the remainder
of merge-recursive.[ch].
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The ort merge strategy has always used the histogram diff algorithm.
The recursive merge strategy, in contrast, defaults to the myers
diff algorithm, while allowing it to be changed.
Change the ort merge strategy to allow different diff algorithms, by
removing the hard coded value in merge_start() and instead just making
it a default in init_merge_options(). Technically, this also changes
the default diff algorithm for the recursive backend too, but we're
going to remove the final callers of the recursive backend in the next
two commits.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Ensure what we write in assert() does not have side effects,
and introduce ASSERT() macro to mark those that cannot be
mechanically checked for lack of side effects.
* en/assert-wo-side-effects:
treewide: replace assert() with ASSERT() in special cases
ci: add build checking for side-effects in assert() calls
git-compat-util: introduce ASSERT() macro
|
|
Miscellaneous code clean-ups.
* en/random-cleanups:
merge-ort: remove extraneous word in comment
merge-ort: fix accidental strset<->strintmap
t7615: be more explicit about diff algorithm used
t6423: fix a comment that accidentally reversed two commits
stash: remove merge-recursive.h include
|
|
First step of deprecating and removing merge-recursive.
* en/merge-ort-prepare-to-remove-recursive:
am: switch from merge_recursive_generic() to merge_ort_generic()
merge-ort: fix merge.directoryRenames=false
t3650: document bug when directory renames are turned off
merge-ort: support having merge verbosity be set to 0
merge-ort: allow rename detection to be disabled
merge-ort: add new merge_ort_generic() function
|
|
When the compiler/linker cannot verify that an assert() invocation is
free of side effects for us (e.g. because the assertion includes some
kind of function call), replace the use of assert() with ASSERT().
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are two issues here.
First, when merge.directoryRenames is set to false, there are a few code
paths that should be turned off. I missed one; collect_renames() was
still doing some directory rename detection logic unconditionally. It
ended up not having much effect because
get_provisional_directory_renames() was skipped earlier and not setting
up renames->dir_renames, but the code should still be skipped.
Second, the larger issue is that sometimes we get a cached_pair rename
from a previous commit being replayed mapping A->B, but in a subsequent
commit but collect_merge_info() doesn't even recurse into the
directory containing B because there are no source pairings for that
rename that are relevant; we can merge that commit fine without knowing
the rename. But since the cached renames are added to the normal
renames, when we go to process it and find that B is not part of
opt->priv->paths, we hit the assertion error
process_renames: Assertion `newinfo && ~newinfo->merged.clean` failed.
I think we could fix this at the beginning of detect_regular_renames() by
pruning from cached_pairs any entry whose destination isn't in
opt->priv->paths, but it's suboptimal in that we'd kind of like the
cached_pair to be restored afterwards so that it can help the subsequent
commit, but more importantly since it sits at the intersection of
the caching renames optimization and the relevant renames optimization,
and the trivial directory resolution optimization, and I don't currently
have Documentation/technical/remembering-renames.txt fully paged in, I'm
not sure if that's a full solution or a bandaid for the current
testcase. However, since the remembering renames optimization was the
weakest of the set, and the optimization is far less important when
directory rename detection is off (as that implies far fewer potential
renames), let's just use a bigger hammer to ensure this special case is
fixed: turn off the rename caching. We do the same thing already when
we encounter rename/rename(1to1) cases (as per `git grep -3
disabling.the.optimization`, though it uses a slightly different
triggering mechanism since it's trying to affect the next time that
merge_check_renames_reusable() is called), and I think it makes sense
to do the same here.
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When merge-ort was written, I did not at first allow rename detection to
be disabled, because I suspected that most folks disabling rename
detection were doing so solely for performance reasons. Since I put a
lot of working into providing dramatic speedups for rename detection
performance as used by the merge machinery, I wanted to know if there
were still real world repositories where rename detection was
problematic from a performance perspective. We have had years now to
collect such information, and while we never received one, waiting
longer with the option disabled seems unlikely to help surface such
issues at this point. Also, there has been at least one request to
allow rename detection to be disabled for behavioral rather than
performance reasons (see the thread including
https://lore.kernel.org/git/CABPp-BG-Nx6SCxxkGXn_Fwd2wseifMFND8eddvWxiZVZk0zRaA@mail.gmail.com/
), so let's start heeding the config and command line settings.
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
merge-recursive.[ch] have three entry points:
* merge_trees()
* merge_recursive()
* merge_recursive_generic()
merge-ort*.[ch] only has equivalents for the first two. Add an
equivalent for the final entry point, so we can switch callers to
use it and remove merge-recursive.[ch].
While porting it over, finally fix the issue with the label for the
ancestor (used when merge.conflictStyle=diff3 as a conflict label).
merge-recursive.c has traditionally not allowed callers to set that
label, but I have found that problematic for years.
(Side note: This function was initially part of the merge-ort rewrite,
but reviewers questioned the ancestor label funnyness which I was
never really happy with anyway. It resulted in me jettisoning it and
hoping at the time that I would eventually be able to force the existing
callers to use some other API. That worked with `git stash`, as per
874cf2a60444 (stash: apply stash using 'merge_ort_nonrecursive()',
2022-05-10), but this API is the most reasonable one for `git am` and
`git merge-recursive`, if we can just allow them some freedom over the
ancestor label.)
The merge_recursive_generic() function did not know whether it was being
invoked by `git stash`, `git merge-recursive`, or `git am`, and the
choice of meaningful ancestor label, when there is a unique ancestor,
varies for these different callers:
* git am: ancestor is a constructed "fake ancestor" that user knows
nothing about and has no access to. (And is different than
the normal thing we mean by a "virtual merge base" which is
the merging of merge bases.)
* git merge-recursive: ancestor might be a tree, but at least it
was one specified by the user (if they invoked
merge-recursive directly)
* git stash: ancestor was the commit serving as the stash base
Thus, using a label like "constructed merge base" (as
merge_recursive_generic() does) presupposes that `git am` is the only
caller; it is incorrect for other callers. This label has thrown me off
more than once. Allow the caller to override when there is a unique
merge base.
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"is was" -> "was"
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Both strset_for_each_entry and strintmap_for_each_entry are macros that
evaluate to the same thing, so they are technically interchangeable.
However, the intent is that we use the one matching the variable type we
are passing. Unfortunately, I somehow mistakenly got one of these wrong
in 7bee6c100431 (merge-ort: avoid recursing into directories when we
don't need to, 2021-07-16) -- possibly related to the fact that
relevant_sources was initially a strset and later refactored into a
strintmap. Correct which macro we use.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `null_oid()` function returns the object ID that only consists of
zeroes. Naturally, this ID also depends on the hash algorithm used, as
the number of zeroes is different between SHA1 and SHA256. Consequently,
the function returns the hash-algorithm-specific null object ID.
This is currently done by depending on `the_hash_algo`, which implicitly
makes us depend on `the_repository`. Refactor the function to instead
pass in the hash algorithm for which we want to retrieve the null object
ID. Adapt callsites accordingly by passing in `the_repository`, thus
bubbling up the dependency on that global variable by one layer.
There are a couple of trivial exceptions for subsystems that already got
rid of `the_repository`. These subsystems instead use the repository
that is available via the calling context:
- "builtin/grep.c"
- "grep.c"
- "refs/debug.c"
There are also two non-trivial exceptions:
- "diff-no-index.c": Here we know that we may not have a repository
initialized at all, so we cannot rely on `the_repository`. Instead,
we adapt `diff_no_index()` to get a `struct git_hash_algo` as
parameter. The only caller is located in "builtin/diff.c", where we
know to call `repo_set_hash_algo()` in case we're running outside of
a Git repository. Consequently, it is fine to continue passing
`the_repository->hash_algo` even in this case.
- "builtin/ls-files.c": There is an in-flight patch series that drops
`USE_THE_REPOSITORY_VARIABLE` in this file, which causes a semantic
conflict because we use `null_oid()` in `show_submodule()`. The
value is passed to `repo_submodule_init()`, which may use the object
ID to resolve a tree-ish in the superproject from which we want to
read the submodule config. As such, the object ID should refer to an
object in the superproject, and consequently we need to use its hash
algorithm.
This means that we could in theory just not bother about this edge
case at all and just use `the_repository` in "diff-no-index.c". But
doing so would feel misdesigned.
Remove the `USE_THE_REPOSITORY_VARIABLE` preprocessor define in
"hash.c".
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
merge-ort has a number of sanity checks on the file it is processing in
process_renames(). One of these sanity checks was slightly overzealous
because it indirectly assumed that a renamed file always ended up at a
different path than where it started. That is normally an entirely fair
assumption, but directory rename detection can make things interesting.
As a quick refresher, if one side of history renames directory A/ -> B/,
and the other side of history adds new files to A/, then directory
rename detection notices and suggests moving those new files to B/. A
similar thing is done for paths renamed into A/, causing them to be
transitively renamed into B/. But, if the file originally came from B/,
then this can end up causing a file to be renamed back to itself.
It turns out the rest of the code following this assertion handled the
case fine; the assertion was just an extra sanity check, not a rigid
precondition. Therefore, simply adjust the assertion to pass under this
special case as well.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Mark code units that generate warnings with `-Wsign-compare`. This
allows for a structured approach to get rid of all such warnings over
time in a way that can be easily measured.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The lifecycle management of diff queues is somewhat confusing:
- For most of the part this can be attributed to `DIFF_QUEUE_CLEAR()`,
which does not release any memory but rather initializes the queue,
only. This is in contrast to our common naming schema, where
"clearing" means that we release underlying memory and then
re-initialize the data structure such that it is ready to use.
- A second offender is `diff_free_queue()`, which does not free the
queue structure itself. It is rather a release-style function.
Refactor the code to make things less confusing. `DIFF_QUEUE_CLEAR()` is
replaced by `DIFF_QUEUE_INIT` and `diff_queue_init()`, while
`diff_free_queue()` is replaced by `diff_queue_release()`. While on it,
adapt callsites where we call `DIFF_QUEUE_CLEAR()` with the intent to
release underlying memory to instead call `diff_queue_clear()` to fix
memory leaks.
This memory leak is exposed by t4211, but plugging it alone does not
make the whole test suite pass.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Typofix.
* ak/typofix-2.46-maint:
upload-pack: fix a typo
sideband: fix a typo
setup: fix a typo
run-command: fix a typo
revision: fix a typo
refs: fix typos
rebase: fix a typo
read-cache-ll: fix a typo
pretty: fix a typo
object-file: fix a typo
merge-ort: fix typos
merge-ll: fix a typo
http: fix a typo
gpg-interface: fix a typo
git-p4: fix typos
git-instaweb: fix a typo
fsmonitor-settings: fix a typo
diffcore-rename: fix typos
config.mak.dev: fix a typo
|
|
Fix typos in comments.
Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are two leaks in `apply_directory_rename_modifications()`:
- We do not release the `dirs_to_insert` string list.
- We do not release some `conflict_info` we put into the
`opt->priv->paths` string map.
The former is trivial to fix. The latter is a bit less straight forward:
the `util` pointer of the string map may sometimes point to data that
has been allocated via `CALLOC()`, while at other times it may point to
data that has been allocated via a `mem_pool`.
It very much seems like an oversight that we didn't also allocate the
conflict info in this code path via the memory pool, though. So let's
fix that, which will also plug the memory leak for us.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We conditionally release the index used for reading gitattributes in
merge-ort based on whether or the index has been populated. This check
uses `cache_nr` as a condition. This isn't sufficient though, as the
variable may be zero even when some other parts of the index have been
populated. This leads to memory leaks when sparse checkouts are in use,
as we may not end up releasing the sparse checkout patterns.
Fix this issue by unconditionally releasing the index.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "ort" merge backend saw one bugfix for a crash that happens
when inner merge gets killed, and assorted code clean-ups.
* en/ort-inner-merge-error-fix:
merge-ort: fix missing early return
merge-ort: convert more error() cases to path_msg()
merge-ort: upon merge abort, only show messages causing the abort
merge-ort: loosen commented requirements
merge-ort: clearer propagation of failure-to-function from merge_submodule
merge-ort: fix type of local 'clean' var in handle_content_merge ()
merge-ort: maintain expected invariant for priv member
merge-ort: extract handling of priv member into reusable function
|
|
More memory leaks have been plugged.
* ps/leakfixes-more: (29 commits)
builtin/blame: fix leaking ignore revs files
builtin/blame: fix leaking prefixed paths
blame: fix leaking data for blame scoreboards
line-range: plug leaking find functions
merge: fix leaking merge bases
builtin/merge: fix leaking `struct cmdnames` in `get_strategy()`
sequencer: fix memory leaks in `make_script_with_merges()`
builtin/clone: plug leaking HEAD ref in `wanted_peer_refs()`
apply: fix leaking string in `match_fragment()`
sequencer: fix leaking string buffer in `commit_staged_changes()`
commit: fix leaking parents when calling `commit_tree_extended()`
config: fix leaking "core.notesref" variable
rerere: fix various trivial leaks
builtin/stash: fix leak in `show_stash()`
revision: free diff options
builtin/log: fix leaking commit list in git-cherry(1)
merge-recursive: fix memory leak when finalizing merge
builtin/merge-recursive: fix leaking object ID bases
builtin/difftool: plug memory leaks in `run_dir_diff()`
object-name: free leaking object contexts
...
|
|
One of the conversions in f19b9165 (merge-ort: convert more error()
cases to path_msg(), 2024-06-19) accidentally lost the early return.
Restore it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
merge_submodule() stores errors using path_msg(), whereas other call
sites make use of the error() function. This is inconsistent, and
moving towards path_msg() seems more friendly for libification efforts
since it will allow the caller to determine whether the error messages
need to be printed.
Note that this deferred handling of error messages changes the error
message in a recursive merge from
error: failed to execute internal merge
to
From inner merge: error: failed to execute internal merge
which provides a little more information about the error which may be
useful. Since the recursive merge strategy still only shows the older
error, we had to adjust the new testcase introduced a few commits ago to
just search for the older message somewhere in the output.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When something goes wrong enough that we need to abort early and not
even attempt merging the remaining files, it probably does not make
sense to report conflicts messages for the subset of files we processed
before hitting the fatal error. Instead, only show the messages
associated with paths where we hit the fatal error. Also, print these
messages to stderr rather than stdout.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The comment above type_short_descriptions claimed that the order had to
match what was found in the conflict_info_and_types enum. Since
type_short_descriptions uses designated initializers, the order should
not actually matter; I am guessing that positional initializers may have
been under consideration when that comment was added, but the comment
was not updated when designated initializers were chosen.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The 'clean' member variable is somewhat of a tri-state (1 = clean, 0 =
conflicted, -1 = failure-to-determine), but we often like to think of
it as binary (ignoring the possibility of a negative value) and use
constructs like '!clean' to reflect this. However, these constructs
can make codepaths more difficult to understand, unless we handle the
negative case early and return pre-emptively; do that in
handle_content_merge() to make the code a bit easier to read.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
handle_content_merge() returns an int. Every caller of
handle_content_merge() expects an int. However, we declare a local
variable 'clean' that we use for the return value to be unsigned. To
make matters worse, we also assign 'clean' the return value of
merge_submodule() in one codepath, which is defined to return an int.
It seems that the only reason to have 'clean' be unsigned was to allow a
cutesy bit manipulation operation to be well-defined. Fix the type of
the 'clean' local in handle_content_merge().
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The calling convention for the merge machinery is
One call to init_merge_options()
One or more calls to merge_incore_[non]recursive()
One call to merge_finalize()
(possibly indirectly via merge_switch_to_result())
Both merge_switch_to_result() and merge_finalize() expect
opt->priv == NULL && result->priv != NULL
which is supposed to be set up by our move_opt_priv_to_result_priv()
function. However, two codepaths dealing with error cases did not
execute this necessary logic, which could result in assertion failures
(or, if assertions were compiled out, could result in segfaults). Fix
the oversight and add a test that would have caught one of these
problems.
While at it, also tighten an existing test for a non-recursive merge
to verify that it fails with appropriate status. Most merge tests in
the testsuite check either for success or conflicts; those testing for
neither are rare and it is good to ensure they support the invariant
assumed by builtin/merge.c in this comment:
/*
* The backend exits with 1 when conflicts are
* left to be resolved, with 2 when it does not
* handle the given merge at all.
*/
So, explicitly check for the exit status of 2 in these cases.
Reported-by: Matt Cree <matt.cree@gearset.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In preparation for a subsequent commit which will ensure we do not
forget to maintain our invariants for the priv member in error
codepaths, extract the necessary functionality out into a separate
function. This change is cosmetic at this point, and introduces no
changes beyond an extra assertion sanity check.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Use of the `the_repository` variable is deprecated nowadays, and we
slowly but steadily convert the codebase to not use it anymore. Instead,
callers should be passing down the repository to work on via parameters.
It is hard though to prove that a given code unit does not use this
variable anymore. The most trivial case, merely demonstrating that there
is no direct use of `the_repository`, is already a bit of a pain during
code reviews as the reviewer needs to manually verify claims made by the
patch author. The bigger problem though is that we have many interfaces
that implicitly rely on `the_repository`.
Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code
units to opt into usage of `the_repository`. The intent of this macro is
to demonstrate that a certain code unit does not use this variable
anymore, and to keep it from new dependencies on it in future changes,
be it explicit or implicit
For now, the macro only guards `the_repository` itself as well as
`the_hash_algo`. There are many more known interfaces where we have an
implicit dependency on `the_repository`, but those are not guarded at
the current point in time. Over time though, we should start to add
guards as required (or even better, just remove them).
Define the macro as required in our code units. As expected, most of our
code still relies on the global variable. Nearly all of our builtins
rely on the variable as there is no way yet to pass `the_repository` to
their entry point. For now, declare the macro in "biultin.h" to keep the
required changes at least a little bit more contained.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When calling either the recursive or the ORT merge machineries we need
to provide a list of merge bases. The ownership of that parameter is
then implicitly transferred to the callee, which is somewhat fishy.
Furthermore, that list may leak in some cases where the merge machinery
runs into an error, thus causing a memory leak.
Refactor the code such that we stop transferring ownership. Instead, the
merge machinery will now create its own local copies of the passed in
list as required if they need to modify the list. Free the list at the
callsites as required.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git checkout --conflict=bad" reported a bad conflictStyle as if it
were given to a configuration variable; it has been corrected to
report that the command line option is bad.
* pw/checkout-conflict-errorfix:
checkout: fix interaction between --conflict and --merge
checkout: cleanup --conflict=<style> parsing
merge options: add a conflict style member
merge-ll: introduce LL_MERGE_OPTIONS_INIT
xdiff-interface: refactor parsing of merge.conflictstyle
|
|
Work to support a repository that work with both SHA-1 and SHA-256
hash algorithms has started.
* eb/hash-transition: (30 commits)
t1016-compatObjectFormat: add tests to verify the conversion between objects
t1006: test oid compatibility with cat-file
t1006: rename sha1 to oid
test-lib: compute the compatibility hash so tests may use it
builtin/ls-tree: let the oid determine the output algorithm
object-file: handle compat objects in check_object_signature
tree-walk: init_tree_desc take an oid to get the hash algorithm
builtin/cat-file: let the oid determine the output algorithm
rev-parse: add an --output-object-format parameter
repository: implement extensions.compatObjectFormat
object-file: update object_info_extended to reencode objects
object-file-convert: convert commits that embed signed tags
object-file-convert: convert commit objects when writing
object-file-convert: don't leak when converting tag objects
object-file-convert: convert tag objects when writing
object-file-convert: add a function to convert trees between algorithms
object: factor out parse_mode out of fast-import and tree-walk into in object.h
cache: add a function to read an OID of a specific algorithm
tag: sign both hashes
commit: export add_header_signature to support handling signatures on tags
...
|
|
Add a conflict_style member to `struct merge_options` and `struct
ll_merge_options` to allow callers to override the default conflict
style. This will be used in the next commit.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Introduce a macro to initialize `struct ll_merge_options` in preparation
for the next commit that will add a new member that needs to be
initialized to a non-zero value.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Make sure failure return from merge_bases_many() is properly caught.
* js/merge-base-with-missing-commit:
merge-ort/merge-recursive: do report errors in `merge_submodule()`
merge-recursive: prepare for `merge_submodule()` to report errors
commit-reach(repo_get_merge_bases_many_dirty): pass on errors
commit-reach(repo_get_merge_bases_many): pass on "missing commits" errors
commit-reach(get_octopus_merge_bases): pass on "missing commits" errors
commit-reach(repo_get_merge_bases): pass on "missing commits" errors
commit-reach(get_merge_bases_many_0): pass on "missing commits" errors
commit-reach(merge_bases_many): pass on "missing commits" errors
commit-reach(paint_down_to_common): start reporting errors
commit-reach(paint_down_to_common): prepare for handling shallow commits
commit-reach(repo_in_merge_bases_many): report missing commits
commit-reach(repo_in_merge_bases_many): optionally expect missing commits
commit-reach(paint_down_to_common): plug two memory leaks
|
|
In 24876ebf68b (commit-reach(repo_in_merge_bases_many): report missing
commits, 2024-02-28), I taught `merge_submodule()` to handle errors
reported by `repo_in_merge_bases_many()`.
However, those errors were not passed through to the callers. That was
unintentional, and this commit remedies that.
Note that `find_first_merges()` can now also return -1 (because it
passes through that return value from `repo_in_merge_bases()`), and this
commit also adds the forgotten handling for that scenario.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git merge-tree" has learned that the three trees involved in the
3-way merge only need to be trees, not necessarily commits.
* js/merge-tree-3-trees:
fill_tree_descriptor(): mark error message for translation
cache-tree: avoid an unnecessary check
Always check `parse_tree*()`'s return value
t4301: verify that merge-tree fails on missing blob objects
merge-ort: do check `parse_tree()`'s return value
merge-tree: fail with a non-zero exit code on missing tree objects
merge-tree: accept 3 trees as arguments
|
|
The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases()` function (which is also
surfaced via the `repo_get_merge_bases()` macro) is aware of that, too.
Naturally, there are a lot of callers that need to be adjusted now, too.
Next step: adjust the callers of `get_octopus_merge_bases()`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Some functions in Git's source code follow the convention that returning
a negative value indicates a fatal error, e.g. repository corruption.
Let's use this convention in `repo_in_merge_bases()` to report when one
of the specified commits is missing (i.e. when `repo_parse_commit()`
reports an error).
Also adjust the callers of `repo_in_merge_bases()` to handle such
negative return values.
Note: As of this patch, errors are returned only if any of the specified
merge heads is missing. Over the course of the next patches, missing
commits will also be reported by the `paint_down_to_common()` function,
which is called by `repo_in_merge_bases_many()`, and those errors will
be properly propagated back to the caller at that stage.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a new advice type 'submoduleMergeConflict' for the error message
shown when a non-trivial submodule conflict is encountered, which
was added in 4057523a40 (submodule merge: update conflict error
message, 2022-08-04). That commit mentions making this message an
advice as possible future work. The message can now be disabled
with the advice mechanism.
Update the tests as the expected message now appears on stderr instead
of stdout.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Otherwise we may easily run into serious crashes: For example, if we run
`init_tree_desc()` directly after a failed `parse_tree()`, we are
accessing uninitialized data or trying to dereference `NULL`.
Note that the `parse_tree()` function already takes care of showing an
error message. The `parse_tree_indirectly()` and
`repo_get_commit_tree()` functions do not, therefore those latter call
sites need to show a useful error message while the former do not.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The previous commit fixed a bug where a missing tree was reported, but
not treated as an error.
This patch addresses the same issue for the remaining two callers of
`parse_tree()`.
This change is not accompanied by a regression test because the code in
question is only reached at the `checkout` stage, i.e. after the merge
has happened (and therefore the tree objects could only be missing if
the disk had gone bad in that short time window, or something similarly
tricky to recreate in the test suite).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When `git merge-tree` encounters a missing tree object, it should error
out and not continue quietly as if nothing had happened.
However, as of time of writing, `git merge-tree` _does_ continue, and
then offers the empty tree as result.
Let's fix this.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Rewrite //-comments to /* comments */ in files whose comments
prevalently use the latter.
* jc/comment-style-fixes:
reftable/pq_test: comment style fix
merge-ort.c: comment style fix
builtin/worktree: comment style fixes
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 70c70de616 (refs: complete list of special refs, 2023-12-14) we have
inrtoduced a new `is_special_ref()` function that classifies some refs
as being special. The rule is that special refs are exclusively read and
written via the filesystem directly, whereas normal refs exclucsively go
via the refs API.
The intent of that commit was to record the status quo so that we know
to route reads of such special refs consistently. Eventually, the list
should be reduced to its bare minimum of refs which really are special,
namely FETCH_HEAD and MERGE_HEAD.
Follow up on this promise and convert the AUTO_MERGE ref to become a
normal pseudo-ref by using the refs API to both read and write it
instead of accessing the filesystem directly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Each of these were checked with
gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).
...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file. These cases were:
* builtin/credential-cache.c
* builtin/pull.c
* builtin/send-pack.c
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Typofix.
* wx/merge-ort-comment-typofix:
merge-ort.c: fix typo 'neeed' to 'needed'
|
|
Signed-off-by: 王常新 <wchangxin824@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Fix "git merge-tree" to stop segfaulting when the --attr-source
option is used.
* jc/merge-ort-attr-index-fix:
merge-ort: initialize repo in index state
|
|
initialize_attr_index() does not initialize the repo member of
attr_index. Starting in 44451a2e5e (attr: teach "--attr-source=<tree>"
global option to "git", 2023-05-06), this became a problem because
istate->repo gets passed down the call chain starting in
git_check_attr(). This gets passed all the way down to
replace_refs_enabled(), which segfaults when accessing r->gitdir.
Fix this by initializing the repository in the index state.
Signed-off-by: John Cai <johncai86@gmail.com>
Helped-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
To make it possible for git ls-tree to display the tree encoded
in the hash algorithm of the oid specified to git ls-tree, update
init_tree_desc to take as a parameter the oid of the tree object.
Update all callers of init_tree_desc and init_tree_desc_gently
to pass the oid of the tree object.
Use the oid of the tree object to discover the hash algorithm
of the oid and store that hash algorithm in struct tree_desc.
Use the hash algorithm in decode_tree_entry and
update_tree_entry_internal to handle reading a tree object encoded in
a hash algorithm that differs from the repositories hash algorithm.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
As noted in CodingGuidelines, error messages should not be capitalized.
Fix up a few of these that were copied verbatim from merge-recursive to
match our modern style.
We'll likewise fix up the matching ones from merge-recursive. We care a
bit less there, since the hope is that it will eventually go away. But
besides being the right thing to do in the meantime, it is necessary for
t6406 to pass both with and without GIT_TEST_MERGE_ALGORITHM set (one of
our CI jobs sets it to "recursive", which will use the merge-recursive.c
code). An alternative would be to use "grep -i" in the test to check
the message, but it's nice for the test suite to be be more exact (we'd
notice if the capitalization fix regressed).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The merge_options parameter has never been used since the function was
introduced in 64aceb6d73 (merge-ort: add code to check for whether
cached renames can be reused, 2021-05-20). In theory some merge options
might impact our decisions here, but that has never been the case so
far.
Let's drop it to appease -Wunused-parameter; it would be easy to add
back later if we need to (there is only one caller).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This function takes three trees representing the merge base and both
sides of the merge, but never looks at any of them. This is due to
f78cf97617 (merge-ort: call diffcore_rename() directly, 2021-02-14).
Prior to that commit, we passed pairs of trees to diff_tree_oid(). But
after that commit, we collect a custom diff_queue for each pair in the
merge_options struct, and just run diffcore_rename() on the result. So
the function does not need to know about the original trees at all
anymore.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This function doesn't look at its merge_options parameter. It used to
pass it down to err(), but that function no longer exists (and didn't
look at "opt" anyway). We can drop it here.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The merge-ort code has an err() function, but it's really just error()
in disguise. It differs in two ways:
1. It takes a "struct merge_options" argument. But the function
completely ignores it! We can simply remove it.
2. It formats the error string into a strbuf, prepending "error: ",
and then feeds the result into error(). But this is wrong! The
error() function already adds the prefix, so we end up with:
error: error: Failed to execute internal merge
So let's just drop this function entirely and call error() directly, as
the functions are otherwise identical (note that they both always return
-1).
Presumably nobody noticed the bogus messages because they are quite hard
to trigger (they are mostly internal errors reading and writing
objects). However, one easy trigger is a custom merge driver which dies
by signal; we have a test already here, but we were not checking the
contents of stderr.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The vast majority of files including object-store.h did not need dir.h
nor khash.h. Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.
After this patch:
$ git grep -h include..object-store | sort | uniq -c
2 #include "object-store.h"
129 #include "object-store-ll.h"
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A long term (but rather minor) pet-peeve of mine was the name
ll-merge.[ch]. I thought it made it harder to realize what stuff was
related to merging when I was working on the merge machinery and trying
to improve it.
Further, back in d1cbe1e6d8a ("hash-ll.h: split out of hash.h to remove
dependency on repository.h", 2023-04-22), we have split the portions of
hash.h that do not depend upon repository.h into a "hash-ll.h" (due to
the recommendation to use "ll" for "low-level" in its name[1], but which
I used as a suffix precisely because of my distaste for "ll-merge").
When we discussed adding additional "*-ll.h" files, a request was made
that we use "ll" consistently as either a prefix or a suffix. Since it
is already in use as both a prefix and a suffix, the only way to do so
is to rename some files.
Besides my distaste for the ll-merge.[ch] name, let me also note that
the files
ll-fsmonitor.h, ll-hash.h, ll-merge.h, ll-object-store.h, ll-read-cache.h
would have essentially nothing to do with each other and make no sense
to group. But giving them the common "ll-" prefix would group them. Using
"-ll" as a suffix thus seems just much more logical to me. Rename
ll-merge.[ch] to merge-ll.[ch] to achieve this consistency, and to
ensure we get a more logical grouping of files.
[1] https://lore.kernel.org/git/kl6lsfcu1g8w.fsf@chooglen-macbookpro.roam.corp.google.com/
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This also made it clear that several .c files that depended upon path.h
were missing a #include for it; add the missing includes while at it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Since this header showed up in some places besides just #include
statements, update/clean-up/remove those other places as well.
Note that compat/fsmonitor/fsm-path-utils-darwin.c previously got
away with violating the rule that all files must start with an include
of git-compat-util.h (or a short-list of alternate headers that happen
to include it first). This change exposed the violation and caused it
to stop building correctly; fix it by having it include
git-compat-util.h first, as per policy.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
For the functions defined in read-cache.c, move their declarations from
cache.h to a new header, read-cache-ll.h. Also move some related inline
functions from cache.h to read-cache.h. The purpose of the
read-cache-ll.h/read-cache.h split is that about 70% of the sites don't
need the inline functions and the extra headers they include.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Note in particular that this reverses the decision made in 118a2e8bde0
("cache: move ensure_full_index() to cache.h", 2021-04-01).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
More header clean-up.
* en/header-split-cache-h-part-2: (22 commits)
reftable: ensure git-compat-util.h is the first (indirect) include
diff.h: reduce unnecessary includes
object-store.h: reduce unnecessary includes
commit.h: reduce unnecessary includes
fsmonitor: reduce includes of cache.h
cache.h: remove unnecessary headers
treewide: remove cache.h inclusion due to previous changes
cache,tree: move basic name compare functions from read-cache to tree
cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c
hash-ll.h: split out of hash.h to remove dependency on repository.h
tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h
dir.h: move DTYPE defines from cache.h
versioncmp.h: move declarations for versioncmp.c functions from cache.h
ws.h: move declarations for ws.c functions from cache.h
match-trees.h: move declarations for match-trees.c functions from cache.h
pkt-line.h: move declarations for pkt-line.c functions from cache.h
base85.h: move declarations for base85.c functions from cache.h
copy.h: move declarations for copy.c functions from cache.h
server-info.h: move declarations for server-info.c functions from cache.h
packfile.h: move pack_window and pack_entry from cache.h
...
|
|
A small API fix to the ort merge strategy backend.
* en/ort-finalize-after-0-merges-fix:
merge-ort: fix calling merge_finalize() with no intermediate merge
|
|
If some code sets up the data structures for a merge, but then never
actually performs one before calling merge_finalize(), then
merge_finalize() wouldn't notice that result->priv was NULL and
return early, resulting in following that NULL pointer and getting
a segfault. There is currently no code in the git codebase that does
this, but this issue was found during testing of some proposed patches
that had the following structure:
struct merge_options merge_opt;
struct merge_result result;
init_merge_options(&merge_opt, the_repository);
memset(&result, 0, sizeof(result));
<do N merges, for some value of N>
merge_finalize(&merge_opt, &result);
where some flags could cause the code to have N=0, i.e. doing no merges.
Add a check for result->priv being NULL and return early to avoid a
segfault in these kinds of cases.
While at it, ensure the FREE_AND_NULL() in the function does something
useful with the nulling aspect, namely sets result->priv to NULL rather
than a mere temporary.
Reported-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Dozens of files made use of trace and trace2 functions, without
explicitly including trace.h or trace2.h. This made it more difficult
to find which files could remove a dependence on cache.h. Make C files
explicitly include trace.h or trace2.h if they are using them.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
en/header-split-cache-h
* ab/remove-implicit-use-of-the-repository:
libs: use "struct repository *" argument, not "the_repository"
post-cocci: adjust comments for recent repo_* migration
cocci: apply the "revision.h" part of "the_repository.pending"
cocci: apply the "rerere.h" part of "the_repository.pending"
cocci: apply the "refs.h" part of "the_repository.pending"
cocci: apply the "promisor-remote.h" part of "the_repository.pending"
cocci: apply the "packfile.h" part of "the_repository.pending"
cocci: apply the "pretty.h" part of "the_repository.pending"
cocci: apply the "object-store.h" part of "the_repository.pending"
cocci: apply the "diff.h" part of "the_repository.pending"
cocci: apply the "commit.h" part of "the_repository.pending"
cocci: apply the "commit-reach.h" part of "the_repository.pending"
cocci: apply the "cache.h" part of "the_repository.pending"
cocci: add missing "the_repository" macros to "pending"
cocci: sort "the_repository" rules by header
cocci: fix incorrect & verbose "the_repository" rules
cocci: remove dead rule from "the_repository.pending.cocci"
|
|
Apply the part of "the_repository.pending.cocci" pertaining to
"promisor-remote.h".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Apply the part of "the_repository.pending.cocci" pertaining to
"object-store.h".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Apply the part of "the_repository.pending.cocci" pertaining to
"commit-reach.h".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Dozens of files made use of gettext functions, without explicitly
including gettext.h. This made it more difficult to find which files
could remove a dependence on cache.h. Make C files explicitly include
gettext.h if they are using it.
However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an
include of gettext.h, it was left out to avoid conflicting with an
in-flight topic.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Merging a branch with directory renames into a branch that changes
the directory to a symlink was mishandled by the ort merge
strategy, which has been corrected.
* en/ort-dir-rename-and-symlink-fix:
merge-ort: fix bug with dir rename vs change dir to symlink
|
|
More UNUSED annotation to help using -Wunused option with the
compiler.
* jk/unused-anno-more:
ll-merge: mark unused parameters in callbacks
diffcore-pickaxe: mark unused parameters in pickaxe functions
convert: mark unused parameter in null stream filter
apply: mark unused parameters in noop error/warning routine
apply: mark unused parameters in handlers
date: mark unused parameters in handler functions
string-list: mark unused callback parameters
object-file: mark unused parameters in hash_unknown functions
mark unused parameters in trivial compat functions
update-index: drop unused argc from do_reupdate()
submodule--helper: drop unused argc from module_list_compute()
diffstat_consume(): assert non-zero length
|
|
When changing a directory to a symlink on one side of history, and
renaming the parent of that directory to a different directory name
on the other side, e.g. with this kind of setup:
Base commit: Has a file named dir/subdir/file
Side1: Rename dir/ -> renamed-dir/
Side2: delete dir/subdir/file, add dir/subdir as symlink
Then merge-ort was running into an assertion failure:
git: merge-ort.c:2622: apply_directory_rename_modifications: Assertion `ci->dirmask == 0' failed
merge-recursive did not have as obvious an issue handling this case,
likely because we never fixed it to handle the case from commit
902c521a35 ("t6423: more involved directory rename test", 2020-10-15)
where we need to be careful about nested renames when a directory rename
occurs (dir/ -> renamed-dir/ implies dir/subdir/ ->
renamed-dir/subdir/). However, merge-recursive does have multiple
problems with this testcase:
* Incorrect stages for the file: merge-recursive omits the stage in
the index corresponding to the base stage, making `git status`
report "added by us" for renamed-dir/subdir/file instead of the
expected "deleted by them".
* Poor directory/file conflict handling: For the renamed-dir/subdir
symlink, instead of reporting a file/directory conflict as
expected, it reports "Error: Refusing to lose untracked file at
renamed-dir/subdir". This is a lie because there is no untracked
file at that location. It then does the normal suboptimal
merge-recursive thing of having the symlink be tracked in the index
at a location where it can't be written due to D/F conflicts
(namely, renamed-dir/subdir), but writes it to the working tree at
a different location as a new untracked file (namely,
renamed-dir/subdir~B^0)
Technically, these problems don't prevent the user from resolving the
merge if they can figure out to ignore the confusion, but because both
pieces of output are quite confusing I don't want to modify the test
to claim the recursive also passes it even if it doesn't have the bug
that ort did.
So, fix the bug in ort by splitting the conflict_info for "dir/subdir"
into two, one for the directory part, one for the file (i.e. symlink)
part, since the symlink is being renamed by directory rename detection.
The directory part is needed for proper nesting, since there are still
conflict_info fields for files underneath it (though those are marked
as is_null, they are still present until the entries are processed,
and the entry processing wants every non-toplevel entry to have a
parent directory).
Reported-by: Stefano Rivera <stefano@rivera.za.net>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
String-lists may be used with callbacks for clearing or iteration. These
callbacks need to conform to a particular interface, even though not
every callback needs all of its parameters. Mark the unused ones to make
-Wunused-parameter happy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the previous commit, we fixed a segmentation fault when a tree object
could not be written.
However, before the tree object is written, `merge-ort` wants to write
out a blob object (except in cases where the merge results in a blob
that already exists in the database). And this can fail, too, but we
ignore that write failure so far.
Let's pay close attention and error out early if the blob could not be
written. This reduces the error output of t4301.25 ("merge-ort fails
gracefully in a read-only repository") from:
error: insufficient permission for adding an object to repository database ./objects
error: error: Unable to add numbers to database
error: insufficient permission for adding an object to repository database ./objects
error: error: Unable to add greeting to database
error: insufficient permission for adding an object to repository database ./objects
fatal: failure to merge
to:
error: insufficient permission for adding an object to repository database ./objects
error: error: Unable to add numbers to database
fatal: failure to merge
This is _not_ just a cosmetic change: Even though one might assume that
the operation would have failed anyway at the point when the new tree
object is written (and the corresponding tree object _will_ be new if it
contains a blob that is new), but that is not so: As pointed out by
Elijah Newren, when Git has previously been allowed to add loose objects
via `sudo` calls, it is very possible that the blob object cannot be
written (because the corresponding `.git/objects/??/` directory may be
owned by `root`) but the tree object can be written (because the
corresponding objects directory is owned by the current user). This
would result in a corrupt repository because it is missing the blob
object, and with this here patch we prevent that.
Note: This patch adjusts two variable declarations from `unsigned` to
`int` because their purpose is to hold the return value of
`handle_content_merge()`, which is of type `int`. The existing users of
those variables are only interested whether that variable is zero or
non-zero, therefore this type change does not affect the existing code.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
If the blob/tree objects cannot be written, we really need the merge
operations to fail, and not to continue (and then try to access the tree
object which is however still set to `NULL`).
Let's stop ignoring the return value of `write_object_file()` and
`write_tree()` and set `clean = -1` in the error case.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code clean-up.
* en/ort-unused-code-removal:
merge-ort: remove code obsoleted by other changes
|
|
Further update the help messages given while merging submodules.
* en/submodule-merge-messages-fixes:
merge-ort: provide helpful submodule update message when possible
merge-ort: avoid surprise with new sub_flag variable
merge-ort: remove translator lego in new "submodule conflict suggestion"
submodule merge: update conflict error message
|
|
Commit 66b209b86a ("merge-ort: implement CE_SKIP_WORKTREE handling with
conflicted entries", 2021-03-20) added some code for merge-ort to handle
conflicted and skip_worktree entries in general. Included in this was
an ugly hack for dealing with present-despite-skipped entries and a
testcase (t6428.2) specific to that hack, since at that time users could
accidentally get files into that state when using a sparse checkout.
However, with the merging of 82386b4496 ("Merge branch
'en/present-despite-skipped'", 2022-03-09), that class of problems was
addressed globally and in a much cleaner way. As such, the
present-despite-skipped hack in merge-ort is no longer needed and can
simply be removed.
No additional testcase is needed here; t6428.2 was written to test the
necessary functionality and is being kept. The fact that this test
continues to pass despite the code being removed shows that the extra
code is no longer necessary.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In commit 4057523a40 ("submodule merge: update conflict error message",
2022-08-04), a more detailed message was provided when submodules
conflict, in order to help users know how to resolve those conflicts.
There were a couple situations for which a different message would be
more appropriate, but that commit left handling those for future work.
Unfortunately, that commit would check if any submodules were of the
type that it didn't know how to explain, and, if so, would avoid
providing the more detailed explanation even for the submodules it did
know how to explain.
Change this to have the code print the helpful messages for the subset
of submodules it knows how to explain.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Commit 4057523a40 ("submodule merge: update conflict error message",
2022-08-04) added a sub_flag variable that is used to store a value from
enum conflict_and_info_types, but initializes it with a value of -1 that
does not correspond to any of the conflict_and_info_types. The code may
never set it to a valid value and yet still use it, which can be
surprising when reading over the code at first. Initialize it instead
to the generic CONFLICT_SUBMODULE_FAILED_TO_MERGE value, which is still
distinct from the two values we need to special case.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In commit 4057523a40 ("submodule merge: update conflict error message",
2022-08-04), the new "submodule conflict suggestion" code was
translating 6 different pieces of the new message and then used
carefully crafted logic to allow stitching it back together with special
formatting. Keep the components of the message together as much as
possible, so that:
* we reduce the number of things translators have to translate
* translators have more control over the format of the output
* the code is much easier for developers to understand too
Also, reformat some comments running beyond the 80th column while at it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Plug memory leaks in the failure code path in the "merge-ort" merge
strategy backend.
* js/ort-clean-up-after-failed-merge:
merge-ort: do leave trace2 region even if checkout fails
merge-ort: clean up after failed merge
|
|
When attempting to merge in a superproject with conflicting submodule
pointers that cannot be fast-forwarded or trivially resolved, the merge
fails and Git prints an error message that accurately describes the
failure, but does not provide steps for the user to resolve the error.
Git is left in a conflicted state, which requires the user to:
1. merge submodules or update submodules to an already existing
commit that reflects the merge
2. add submodules changes to the superproject
3. finish merging superproject
These steps are non-obvious for newer submodule users to figure out
based on the error message and neither `git submodule status` nor `git
status` provide any useful pointers.
Update error message to provide steps to resolve submodule merge
conflict. Future work could involve adding an advice flag to the
message. Although the message is long, it also has the id of the
submodule commit that needs to be merged, which could be useful
information for the user.
Additionally, 5 merge failures that resulted in an early return have
been updated to reflect the status of the merge.
1. Null merge base (null o): CONFLICT_SUBMODULE_NULL_MERGE_BASE added
as a new conflict type and will print updated error message.
2. Null merge side a (null a): BUG(). See [1] for discussion
3. Null merge side b (null b): BUG(). See [1] for discussion
4. Submodule not checked out: added NEEDSWORK bit
5. Submodule commits not present: added NEEDSWORK bit
The errors with a NEEDSWORK bit deserve a more detailed explanation of
how to resolve them. See [2] for more context.
[1] https://lore.kernel.org/git/CABPp-BE0qGwUy80dmVszkJQ+tcpfLRW0OZyErymzhZ9+HWY1mw@mail.gmail.com/
[2] https://lore.kernel.org/git/xmqqpmhjjwo9.fsf@gitster.g/
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 557ac0350d9 (merge-ort: begin performance work; instrument with
trace2_region_* calls, 2021-01-23), we added Trace2 instrumentation, but
in the error path that returns early, we forgot to tell Trace2 that
we're leaving the region. Let's fix that.
Pointed-out-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 9fefce68dc8 (merge-ort: basic outline for merge_switch_to_result(),
2020-12-13), we added functionality to lay down the result of a merge on
disk. But we forgot to release the data structures in case
`unpack_trees()` failed to run properly.
This was pointed out by the `linux-leaks` job in our CI runs.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Fixes a long-standing corner case bug around directory renames in
the merge-ort strategy.
* en/merge-dual-dir-renames-fix:
merge-ort: fix issue with dual rename and add/add conflict
merge-ort: shuffle the computation and cleanup of potential collisions
merge-ort: make a separate function for freeing struct collisions
merge-ort: small cleanups of check_for_directory_rename
t6423: add tests of dual directory rename plus add/add conflict
|
|
There is code in both merge-recursive and merge-ort for avoiding doubly
transitive renames (i.e. one side renames directory A/ -> B/, and the
other side renames directory B/ -> C/), because this combination would
otherwise make a mess for new files added to A/ on the first side and
wondering which directory they end up in -- especially if there were
even more renames such as the first side renaming C/ -> D/. In such
cases, it just turns "off" directory rename detection for the higher
order transitive cases.
The testcases added in t6423 a couple commits ago are slightly different
but similar in principle. They involve a similar case of paired
renaming but instead of A/ -> B/ and B/ -> C/, the second side renames
a leading directory of B/ to C/. And both sides add a new file
somewhere under the directory that the other side will rename. While
the new files added start within different directories and thus could
logically end up within different directories, it is weird for a file
on one side to end up where the other one started and not move along
with it. So, let's just turn off directory rename detection in this
case as well.
Another way to look at this is that if the source name involved in a
directory rename on one side is the target name of a directory rename
operation for a file from the other side, then we avoid the doubly
transitive rename. (More concretely, if a directory rename on side D
wants to rename a file on side E from OLD_NAME -> NEW_NAME, and side D
already had a file named NEW_NAME, and a directory rename on side E
wants to rename side D's NEW_NAME -> NEWER_NAME, then we turn off the
directory rename detection for NEW_NAME to prevent the
NEW_NAME -> NEWER_NAME rename, and instead end up with an add/add
conflict on NEW_NAME.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Run compute_collisions() for renames on both sides of history before
any calls to collect_renames(), and do not free the computed collisions
until after both calls to collect_renames(). This is just a code
reorganization at this point that doesn't make sense on its own, but
will permit us to use the computed collision info from both sides
within each call to collect_renames() in a subsequent commit.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This commit makes no functional changes, it's just some code movement in
preparation for later changes.
Signed-off-by: Elijah Newren <newren@palantir.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
No functional changes, just some preparatory cleanups.
Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@palantir.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
With the new `detailed` parameter, a new mode can be triggered when
displaying the merge messages: The `detailed` mode prints NUL-delimited
fields of the following form:
<path-count> NUL <path>... NUL <conflict-type> NUL <message>
The `<path-count>` field determines how many `<path>` fields there are.
The intention of this mode is to support server-side operations, where
worktree-less merges can lead to conflicts and depending on the type
and/or path count, the caller might know how to handle said conflict.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
It is all fine and dandy for a regular Git command that is intended to
be run interactively to produce a bunch of messages upon an error.
However, in `merge-ort`'s case, we want to call the command e.g. in
server-side software, where the actual error messages are not quite as
interesting as machine-readable, immutable terms that describe the exact
nature of any given conflict.
With this patch, the `merge-ort` machinery records the exact type (as
specified via an `enum` value) as well as the involved path(s) together
with the conflict's message.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This allows us once again to get away with less data copying.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
To prepare for using the `merge-ort` machinery in server operations, we
cannot simply produce a free-form string that combines a variable-length
list of messages.
Instead, we need to list them one by one. The natural fit for this is a
`string_list`.
We will subsequently add even more information in the `util` attribute
of the string list items.
Based-on-a-patch-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There was one case in merge-ort that would call path_msg() multiple
times for the same logical conflict, and it was in order to give advice
about how to resolve a conflict. This advice does not make as much
sense with remerge-diff, or with merge-tree being invoked by a GitHub
GUI for resolution of messages, and is making it hard to provide
which-logical-conflict-affects-which-paths information in a machine
parseable way to a higher level caller of merge-tree. Let's simply
remove this informational message.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
After a merge, this function allows the user to extract the same
information that would be printed by `ls-files -u`, which means
files with their mode, oid, and stage.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This patch includes no new code; it simply moves a bunch of lines into a
new function. As such, there are no functional changes. This is just a
preparatory step to allow the printed messages to be handled differently
by other callers, such as in `git merge-tree --write-tree`.
(Patch best viewed with
--color-moved --color-moved-ws=allow-indentation-change
to see that it is a simple code movement.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Plug the memory leaks from the trickiest API of all, the revision
walker.
* ab/plug-leak-in-revisions: (27 commits)
revisions API: add a TODO for diff_free(&revs->diffopt)
revisions API: have release_revisions() release "topo_walk_info"
revisions API: have release_revisions() release "date_mode"
revisions API: call diff_free(&revs->pruning) in revisions_release()
revisions API: release "reflog_info" in release revisions()
revisions API: clear "boundary_commits" in release_revisions()
revisions API: have release_revisions() release "prune_data"
revisions API: have release_revisions() release "grep_filter"
revisions API: have release_revisions() release "filter"
revisions API: have release_revisions() release "cmdline"
revisions API: have release_revisions() release "mailmap"
revisions API: have release_revisions() release "commits"
revisions API users: use release_revisions() for "prune_data" users
revisions API users: use release_revisions() with UNLEAK()
revisions API users: use release_revisions() in builtin/log.c
revisions API users: use release_revisions() in http-push.c
revisions API users: add "goto cleanup" for release_revisions()
stash: always have the owner of "stash_info" free it
revisions API users: use release_revisions() needing REV_INFO_INIT
revision.[ch]: document and move code declared around "init"
...
|
|
Introduce and apply coccinelle rule to discourage an explicit
comparison between a pointer and NULL, and applies the clean-up to
the maintenance track.
* ep/maint-equals-null-cocci:
tree-wide: apply equals-null.cocci
tree-wide: apply equals-null.cocci
contrib/coccinnelle: add equals-null.cocci
|
|
* ep/maint-equals-null-cocci:
tree-wide: apply equals-null.cocci
contrib/coccinnelle: add equals-null.cocci
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a release_revisions() to various users of "struct rev_list" in
those straightforward cases where we only need to add the
release_revisions() call to the end of a block, and don't need to
e.g. refactor anything to use a "goto cleanup" pattern.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Count string_list items in size_t, not "unsigned int".
* ab/string-list-count-in-size-t:
string-list API: change "nr" and "alloc" to "size_t"
gettext API users: don't explicitly cast ngettext()'s "n"
|
|
Object-file API shuffling.
* ab/object-file-api-updates:
object-file API: pass an enum to read_object_with_reference()
object-file.c: add a literal version of write_object_file_prepare()
object-file API: have hash_object_file() take "enum object_type"
object API: rename hash_object_file_literally() to write_*()
object-file API: split up and simplify check_object_signature()
object API users + docs: check <0, not !0 with check_object_signature()
object API docs: move check_object_signature() docs to cache.h
object API: correct "buf" v.s. "map" mismatch in *.c and *.h
object-file API: have write_object_file() take "enum object_type"
object-file API: add a format_object_header() function
object-file API: return "void", not "int" from hash_object_file()
object-file.c: split up declaration of unrelated variables
|
|
Align the level of verbose output from the ort backend during inner
merge to that of the recursive backend.
* en/merge-ort-align-verbosity-with-recursive:
merge-ort: exclude messages from inner merges by default
|
|
Change the "nr" and "alloc" members of "struct string_list" to use
"size_t" instead of "nr". On some platforms the size of an "unsigned
int" will be smaller than a "size_t", e.g. a 32 bit unsigned v.s. 64
bit unsigned. As "struct string_list" is a generic API we use in a lot
of places this might cause overflows.
As one example: code in "refs.c" keeps track of the number of refs
with a "size_t", and auxiliary code in builtin/remote.c in
get_ref_states() appends those to a "struct string_list".
While we're at it split the "nr" and "alloc" in string-list.h across
two lines, which is the case for most such struct member
declarations (e.g. in "strbuf.h" and "strvec.h").
Changing e.g. "int i" to "size_t i" in run_and_feed_hook() isn't
strictly necessary, and there are a lot more cases where we'll use a
local "int", "unsigned int" etc. variable derived from the "nr" in the
"struct string_list". But in that case as well as
add_wrapped_shortlog_msg() in builtin/shortlog.c we need to adjust the
printf format referring to "nr" anyway, so let's also change the other
variables referring to it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Leakfix.
* en/merge-ort-plug-leaks:
merge-ort: fix small memory leak in unique_path()
merge-ort: fix small memory leak in detect_and_process_renames()
|
|
merge-recursive would only report messages from inner merges when the
GIT_MERGE_VERBOSITY was set to 5. Do the same for merge-ort.
Note that somewhat reverts 0d83d8240d ("merge-ort: mark conflict/warning
messages from inner merges as omittable", 2022-02-02) based on two
facts:
* This commit basically removes the showing of messages from inner
merges as well, at least by default. The only difference is that
users can request to get them back by turning up the verbosity.
* Messages from inner merges are specially annotated since 4a3d86e1bb
("merge-ort: make informational messages from recursive merges
clearer", 2022-02-17). The ability to distinguish them from outer
merge comments make them less problematic to include, and easier
for humans to parse.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Change the write_object_file() function to take an "enum object_type"
instead of a "const char *type". Its callers either passed
{commit,tree,blob,tag}_type and can pass the corresponding OBJ_* type
instead, or were hardcoding strings like "blob".
This avoids the back & forth fragility where the callers of
write_object_file() would have the enum type, and convert it
themselves via type_name(). We do have to now do that conversion
ourselves before calling write_object_file_prepare(), but those
codepaths will be similarly adjusted in subsequent commits.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The struct strmap paths member of merge_options_internal is perhaps the
most central data structure to all of merge-ort. Because all the paths
involved in the merge need to be kept until the merge is complete, this
"paths" data structure traditionally took responsibility for owning all
the allocated paths. When the merge is over, those paths were free()d
as part of free()ing this strmap.
In commit 6697ee01b5d3 (merge-ort: switch our strmaps over to using
memory pools, 2021-07-30), we changed the allocations for pathnames to
come from a memory pool. That meant the ownership changed slightly;
there were no individual free() calls to make, instead the memory pool
owned all those paths and they were free()d all at once.
Unfortunately unique_path() was written presuming the pre-memory-pool
model, and allocated a path on the heap and left it in the strmap for
later free()ing. Modify it to return a path allocated from the memory
pool instead.
Note that there's one instance -- in record_conflicted_index_entries()
-- where the returned string from unique_path() was only used very
temporarily and thus had been immediately free()'d. This codepath was
associated with an ugly skip-worktree workaround that has since been
better fixed by the in-flight en/present-despite-skipped topic. This
workaround probably makes sense to excise once that topic merges down,
but for now, just remove the immediate free() and allow the returned
string to be free()d when the memory pool is released.
This fixes the following memory leak as reported by valgrind:
==PID== 65 bytes in 1 blocks are definitely lost in loss record 79 of 134
==PID== at 0xADDRESS: malloc
==PID== by 0xADDRESS: realloc
==PID== by 0xADDRESS: xrealloc (wrapper.c:126)
==PID== by 0xADDRESS: strbuf_grow (strbuf.c:98)
==PID== by 0xADDRESS: strbuf_vaddf (strbuf.c:394)
==PID== by 0xADDRESS: strbuf_addf (strbuf.c:335)
==PID== by 0xADDRESS: unique_path (merge-ort.c:733)
==PID== by 0xADDRESS: process_entry (merge-ort.c:3678)
==PID== by 0xADDRESS: process_entries (merge-ort.c:4037)
==PID== by 0xADDRESS: merge_ort_nonrecursive_internal (merge-ort.c:4621)
==PID== by 0xADDRESS: merge_ort_internal (merge-ort.c:4709)
==PID== by 0xADDRESS: merge_incore_recursive (merge-ort.c:4760)
==PID== by 0xADDRESS: merge_ort_recursive (merge-ort-wrappers.c:57)
==PID== by 0xADDRESS: try_merge_strategy (merge.c:753)
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
detect_and_process_renames() detects renames on both sides of history
and then combines these into a single diff_queue_struct. The combined
diff_queue_struct needs to be able to hold the renames found on either
side, and since it knows the (maximum) size it needs, it pre-emptively
grows the array to the appropriate size:
ALLOC_GROW(combined.queue,
renames->pairs[1].nr + renames->pairs[2].nr,
combined.alloc);
It then collects the items from each side:
collect_renames(opt, &combined, MERGE_SIDE1, ...)
collect_renames(opt, &combined, MERGE_SIDE2, ...)
Note, though, that collect_renames() sometimes determines that some
pairs are unnecessary and does not include them in the combined array.
When it is done, detect_and_process_renames() frees this memory:
if (combined.nr) {
...
free(combined.queue);
}
The problem is that sometimes even when there are pairs, none of them
are necessary. Instead of checking combined.nr, just remove the
if-check; free() knows to skip NULL pointers. This change fixes the
following memory leak, as reported by valgrind:
==PID== 192 bytes in 1 blocks are definitely lost in loss record 107 of 134
==PID== at 0xADDRESS: malloc
==PID== by 0xADDRESS: realloc
==PID== by 0xADDRESS: xrealloc (wrapper.c:126)
==PID== by 0xADDRESS: detect_and_process_renames (merge-ort.c:3134)
==PID== by 0xADDRESS: merge_ort_nonrecursive_internal (merge-ort.c:4610)
==PID== by 0xADDRESS: merge_ort_internal (merge-ort.c:4709)
==PID== by 0xADDRESS: merge_incore_recursive (merge-ort.c:4760)
==PID== by 0xADDRESS: merge_ort_recursive (merge-ort-wrappers.c:57)
==PID== by 0xADDRESS: try_merge_strategy (merge.c:753)
==PID== by 0xADDRESS: cmd_merge (merge.c:1676)
==PID== by 0xADDRESS: run_builtin (git.c:461)
==PID== by 0xADDRESS: handle_builtin (git.c:713)
==PID== by 0xADDRESS: run_argv (git.c:780)
==PID== by 0xADDRESS: cmd_main (git.c:911)
==PID== by 0xADDRESS: main (common-main.c:52)
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This is another simple change with a long explanation...
merge-recursive and merge-ort are both based on the same recursive idea:
if there is more than one merge base, merge the merge bases (which may
require first merging the merge bases of the merges bases, etc.). The
depth of the inner merge is recorded via a variable called "call_depth",
which we'll bring up again later. Naturally, the inner merges
themselves can have conflicts and various messages generated about those
files.
merge-recursive immediately prints to stdout as it goes, at the risk of
printing multiple conflict notices for the same path separated far apart
from each other with many intervenining conflict notices for other paths
between them. And this is true even if there are no inner merges
involved. An example of this was given in [1] and apparently caused
some confusion:
CONFLICT (rename/add): Rename A->B in HEAD. B added in otherbranch
...dozens of conflicts for OTHER paths...
CONFLICT (content): Merge conflicts in B
In contrast, merge-ort collects messages and stores them by path so that
it can print them grouped by path. Thus, the same case handled by
merge-ort would have output of the form:
CONFLICT (rename/add): Rename A->B in HEAD. B added in otherbranch
CONFLICT (content): Merge conflicts in B
...dozens of conflicts for OTHER paths...
This is generally helpful, but does make a separate bug more
problematic. In particular, while merge-recursive might report the
following for a recursive merge:
Auto-merging dir.c
Auto-merging midx.c
CONFLICT (content): Merge conflict in midx.c
Auto-merging diff.c
Auto-merging dir.c
CONFLICT (content): Merge conflict in dir.c
merge-ort would instead report:
Auto-merging diff.c
Auto-merging dir.c
Auto-merging dir.c
CONFLICT (content): Merge conflict in dir.c
Auto-merging midx.c
CONFLICT (content): Merge conflict in midx.c
The fact that messages for the same file are together is probably
helpful in general, but with the indentation missing for the inner
merge it unfortunately serves to confuse. This probably would lead
users to wonder:
* Why is Git reporting that "dir.c" is being merged twice?
* If midx.c has conflicts, why do I not see any when I open up the
file and why are no conflicts shown in the index?
Fix this output confusion by changing the output to clearly
differentiate the messages for outer merges from the ones for inner
merges, changing the above output from merge-ort to:
Auto-merging diff.c
From inner merge: Auto-merging dir.c
Auto-merging dir.c
CONFLICT (content): Merge conflict in dir.c
From inner merge: Auto-merging midx.c
From inner merge: CONFLICT (content): Merge conflict in midx.c
(Note: the number of spaces after the 'From inner merge:' is
2*call_depth).
One other thing to note here, that I didn't notice until typing up this
commit message, is that merge-recursive does not print any messages from
the inner merges by default; the extra verbosity has to be requested.
merge-ort currently has no verbosity controls and always prints these.
We may also want to change that, but for now, just make the output
clearer with these extra markings and indentation.
[1] https://lore.kernel.org/git/CAGyf7-He4in8JWUh9dpAwvoPkQz9hr8nCBpxOxhZEd8+jtqTpg@mail.gmail.com/
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git log --remerge-diff" shows the difference from mechanical merge
result and the result that is actually recorded in a merge commit.
* en/remerge-diff:
diff-merges: avoid history simplifications when diffing merges
merge-ort: mark conflict/warning messages from inner merges as omittable
show, log: include conflict/warning messages in --remerge-diff headers
diff: add ability to insert additional headers for paths
merge-ort: format messages slightly different for use in headers
merge-ort: mark a few more conflict messages as omittable
merge-ort: capture and print ll-merge warnings in our preferred fashion
ll-merge: make callers responsible for showing warnings
log: clean unneeded objects during `log --remerge-diff`
show, log: provide a --remerge-diff capability
|
|
Leakfix.
* en/plug-leaks-in-merge:
merge: fix memory leaks in cmd_merge()
merge-ort: fix memory leak in merge_ort_internal()
|
|
The merge-ort misbehaved when merge.renameLimit configuration is
set too low and failed to find all renames.
* en/merge-ort-restart-optim-fix:
merge-ort: avoid assuming all renames detected
|
|
A recursive merge involves merging the merge bases of the two branches
being merged. Such an inner merge can itself generate conflict notices.
While such notices may be useful when initially trying to create a
merge, they seem to just be noise when investigating merges later with
--remerge-diff. (Especially when both sides of the outer merge resolved
the conflict the same way leading to no overall conflict.) Remove them.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Conflicts such as modify/delete, rename/rename, or file/directory are
not representable via content conflict markers, and the normal output
messages notifying users about these were dropped with --remerge-diff.
While we don't want these messages randomly shown before the commit
and diff headers, we do want them to still be shown; include them as
part of the diff headers instead.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When users run
git show --remerge-diff $MERGE_COMMIT
or
git log -p --remerge-diff ...
stdout is not an appropriate location to dump conflict messages, but we
do want to provide them to users. We will include them in the diff
headers instead...but for that to work, we need for any multiline
messages to replace newlines with both a newline and a space. Add a new
flag to signal when we want these messages modified in such a fashion,
and use it in path_msg() to modify these messages this way. Also, allow
a special prefix to be specified for these headers.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
path_msg() has the ability to mark messages as omittable, designed for
remerge-diff where we'll instead be showing conflict messages as diff
headers for a subsequent diff. While all these messages are very useful
when trying to create a merge initially, early use with the
--remerge-diff feature (the only user of this omittable conflict message
capability), suggests that the particular messages marked in this commit
are just noise when trying to see what changes users made to create a
merge commit. Mark them as omittable.
Note that there were already a few messages marked as omittable in
merge-ort when doing a remerge-diff, because the development of
--remerge-diff preceded the upstreaming of merge-ort and I was trying to
ensure merge-ort could handle all the necessary requirements. See
commit c5a6f65527 ("merge-ort: add modify/delete handling and delayed
output processing", 2020-12-03) for the initial details. For some
examples of already-marked-as-omittable messages, see either
"Auto-merging <path>" or some of the submodule update hints. This
commit just adds two more messages that should also be omittable.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Instead of immediately printing ll-merge warnings to stderr, we save
them in our output strbuf. Besides allowing us to move these warnings
to a special file for --remerge-diff, this has two other benefits for
regular merges done by merge-ort:
* The deferral of messages ensures we can print all messages about
any given path together (merge-recursive was known to sometimes
intersperse messages about other paths, particularly when renames
were involved).
* The deferral of messages means we can avoid printing spurious
conflict messages when we just end up aborting due to local user
modifications in the way. (In contrast to merge-recursive.c which
prematurely checks for local modifications in the way via
unpack_trees() and gets the check wrong both in terms of false
positives and false negatives relative to renames, merge-ort does
not perform the local modifications in the way check until the
checkout() step after the full merge has been computed.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Since some callers may want to send warning messages to somewhere other
than stdout/stderr, stop printing "warning: Cannot merge binary files"
from ll-merge and instead modify the return status of ll_merge() to
indicate when a merge of binary files has occurred. Message printing
probably does not belong in a "low-level merge" anyway.
This commit continues printing the message as-is, just from the callers
instead of within ll_merge(). Future changes will start handling the
message differently in the merge-ort codepath.
There was one special case here: the callers in rerere.c do NOT check
for and print such a message; since those code paths explicitly skip
over binary files, there is no reason to check for a return status of
LL_MERGE_BINARY_CONFLICT or print the related message.
Note that my methodology included first modifying ll_merge() to return
a struct, so that the compiler would catch all the callers for me and
ensure I had modified all of them. After modifying all of them, I then
changed the struct to an enum.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The documentation for merge_incore_recursive(), modelled after
merge_recursive(), notes that
merge_bases will be consumed (emptied) so make a copy if you need it
However, in merge_ort_internal() (which merge_incore_recursive() calls),
it runs
merged_merge_bases = pop_commit(&merge_bases);
...
for (iter = merge_bases; iter; iter = iter->next) {
...
}
In other words, it only consumes the *first* entry of merge_bases, and
the rest it iterates through. If it iterated through all of them, the
caller could be responsible for free'ing the memory. If it consumed all
of them, the current documentation would be correct and the callers
would need to do nothing. The current middle ground makes it impossible
for callers to avoid memory leaks, since any attempt to use the
merge_bases it passes in would result in a use-after-free.
It turns out this part of the code was copied from merge-recursive.c,
which has had the same bug for 15.5 years. However, since we are trying
to keep merge-recursive.c stable as we sunset it, let's just fix the
leak in in merge_ort_internal() by having it actually consume all the
elements of the merge_bases commit_list.
Testing this commit against t6404 (the first testcase specifically
about recursive merges) under valgrind shows that this patch fixes
the following leak:
32 (16 direct, 16 indirect) bytes in 1 blocks are definitely lost \
in loss record 49 of 126
at 0x484086F: malloc (vg_replace_malloc.c:380)
by 0x69FFEB: do_xmalloc (wrapper.c:41)
by 0x6A0073: xmalloc (wrapper.c:62)
by 0x52A72D: commit_list_insert (commit.c:556)
by 0x47EC86: try_merge_strategy (merge.c:751)
by 0x48143B: cmd_merge (merge.c:1679)
by 0x40686E: run_builtin (git.c:464)
by 0x406C51: handle_builtin (git.c:716)
by 0x406E96: run_argv (git.c:783)
by 0x40730A: cmd_main (git.c:914)
by 0x4E7DFA: main (common-main.c:56)
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In commit 8b09a900a1 ("merge-ort: restart merge with cached renames to
reduce process entry cost", 2021-07-16), we noted that in the merge-ort
steps of
collect_merge_info()
detect_and_process_renames()
process_entries()
that process_entries() was expensive, and we could often make it cheaper
by changing this to
collect_merge_info()
detect_and_process_renames()
<cache all the renames, and restart>
collect_merge_info()
detect_and_process_renames()
process_entries()
because the second collect_merge_info() would be cheaper (we could avoid
traversing into some directories), the second
detect_and_process_renames() would be free since we had already detected
all renames, and then process_entries() has far fewer entries to handle.
However, this was built on the assumption that the first
detect_and_process_renames() actually detected all potential renames.
If someone has merge.renameLimit set to some small value, that
assumption is violated which manifests later with the following message:
$ git -c merge.renameLimit=1 rebase upstream
...
git: merge-ort.c:546: clear_or_reinit_internal_opts: Assertion
`renames->cached_pairs_valid_side == 0' failed.
Turn off this cache-renames-and-restart whenever we cannot detect all
renames, and add a testcase that would have caught this problem.
Reported-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Tested-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Ever since commit a492d5331c ("merge-ort: ensure we consult df_conflict
and path_conflicts", 2021-06-30), when renormalization is active AND a
file is involved in a rename/delete conflict BUT the file is unmodified
(either before or after renormalization), merge-ort was running into an
assertion failure. Prior to that commit (or if assertions were compiled
out), merge-ort would mis-merge instead, ignoring the rename/delete
conflict and just deleting the file.
Remove the assertions, fix the code appropriately, leave some good
comments in the code, and add a testcase for this situation.
Reported-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Follow through the work to use the repo interface to access
submodule objects in-process, instead of abusing the alternate
object database interface.
* jt/no-abuse-alternate-odb-for-submodules:
submodule: trace adding submodule ODB as alternate
submodule: pass repo to check_has_commit()
object-file: only register submodule ODB if needed
merge-{ort,recursive}: remove add_submodule_odb()
refs: peeling non-the_repository iterators is BUG
refs: teach arbitrary repo support to iterators
refs: plumb repo into ref stores
|
|
Various fixes in code paths that move untracked files away to make room.
* en/removing-untracked-fixes:
Documentation: call out commands that nuke untracked files/directories
Comment important codepaths regarding nuking untracked files/dirs
unpack-trees: avoid nuking untracked dir in way of locally deleted file
unpack-trees: avoid nuking untracked dir in way of unmerged file
Change unpack_trees' 'reset' flag into an enum
Remove ignored files by default when they are in the way
unpack-trees: make dir an internal-only struct
unpack-trees: introduce preserve_ignored to unpack_trees_options
read-tree, merge-recursive: overwrite ignored files by default
checkout, read-tree: fix leak of unpack_trees_options.dir
t2500: add various tests for nuking untracked files
|
|
After the parent commit and some of its ancestors, the only place
commits are being accessed through alternates is in the user-facing
message formatting code. Fix those, and remove the add_submodule_odb()
calls.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
More code paths that use the hack to add submodule's object
database to the set of alternate object store have been cleaned up.
* jt/add-submodule-odb-clean-up:
revision: remove "submodule" from opt struct
repository: support unabsorbed in repo_submodule_init
submodule: remove unnecessary unabsorbed fallback
|
|
Typofixes.
* en/typofixes:
merge-ort: fix completely wrong comment
trace2.h: fix trivial comment typo
|
|
Change several commands to remove ignored files by default when they are
in the way. Since some commands (checkout, merge) take a
--no-overwrite-ignore option to allow the user to configure this, and it
may make sense to add that option to more commands (and in the case of
merge, actually plumb that configuration option through to more of the
backends than just the fast-forwarding special case), add little
comments about where such flags would be used.
Incidentally, this fixes a test failure in t7112.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Currently, every caller of unpack_trees() that wants to ensure ignored
files are overwritten by default needs to:
* allocate unpack_trees_options.dir
* flip the DIR_SHOW_IGNORED flag in unpack_trees_options.dir->flags
* call setup_standard_excludes
AND then after the call to unpack_trees() needs to
* call dir_clear()
* deallocate unpack_trees_options.dir
That's a fair amount of boilerplate, and every caller uses identical
code. Make this easier by instead introducing a new boolean value where
the default value (0) does what we want so that new callers of
unpack_trees() automatically get the appropriate behavior. And move all
the handling of unpack_trees_options.dir into unpack_trees() itself.
While preserve_ignored = 0 is the behavior we feel is the appropriate
default, we defer fixing commands to use the appropriate default until a
later commit. So, this commit introduces several locations where we
manually set preserve_ignored=1. This makes it clear where code paths
were previously preserving ignored files when they should not have been;
a future commit will flip these to instead use a value of 0 to get the
behavior we want.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
jt/no-abuse-alternate-odb-for-submodules
* jt/add-submodule-odb-clean-up:
revision: remove "submodule" from opt struct
repository: support unabsorbed in repo_submodule_init
submodule: remove unnecessary unabsorbed fallback
|
|
Various mergy operations have been prepared to work efficiently
with the sparse index.
* ds/mergies-with-sparse-index:
sparse-index: integrate with cherry-pick and rebase
sequencer: ensure full index if not ORT strategy
t1092: add cherry-pick, rebase tests
merge-ort: expand only for out-of-cone conflicts
merge: make sparse-aware with ORT
diff: ignore sparse paths in diffstat
|
|
Not sure what happened, but the comment is describing code elsewhere in
the file. Fix the comment to actually discuss the code that follows.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Merge conflicts happen often enough to want to avoid expanding a sparse
index when they happen, as long as those conflicts are within the
sparse-checkout cone. If a conflict exists outside of the
sparse-checkout cone, then we still need to expand before iterating over
the index entries. This is critical to do in advance because of how the
original_cache_nr is tracked to allow inserting and replacing cache
entries.
Iterate over the conflicted files and check if any paths are outside of
the sparse-checkout cone. If so, then expand the full index.
Add a test that demonstrates that we do not expand the index, even when
we hit a conflict within the sparse-checkout cone.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Allow 'git merge' to operate without expanding a sparse index, at least
not immediately. The index still will be expanded in a few cases:
1. If the merge strategy is 'recursive', then we enable
command_requires_full_index at the start of the merge_recursive()
method. We expect sparse-index users to also have the 'ort' strategy
enabled.
2. With the 'ort' strategy, if the merge results in a conflicted file,
then we expand the index before updating the working tree. The loop
that iterates over the worktree replaces index entries and tracks
'origintal_cache_nr' which can become completely wrong if the index
expands in the middle of the operation. This safety valve is
important before that loop starts. A later change will focus this
to only expand if we indeed have a conflict outside of the
sparse-checkout cone.
3. Other merge strategies are executed as a 'git merge-X' subcommand,
and those strategies are currently protected with the
'command_requires_full_index' guard.
Some test updates are required, including a mistaken 'git checkout -b'
that did not specify the base branch, causing merges to be fast-forward
merges.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Clean up a TODO in revision.h by removing the "submodule" field from
struct setup_revision_opt. This field is only used to specify the ref
store to use, so use rev_info->repo to determine the ref store instead.
The only users of this field are merge-ort.c and merge-recursive.c.
However, both these files specify the superproject as rev_info->repo and
the submodule as setup_revision_opt->submodule. In order to be able to
pass the submodule as rev_info->repo, all commits must be parsed with
the submodule explicitly specified; this patch does that as well. (An
incremental solution in which only some commits are parsed with explicit
submodule will not work, because if the same commit is parsed twice in
different repositories, there will be 2 heap-allocated object structs
corresponding to that commit, and any flag set by the revision walking
mechanism on one of them will not be reflected onto the other.)
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Final batch for "merge -sort" optimization.
* en/ort-perf-batch-15:
merge-ort: remove compile-time ability to turn off usage of memory pools
merge-ort: reuse path strings in pool_alloc_filespec
merge-ort: store filepairs and filespecs in our mem_pool
diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc
merge-ort: switch our strmaps over to using memory pools
merge-ort: set up a memory pool
merge-ort: add pool_alloc, pool_calloc, and pool_strndup wrappers
diffcore-rename: use a mem_pool for exact rename detection's hashmap
merge-ort: rename str{map,intmap,set}_func()
|
|
Further optimization on "merge -sort" backend.
* en/ort-perf-batch-14:
merge-ort: restart merge with cached renames to reduce process entry cost
merge-ort: avoid recursing into directories when we don't need to
merge-ort: defer recursing into directories when merge base is matched
merge-ort: add a handle_deferred_entries() helper function
merge-ort: add data structures for allowable trivial directory resolves
merge-ort: add some more explanations in collect_merge_info_callback()
merge-ort: resolve paths early when we have sufficient information
|
|
Simplify code maintenance by removing the ability to toggle between
usage of memory pools and direct allocations. This allows us to also
remove paths_to_free since it was solely about bookkeeping to make sure
we freed the necessary paths, and allows us to remove some auxiliary
functions.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
pool_alloc_filespec() was written so that the code when pool != NULL
mimicked the code from alloc_filespec(), which including allocating
enough extra space for the path and then copying it. However, the path
passed to pool_alloc_filespec() is always going to already be in the
same memory pool, so we may as well reuse it instead of copying it.
For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:
Before After
no-renames: 198.5 ms ± 3.4 ms 198.3 ms ± 2.9 ms
mega-renames: 679.1 ms ± 5.6 ms 661.8 ms ± 5.9 ms
just-one-mega: 271.9 ms ± 2.8 ms 264.6 ms ± 2.5 ms
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:
Before After
no-renames: 198.1 ms ± 2.6 ms 198.5 ms ± 3.4 ms
mega-renames: 715.8 ms ± 4.0 ms 679.1 ms ± 5.6 ms
just-one-mega: 276.8 ms ± 4.2 ms 271.9 ms ± 2.8 ms
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We want to be able to allocate filespecs and filepairs using a mem_pool.
However, filespec data will still remain outside the pool (perhaps in
the future we could plumb the pool through the various diff APIs to
allocate the filespec data too, but for now we are limiting the scope).
Add some extra functions to allocate these appropriately based on the
non-NULL-ness of opt->priv->pool, as well as some extra functions to
handle correctly deallocating the relevant parts of them. A future
commit will make use of these new functions.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
For all the strmaps (including strintmaps and strsets) whose memory is
unconditionally freed as part of clear_or_reinit_internal_opts(), switch
them over to using our new memory pool.
For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:
Before After
no-renames: 202.5 ms ± 3.2 ms 198.1 ms ± 2.6 ms
mega-renames: 1.072 s ± 0.012 s 715.8 ms ± 4.0 ms
just-one-mega: 357.3 ms ± 3.9 ms 276.8 ms ± 4.2 ms
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
merge-ort has a lot of data structures, and they all tend to be freed
together in clear_or_reinit_internal_opts(). Set up a memory pool to
allow us to make these allocations and deallocations faster. Future
commits will adjust various callers to make use of this memory pool.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Make the code more flexible so that it can handle both being run with or
without a memory pool by adding utility functions which will either call
xmalloc, xcalloc, xstrndup
or
mem_pool_alloc, mem_pool_calloc, mem_pool_strndup
depending on whether we have a non-NULL memory pool. A subsequent
commit will make use of these.
(We will actually be dropping these functions soon and just assuming we
always have a memory pool, but the flexibility was very useful during
development of merge-ort so I want to be able to restore it if needed.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In order to make it clearer that these three variables holding a
function refer to functions that will clear the strmap/strintmap/strset,
rename them to str{map,intmap,set}_clear_func().
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Documentation on "git diff -l<n>" and diff.renameLimit have been
updated, and the defaults for these limits have been raised.
* en/rename-limits-doc:
rename: bump limit defaults yet again
diffcore-rename: treat a rename_limit of 0 as unlimited
doc: clarify documentation for rename/copy limits
diff: correct warning message when renameLimit exceeded
|
|
Many "printf"-like helper functions we have have been annotated
with __attribute__() to catch placeholder/parameter mismatches.
* ab/attribute-format:
advice.h: add missing __attribute__((format)) & fix usage
*.h: add a few missing __attribute__((format))
*.c static functions: add missing __attribute__((format))
sequencer.c: move static function to avoid forward decl
*.c static functions: don't forward-declare __attribute__
|
|
The merge algorithm mostly consists of the following three functions:
collect_merge_info()
detect_and_process_renames()
process_entries()
Prior to the trivial directory resolution optimization of the last half
dozen commits, process_entries() was consistently the slowest, followed
by collect_merge_info(), then detect_and_process_renames(). When the
trivial directory resolution applies, it often dramatically decreases
the amount of time spent in the two slower functions.
Looking at the performance results in the previous commit, the trivial
directory resolution optimization helps amazingly well when there are no
relevant renames. It also helps really well when reapplying a long
series of linear commits (such as in a rebase or cherry-pick), since the
relevant renames may well be cached from the first reapplied commit.
But when there are any relevant renames that are not cached (represented
by the just-one-mega testcase), then the optimization does not help at
all.
Often, I noticed that when the optimization does not apply, it is
because there are a handful of relevant sources -- maybe even only one.
It felt frustrating to need to recurse into potentially hundreds or even
thousands of directories just for a single rename, but it was needed for
correctness.
However, staring at this list of functions and noticing that
process_entries() is the most expensive and knowing I could avoid it if
I had cached renames suggested a simple idea: change
collect_merge_info()
detect_and_process_renames()
process_entries()
into
collect_merge_info()
detect_and_process_renames()
<cache all the renames, and restart>
collect_merge_info()
detect_and_process_renames()
process_entries()
This may seem odd and look like more work. However, note that although
we run collect_merge_info() twice, the second time we get to employ
trivial directory resolves, which makes it much faster, so the increased
time in collect_merge_info() is small. While we run
detect_and_process_renames() again, all renames are cached so it's
nearly a no-op (we don't call into diffcore_rename_extended() but we do
have a little bit of data structure checking and fixing up). And the
big payoff comes from the fact that process_entries(), will be much
faster due to having far fewer entries to process.
This restarting only makes sense if we can save recursing into enough
directories to make it worth our while. Introduce a simple heuristic to
guide this. Note that this heuristic uses a "wanted_factor" that I have
virtually no actual real world data for, just some back-of-the-envelope
quasi-scientific calculations that I included in some comments and then
plucked a simple round number out of thin air. It could be that
tweaking this number to make it either higher or lower improves the
optimization. (There's slightly more here; when I first introduced this
optimization, I used a factor of 10, because I was completely confident
it was big enough to not cause slowdowns in special cases. I was
certain it was higher than needed. Several months later, I added the
rough calculations which make me think the optimal number is close to 2;
but instead of pushing to the limit, I just bumped it to 3 to reduce the
risk that there are special cases where this optimization can result in
slowing down the code a little. If the ratio of path counts is below 3,
we probably will only see minor performance improvements at best
anyway.)
Also, note that while the diffstat looks kind of long (nearly 100
lines), more than half of it is in two comments explaining how things
work.
For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:
Before After
no-renames: 205.1 ms ± 3.8 ms 204.2 ms ± 3.0 ms
mega-renames: 1.564 s ± 0.010 s 1.076 s ± 0.015 s
just-one-mega: 479.5 ms ± 3.9 ms 364.1 ms ± 7.0 ms
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This combines the work of the last several patches, and implements the
conditions when we don't need to recurse into directories. It's perhaps
easiest to see the logic by separating the fact that a directory might
have both rename sources and rename destinations:
* rename sources: only files present in the merge base can serve as
rename sources, and only when one side deletes that file. When the
tree on one side matches the merge base, that means every file
within the subtree matches the merge base. This means that the
skip-irrelevant-rename-detection optimization from before kicks in
and we don't need any of these files as rename sources.
* rename destinations: the tree that does not match the merge base
might have newly added and hence unmatched destination files.
This is what usually prevents us from doing trivial directory
resolutions in the merge machinery. However, the fact that we have
deferred recursing into this directory until the end means we know
whether there are any unmatched relevant potential rename sources
elsewhere in this merge. If there are no unmatched such relevant
sources anywhere, then there is no need to look for unmatched
potential rename destinations to match them with.
This informs our algorithm:
* Search through relevant_sources; if we have entries, they better all
be reflected in cached_pairs or cached_irrelevant, otherwise they
represent an unmatched potential rename source (causing the
optimization to be disallowed).
* For any relevant_source represented in cached_pairs, we do need to
to make sure to get the destination for each source, meaning we need
to recurse into any ancestor directories of those destinations.
* Once we've recursed into all the rename destinations for any
relevant_sources in cached_pairs, we can then do the trivial
directory resolution for the remaining directories.
For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:
Before After
no-renames: 5.235 s ± 0.042 s 205.1 ms ± 3.8 ms
mega-renames: 9.419 s ± 0.107 s 1.564 s ± 0.010 s
just-one-mega: 480.1 ms ± 3.9 ms 479.5 ms ± 3.9 ms
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When one side of history matches the merge base (including when the
merge base has no entry for the given directory), have
collect_merge_info_callback() defer recursing into the directory. To
ensure those entries are eventually handled, add a call to
handled_deferred_entries() in collect_merge_info() after
traverse_trees() returns.
Note that the condition in collect_merge_info_callback() may look more
complicated than necessary at first glance;
renames->trivial_merges_okay[side] is always true until
handle_deferred_entries() is called, and possible_trivial_merges[side]
is always empty right now (and in the future won't be filled until
handle_deferred_entries() is called). However, when
handle_deferred_entries() calls traverse_trees() for the relevant
deferred directories, those traverse_trees() calls will once again end
up in collect_merge_info_callback() for all the entries under those
subdirectories. The extra conditions are there for such deferred cases
and will be used more as we do more with those variables.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In order to allow trivial directory resolution, we first need to be able
to gather more information to determine if the optimization is safe. To
enable that, we need a way of deferring the recursion into the directory
until a later time. Naturally, deferring the entry into a subtree means
that we need some function that will later recurse into the subdirectory
exactly the same way that collect_merge_info_callback() would have done.
Add a helper function that does this. For now this function is not used
but a subsequent commit will change that. Future commits will also make
the function sometimes resolve directories instead of traversing inside.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
As noted a few commits ago, we can resolve individual files early if all
three sides of the merge have a file at the path and two of the three
sides match. We would really like to do the same thing with
directories, because being able to do a trivial directory resolve means
we don't have to recurse into the directory, potentially saving us a
huge amount of time in both collect_merge_info() and process_entries().
Unfortunately, resolving directories early would mean missing any
renames whose source or destination is underneath that directory.
If we somehow knew there weren't any renames under the directory in
question, then we could resolve it early. Sadly, it is impossible to
determine whether there are renames under the directory in question
without recursing into it, and this has traditionally kept us from ever
implementing such an optimization.
In commit f89b4f2bee ("merge-ort: skip rename detection entirely if
possible", 2021-03-11), we added an additional reason that rename
detection could be skipped entirely -- namely, if no *relevant* sources
were present. Without completing collect_merge_info_callback(), we do
not yet know if there are no relevant sources. However, we do know that
if the current directory on one side matches the merge base, then every
source file within that directory will not be RELEVANT_CONTENT, and a
few simple checks can often let us rule out RELEVANT_LOCATION as well.
This suggests we can just defer recursing into such directories until
the end of collect_merge_info.
Since the deferred directories are known to not add any relevant sources
due to the above properties, then if there are no relevant sources after
we've traversed all paths other than the deferred ones, then we know
there are not any relevant sources. Under those conditions, rename
detection is unnecessary, and that means we can resolve the deferred
directories without recursing into them.
Note that the logic for skipping rename detection was also modified
further in commit 76e253793c ("merge-ort, diffcore-rename: employ cached
renames when possible", 2021-01-30); in particular rename detection can
be skipped if we already have cached renames for each relevant source.
We can take advantage of this information as well with our deferral of
recursing into directories where one side matches the merge base.
Add some data structures that we will use to do these deferrals, with
some lengthy comments explaining their purpose.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The previous patch possibly raises some questions about whether
additional cases in collect_merge_info_callback() can be handled early.
Add some explanations in the form of comments to help explain these
better. While we're at it, add a few comments to denote what a few
boolean '0' or '1' values stand for.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When there are no directories involved at a given path, and all three
sides have a file at that path, and two of the three sides of history
match, we can immediately resolve the merge of that path in
collect_merge_info() and do not need to wait until process_entries().
This is actually a very minor improvement: half the time when I run it,
I see an improvement; the other half a slowdown. It seems to be in the
range of noise. However, this idea serves as the beginning of some
bigger optimizations coming in the following patches.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code cleanup around struct_type_init() functions.
* ab/struct-init:
string-list.h users: change to use *_{nodup,dup}()
string-list.[ch]: add a string_list_init_{nodup,dup}()
dir.[ch]: replace dir_init() with DIR_INIT
*.c *_init(): define in terms of corresponding *_INIT macro
*.h: move some *_INIT to designated initializers
|
|
The merge code had funny interactions between content based rename
detection and directory rename detection.
* en/merge-dir-rename-corner-case-fix:
merge-recursive: handle rename-to-self case
merge-ort: ensure we consult df_conflict and path_conflicts
t6423: test directory renames causing rename-to-self
|
|
Performance tweaks of "git merge -sort" around lazy fetching of objects.
* en/ort-perf-batch-13:
merge-ort: add prefetching for content merges
diffcore-rename: use a different prefetch for basename comparisons
diffcore-rename: allow different missing_object_cb functions
t6421: add tests checking for excessive object downloads during merge
promisor-remote: output trace2 statistics for number of objects fetched
|
|
More fix-ups and optimization to "merge -sort".
* en/ort-perf-batch-12:
merge-ort: miscellaneous touch-ups
Fix various issues found in comments
diffcore-rename: avoid unnecessary strdup'ing in break_idx
merge-ort: replace string_list_df_name_compare with faster alternative
|
|
These were last bumped in commit 92c57e5c1d29 (bump rename limit
defaults (again), 2011-02-19), and were bumped both because processors
had gotten faster, and because people were getting ugly merges that
caused problems and reporting it to the mailing list (suggesting that
folks were willing to spend more time waiting).
Since that time:
* Linus has continued recommending kernel folks to set
diff.renameLimit=0 (maps to 32767, currently)
* Folks with repositories with lots of renames were happy to set
merge.renameLimit above 32767, once the code supported that, to
get correct cherry-picks
* Processors have gotten faster
* It has been discovered that the timing methodology used last time
probably used too large example files.
The last point is probably worth explaining a bit more:
* The "average" file size used appears to have been average blob size
in the linux kernel history at the time (probably v2.6.25 or
something close to it).
* Since bigger files are modified more frequently, such a computation
weights towards larger files.
* Larger files may be more likely to be modified over time, but are
not more likely to be renamed -- the mean and median blob size
within a tree are a bit higher than the mean and median of blob
sizes in the history leading up to that version for the linux
kernel.
* The mean blob size in v2.6.25 was half the average blob size in
history leading to that point
* The median blob size in v2.6.25 was about 40% of the mean blob size
in v2.6.25.
* Since the mean blob size is more than double the median blob size,
any file as big as the mean will not be compared to any files of
median size or less (because they'd be more than 50% dissimilar).
* Since it is the number of files compared that provides the O(n^2)
behavior, median-sized files should matter more than mean-sized
ones.
The combined effect of the above is that the file size used in past
calculations was likely about 5x too large. Combine that with a CPU
performance improvement of ~30%, and we can increase the limits by
a factor of sqrt(5/(1-.3)) = 2.67, while keeping the original stated
time limits.
Keeping the same approximate time limit probably makes sense for
diff.renameLimit (there is no progress feedback in e.g. git log -p),
but the experience above suggests merge.renameLimit could be extended
significantly. In fact, it probably would make sense to have an
unlimited default setting for merge.renameLimit, but that would
likely need to be coupled with changes to how progress is displayed.
(See https://lore.kernel.org/git/YOx+Ok%2FEYvLqRMzJ@coredump.intra.peff.net/
for details in that area.) For now, let's just bump the approximate
time limit from 10s to 1m.
(Note: We do not want to use actual time limits, because getting results
that depend on how loaded your system is that day feels bad, and because
we don't discover that we won't get all the renames until after we've
put in a lot of work rather than just upfront telling the user there are
too many files involved.)
Using the original time limit of 2s for diff.renameLimit, and bumping
merge.renameLimit from 10s to 60s, I found the following timings using
the simple script at the end of this commit message (on an AWS c5.xlarge
which reports as "Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz"):
N Timing
1300 1.995s
7100 59.973s
So let's round down to nice even numbers and bump the limits from
400->1000, and from 1000->7000.
Here is the measure_rename_perf script (adapted from
https://lore.kernel.org/git/20080211113516.GB6344@coredump.intra.peff.net/
in particular to avoid triggering the linear handling from
basename-guided rename detection):
#!/bin/bash
n=$1; shift
rm -rf repo
mkdir repo && cd repo
git init -q -b main
mkdata() {
mkdir $1
for i in `seq 1 $2`; do
(sed "s/^/$i /" <../sample
echo tag: $1
) >$1/$i
done
}
mkdata initial $n
git add .
git commit -q -m initial
mkdata new $n
git add .
cd new
for i in *; do git mv $i $i.renamed; done
cd ..
git rm -q -rf initial
git commit -q -m new
time git diff-tree -M -l0 --summary HEAD^ HEAD
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add missing __attribute__((format)) function attributes to various
"static" functions that take printf arguments.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Change all in-tree users of the string_list_init(LIST, BOOL) API to
use string_list_init_{nodup,dup}(LIST) instead.
As noted in the preceding commit let's leave the now-unused
string_list_init() wrapper in-place for any in-flight users, it can be
removed at some later date.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|