| Age | Commit message (Collapse) | Author | Files | Lines |
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The code to walk revision graph to compute merge base has been
optimized.
* rs/merge-base-optim:
commit-reach: avoid commit_list_insert_by_date()
|
|
Finishing touches to fixes to the recent regression in "git diff -w
--quiet" and anything that needs to internally generate patch to
see if it turns empty.
* jk/diff-patch-dry-run-cleanup:
diff: simplify run_external_diff() quiet logic
diff: drop dry-run redirection to /dev/null
diff: replace diff_options.dry_run flag with NULL file
diff: drop save/restore of color_moved in dry-run mode
diff: send external diff output to diff_options.file
|
|
"git maintenance" command learns the "geometric" strategy where it
avoids doing maintenance tasks that rebuilds everything from
scratch.
* ps/maintenance-geometric:
t7900: fix a flaky test due to git-repack always regenerating MIDX
builtin/maintenance: introduce "geometric" strategy
builtin/maintenance: make "gc" strategy accessible
builtin/maintenance: extend "maintenance.strategy" to manual maintenance
builtin/maintenance: run maintenance tasks depending on type
builtin/maintenance: improve readability of strategies
builtin/maintenance: don't silently ignore invalid strategy
builtin/maintenance: make the geometric factor configurable
builtin/maintenance: introduce "geometric-repack" task
builtin/gc: make `too_many_loose_objects()` reusable without GC config
builtin/gc: remove global `repack` variable
|
|
The wildmatch code had a corner case bug that mistakenly makes
"foo**/bar" match with "foobar", which has been corrected.
* jk/match-pathname-fix:
match_pathname(): give fnmatch one char of prefix context
match_pathname(): reorder prefix-match check
|
|
* kh/doc-patch-id-1:
doc: patch-id: convert to the modern synopsis style
|
|
The 'q'(uit) command in "git add -p" has been improved to quit
without doing any meaningless work before leaving, and giving EOF
(typically control-D) to the prompt is made to behave the same way.
* rs/add-patch-quit:
add-patch: quit on EOF
add-patch: quit without skipping undecided hunks
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git bisect" command did not react correctly to "git bisect help"
and "git bisect unknown", which has been corrected.
* rz/bisect-help-unknown:
bisect: fix handling of `help` and invalid subcommands
|
|
"git shortlog" knows "--committer" and "--author" options, which
the command line completion (in contrib/) did not handle well,
which has been corrected.
* kf/log-shortlog-completion-fix:
completion: complete some 'git log' options
|
|
Regression fixes for a topic that has already been merged.
* ly/diff-name-only-with-diff-from-content:
diff: stop output garbled message in dry run mode
|
|
Two slightly different ways to get at "all the packfiles" in API
has been cleaned up.
* ps/remove-packfile-store-get-packs:
packfile: rename `packfile_store_get_all_packs()`
packfile: introduce macro to iterate through packs
packfile: drop `packfile_store_get_packs()`
builtin/grep: simplify how we preload packs
builtin/gc: convert to use `packfile_store_get_all_packs()`
object-name: convert to use `packfile_store_get_all_packs()`
|
|
strbuf_split*() to split a string into multiple strbufs is often a
wrong API to use. A few uses of it have been removed by
simplifying the code.
* ob/gpg-interface-cleanup:
gpg-interface: do not use misdesigned strbuf_split*()
gpg-interface: do not use misdesigned strbuf_split*()
|
|
"Symlink symref" has been added to the list of things that will
disappear at Git 3.0 boundary.
* ps/symlink-symref-deprecation:
refs/files: deprecate writing symrefs as symbolic links
|
|
A new configuration variable commitGraph.changedPaths allows to
turn "--changed-paths" on by default for "git commit-graph".
* ey/commit-graph-changed-paths-config:
commit-graph: add new config for changed-paths & recommend it in scalar
|
|
We track packfiles via two different lists:
- `struct packfile_store::packs` is a list that sorts local packs
first. In addition, these packs are sorted so that younger packs are
sorted towards the front.
- `struct packfile_store::mru` is a list that sorts packs so that
most-recently used packs are at the front.
The reasoning behind the ordering in the `packs` list is that younger
objects stored in the local object store tend to be accessed more
frequently, and that is certainly true for some cases. But there are
going to be lots of cases where that isn't true. Especially when
traversing history it is likely that one needs to access many older
objects, and due to our housekeeping it is very likely that almost all
of those older objects will be contained in one large pack that is
oldest.
So whether or not the ordering makes sense really depends on the use
case at hand. A flexible approach like our MRU list addresses that need,
as it will sort packs towards the front that are accessed all the time.
Intuitively, this approach is thus able to satisfy more use cases more
efficiently.
This reasoning casts some doubt on whether or not it really makes sense
to track packs via two different lists. It causes confusion, and it is
not clear whether there are use cases where the `packs` list really is
such an obvious choice.
Merge these two lists into one most-recently-used list.
Note that there is one important edge case: `for_each_packed_object()`
uses the MRU list to iterate through packs, and then it lists each
object in those packs. This would have the effect that we now sort the
current pack towards the front, thus modifying the list of packfiles we
are iterating over, with the consequence that we'll see an infinite
loop. This edge case is worked around by introducing a new field that
allows us to skip updating the MRU.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When preparing the packfile store we know to also prepare the MRU list
of packfiles with all packs that are currently loaded in the store via
`packfile_store_prepare_mru()`. So we know that the list of packs in the
MRU list should match the list of packs in the non-MRU list.
But there are some direct or indirect callsites that add a packfile to
the store via `packfile_store_add_pack()` without adding the pack to the
MRU. And while functions that access the MRU (e.g. `find_pack_entry()`)
know to call `packfile_store_prepare()`, which knows to prepare the MRU
via `packfile_store_prepare_mru()`, that operation will be turned into a
no-op because the packfile store is already prepared. So this will not
cause us to add the packfile to the MRU, and consequently we won't be
able to find the packfile in our MRU list.
There are only a handful of callers outside of "packfile.c" that add a
packfile to the store:
- "builtin/fast-import.c" adds multiple packs of imported objects, but
it knows to look up objects via `packfile_store_get_packs()`. This
function does not use the MRU, so we're good.
- "builtin/index-pack.c" adds the indexed pack to the store in case it
needs to perform consistency checks on its objects.
- "http.c" adds the fetched pack to the store so that we can access
its objects.
In all of these cases we actually want to access the contained objects.
And luckily, reading these objects works as expected:
1. We eventually end up in `do_oid_object_info_extended()`.
2. Calling `find_pack_entry()` fails because the MRU list doesn't
contain the newly added packfile.
3. The callers don't pass `OBJECT_INFO_QUICK`, so we end up
repreparing the object database. This will also cause us to
reprepare the MRU list.
4. We now retry reading the object via `find_pack_entry()`, and now we
succeed because the MRU list got populated.
This logic feels quite fragile: we intentionally add the packfile to the
store, but we then ultimately rely on repreparing the entire store only
to make the packfile accessible. While we do the correct thing in
`do_oid_object_info_extended()`, other sites that access the MRU may not
know to reprepare.
But besides being fragile it's also a waste of resources: repreparing
the object database requires us to re-read the alternates file and
discard any caches.
Refactor the code so that we unconditionally add packfiles to the MRU
when adding them to a packfile store. This makes the logic less fragile
and ensures that we don't have to reprepare the store to make the pack
accessible.
Note that this does not allow us to drop `packfile_store_prepare_mru()`
just yet: while the MRU list is already populated with all packs now,
the order in which we add these packs is indeterministic for most of the
part. So by first calling `sort_pack()` on the other packfile list and
then re-preparing the MRU list we inherit its sorting.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Move the list of packs into the packfile store. This follows the same
logic as in a previous commit, where we moved the most-recently-used
list of packs, as well.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The function `has_sha1_pack_kept_or_nonlocal()` takes an object ID and
then searches through packed objects to figure out whether the object
exists in a kept or non-local pack. As a performance optimization we
remember the packfile that contains a given object ID so that the next
call to the function first checks that same packfile again.
The way this is written is rather hard to follow though, as the caching
mechanism is intertwined with the loop that iterates through the packs.
Consequently, we need to do some gymnastics to re-start the iteration if
the cached pack does not contain the objects.
Refactor this so that we check the cached packfile at the beginning. We
don't have to re-verify whether the packfile meets the properties as we
have already verified those when storing the pack in `last_found` in the
first place. So all we need to do is to use `find_pack_entry_one()` to
check whether the pack contains the object ID, and to skip the cached
pack in the loop so that we don't search it twice.
Furthermore, stop using the `(void *)1` sentinel value and instead use a
simple `NULL` pointer to indicate that we don't have a last-found pack
yet.
This refactoring significantly simplifies the logic and makes it much
easier to follow.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When approximating the number of objects in a repository we only take
into account two data sources, the multi-pack index and the packfile
indices, as both of these data structures allow us to easily figure out
how many objects they contain.
But the way we currently approximate the number of objects is broken in
presence of a multi-pack index. This is due to two separate reasons:
- We have recently introduced initial infrastructure for incremental
multi-pack indices. Starting with that series, `num_objects` only
counts the number of objects of a specific layer of the MIDX chain,
so we do not take into account objects from parent layers.
This issue is fixed by adding `num_objects_in_base`, which contains
the sum of all objects in previous layers.
- When using the multi-pack index we may count objects contained in
packfiles twice: once via the multi-pack index, but then we again
count them via the packfile itself.
This issue is fixed by skipping any packfiles that have an MIDX.
Overall, given that we _always_ count the packs, we can only end up
overestimating the number of objects, and the overestimation is limited
to a factor of two at most.
The consequences of those issues are very limited though, as we only
approximate object counts in a small number of cases:
- When writing a commit-graph we use the approximate object count to
display the upper limit of a progress display.
- In `repo_find_unique_abbrev_r()` we use it to specify a lower limit
of how many hex digits we want to abbreviate to. Given that we use
power-of-two here to derive the lower limit we may end up with an
abbreviated hash that is one digit longer than required.
- In `estimate_repack_memory()` we may end up overestimating how much
memory a repack needs to pack objects. Conseuqently, we may end up
dropping some packfiles from a repack.
None of these are really game-changing. But it's nice to fix those
issues regardless.
While at it, convert the code to use `repo_for_each_pack()`.
Furthermore, use `odb_prepare_alternates()` instead of explicitly
preparing the packfile store. We really only want to prepare the object
database sources, and `get_multi_pack_index()` already knows to prepare
the packfile store for us.
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The dumb HTTP protocol directly fetches packfiles from the remote server
and temporarily stores them in a list of packfiles. Those packfiles are
not yet added to the repository's packfile store until we finalize the
whole fetch.
Refactor the code to instead use a `struct packfile_list` to store those
packs. This prepares us for a subsequent change where the `->next`
pointer of `struct packed_git` will go away.
Note that this refactoring creates some temporary duplication of code,
as we now have both `packfile_list_find_oid()` and `find_oid_pack()`.
The latter function will be removed in a subsequent commit though.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Packfiles have two lists associated to them:
- A list that keeps track of packfiles in the order that they were
added to a packfile store.
- A list that keeps track of packfiles in most-recently-used order so
that packfiles that are more likely to contain a specific object are
ordered towards the front.
Both of these lists are hosted by `struct packed_git` itself, So to
identify all packfiles in a repository you simply need to grab the first
packfile and then iterate the `->next` pointers or the MRU list. This
pattern has the problem that all packfiles are part of the same list,
regardless of whether or not they belong to the same object source.
With the upcoming pluggable object database effort this needs to change:
packfiles should be contained by a single object source, and reading an
object from any such packfile should use that source to look up the
object. Consequently, we need to break up the global lists of packfiles
into per-object-source lists.
A first step towards this goal is to move those lists out of `struct
packed_git` and into the packfile store. While the packfile store is
currently sitting on the `struct object_database` level, the intent is
to push it down one level into the `struct odb_source` in a subsequent
patch series.
Introduce a new `struct packfile_list` that is used to manage lists of
packfiles and use it to store the list of most-recently-used packfiles
in `struct packfile_store`. For now, the new list type is only used in a
single spot, but we'll expand its usage in subsequent patches.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
To allow fast lookups of a packfile by name we use a hashmap that has
the packfile name as key and the pack itself as value. But while this is
the perfect use case for a `strmap`, we instead use `struct hashmap` and
store the hashmap entry in the packfile itself.
Simplify the code by using a `strmap` instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Previous commits have marked a number of error or warning messages in
"builtin/fast-export.c" and "builtin/fast-import.c" for translation.
As "gpg-interface.c" code is used by the fast-export and fast-import
code, we should make sure that error or warning messages are also all
marked for translation in "gpg-interface.c".
To ensure that, let's mark for translation an error message in a
die() function.
With this, all the error and warning messages emitted by fast-export
and fast-import can be properly translated.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Some error or warning messages in "builtin/fast-import.c" are marked
for translation, but many are not.
To be more consistent and provide a better experience to people using a
translated version, let's mark all the remaining error or warning
messages for translation.
While at it, let's make the following small changes:
- replace "GIT" or "git" in a few error messages to just "Git",
- replace "Expected from command, got %s" to "expected 'from'
command, got '%s'", which makes it clearer that "from" is a command
and should not be translated,
- downcase error and warning messages that start with an uppercase,
- fix test cases in "t9300-fast-import.sh" that broke because an
error or warning message was downcased,
- split error and warning messages that are too long,
- adjust the indentation of some arguments of the error functions.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Some error or warning messages in "builtin/fast-export.c" are marked
for translation, but many are not.
To be more consistent and provide a better experience to people using a
translated version, let's mark all the remaining error or warning
messages for translation.
While at it:
- improve how some arguments to some error functions are indented,
- remove "Error:" at the start of an error message,
- downcase error and warning messages that start with an uppercase.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In "gpg-interface.h", the definitions of the GPG_VERIFY_* boolean flags
are currently using 1, 2 and 4 while we often prefer the bitwise left
shift operator, `<<`, for that purpose to make it clearer that they are
boolean.
Let's use the left shift operator here too. Let's also fix an indent
issue with "4" while at it.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In "gpg-interface.c", the 'parse_ssh_output()' function takes a
'struct signature_check *sigc' argument and populates many members of
this 'sigc' using information parsed from 'sigc->output' which
contains the ouput of an `ssh-keygen -Y ...` command that was used to
verify an SSH signature.
When it populates 'sigc->fingerprint' though, it uses
`xstrdup(strstr(line, "key ") + 4)` while `strstr(line, "key ")` has
already been computed a few lines above and is already available in
the `key` variable.
Let's simplify this.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Clean-up "git repack" machinery to prepare for incremental update
of midx files.
* tb/incremental-midx-part-3.1: (49 commits)
builtin/repack.c: clean up unused `#include`s
repack: move `write_cruft_pack()` out of the builtin
repack: move `write_filtered_pack()` out of the builtin
repack: move `pack_kept_objects` to `struct pack_objects_args`
repack: move `finish_pack_objects_cmd()` out of the builtin
builtin/repack.c: pass `write_pack_opts` to `finish_pack_objects_cmd()`
repack: extract `write_pack_opts_is_local()`
repack: move `find_pack_prefix()` out of the builtin
builtin/repack.c: use `write_pack_opts` within `write_cruft_pack()`
builtin/repack.c: introduce `struct write_pack_opts`
repack: 'write_midx_included_packs' API from the builtin
builtin/repack.c: inline packs within `write_midx_included_packs()`
builtin/repack.c: pass `repack_write_midx_opts` to `midx_included_packs`
builtin/repack.c: inline `remove_redundant_bitmaps()`
builtin/repack.c: reorder `remove_redundant_bitmaps()`
repack: keep track of MIDX pack names using existing_packs
builtin/repack.c: use a string_list for 'midx_pack_names'
builtin/repack.c: extract opts struct for 'write_midx_included_packs()'
builtin/repack.c: remove ref snapshotting from builtin
repack: remove pack_geometry API from the builtin
...
|
|
We read the input into a strbuf, so we must free it. Without this, t1016
complains in SANITIZE=leak mode.
The bug was introduced in 7673ecd2dc (t1016-compatObjectFormat: add
tests to verify the conversion between objects, 2023-10-01). But nobody
seems to have noticed, probably because CI did not run these tests until
the fix in 6cd8369ef3 (t/lib-gpg: call prepare_gnupghome() in GPG2
prereq, 2024-07-03).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Because gitignore patterns are passed to fnmatch, the handling of
backslashes is the same as it is there: it can be used to escape
metacharacters. We do reference fnmatch(3) for more details, but it may
be friendlier to point out this implication explicitly (especially for
people who want to know about backslash handling and search the
documentation for that word). There are also two cases that I've seen
some other backslash-escaping systems handle differently, so let's
describe those:
1. A backslash before any character treats that character literally,
even if it's not otherwise a meta-character. As opposed to
including the backslash itself (like "foo\bar" in shell expands to
"foo\bar") or forbidding it ("foo\zar" is required to produce a
diagnostic in C).
2. A backslash at the end of the string is an invalid pattern (and not
a literal backslash).
This second one in particular was a point of confusion between our
implementation and the one in JGit. Our wildmatch behavior matches what
POSIX specifies for fnmatch, so the code and documentation are in line.
But let's add a test to cover this case. Note that the behavior here
differs between wildmatch itself (which is what gitignore will use) and
pathspec matching (which will only turn to wildmatch if a literal match
fails). So we match "foo\" to "foo\" in pathspecs, but not via
gitignore.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The strategy in t1016-compatObjectFormat is to build two trees with
identical commits, one tree encoded in sha1 the other tree encoded
in sha256 and to use the compatibility code to test and see if
the two trees are identical.
GPG signatures include the current time as part of the signature.
To make gpg deterministic I forced the use of gpg --faked-system-time.
Unfortunately I did not look closely enough.
By default gpg still allows time to move forward with --faked-system-time.
So in those rare instances when the system is heavily loaded and gpg runs
slower than other times, signatures over the exact same data differ
due to timestamps with a minuscule difference.
Reading through the gpg documentation with a close eye, time can be
frozen by including an exclamation point at the end of the argument to
--faked-system-time.
Add the exclamation point so gpg really runs with a fixed notion of time,
resulting in the exact same data having identical gpg signatures.
That is enough that I can run "t1016-compatObjectFormat.sh --stress"
and I don't see any failures.
It is possible a future change to gpg will make replay protection more
robust and not provide a way to allow two separate runs of gpg to
produce exactly the same signature for exactly the same data. If that
happens a deeper comparison of the two repositories will need to be
performed. A comparison that simply verifies the signatures and
compares the data for equality. For now that is a lot of work
for no gain so I am just documenting the possibility.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Update the usage string of `git bisect` and documentation to match each
other. While at it, also:
1. Move the synopsis of `git bisect` subcommands to the synopsis
section, so that the test `t0450-txt-doc-vs-help.sh` can pass.
2. Document the `git bisect next` subcommand, which exists in the code
but is missing from the documentation.
See also: [1].
[1]: https://lore.kernel.org/git/3DA38465-7636-4EEF-B074-53E4628F5355@gmail.com/
Suggested-by: Ben Knoble <ben.knoble@gmail.com>
Signed-off-by: Ruoyu Zhong <zhongruoyu@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The placeholder markup is underscore (_), not backtick (`) as well.
The inline-verbatim markup (backticks) handle interior formatting. This
means in this case that it applies HTML `<code>` to the underscores and
`<em>` to the placeholder.
That is the effect, anyway; we can see from the rest of 042d6f34 (doc:
git-checkout: clarify `-b` and `-B`, 2025-09-10) that this was probably
an unintended mix-up.
Acked-by: Julia Evans <julia@jvns.ca>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
CI improvements to handle the recent Rust integration better.
* ps/ci-rust:
rust: support for Windows
ci: verify minimum supported Rust version
ci: check for common Rust mistakes via Clippy
rust/varint: add safety comments
ci: check formatting of our Rust code
ci: deduplicate calls to `apt-get update`
|
|
"git fast-import" is taught to handle signed tags, just like it
recently learned to handle signed commits, in different ways.
* cc/fast-import-strip-signed-tags:
fast-import: add '--signed-tags=<mode>' option
fast-export: handle all kinds of tag signatures
t9350: properly count annotated tags
lib-gpg: allow tests with GPGSM or GPGSSH prereq first
doc: git-tag: stop focusing on GPG signed tags
|
|
"git sparse-checkout" subcommand learned a new "clean" action to
prune otherwise unused working-tree files that are outside the
areas of interest.
* ds/sparse-checkout-clean:
sparse-index: improve advice message instructions
t: expand tests around sparse merges and clean
sparse-index: point users to new 'clean' action
sparse-checkout: add --verbose option to 'clean'
dir: add generic "walk all files" helper
sparse-checkout: match some 'clean' behavior
sparse-checkout: add basics of 'clean' command
sparse-checkout: remove use of the_repository
|
|
ps/packed-git-in-object-store
* ps/remove-packfile-store-get-packs: (55 commits)
packfile: rename `packfile_store_get_all_packs()`
packfile: introduce macro to iterate through packs
packfile: drop `packfile_store_get_packs()`
builtin/grep: simplify how we preload packs
builtin/gc: convert to use `packfile_store_get_all_packs()`
object-name: convert to use `packfile_store_get_all_packs()`
builtin/repack.c: clean up unused `#include`s
repack: move `write_cruft_pack()` out of the builtin
repack: move `write_filtered_pack()` out of the builtin
repack: move `pack_kept_objects` to `struct pack_objects_args`
repack: move `finish_pack_objects_cmd()` out of the builtin
builtin/repack.c: pass `write_pack_opts` to `finish_pack_objects_cmd()`
repack: extract `write_pack_opts_is_local()`
repack: move `find_pack_prefix()` out of the builtin
builtin/repack.c: use `write_pack_opts` within `write_cruft_pack()`
builtin/repack.c: introduce `struct write_pack_opts`
repack: 'write_midx_included_packs' API from the builtin
builtin/repack.c: inline packs within `write_midx_included_packs()`
builtin/repack.c: pass `repack_write_midx_opts` to `midx_included_packs`
builtin/repack.c: inline `remove_redundant_bitmaps()`
...
|
|
When a supposedly no-op "git repack" runs across a second boundary,
because the command always touches the MIDX file and updates its
timestamp, "ls -l $GIT_DIR/objects/pack/" before and after the
operation can change, which causes such a test to fail. Only
compare the *.pack files in the directory before and after the
operation to work around this flakyness.
Arguably, git-repack(1) should learn to not rewrite the MIDX in case
we know it is already up-to-date. But this is not a new problem
introduced via the new geometric maintenance task, so for now it
should be good enough to paper over the issue.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
[jc: taken from diff to v4 from v3 that was already merged to 'next']
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a note after the `git send-email` section explaining how
contributors can confirm that their patches reached the mailing
list by checking https://lore.kernel.org/git/. This helps
contributors verify that their emails were successfully delivered.
Signed-off-by: Queen Ediri Jessa <qjessa662@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The debug ref backend (refs_be_debug) was missing the remove_on_disk
function pointer, which caused a segmentation fault when running
'GIT_TRACE_REFS=1 git refs migrate --ref-format=reftable' commands.
Signed-off-by: Xinyu Ruan <r200981113@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Test modernization.
* so/t2401-use-test-path-helpers:
t2401: update path checks using test_path helpers
|
|
CI update.
* js/ci-github-actions-update:
build(deps): bump actions/github-script from 7 to 8
build(deps): bump actions/setup-python from 5 to 6
build(deps): bump actions/checkout from 4 to 5
build(deps): bump actions/download-artifact from 4 to 5
|
|
Doc mark-up fixes.
* kh/doc-continued-paragraph-fix:
doc: fix accidental literal blocks
|
|
Code clean-up.
* js/unreachable-workaround-for-no-symlink-head:
refs: forbid clang to complain about unreachable code
|
|
Recent OpenSSH creates the Unix domain socket to communicate with
ssh-agent under $HOME instead of /tmp, which causes our test to
fail doe to overly long pathname in our test environment, which has
been worked around by using "ssh-agent -T".
* ps/t7528-ssh-agent-uds-workaround:
t7528: work around ETOOMANY in OpenSSH 10.1 and newer
|
|
Unicode width table update.
* tb/unicode-width-table-17:
unicode: update the width tables to Unicode 17
|
|
The "--short" option of "git status" that meant output for humans
and "-z" option to show NUL delimited output format did not mix
well, and colored some but not all things. The command has been
updated to color all elements consistently in such a case.
* jk/status-z-short-fix:
status: make coloring of "-z --short" consistent
|
|
An earlier addition to "git diff --no-index A B" to limit the
output with pathspec after the two directories misbehaved when
these directories were given with a trailing slash, which has been
corrected.
* jk/diff-no-index-with-pathspec-fix:
diff --no-index: fix logic for paths ending in '/'
|
|
Windows "real-time monitoring" interferes with the execution of
tests and affects negatively in both correctness and performance,
which has been disabled in Gitlab CI.
* ps/gitlab-ci-disable-windows-monitoring:
gitlab-ci: disable realtime monitoring to unbreak Windows jobs
|
|
The code to squelch output from "git diff -w --name-status"
etc. for paths that "git diff -w -p" would have stayed silent
leaked output from dry-run patch generation, which has been
corrected.
* jc/diff-from-contents-fix:
diff: make sure the other caller of diff_flush_patch_quietly() is silent
|
|
Recently we attempted to improve "git diff -w" and friends to
handle cases where patch output would be suppressed, but it
introduced a bug that emits unnecessary output, which has been
corrected.
* jk/diff-from-contents-fix:
diff: restore redirection to /dev/null for diff_from_contents
|
|
If we reach the end of the input, e.g. because the user pressed ctrl-D
on Linux, there is no point in showing any more prompts, as we won't get
any reply. Do the same as option 'q' would: Quit.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In match_pathname(), which we use for matching .gitignore and
.gitattribute patterns, we are comparing paths with fnmatch patterns
(actually our extended wildmatch, which will be important). There's an
extra optimization there: we pre-compute the number of non-wildcard
characters at the beginning of the pattern and do an fspathncmp() on
that prefix.
That lets us avoid fnmatch entirely on patterns without wildcards, and
shrinks the amount of work we hand off to fnmatch. For a pattern like
"foo*.txt" and a path "foobar.txt", we'd cut away the matching "foo"
prefix and just pass "*.txt" and "bar.txt" to fnmatch().
But this misses a subtle corner case. In fnmatch(), we'll think
"bar.txt" is the start of the path, but it's not. This doesn't matter
for the pattern above, but consider the wildmatch pattern "foo**/bar"
and the path "foobar". These two should not match, because there is no
file named "bar", and the "**" applies only to the containing directory
name. But after removing the "foo" prefix, fnmatch will get "**/bar" and
"bar", which it does consider a match, because "**/" can match zero
directories.
We can solve this by giving fnmatch a bit more context. As long as it
has one byte of the matched prefix, then it will know that "bar" is not
the start of the path. In this example it would get "o**/bar" and
"obar", and realize that they cannot match.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
As an optimization, we use fspathncmp() to match a prefix of the pattern
that does not contain any wildcards, and then pass the remainder to
fnmatch(). If it has matched the whole thing, we can return early.
Let's shift this early-return check to before we tweak the pattern and
name strings. That will gives us more flexibility with that tweaking.
It might also save a few instructions, but I couldn't measure any
improvement in doing so (and I wouldn't be surprised if an optimizing
compiler could figure that out itself).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add an install target rule to the Makefiles in contrib/credential in the
same manner as in other Makefiles in contrib such as for contacts or
subtree.
Signed-off-by: Thomas Uhle <thomas.uhle@mailbox.tu-dresden.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Option q implies d, i.e., it marks any undecided hunks towards the
bottom of the hunk array as skipped. This is unnecessary; later code
treats undecided and skipped hunks the same: The only functions that
use UNDECIDED_HUNK and SKIP_HUNK are patch_update_file() itself (but
not after its big for loop) and its helpers get_first_undecided() and
display_hunks().
Streamline the handling of option q by quitting immediately.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Recent OpenSSH creates the Unix domain socket to communicate with
ssh-agent under $HOME instead of /tmp, which causes our test to
fail doe to overly long pathname in our test environment, which has
been worked around by using "ssh-agent -T".
* ps/t7528-ssh-agent-uds-workaround:
t7528: work around ETOOMANY in OpenSSH 10.1 and newer
|
|
Show 'P'ipe command in "git add -p".
* rs/add-patch-document-p-for-pager:
add-patch: fully document option P
|
|
GPG signing test set-up has been broken for a year, which has been
corrected.
* jc/t1016-setup-fix:
t1016: make sure to use specified GPG
|
|
Unicode width table update.
* tb/unicode-width-table-17:
unicode: update the width tables to Unicode 17
|
|
Build procedure for a few credential helpers (in contrib/) have
been updated.
* tu/credential-makefile-updates:
contrib/credential: harmonize Makefiles
|
|
The "--short" option of "git status" that meant output for humans
and "-z" option to show NUL delimited output format did not mix
well, and colored some but not all things. The command has been
updated to color all elements consistently in such a case.
* jk/status-z-short-fix:
status: make coloring of "-z --short" consistent
|
|
Test fix.
* js/t7500-pwd-windows-fix:
t7500: fix tests with absolute path following ":(optional)" on Windows
|
|
Documentation mark-up fixes.
* rj/doc-technical-fixes:
doc: add large-object-promisors.adoc to the docs build
doc: commit-graph.adoc: fix up some formatting
doc: sparse-checkout.adoc: fix asciidoc warnings
doc: remembering-renames.adoc: fix asciidoc warnings
|
|
We have two different repacking strategies in Git:
- The "gc" strategy uses git-gc(1).
- The "incremental" strategy uses multi-pack indices and `git
multi-pack-index repack` to merge together smaller packfiles as
determined by a specific batch size.
The former strategy is our old and trusted default, whereas the latter
has historically been used for our scheduled maintenance. But both
strategies have their shortcomings:
- The "gc" strategy performs regular all-into-one repacks. Furthermore
it is rather inflexible, as it is not easily possible for a user to
enable or disable specific subtasks.
- The "incremental" strategy is not a full replacement for the "gc"
strategy as it doesn't know to prune stale data.
So today, we don't have a strategy that is well-suited for large repos
while being a full replacement for the "gc" strategy.
Introduce a new "geometric" strategy that aims to fill this gap. This
strategy invokes all the usual cleanup tasks that git-gc(1) does like
pruning reflogs and rerere caches as well as stale worktrees. But where
it differs from both the "gc" and "incremental" strategy is that it uses
our geometric repacking infrastructure exposed by git-repack(1) to
repack packfiles. The advantage of geometric repacking is that we only
need to perform an all-into-one repack when the object count in a repo
has grown significantly.
One downside of this strategy is that pruning of unreferenced objects is
not going to happen regularly anymore. Every geometric repack knows to
soak up all loose objects regardless of their reachability, and merging
two or more packs doesn't consider reachability, either. Consequently,
the number of unreachable objects will grow over time.
This is remedied by doing an all-into-one repack instead of a geometric
repack whenever we determine that the geometric repack would end up
merging all packfiles anyway. This all-into-one repack then performs our
usual reachability checks and writes unreachable objects into a cruft
pack. As cruft packs won't ever be merged during geometric repacks we
can thus phase out these objects over time.
Of course, this still means that we retain unreachable objects for far
longer than with the "gc" strategy. But the maintenance strategy is
intended especially for large repositories, where the basic assumption
is that the set of unreachable objects will be significantly dwarfed by
the number of reachable objects.
If this assumption is ever proven to be too disadvantageous we could for
example introduce a time-based strategy: if the largest packfile has not
been touched for longer than $T, we perform an all-into-one repack. But
for now, such a mechanism is deferred into the future as it is not clear
yet whether it is needed in the first place.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While the user can pick the "incremental" maintenance strategy, it is
not possible to explicitly use the "gc" strategy. This has two
downsides:
- It is impossible to use the default "gc" strategy for a specific
repository when the strategy was globally set to a different strategy.
- It is not possible to use git-gc(1) for scheduled maintenance.
Address these issues by making making the "gc" strategy configurable.
Furthermore, extend the strategy so that git-gc(1) runs for both manual
and scheduled maintenance.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "maintenance.strategy" configuration allows users to configure how
Git is supposed to perform repository maintenance. The idea is that we
provide a set of high-level strategies that may be useful in different
contexts, like for example when handling a large monorepo. Furthermore,
the strategy can be tweaked by the user by overriding specific tasks.
In its current form though, the strategy only applies to scheduled
maintenance. This creates something of a gap, as scheduled and manual
maintenance will now use _different_ strategies as the latter would
continue to use git-gc(1) by default. This makes the strategies way less
useful than they could be on the one hand. But even more importantly,
the two different strategies might clash with one another, where one of
the strategies performs maintenance in such a way that it discards
benefits from the other strategy.
So ideally, it should be possible to pick one strategy that then applies
globally to all the different ways that we perform maintenance. This
doesn't necessarily mean that the strategy always does the _same_ thing
for every maintenance type. But it means that the strategy can configure
the different types to work in tandem with each other.
Change the meaning of "maintenance.strategy" accordingly so that the
strategy is applied to both types, manual and scheduled. As preceding
commits have introduced logic to run maintenance tasks depending on this
type we can tweak strategies so that they perform those tasks depending
on the context.
Note that this raises the question of backwards compatibility: when the
user has configured the "incremental" strategy we would have ignored
that strategy beforehand. Instead, repository maintenance would have
continued to use git-gc(1) by default.
But luckily, we can match that behaviour by:
- Keeping all current tasks of the incremental strategy as
`MAINTENANCE_TYPE_SCHEDULED`. This ensures that those tasks will not
run during manual maintenance.
- Configuring the "gc" task so that it is invoked during manual
maintenance.
Like this, the user shouldn't observe any difference in behaviour.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We basically have three different ways to execute repository
maintenance:
1. Manual maintenance via `git maintenance run`.
2. Automatic maintenance via `git maintenance run --auto`.
3. Scheduled maintenance via `git maintenance run --schedule=`.
At the moment, maintenance strategies only have an effect for the last
type of maintenance. This is about to change in subsequent commits, but
to do so we need to be able to skip some tasks depending on how exactly
maintenance was invoked.
Introduce a new maintenance type that discern between manual (1 & 2) and
scheduled (3) maintenance. Convert the `enabled` field into a bitset so
that it becomes possible to specifiy which tasks exactly should run in a
specific context.
The types picked for existing strategies match the status quo:
- The default strategy is only ever executed as part of a manual
maintenance run. It is not possible to use it for scheduled
maintenance.
- The incremental strategy is only ever executed as part of a
scheduled maintenance run. It is not possible to use it for manual
maintenance.
The strategies will be tweaked in subsequent commits to make use of this
new infrastructure.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Our maintenance strategies are essentially a large array of structures,
where each of the tasks can be enabled and scheduled individually. With
the current layout though all the configuration sits on the same nesting
layer, which makes it a bit hard to discern which initialized fields
belong to what task.
Improve readability of the individual tasks by using nested designated
initializers instead.
Suggested-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When parsing maintenance strategies we completely ignore the
user-configured value in case it is unknown to us. This makes it
basically undiscoverable to the user that scheduled maintenance is
devolving into a no-op.
Change this to instead die when seeing an unknown maintenance strategy.
While at it, pull out the parsing logic into a separate function so that
we can reuse it in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The geometric repacking task uses a factor of two for its geometric
sequence, meaning that each next pack must contain at least twice as
many objects as the next-smaller one. In some cases it may be helpful to
configure this factor though to reduce the number of packfile merges
even further, e.g. in very big repositories. But while git-repack(1)
itself supports doing this, the maintenance task does not give us a way
to tune it.
Introduce a new "maintenance.geometric-repack.splitFactor" configuration
to plug this gap.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Introduce a new "geometric-repack" task. This task uses our geometric
repack infrastructure as provided by git-repack(1) itself, which is a
strategy that especially hosting providers tend to use to amortize the
costs of repacking objects.
There is one issue though with geometric repacks, namely that they
unconditionally pack all loose objects, regardless of whether or not
they are reachable. This is done because it means that we can completely
skip the reachability step, which significantly speeds up the operation.
But it has the big downside that we are unable to expire objects over
time.
To address this issue we thus use a split strategy in this new task:
whenever a geometric repack would merge together all packs, we instead
do an all-into-one repack. By default, these all-into-one repacks have
cruft packs enabled, so unreachable objects would now be written into
their own pack. Consequently, they won't be soaked up during geometric
repacking anymore and can be expired with the next full repack, assuming
that their expiry date has surpassed.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
To decide whether or not a repository needs to be repacked we estimate
the number of loose objects. If the number exceeds a certain threshold
we perform the repack, otherwise we don't.
This is done via `too_many_loose_objects()`, which takes as parameter
the `struct gc_config`. This configuration is only used to determine the
threshold. In a subsequent commit we'll add another caller of this
function that wants to pass a different limit than the one stored in
that structure.
Refactor the function accordingly so that we only take the limit as
parameter instead of the whole structure.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The global `repack` variable is used to store all command line arguments
that we eventually want to pass to git-repack(1). It is being appended
to from multiple different functions, which makes it hard to follow the
logic. Besides being hard to follow, it also makes it unnecessarily hard
to reuse this infrastructure in new code.
Refactor the code so that we store this variable on the stack and pass
a pointer to it around as needed. This is done so that we can reuse
`add_repack_all_options()` in a subsequent commit.
The refactoring itself is straight-forward. One function that deserves
attention though is `need_to_gc()`: this function determines whether or
not we need to execute garbage collection for `git gc --auto`, but also
for `git maintenance run --auto`. But besides figuring out whether we
have to perform GC, the function also sets up the `repack` arguments.
For `git gc --auto` it's trivial to adapt, as we already have the
on-stack variable at our fingertips. But for the maintenance condition
it's less obvious what to do.
As it turns out, we can just use another temporary variable there that
we then immediately discard. If we need to perform GC we execute a child
git-gc(1) process to repack objects for us, and that process will have
to recompute the arguments anyway.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We'd sometimes end up in run_external_diff() to do a dry-run diff (e.g.,
to find content-level changes for --quiet). We recognize this quiet mode
by seeing the lack of DIFF_FORMAT_PATCH in the output format.
But since introducing an explicit dry-run check via 3ed5d8bd73 (diff:
stop output garbled message in dry run mode, 2025-10-20), this logic can
never trigger. We can only get to this function by calling
diff_flush_patch(), and that comes from only two places:
1. A dry-run flush comes from diff_flush_patch_quietly(), which is
always in dry-run mode (so the other half of our "||" is true
anyway).
2. A regular flush comes from diff_flush_patch_all_file_pairs(),
which is only called when output_format has DIFF_FORMAT_PATCH in
it.
So we can simplify our "quiet" condition to just checking dry-run mode
(which used to be a specific flag, but recently became just a NULL
"file" pointer). And since it's so simple, we can just do that inline.
This makes the logic about o->file more obvious, since we handle the
NULL and non-stdout cases next to each other.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
As an added protection against dry-run diffs accidentally producing
output, we redirect diff_options.file to /dev/null. But as of the
previous patch, this now does nothing, since dry-run diffs are
implemented by setting "file" to NULL.
So we can drop this extra code with no change in behavior. This is
effectively a revert of 623f7af284 (diff: restore redirection to
/dev/null for diff_from_contents, 2025-10-17) and 3da4413dbc (diff: make
sure the other caller of diff_flush_patch_quietly() is silent,
2025-10-22), but:
1. We get a conflict because we already dropped the color_moved
handling in an earlier patch. But we just resolve the conflicts to
"theirs" (removing all of the code).
2. We retain the test from 623f7af284.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We introduced a dry_run flag to diff_options in b55e6d36eb (diff: ensure
consistent diff behavior with ignore options, 2025-08-08), with the idea
that the lower-level diff code could skip output when it is set.
As we saw with the bugs fixed by 3ed5d8bd73 (diff: stop output garbled
message in dry run mode, 2025-10-20), it is easy to miss spots. In the
end, we located all of them by checking where diff_options.file is used.
That suggests another possible approach: we can replace the dry_run
boolean with a NULL pointer for "file", as we know that using "file" in
dry_run mode would always be an error. This turns any missed spots from
producing extra output[1] into a segfault. Which is less forgiving, but
that is the point: this is indicative of a programming error, and
complaining loudly and immediately is good.
[1] We protect ourselves against garbled output as a separate step,
courtesy of 623f7af284 (diff: restore redirection to /dev/null for
diff_from_contents, 2025-10-17). So in that sense this patch can
only introduce user-visible errors (since any "bugs" were going to
/dev/null before), but the idea is to catch them rather than quietly
send garbage to /dev/null.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When running a dry-run content-level diff to check whether a "--quiet"
diff has any changes, we have always unset the color_moved variable
since the feature was added in 2e2d5ac184 (diff.c: color moved lines
differently, 2017-06-30). The reasoning is not given explicitly there,
but presumably the idea is that since color_moved requires a lot of
extra computation to match lines but does not actually affect the
found_changes flag, we want to skip it.
Later, in 3da4413dbc (diff: make sure the other caller of
diff_flush_patch_quietly() is silent, 2025-10-22) we copied the same
idea for other dry-run diffs.
But neither spot actually needs to reset this flag at all, because
diff_flush_patch() will not ever compute color_moved. Nor could it, as
it is only looking at a single file-pair, and we detect moves across
files. So color_moved is checked only when we are actually doing real
DIFF_FORMAT_PATCH output, and call diff_flush_patch_all_file_pairs().
So we can get rid of these extra lines to save and restore the
color_moved flag without changing the behavior at all. (Note that there
is no "restore" to drop for the second caller, as we know at that point
we are not generating any output and can just leave the feature
disabled).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Diff output usually goes to the process stdout, but it can be redirected
with the "--output" option. We store this in the "file" pointer of
diff_options, and all of the diff code should write there instead of to
stdout.
But there's one spot we missed: running an external diff cmd. We don't
redirect its output at all, so it just defaults to the stdout of the
parent process. We should instead point its stdout at our output file.
There are a few caveats to watch out for when doing so:
- The stdout field takes a descriptor, not a FILE pointer. We can pull
out the descriptor with fileno().
- The run-command API always closes the stdout descriptor we pass to
it. So we must duplicate it (otherwise we break the FILE pointer,
since it now points to a closed descriptor).
- We don't need to worry about closing our dup'd descriptor, since the
point is that run-command will do it for us (even in the case of an
error). But we do need to make sure we skip the dup() if we set
no_stdout (because then run-command will not look at it at all).
- When the output is going to stdout, it would not be wrong to dup()
the descriptor, but we don't need to. We can skip that extra work
with a simple pointer comparison.
- It seems like you'd need to fflush() the descriptor before handing
off a copy to the child process to prevent out-of-order writes. But
that was true even before this patch! It works because run-command
always calls fflush(NULL) before running the child.
The new test shows the breakage (and fix). The need for duplicating the
descriptor doesn't need a new test; that is covered by the later test
"GIT_EXTERNAL_DIFF with more than one changed files".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
jk/diff-patch-dry-run-cleanup
* ly/diff-name-only-with-diff-from-content:
diff: stop output garbled message in dry run mode
|
|
Building a list using commit_list_insert_by_date() has quadratic worst
case complexity. Avoid it by just appending in the loop and sorting at
the end.
The number of merge bases is usually small, so don't expect speedups in
normal repositories. It has no limit, though. The added perf test
shows a nice improvement when dealing with 16384 merge bases:
Test v2.51.1 HEAD
-----------------------------------------------------------------
6010.2: git merge-base 0.55(0.54+0.00) 0.03(0.02+0.00) -94.5%
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The code to squelch output from "git diff -w --name-status"
etc. for paths that "git diff -w -p" would have stayed silent
leaked output from dry-run patch generation, which has been
corrected.
* jc/diff-from-contents-fix:
diff: make sure the other caller of diff_flush_patch_quietly() is silent
|
|
Recently we attempted to improve "git diff -w" and friends to
handle cases where patch output would be suppressed, but it
introduced a bug that emits unnecessary output, which has been
corrected.
* jk/diff-from-contents-fix:
diff: restore redirection to /dev/null for diff_from_contents
|
|
In t7528 we spawn an SSH agent to verify that we can sign a commit via
it. This test has started to fail on some machines:
+++ ssh-agent
unix_listener_tmp: path "/home/pks/Development/git/build/test-output/trash directory.t7528-signed-commit-ssh/.ssh/agent/s.UTulegefEg.agent.UrPHumMXPq" too long for Unix domain socket
main: Couldn't prepare agent socket
As it turns out this is caused by a change in OpenSSH 10.1 [1]:
* ssh-agent(1), sshd(8): move agent listener sockets from /tmp to
under ~/.ssh/agent for both ssh-agent(1) and forwarded sockets
in sshd(8).
Instead of creating the socket in "/tmp", OpenSSH now creates the socket
in our home directory. And as the home directory gets modified to be
located in our test output directory we end up with paths that are
somewhat long. But Linux has a rather short limit of 108 characters for
socket paths, and other systems have even lower limits, so it is very
easy now to exceed the limit and run into the above error.
Work around the issue by using `ssh-agent -T`, which instructs it to
use the old behaviour and create the socket in "/tmp" again. This switch
has only been introduced with 10.1 though, so for older versions we have
to fall back to not using it. That's fine though, as older versions know
to put the socket into "/tmp" already.
An alternative approach would be to abbreviate the socket name itself so
that we create it as e.g. "sshsock" in the trash directory. But taking
the above example we'd still end up with a path that is 91 characters
long. So we wouldn't really have a lot of headroom, and it is quite
likely that some developers would see the issue on their machines.
[1]: https://www.openssh.com/txt/release-10.1
Reported-by: Xi Ruoyao <xry111@xry111.site>
Suggested-by: brian m. carlson <sandals@crustytoothpaste.net>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Lauri Tirkkonen <lauri@hacktheplanet.fi>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In get_default_ssh_signing_key(), the default ssh signing key is
retrieved in `key_stdout` buf, which is then split using
strbuf_split_max() into up to two strbufs at a new line and the first
strbuf is returned as a `char *`and not a strbuf.
This makes the function lack the use of strbuf API as no edits are
performed on the split tokens.
Simplify the process of retrieving and returning the desired line by
using strchr() to isolate the line and xmemdupz() to return a copy of the
line. This removes the roundabout way of splitting the string into
strbufs, just to return the line.
Reported-by: Junio Hamano <gitster@pobox.com>
Helped-by: Christian Couder <christian.couder@gmail.com>
Helped-by: Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com>
Signed-off-by: Olamide Caleb Bello <belkid98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In get_ssh_finger_print(), the output of the `ssh-keygen` command is
put into `fingerprint_stdout` strbuf. The string in `fingerprint_stdout`
is then split into up to 3 strbufs using strbuf_split_max(). However they
are not modified after the split thereby not making use of the strbuf API
as the fingerprint token is merely returned as a char * and not a strbuf.
Hence they do not need to be strbufs.
Simplify the process of retrieving and returning the desired token by
using strchr() to isolate the token and xmemdupz() to return a copy of the
token. This removes the roundabout way of splitting the string into
strbufs just to return the token.
Reported-by: Junio Hamano <gitster@pobox.com>
Helped-by: Christian Couder <christian.couder@gmail.com>
Helped-by: Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com>
Signed-off-by: Olamide Caleb Bello <belkid98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Earlier, b55e6d36 (diff: ensure consistent diff behavior with
ignore options, 2025-08-08) introduced "dry-run" mode to the
diff machinery so that content-based diff filtering (like
ignoring space changes or those that match -I<regex>) can first
try to produce a patch without emitting any output to see if
under the given diff filtering condition we would get any output
lines, and a new helper function diff_flush_patch_quietly() was
introduced to use the mode to see an individual filepair needs
to be shown.
However, the solution was not complete. When files are deleted,
file modes change, or there are unmerged entries in the index,
dry-run mode still produces output because we overlooked these
conditions, and as a result, dry-run mode was not quiet.
To fix this, return early in emit_diff_symbol_from_struct() if
we are in dry-run mode. This function will be called by all the
emit functions to output the results. Returning early can avoid
diff output when files are deleted or file modes are changed.
Stop print message in dry-run mode if we have unmerged entries
in index. Discard output of external diff tool in dry-run mode.
Signed-off-by: Lidong Yan <yldhome2d2@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
ly/diff-name-only-with-diff-from-content
* jc/diff-from-contents-fix:
diff: make sure the other caller of diff_flush_patch_quietly() is silent
|
|
Earlier, we added is a protection for the loop that computes "git
diff --quiet -w" to ensure calls to the diff_flush_patch_quietly()
helper stays quiet. Do the same for another loop that deals with
options like "--name-status" to make calls to the same helper.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
ly/diff-name-only-with-diff-from-content
* jk/diff-from-contents-fix:
diff: restore redirection to /dev/null for diff_from_contents
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Test modernization.
* so/t2401-use-test-path-helpers:
t2401: update path checks using test_path helpers
|
|
Documentation updates.
* je/doc-pull:
doc: git-pull: clarify how to exit a conflicted merge
doc: git-pull: delete the example
doc: git-pull: clarify options for integrating remote branch
doc: git-pull: move <repository> and <refspec> params
|
|
The beginning of SHA1-SHA256 interoperability work.
* bc/sha1-256-interop-01:
t1010: use BROKEN_OBJECTS prerequisite
t: allow specifying compatibility hash
fsck: consider gpgsig headers expected in tags
rev-parse: allow printing compatibility hash
docs: add documentation for loose objects
docs: improve ambiguous areas of pack format documentation
docs: reflect actual double signature for tags
docs: update offset order for pack index v3
docs: update pack index v3 format
|
|
CI update.
* js/ci-github-actions-update:
build(deps): bump actions/github-script from 7 to 8
build(deps): bump actions/setup-python from 5 to 6
build(deps): bump actions/checkout from 4 to 5
build(deps): bump actions/download-artifact from 4 to 5
|
|
As documented in git-bisect(1), `git bisect help` should display usage
information. However, since the migration of `git bisect` to a full
builtin command in 73fce29427 (Turn `git bisect` into a full built-in,
2022-11-10), this behavior was broken. Running `git bisect help` would,
instead of showing usage, either fail silently if already in a bisect
session, or otherwise trigger an interactive autostart prompt asking "Do
you want me to do it for you [Y/n]?".
Similarly, since df63421be9 (bisect--helper: handle states directly,
2022-11-10), running invalid subcommands like `git bisect foobar` also
led to the same behavior.
This occurred because `help` and other unrecognized subcommands were
being unconditionally passed to `bisect_state`, which then called
`bisect_autostart`, triggering the interactive prompt.
Fix this by:
1. Adding explicit handling for the `help` subcommand to show usage;
2. Validating that unrecognized commands are actually valid state
commands before calling `bisect_state`;
3. Showing an error with usage for truly invalid commands.
This ensures that `git bisect help` displays the usage as documented,
and invalid commands fail cleanly without entering interactive mode.
Alternate terms are still handled correctly through
`check_and_set_terms`.
Signed-off-by: Ruoyu Zhong <zhongruoyu@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The changed-path Bloom filters feature has proven stable and reliable
over several years of use, delivering significant performance
improvement for file history computation in large monorepos. Currently
a user can opt-in to writing the changed-path Bloom filters using the
"--changed-paths" option to "git commit-graph write". The filters will
be persisted until the user drops the filters using the
"--no-changed-paths" option. For this functionality, refer to 0087a87ba8
(commit-graph: persist existence of changed-paths, 2020-07-01).
Large monorepos using Git's background maintenance to build and update
commit-graph files could use an easy switch to enable this feature
without a foreground computation. In this commit, we're proposing a new
config option "commitGraph.changedPaths":
* If "true", "git commit-graph write" will write Bloom filters,
equivalent to passing "--changed-paths";
* If "false" or "unset", Bloom filters will be written during "git
commit-graph write" only if the filters already exist in the current
commit-graph file. This matches the default behaviour of "git
commit-graph write" without any "--[no-]changed-paths" option. Note
"false" can disable a previous "true" config value but doesn't imply
"--no-changed-paths".
This config will always respect the precedence of command line option
"--[no-]changed-paths".
We also set this new config as optional recommended config in scalar to
turn on this feature for large repos.
Helped-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Emily Yang <emilyyang.git@gmail.com>
Acked-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* jt/repo-structure:
builtin/repo: add progress meter for structure stats
builtin/repo: add keyvalue and nul format for structure stats
builtin/repo: add object counts in structure output
builtin/repo: introduce structure subcommand
ref-filter: export ref_kind_from_refname()
ref-filter: allow NULL filter pattern
builtin/repo: rename repo_info() to cmd_repo_info()
|
|
* tb/incremental-midx-part-3.1: (49 commits)
builtin/repack.c: clean up unused `#include`s
repack: move `write_cruft_pack()` out of the builtin
repack: move `write_filtered_pack()` out of the builtin
repack: move `pack_kept_objects` to `struct pack_objects_args`
repack: move `finish_pack_objects_cmd()` out of the builtin
builtin/repack.c: pass `write_pack_opts` to `finish_pack_objects_cmd()`
repack: extract `write_pack_opts_is_local()`
repack: move `find_pack_prefix()` out of the builtin
builtin/repack.c: use `write_pack_opts` within `write_cruft_pack()`
builtin/repack.c: introduce `struct write_pack_opts`
repack: 'write_midx_included_packs' API from the builtin
builtin/repack.c: inline packs within `write_midx_included_packs()`
builtin/repack.c: pass `repack_write_midx_opts` to `midx_included_packs`
builtin/repack.c: inline `remove_redundant_bitmaps()`
builtin/repack.c: reorder `remove_redundant_bitmaps()`
repack: keep track of MIDX pack names using existing_packs
builtin/repack.c: use a string_list for 'midx_pack_names'
builtin/repack.c: extract opts struct for 'write_midx_included_packs()'
builtin/repack.c: remove ref snapshotting from builtin
repack: remove pack_geometry API from the builtin
...
|
|
When using the structure subcommand for git-repo(1), evaluating a
repository may take some time depending on its shape. Add a progress
meter to provide feedback to the user about what is happening. The
progress meter is enabled by default when the command is executed from a
tty. It can also be explicitly enabled/disabled via the --[no-]progress
option.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
All repository structure stats are outputted in a human-friendly table
form. This format is not suitable for machine parsing. Add a --format
option that supports three output modes: `table`, `keyvalue`, and `nul`.
The `table` mode is the default format and prints the same table output
as before.
With the `keyvalue` mode, each line of output contains a key-value pair
of a repository stat. The '=' character is used to delimit between keys
and values. The `nul` mode is similar to `keyvalue`, but key-values are
delimited by a NUL character instead of a newline. Also, instead of a
'=' character to delimit between keys and values, a newline character is
used. This allows stat values to support special characters without
having to cquote them. These two new modes provides output that is more
machine-friendly.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The amount of objects in a repository can provide insight regarding its
shape. To surface this information, use the path-walk API to count the
number of reachable objects in the repository by object type. All
regular references are used to determine the reachable set of objects.
The object counts are appended to the same table containing the
reference information.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The structure of a repository's history can have huge impacts on the
performance and health of the repository itself. Currently, Git lacks a
means to surface repository metrics regarding its structure/shape via a
single command. Acquiring this information requires users to be familiar
with the relevant data points and the various Git commands required to
surface them. To fill this gap, supplemental tools such as git-sizer(1)
have been developed.
To allow users to more readily identify repository structure related
information, introduce the "structure" subcommand in git-repo(1). The
goal of this subcommand is to eventually provide similar functionality
to git-sizer(1), but natively in Git.
The initial version of this command only iterates through all references
in the repository and tracks the count of branches, tags, remote refs,
and other reference types. The corresponding information is displayed in
a human-friendly table formatted in a very similar manner to
git-sizer(1). The width of each table column is adjusted automatically
to satisfy the requirements of the widest row contained.
Subsequent commits will surface additional relevant data points to
output and also provide other more machine-friendly output formats.
Based-on-patch-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When filtering refs, `ref_kind_from_refname()` is used to determine the
ref type. In a subsequent commit, this same logic is reused when
counting refs by type. Export the function to prepare for this change.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When setting up `struct ref_filter` for filter_refs(), the
`name_patterns` field must point to an array of pattern strings even if
no patterns are required. To improve this interface, treat a NULL
`name_patterns` field the same as when it points to an empty array.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Subcommand functions are often prefixed with `cmd_` to denote that they
are an entrypoint. Rename repo_info() to cmd_repo_info() accordingly.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Show option P in the prompt and explain it properly on a dedicated line
in online help and documentation.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* tb/incremental-midx-part-3.1: (64 commits)
builtin/repack.c: clean up unused `#include`s
repack: move `write_cruft_pack()` out of the builtin
repack: move `write_filtered_pack()` out of the builtin
repack: move `pack_kept_objects` to `struct pack_objects_args`
repack: move `finish_pack_objects_cmd()` out of the builtin
builtin/repack.c: pass `write_pack_opts` to `finish_pack_objects_cmd()`
repack: extract `write_pack_opts_is_local()`
repack: move `find_pack_prefix()` out of the builtin
builtin/repack.c: use `write_pack_opts` within `write_cruft_pack()`
builtin/repack.c: introduce `struct write_pack_opts`
repack: 'write_midx_included_packs' API from the builtin
builtin/repack.c: inline packs within `write_midx_included_packs()`
builtin/repack.c: pass `repack_write_midx_opts` to `midx_included_packs`
builtin/repack.c: inline `remove_redundant_bitmaps()`
builtin/repack.c: reorder `remove_redundant_bitmaps()`
repack: keep track of MIDX pack names using existing_packs
builtin/repack.c: use a string_list for 'midx_pack_names'
builtin/repack.c: extract opts struct for 'write_midx_included_packs()'
builtin/repack.c: remove ref snapshotting from builtin
repack: remove pack_geometry API from the builtin
...
|
|
Unicode 17 is out. Update the unicode with table.
https://blog.unicode.org/2025/09/unicode-170-release-announcement.html
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code clean-up.
* tb/cat-file-objectmode-update:
builtin/cat-file.c: simplify calling `report_object_status()`
|
|
Doc mark-up fixes.
* kh/doc-continued-paragraph-fix:
doc: fix accidental literal blocks
|
|
Code clean-up.
* js/unreachable-workaround-for-no-symlink-head:
refs: forbid clang to complain about unreachable code
|
|
Code clean-up.
* js/mingw-includes-cleanup:
mingw: order `#include`s alphabetically
mingw: avoid relative `#include`s
|
|
Doc update.
* dk/stash-apply-index:
doc: explain the impact of stash.index on --autostash options
|
|
Update these Makefiles to be in line with other Makefiles from contrib
such as for contacts or subtree by making the following changes:
* Make the default settings after including config.mak.autogen and
config.mak.
* Add the missing $(CPPFLAGS) to the compiler command as well as the
missing $(CFLAGS) to the linker command.
* Use a pattern rule for compilation instead of a dedicated rule for
each compile unit.
* Get rid of $(MAIN), $(SRCS) and $(OBJS) and simply use their values
such as git-credential-libsecret and git-credential-libsecret.o.
* Strip @ from $(RM) to let the clean target rule be verbose.
* Define .PHONY for all special targets (all, clean).
Signed-off-by: Thomas Uhle <thomas.uhle@mailbox.tu-dresden.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
1. '--exclude=' option to 'git log' and 'git shortlog' are missing. Add the
option to __git_log_shortlog_options.
2. The `--committer` option in `git log` requires a pattern, such as
`--committer=ba`, but in `git shortlog`, specifying a pattern results in
an error: “error: option `committer' takes no value.” Handle them as
separate options for completion rather than a shared one.
Signed-off-by: KIYOTA Fumiya <aimluck.kiyota@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When an on-disk sparse index is expanded to a full one, it could be
due to some worktree state that requires looking at file entries
hidden within sparse tree entries. This can be avoided if the
worktree is cleaned up and some other issues related to the index
state are resolved.
Expand the advice message to include all of these cases, since 'git
sparse-checkout clean' is not currently capable of handling all
cases.
In the future, we may improve the behavior of 'git sparse-checkout
clean' to handle all of the cases.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
On Windows, the MSYS layer translates absolute path names generated by
a shell script from the POSIX style /c/dir/file to the Windows style
C:/dir/file form that is understood by git.exe. This happens only when
the absolute path stands on its own as a program argument or a value of
an environment variable.
The earlier commits 749d6d166d (config: values of pathname type can be
prefixed with :(optional), 2025-09-28) and ccfcaf399f (parseopt: values
of pathname type can be prefixed with :(optional), 2025-09-28) added
test cases where ":(optional)" is inserted before an absolute path.
$PWD is used to construct the absolute paths, which gives the POSIX
form, and the result is ":(optional)/c/dir/template". Such command line
arguments are no longer recognized as absolute paths and do not undergo
translation.
Existing test cases that expect that the specified file does not exist
are not incorrect (after all, git.exe will not find /c/dir/template).
Yet, they are conceptually incorrect. That the use of $PWD is erroneous
is revealed by a test case that expects that the optional file exists.
Since no such test case is present, add one. Use "$(pwd)" to generate
the absolute paths, so that the command line arguments become
":(optional)C:/dir/template".
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When running "git status -z --short", the marker on modified index
entries (e.g., "M") is colorized, but the "??" marker for untracked
entries is not. Let's fix the "??" entries to show color here.
At first glance you might think that neither should be colorized, as
usually one would use "-z" to get machine-readable output. But this is a
tricky and unusual case. We have two output formats, "--short" and
"--porcelain" which are substantially similar, but differ in that
"--short" is for humans who want something short and "--porcelain" is
for machines. And "-z" by itself, without any other output option, does
default to "--porcelain", so "git status -z" will not colorize anything.
But if you explicitly ask for "-z" and "--short" together, then that is
asking for the human-readable output, but separated by NULs. This is
unlikely to be useful directly, but could for example be used if the
output will be shown to a human outside of the terminal. At any rate,
the current behavior is clearly wrong (since we colorize some things but
not others), and I think colorizing everything is the least-surprising
thing we can do here.
Reported-by: Langbart <Langbart@protonmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
An earlier addition to "git diff --no-index A B" to limit the
output with pathspec after the two directories misbehaved when
these directories were given with a trailing slash, which has been
corrected.
* jk/diff-no-index-with-pathspec-fix:
diff --no-index: fix logic for paths ending in '/'
|
|
A few more things that patch authors can do to help maintainer to
keep track of their topics better.
* tb/doc-submitting-patches:
SubmittingPatches: guidance for multi-series efforts
SubmittingPatches: extend release-notes experiment to topic names
|
|
The code in "git add -p" and friends to iterate over hunks was
riddled with bugs, which has been corrected.
* rs/add-patch-options-fix:
add-patch: reset "permitted" at loop start
add-patch: let options a and d roll over like y and n
add-patch: let options k and K roll over like j and J
add-patch: let options y, n, j, and e roll over to next undecided
add-patch: document that option J rolls over
add-patch: improve help for options j, J, k, and K
|
|
Instead of three library archives (one for git, one for reftable,
and one for xdiff), roll everything into a single libgit.a archive.
This would help later effort to FFI into Rust.
* en/make-libgit-a:
make: delete REFTABLE_LIB, add reftable to LIB_OBJS
make: delete XDIFF_LIB, add xdiff to LIB_OBJS
|
|
In --quiet mode, since we produce only an exit code for "something was
changed" and no actual output, we can often get by with just a
tree-level diff. However, certain options require us to actually look at
the file contents (e.g., if we are ignoring whitespace changes). We have
a flag "diff_from_contents" for that, and if it is set we call
diff_flush() on each path.
To avoid producing any output (since we were asked to be --quiet), we
traditionally just redirected the output to /dev/null. That changed in
b55e6d36eb (diff: ensure consistent diff behavior with ignore options,
2025-08-08), which replaced that with a "dry_run" flag. In theory, with
dry_run set, we should produce no output. But it carries a risk of
regression: if we forget to respect dry_run in any of the output paths,
we'll accidentally produce output.
And indeed, there is at least one such regression in that commit, as it
covered only the case where we actually call into xdiff, and not
creation or deletion diffs, where we manually generate the headers. We
even test this case in t4035, but only with diff-tree, which does not
show the bug by default because it does not require diff_from_contents.
But git-diff does, because it allows external diff programs by default
(so we must dig into each diff filepair to decide if it requires running
an external diff that may declare two distinct blobs to actually be the
same).
We should fix all of those code paths to respect dry_run correctly, but
in the meantime we can protect ourselves more fully by restoring the
redirection to /dev/null. This gives us an extra layer of protection
against regressions dues to other code paths we've missed.
Though the original issue was reported with "git diff" (and due to its
default of --ext-diff), I've used "diff-tree -w" in the new test. It
triggers the same issue, but I think the fact that "-w" implies
diff_from_contents is a bit more obvious, and fits in with the rest of
t4035.
Reported-by: Jake Zimmerman <jake@zimmerman.io>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
jk/diff-from-contents-fix
* ly/diff-name-only-with-diff-from-content:
diff: ensure consistent diff behavior with ignore options
|
|
The Tags and Heads window always opens at a default position and size,
requiring users to reposition it each time. Remember its geometry
between sessions in the config file as `geometry(showrefs)`.
Note that the existing configuration is sourced in proc savestuff
right before new settings are written. This makes the old settings
available as local variables(!) and does not overwrite the current
settings. Since we need access to the global geometry(showrefs), it
is necessary to unset the local variable.
Helped-by: Michael Rappazzo <rappazzo@gmail.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
|
|
This reverts commit b9bee11526ec (gitk: Only restore window size from
~/.gitk, not position, 2008-03-10).
The earlier commit e9937d2a03a4 (Make gitk work reasonably well on
Cygwin, 2007-02-01) reworked the window layout considerably. Much of
this became irrelevant around 2011 after Cygwin gained an X11 server
and switched to a supportable port of the Unix/X11 Tcl/Tk (it is now
on the current 8.6 code base).
Part of the necessary change was to restore the window size across
sessions, but the position was also restored. This raised complaints
on the mailing list[*], because Gitk was opened on the wrong monitor.
b9bee11526ec was the compromise, because it was only the size that
mattered for the Cygwin layout engine to work.
I personally, find it annoying when Gitk pops up on a random location
on the screen, in particular, since many other applications restore
the window positions across sessions, so why not Gitk as well? (I do
not operate multi-monitor setups, so I cannot test the case.)
[*] https://lore.kernel.org/git/47AAA254.2020008@thorn.ws/
Helped-by: Mark Levedahl <mlevedahl@gmail.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
|
|
In a preceding commit we have removed `packfile_store_get_packs()`. With
this function removed it's somewhat useless to still have the "all"
infix in `packfile_store_get_all_packs()`. Rename the latter to drop
that infix.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We have a bunch of different sites that want to iterate through all
packs of a given `struct packfile_store`. This pattern is somewhat
verbose and repetitive, which makes it somewhat cumbersome.
Introduce a new macro `repo_for_each_pack()` that removes some of the
boilerplate.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the preceding commits we have removed all remaining callers of
`packfile_store_get_packs()`, the function is thus unused now. Remove
it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When using multiple threads in git-grep(1) we eagerly preload both the
gitmodules file as well as the packfiles so that the threads won't race
with one another to initialize these data structures.
For packfiles, this is done by calling `packfile_store_get_packs()`,
which first loads our packfiles and then returns a pointer to the first
such packfile. This pointer is ignored though, as all we really care
about is that `packfile_store_prepare()` was called.
Historically, that function was file-local to "packfile.c", but that
changed with 4188332569 (packfile: move `get_multi_pack_index()` into
"midx.c", 2025-09-02). We can thus simplify the code by calling that
function directly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When running maintenance tasks via git-maintenance(1) we have a couple
of auto-conditions that check whether or not a specific task should be
running. One such check is for incremental repacks, which essentially
use `git multi-pack-index repack` to repack a set of smaller packfiles
into one larger packfile.
The auto-condition for this task checks how many packfiles there are
that aren't indexed by any multi-pack index. If there is a sufficient
number then we execute the above command to combine those into a single
pack and add that pack to the MIDX.
As we don't care about MIDX'd packs we use `packfile_store_get_packs()`,
which knows to not load any packs that are indexed by a MIDX. But as
explained in the preceding commit, we want to get rid of that function.
We already handle packfiles that have a MIDX by the very nature of this
function, as we explicitly count non-MIDX'd packs. As such, we can
trivially switch over to use `packfile_store_get_all_packs()` instead.
Do so.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When searching for abbreviated or when trying to disambiguate object IDs
we do this in two steps:
1. We search through the multi-pack index.
2. We search through all packfiles not part of any multi-pack index.
The second step uses `packfile_store_get_packs()`, which knows to skip
loading any packfiles that are indexed by an MIDX; this is exactly what
we want.
But that function is somewhat problematic, as its behaviour is stateful
and is influenced by `packfile_store_get_all_packs()`. This function
basically does the same as `packfile_store_get_packs()`, but in addition
it also loads all packfiles indexed by an MIDX. The problem here is that
both of these functions act on the same linked list of packfiles, and
thus depending on whether or not `get_all_packs()` was called the result
returned by `get_packs()` will be different. Consequently, all callers
of `get_packs()` need to be prepared to see MIDX'd packs even though
these should in theory be excluded.
This interface is confusing and thus potentially dangerous, which is why
we're converting all callers of `get_packs()` to use `get_all_packs()`
instead.
Do so for the above functions in "object-name.c". As explained, we
already know to skip any MIDX'd packs in both `find_abbrev_len_packed()`
and `find_short_packed_object()`, so it's fine to start loading MIDX'd
packfiles.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
ps/remove-packfile-store-get-packs
* tb/incremental-midx-part-3.1: (64 commits)
builtin/repack.c: clean up unused `#include`s
repack: move `write_cruft_pack()` out of the builtin
repack: move `write_filtered_pack()` out of the builtin
repack: move `pack_kept_objects` to `struct pack_objects_args`
repack: move `finish_pack_objects_cmd()` out of the builtin
builtin/repack.c: pass `write_pack_opts` to `finish_pack_objects_cmd()`
repack: extract `write_pack_opts_is_local()`
repack: move `find_pack_prefix()` out of the builtin
builtin/repack.c: use `write_pack_opts` within `write_cruft_pack()`
builtin/repack.c: introduce `struct write_pack_opts`
repack: 'write_midx_included_packs' API from the builtin
builtin/repack.c: inline packs within `write_midx_included_packs()`
builtin/repack.c: pass `repack_write_midx_opts` to `midx_included_packs`
builtin/repack.c: inline `remove_redundant_bitmaps()`
builtin/repack.c: reorder `remove_redundant_bitmaps()`
repack: keep track of MIDX pack names using existing_packs
builtin/repack.c: use a string_list for 'midx_pack_names'
builtin/repack.c: extract opts struct for 'write_midx_included_packs()'
builtin/repack.c: remove ref snapshotting from builtin
repack: remove pack_geometry API from the builtin
...
|
|
Commit 5040f9f164 ("doc: add technical design doc for large object
promisors", 2025-02-18) added the large object promisors document
as a technical document (with a '.txt' extension). The merge commit
2c6fd30198 ("Merge branch 'cc/lop-remote'", 2025-03-05) seems to
have renamed the file with an '.adoc' extension.
Despite the '.adoc' extension, this document was not being formatted
by asciidoc(tor) as part of the docs build. In order to do so, add
the document to the make and meson build files.
Having added the document to the build, asciidoc and asciidoctor find
(slightly different) problems with the syntax of the input document.
The first set of warnings (only issued by asciidoc) relate to some
'section title out of sequence: expected level 3, got level 4'. This
document uses 'setext' style of section headers, using a series of
underline characters, where the character used denotes the level of
the title. From document title to level 5 (see [1]), these characters
are =, -, ~, ^, +. This does not seem to fit the error message, which
implies that those characters denote levels 0 -> 4. Replacing the headings
underlined with '+' by the '^' character eliminates these warnings.
The second set of warnings (only issued by asciidoctor) relate to some
headings which seem to use both arabic and roman numerals as part of
a single 'list' sequence. This elicited either 'unterminated listing
block' or (for example) 'list item index: expected I, got II' warnings.
In order not to mix arabic and roman numerals, remove the numeral from
the '0) Non goals' heading. Similarly, the remaining roman numeral
entries had the ')' removed and turned into regular headings with I, II,
III ... at the beginning.
[1] https://asciidoctor.org/docs/asciidoc-recommended-practices/
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The formatting markup syntax used in this document (markdown?) is not
interpreted correctly by asciidoc or asciidoctor. The main problem is
the use of a '## ' prefix markup for some sub-headings, along with the
use of '```' code markup and some missing literal blocks.
In order to improve the (html) document formatting:
- replace the '## ' prefix sub-title syntax with the '~~' underlining
syntax for the relevant sub-headings.
- replace the '```' code markup, which causes asciidoc(tor) to simply
remove the marked up text, with a literal block '----' markup.
- the second ascii diagram, in the 'Merging commit-graph files'
section, is not rendered correctly by asciidoctor (asciidoc is fine)
so enclose it in a '....' block.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Both asciidoc and asciidoctor issue warnings about 'list item index:
expected n got n-1' for n=1->7 on lines 928, 931, 951, 974, 980, 1033
and 1049. In asciidoc, numbered lists must start at one, whereas this
file has a list starting at zero. Also, asciidoc and asciidoctor warn
about 'section title out of sequence: expected level 1, got level 2'
on line 17. (asciidoc only complains about the first instance of this,
while asciidoctor complains about them all, on lines 95, 258, 303, 316,
545, 612, 752, 824, 895, 923 and 1053). These warnings stem from the
section titles not being correctly nested within a document/chapter
title.
In order to address the first set of warnings, simply renumber the list
from one to seven, rather than zero to six. Fortunately, this does not
require altering additional text, since the enumeration of 'Known Bugs'
is not referred to anywhere else in the document.
In order to address the second set of warnings, change the section title
syntax from '=== title ===' to '== title ==', effectively reducing the
nesting level of the title by one. Also, some apparent (sub-)titles are
not marked up with sub-title syntax, so add some '=== ' prefix(s) to the
relevant headings.
In addition to the warnings, address some other formatting issues:
- the use of heavily nested unordered lists is not reflected in the
output (making the file totally unreadable) because each level of
nesting requires a different syntax. (i.e. replace '*' with '**'
for the second level, '*' with '***' for the third level, etc.)
- make use of literal blocks and manual indentation to get asciidoc
and asciidoctor to display even remotely similar output.
- make use of labelled lists, in some places, to get a similar looking
output to the input, for both asciidoc and asciidoctor.
- replace the trailing space in: `git grep ${SEARCH_TERM} OLDREV `
otherwise the entire line in which that appears is removed from
the output.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Both asciidoc and ascidoctor issue warnings about 'list item index:
expected n got n-1' for n=1->9 on lines 13, 15, 17, 20, 23, 25, 29,
31 and 33. In asciidoc, numbered lists must start at one, whereas this
file has a list starting at zero. Also, asciidoc and asciidoctor warn
about 'section title out of sequence: expected level 1, got level 2'
on line 38. (asciidoc only complains about the first instance of this,
while asciidoctor complains about them all, on lines 94, 141, 142,
184, 185, 257, 288, 289, 290, 397, 424, 485, 486 and 487). These
warnings stem from the section titles not being correctly nested within
a document/chapter title.
In order to address the first set of warnings, simply renumber the list
from one to nine, rather than zero to eight. This also requires altering
the text which refers to the section numbers, including other section
titles.
In order to address the second set of warnings, change the section title
syntax from '=== title ===' to '== title ==', effectively reducing the
nesting level of the title by one. Also, some of the titles are given
over multiple lines (they are very long), with an title '===' prefix
on each line. This leads to them being treated as separate sections
with no body text (as you can see from the line numbers given for the
asciidoctor warnings, above). So, for these titles, turn them into a
single (long) line of text.
In addition to the warnings, address some other formatting issues:
- the ascii branch diagrams didn't format correctly on asciidoctor
so include them in a literal block.
- several blocks of text were intended to be formatted 'as is' but
were not included in a literal block.
- in section 8, format the (A)->(D) in the text description as a
literal with `` marks, since (C) is rendered as a copyright
symbol in html otherwise.
- in section 9, a sub-list of two items is not formatted as such.
change the '*' introducer to '**' to correct the sub-list format.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Over the past several dozen commits, we have moved a large amount of
functionality out of the repack builtin and into other files like
repack.c, repack-cruft.c, repack-filtered.c, repack-midx.c, and
repack-promisor.c.
These files specify the minimal set of `#include`s that they need to
compile successfully, but we did not change the set of `#include`s in
the repack builtin itself.
Now that the code movement is complete, let's clean up that set of
`#include`s and trim down the builtin to include the minimal amount of
external headers necessary to compile.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In an identical fashion as the previous commit, move the function
`write_cruft_pack()` into its own compilation unit, and make the
function visible through the repack.h API.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar fashion as in previous commits, move the function
`write_filtered_pack()` out of the builtin and into its own compilation
unit.
This function is now part of the repack.h API, but implemented in its
own "repack-filtered.c" unit as it is a separate component from other
kinds of repacking operations.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "pack_kept_objects" variable is defined as static to the repack
builtin, but is inherently related to the pack-objects arguments that
the builtin uses when generating new packs.
Move that field into the "struct pack_objects_args", and shuffle around
where we append the corresponding command-line option when preparing a
pack-objects process. Specifically:
- `write_cruft_pack()` always wants to pass "--honor-pack-keep", so
explicitly set the `pack_kept_objects` field to "0" when initializing
the `write_pack_opts` struct before calling `write_cruft_pack()`.
- `write_filtered_pack()` no longer needs to handle writing the
command-line option "--honor-pack-keep" when preparing a pack-objects
process, since its call to `prepare_pack_objects()` will have already
taken care of that.
`write_filtered_pack()` also reads the `pack_kept_objects` field to
determine whether to write the existing kept packs with a leading "^"
character, so update that to read through the `po_args` pointer
instead.
- `cmd_repack()` also no longer has to write the "--honor-pack-keep"
flag explicitly, since this is also handled via its call to
`prepare_pack_objects()`.
Since there is a default value for "pack_kept_objects" that relies on
whether or not we are writing a bitmap (and not writing a MIDX), extract
a default initializer for `struct pack_objects_args` that keeps this
conditional default behavior.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar spirit as the previous commit(s), now that the function
`finish_pack_objects_cmd()` has no explicit dependencies within the
repack builtin, let's extract it.
This prepares us to extract the remaining two functions within the
repack builtin that explicitly write packfiles, which are
`write_cruft_pack()` and `write_filtered_pack()`, which will be done in
the future commits.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
To prepare to move the `finish_pack_objects_cmd()` function out of the
builtin and into the repack.h API, there are a couple of things we need
to do first:
- First, let's take advantage of `write_pack_opts_is_local()` function
introduced in the previous commit instead of passing "local"
explicitly.
- Let's also avoid referring to the static 'packtmp' field within
builtin/repack.c by instead accessing it through the write_pack_opts
argument.
There are three callers which need to adjust themselves in order to
account for this change. The callers which reside in write_cruft_pack()
and write_filtered_pack() both already have an "opts" in scope, so they
can pass it through transparently.
The other call (at the bottom of `cmd_repack()`) needs to initialize its
own write_pack_opts to pass the necessary fields over to the direct call
to `finish_pack_objects_cmd()`.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Similar to the previous commit, the functions `write_cruft_pack()` and
`write_filtered_pack()` both compute a "local" variable via the exact
same mechanism:
const char *scratch;
int local = skip_prefix(opts->destination, opts->packdir, &scratch);
Not only does this cause us to repeat the same pair of lines, it also
introduces an unnecessary "scratch" variable that is common between both
functions.
Instead of repeating ourselves, let's extract that functionality into a
new function in the repack.h API called "write_pack_opts_is_local()".
That function takes a pointer to a "struct write_pack_opts" (which has
as fields both "destination" and "packdir"), and can encapsulate the
dangling "scratch" field.
Extract that function and make it visible within the repack.h API, and
use it within both `write_cruft_pack()` and `write_filtered_pack()`.
While we're at it, match our modern conventions by returning a "bool"
instead of "int", and use `starts_with()` instead of `skip_prefix()` to
avoid storing the dummy "scratch" variable.
The remaining duplication (that is, that both `write_cruft_pack()` and
`write_filtered_pack()` still both call `write_pack_opts_is_local()`)
will be addressed in the following commit.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Both callers within the repack builtin which call functions that take a
'write_pack_opts' structure have the following pattern:
struct write_pack_opts opts = {
.packdir = packdir,
.packtmp = packtmp,
.pack_prefix = find_pack_prefix(packdir, packtmp),
/* ... */
};
int ret = write_some_kind_of_pack(&opts, /* ... */);
, but both "packdir" and "packtmp" are fields within the write_pack_opts
struct itself!
Instead of also computing the pack_prefix ahead of time, let's have the
callees compute it themselves by moving `find_pack_prefix()` out of the
repack builtin, and have it take a write_pack_opts pointer instead of
the "packdir" and "packtmp" fields directly.
This avoids the callers having to do some prep work that is common
between the two of them, but also avoids the potential pitfall of
accidentally writing:
.pack_prefix = find_pack_prefix(packtmp, packdir),
(which is well-typed) when the caller meant to instead write:
.pack_prefix = find_pack_prefix(packdir, packtmp),
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Similar to the changes made in the previous commit to
`write_filtered_pack()`, teach `write_cruft_pack()` to take a
`write_pack_opts` struct and use that where possible.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are various functions within the 'repack' builtin which are
responsible for writing different kinds of packs. They include:
- `static int write_filtered_pack(...)`
- `static int write_cruft_pack(...)`
as well as the function `finish_pack_objects_cmd()`, which is
responsible for finalizing a new pack write, and recording the checksum
of its contents in the 'names' list.
Both of these `write_` functions have a few things in common. They both
take a pointer to the 'pack_objects_args' struct, as well as a pair of
character pointers for `destination` and `pack_prefix`.
Instead of repeating those arguments for each function, let's extract an
options struct called "write_pack_opts" which has these three parameters
as member fields. While we're at it, add fields for "packdir," and
"packtmp", both of which are static variables within the builtin, and
need to be read from within these two functions.
This will shorten the list of parameters that callers have to provide to
`write_filtered_pack()`, avoid ambiguity when passing multiple variables
of the same type, and provide a unified interface for the two functions
mentioned earlier.
(Note that "pack_prefix" can be derived on the fly as a function of
"packdir" and "packtmp", making it unnecessary to store "pack_prefix"
explicitly. This commit ignores that potential cleanup in the name of
doing as few things as possible, but a later commit will make that
change.)
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Now that we have sufficiently cleaned up the write_midx_included_packs()
function, we can move it (along with the struct repack_write_midx_opts)
out of the builtin, and into the repack.h header.
Since this function (and the static ones that it depends on) are
MIDX-specific details of the repacking process, move them to the
repack-midx.c compilation unit instead of the general repack.c one.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
To write a MIDX at the end of a repack operation, 'git repack' presently
computes the set of packs to write into the MIDX, before invoking
`write_midx_included_packs()` with a `string_list` containing those
packs.
The logic for computing which packs are supposed to appear in the
resulting MIDX is within `midx_included_packs()`, where it is aware of
details like which cruft pack(s) were written/combined, if/how we did a
geometric repack, etc.
Computing this list ourselves before providing it to the sole function
to make use of that list `write_midx_included_packs()` is somewhat
awkward. In the future, repack will learn how to write incremental
MIDXs, which will use a very different pack selection routine.
Instead of doing something like:
struct string_list included_packs = STRING_LIST_INIT_DUP;
if (incremental) {
midx_incremental_included_packs(&included_packs, ...):
write_midx_incremental_included_packs(&included_packs, ...);
} else {
midx_included_packs(&included_packs, ...):
write_midx_included_packs(&included_packs, ...);
}
in the future, let's have each function that writes a MIDX be
responsible for itself computing the list of included packs. Inline the
declaration and initialization of `included_packs` into the
`write_midx_included_packs()` function itself, and repeat that pattern
in the future when we introduce new ways to write MIDXs.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Instead of passing individual parameters (in this case, "existing",
"names", and "geometry") to `midx_included_packs()`, pass a pointer to a
`repack_write_midx_opts` structure instead.
Besides reducing the number of parameters necessary to call the
`midx_included_packs` function, this refactoring sets us up nicely to
inline the call to `midx_included_packs()` into
`write_midx_included_packs()`, thus making the caller (in this case,
`cmd_repack()`) oblivious to the set of packs being written into the
MIDX.
In order to do this, `repack_write_midx_opts` has to keep track of the
set of existing packs, so add an additional field to point to that set.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
After writing a new MIDX, the repack command removes any bitmaps
belonging to packs which were written into the MIDX.
This is currently done in a separate function outside of
`write_midx_included_packs()`, which forces the caller to keep track of
the set of packs written into the MIDX.
Prepare to no longer require the caller to keep track of such
information by inlining the clean-up into `write_midx_included_packs()`.
Future commits will make the caller oblivious to the set of packs
included in the MIDX altogether.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The next commit will inline the call to `remove_redundant_bitmaps()`
into `write_midx_included_packs()`. Reorder these two functions to avoid
a forward declaration to `remove_redundant_bitmaps()`.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Instead of storing the list of MIDX pack names separately, let's inline
it into the existing_packs struct, further reducing the number of
parameters we have to pass around.
This amounts to adding a new string_list to the existing_packs struct,
and populating it via `existing_packs_collect()`. This is fairly
straightforward to do, since we are already looping over all packs, all
we need to do is:
if (p->multi_pack_index)
string_list_append(&existing->midx_packs, pack_basename(p));
Note, however, that this check *must* come before other conditions where
we discard and do not keep track of a pack, including the condition "if
(!p->pack_local)" immediately below. This is because the existing
routine which collects MIDX pack names does so blindly, and does not
discard, for example, non-local packs.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When writing a new MIDX, repack must determine whether or not there are
any packs in the MIDX it is replacing (if one exists) that are not
somehow represented in the new MIDX (e.g., either by preserving the pack
verbatim, or rolling it up as part of a geometric repack, etc.).
In order to do this, it keeps track of a list of pack names from the
MIDX present in the repository at the start of the repack operation.
Since we manipulate and close the object store, we cannot rely on the
repository's in-core representation of the MIDX, since this is subject
to change and/or go away.
When this behavior was introduced in 5ee86c273b (repack: exclude cruft
pack(s) from the MIDX where possible, 2025-06-23), we maintained an
array of character pointers instead of using a convenience API, such as
string-list.h.
Store the list of MIDX pack names in a string_list, thereby reducing the
number of parameters we have to pass to `midx_has_unknown_packs()`.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The function 'write_midx_included_packs()', which is responsible for
writing a new MIDX with a given set of included packs, currently takes a
list of six arguments.
In order to extract this function out of the builtin, we have to pass
in a few additional parameters, like 'midx_must_contain_cruft' and
'packdir', which are currently declared as static variables within the
builtin/repack.c compilation unit.
Instead of adding additional parameters to `write_midx_included_packs()`
extract out an "opts" struct that names these parameters, and pass a
pointer to that, making it less cumbersome to add additional parameters.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When writing a MIDX, 'git repack' takes a snapshot of the repository's
references and writes the result out to a file, which it then passes to
'git multi-pack-index write' via the '--refs-snapshot'.
This is done in order to make bitmap selections with respect to what we
are packing, thus avoiding a race where an incoming reference update
causes us to try and write a bitmap for a commit not present in the
MIDX.
Extract this functionality out into a new repack-midx.c compilation
unit, and expose the necessary functions via the repack.h API.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Now that the pack_geometry API is fully factored and isolated from the
rest of the builtin, declare it within repack.h and move its
implementation to "repack-geometry.c" as a separate component.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
For similar reasons as the preceding commit, pass the "packdir" variable
directly to `pack_geometry_remove_redundant()` as a parameter to the
function.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Prepare to move pack_geometry-related APIs to their own compilation unit
by passing in the static "pack_kept_objects" variable directly as a
parameter to the 'pack_geometry_init()' function.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Rename functions which work with 'struct pack_geometry' to begin with
"pack_geometry_". While we're at it, change `free_pack_geometry()` to
instead be named `pack_geometry_release()` to match our conventions, and
make clear that that function frees the contents of the struct, not the
memory allocated to hold the struct itself.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Now that we have properly factored the portion of the builtin which is
responsible for repacking promisor objects, we can move that function
(and associated dependencies) out of the builtin entirely.
Similar to previous extractions, this function is declared in repack.h,
but implemented in a separate repack-promisor.c file. This is done to
separate promisor-specific repacking functionality from generic repack
utilities (like "existing_packs", and "generated_pack" APIs).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar spirit as previous commit(s), pass the "packtmp" variable
to "repack_promisor_objects()" as an explicit parameter of the function,
preparing us to move this function in a following commit.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Now that we have factored the "generated_pack" API, we can move it to
repack.ch, further slimming down builtin/repack.c.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Repeat what was done in the preceding commit for the
`generated_pack_install()` function, which needs both "packdir" and
"packtmp".
(As an aside, it is somewhat unfortunate that the final three parameters
to this function are all "const char *", making errors like passing
"packdir" and "packtmp" in the wrong order easy. We could define a new
structure here, but that may be too heavy-handed.)
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar spirit as previous commits, this function needs to know the
temporary pack prefix, which it currently accesses through the static
"packtmp" variable within builtin/repack.c.
Pass it explicitly as a function parameter to facilitate moving this
function out of builtin/repack.c entirely.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Once all new packs are known to exist, 'repack' installs their contents
from their temporary location into their permanent one. This is a
semi-involved procedure for each pack, since for each extension (e.g.,
".idx", ".pack", ".mtimes", and so on) we have to either:
- adjust the filemode of the temporary file before renaming it into
place, or
- die() if we are missing a non-optional extension, or
- unlink() any existing file for extensions that we did not generate
(e.g., if a non-cruft pack we generated was identical to, say, a
cruft pack which existed at the beginning of the process, we have to
remove the ".mtimes" file).
Extract this procedure into its own function, and call it
"generated_pack_install"(). This will set us up for pulling this
function out of the builtin entirely and making it part of the repack.h
API, which will be done in a future commit.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The name "generated_pack_data" is somewhat redundant, since the contents
of the struct *is* the data associated with the generated pack.
Rename the structure to just "generated_pack", resulting in less awkward
function names, like "generated_pack_has_ext()" which is preferable to
"generated_pack_data_has_ext()".
Rename a few related functions to align with the convention that
functions to do with a struct "S" should be prefixed with "S_".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The repack builtin defines an API for keeping track of which packs
were found in the repository at the beginning of the repack operation.
This is used to classify what state a pack was in (kept, non-kept, or
cruft), and is also used to mark which packs to delete (or keep) at the
end of a repack operation.
Now that the prerequisite refactoring is complete, this API is isolated
enough that it can be moved out to repack.[ch] and removed from the
builtin entirely.
As a result, some of its functions become static within repack.c,
cleaning up the visible API.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are a couple of spots that cause warnings within the
existing_packs API without DISABLE_SIGN_COMPARE_WARNINGS under
DEVELOPER=1 mode.
In both cases, we have int values that are being compared against size_t
ones. Neither of these two cases are incorrect, and the cast is
completely OK in practice. But both are unnecessary, since:
- in existing_packs_mark_for_deletion_1(), 'hexsz' should be defined as
a size_t anyway, since algop->hexsz is.
- in existing_packs_collect(), 'i' should be defined as a size_t since
it is counting up to the value of a string_list's 'nr' field.
(This patch is a little bit of noise, but I would rather see us squelch
these warnings ahead of moving the existing_packs API into a separate
compilation unit to avoid having to define DISABLE_SIGN_COMPARE_WARNINGS
in repack.c.)
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
builtin/repack.c defines a static "packdir" to instruct pack-objects on
where to write any new packfiles. This is also the directory scanned
when removing any packfiles which were made redundant by the latest
repack.
Prepare to move the "existing_packs_remove_redundant" function to its
own compilation unit by passing in this information as a parameter to
that function.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Extract "remove_redundant_pack()" as generic repack-related
functionality by moving its implementation to the repack.[ch]
compilation unit.
This is a prerequisite to moving the "existing_packs" API, which is one
of the callers of this function. (The remaining caller in the pack
geometry code will eventually move to its own compilation unit as well,
and will likewise rely on this function.)
While moving it over, prefix the function name with "repack_" to
indicate that it belongs to the repack-subsystem.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Rename many of the 'struct existing_packs'-related functions according
to the convention introduced in and described by 541204aabe
(Documentation: document naming schema for structs and their functions,
2024-07-30).
Note that some functions which operate over an individual entry in the
list of existing packs are prefixed with "existing_pack_" instead of the
plural form.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Now that the 'prepare_pack_objects' function no longer refers to
external, static variables, move it out to repack.h as generic
functionality.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The static variable 'delta_base_offset' determines whether or not we
pass the "--delta-base-offset" command-line argument when spawning
pack-objects as a child process. Its introduction dates back to when
repack was rewritten in C, all the way back in a1bbc6c017 (repack:
rewrite the shell script in C, 2013-09-15).
'struct pack_objects_args' was introduced much later on in 4571324b99
(builtin/repack.c: allow configuring cruft pack generation, 2022-05-20),
but did not move the 'delta_base_offset' variable.
Since the 'delta_base_offset' is a property of an individual
pack-objects command, re-introduce that variable as a member of 'struct
pack_objects_args', which will enable further code movement in the
subsequent commits.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A subsequent commit will remove 'delta_base_offset' as a static variable
within builtin/repack.c, and reintroduce it as a member of the 'struct
pack_objects_args'.
As a result, the repack_config callback will need to have both the
cruft- and non-cruft 'struct pack_objects_args's in scope. Introduce a
new 'struct repack_config_ctx' to allow the callee to provide both
pointers to the callback.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Over the years, builtin/repack.c has turned into a grab-bag of
functionality powering the 'git repack' builtin. Among its many
capabilities, it:
- can build and spawn 'git pack-objects' commands, which in turn
generate new packs
- has infrastructure to manage the set of existing packs in a
repository
- has infrastructure to split a sequence of packs into a geometric
progression based on object size
- can manage both generating and combining cruft packs together
- can write new MIDXs
to name a few.
As a result, this builtin has accumulated a lot of code, making adding
new functionality difficult. In the future, 'repack' will learn how to
manage a chain of incremental MIDXs, adding yet more functionality into
the builtin.
As a prerequisite step, let's first move some of the functionality in
the builtin into its own repack.[ch].
This will be done over the course of many steps, since there are many
individual components, some of which will end up in other, yet-to-exist
compilation units of their own. Some of the code movement here is also
non-trivial, so performing it in individual steps will make it easier to
verify.
Let's start by migrating 'struct pack_objects_args' (and the related
corresponding pack_objects_args_release() function) into repack.h, and
teach both the Makefile and Meson how to build the new compilation unit.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In previous commits, we started passing either repository or
git_hash_algo pointers around to various spots within builtin/repack.c
to reduce our dependency on the_repository in the hope of undef'ing
USE_THE_REPOSITORY_VARIABLE.
This commit takes us as far as we can (easily) go in that direction by
removing the only use of a convenience function that only exists when
USE_THE_REPOSITORY_VARIABLE is defined.
Unfortunately, the only other such function is "is_bare_repository()",
which is less than straightforward to convert into, say,
"repo_is_bare()", the latter of the two accepting a repository pointer.
Punt on that for now, and declare this commit as the stopping point for
our efforts in the direction of undef'ing USE_THE_REPOSITORY_VARIABLE.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar spirit as previous commits, avoid referring directly to
"the_hash_algo" in builtin/repack.c::finish_pack_objects_cmd() and
instead accept one as a parameter to the function.
Since this function has a number of callers throughout the builtin, the
diff is a little noisier than previous commits. However, each hunk is
limited to passing the hash_algo parameter from a repository pointer
that is already in scope.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar spirit as the previous commits, avoid referring directly to
"the_hash_algo" within builtin/repack.c::repack_promisor_objects().
Since there is already a repository pointer in scope, use its hash_algo
value instead.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a similar spirit as the previous commit, avoid referring directly to
"the_hash_algo" within builtin/repack.c::write_oid().
Unlike the previous commit, we are within a callback function, so must
introduce a new struct to pass additional data through its "data"
pointer.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "mark_packs_for_deletion_1" function uses "the_hash_algo->hexsz" to
isolate a pack's checksum before deleting it to avoid deleting a newly
written pack having the same checksum (that is, some generated pack
wound up identical to an existing pack).
Avoid this by passing down a "struct git_hash_algo" pointer, and refer to
the hash algorithm through it instead.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Pass a "struct repository" pointer to the 'repack_promisor_objects()'
function to avoid using "the_repository".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The 'remove_redundant_pack()' function uses "the_repository" to obtain,
and optionally remove, the repository's MIDX. Instead of relying on
"the_repository", pass around a "struct repository *" parameter through
its callers, and use that instead.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Avoid using "the_repository" in various MIDX-related ref snapshotting
functions.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are a number of spots within builtin/repack.c which refer to
"the_repository", and either make use of the "existing packs" API
or otherwise have a 'struct existing_packs *' in scope.
Add a "repo" member to "struct existing_packs" and use that instead of
"the_repository" in such locations.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Reduce builtin/repack.c's reliance on `the_repository` by using the
currently-UNUSED "repo" parameter within cmd_repack().
The following commits will continue to reduce the usage of
the_repository in other places within builtin/repack.c.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Carry over the fixups from 8c3d7c5f (RelNotes: minor fixups before
2.51.1, 2025-10-15).
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Bumps [actions/github-script](https://github.com/actions/github-script)
from 7 to 8.
- [Release notes](https://github.com/actions/github-script/releases)
- [Commits](https://github.com/actions/github-script/compare/v7...v8)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Bumps [actions/setup-python](https://github.com/actions/setup-python)
from 5 to 6.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](https://github.com/actions/setup-python/compare/v5...v6)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v4...v5)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|