aboutsummaryrefslogtreecommitdiffstats
path: root/builtin/fetch.c
AgeCommit message (Collapse)AuthorFilesLines
2025-11-30Merge branch 'ja/doc-synopsis-style'Junio C Hamano1-1/+1
Doc mark-up updates. * ja/doc-synopsis-style: doc: pull-fetch-param typofix doc: convert git push to synopsis style doc: convert git pull to synopsis style doc: convert git fetch to synopsis style
2025-11-19doc: convert git fetch to synopsis styleJean-Noël Avila1-1/+1
- Switch the synopsis to a synopsis block which will automatically format placeholders in italics and keywords in monospace - Use _<placeholder>_ instead of <placeholder> in the description - Use `backticks` for keywords and more complex option descriptions. The new rendering engine will apply synopsis rules to these spans. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04refs: introduce wrapper struct for `each_ref_fn`Patrick Steinhardt1-9/+4
The `each_ref_fn` callback function type is used across our code base for several different functions that iterate through reference. There's a bunch of callbacks implementing this type, which makes any changes to the callback signature extremely noisy. An example of the required churn is e8207717f1 (refs: add referent to each_ref_fn, 2024-08-09): adding a single argument required us to change 48 files. It was already proposed back then [1] that we might want to introduce a wrapper structure to alleviate the pain going forward. While this of course requires the same kind of global refactoring as just introducing a new parameter, it at least allows us to more change the callback type afterwards by just extending the wrapper structure. One counterargument to this refactoring is that it makes the structure more opaque. While it is obvious which callsites need to be fixed up when we change the function type, it's not obvious anymore once we use a structure. That being said, we only have a handful of sites that actually need to populate this wrapper structure: our ref backends, "refs/iterator.c" as well as very few sites that invoke the iterator callback functions directly. Introduce this wrapper structure so that we can adapt the iterator interfaces more readily. [1]: <ZmarVcF5JjsZx0dl@tanuki> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-17refs/files: catch conflicts on case-insensitive file-systemsKarthik Nayak1-3/+18
During the 'prepare' phase of a reference transaction in the files backend, we create the lock files for references to be created. When using batched updates on case-insensitive filesystems, the entire batched updates would be aborted if there are conflicting names such as: refs/heads/Foo refs/heads/foo This affects all commands which were migrated to use batched updates in Git 2.51, including 'git-fetch(1)' and 'git-receive-pack(1)'. Before that, reference updates would be applied serially with one transaction used per update. When users fetched multiple references on case-insensitive systems, subsequent references would simply overwrite any earlier references. So when fetching: refs/heads/foo: 5f34ec0bfeac225b1c854340257a65b106f70ea6 refs/heads/Foo: ec3053b0977e83d9b67fc32c4527a117953994f3 refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56 The user would simply end up with: refs/heads/foo: ec3053b0977e83d9b67fc32c4527a117953994f3 refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56 This is buggy behavior since the user is never informed about the overrides performed and missing references. Nevertheless, the user is left with a working repository with a subset of the references. Since Git 2.51, in such situations fetches would simply fail without updating any references. Which is also buggy behavior and worse off since the user is left without any references. The error is triggered in `lock_raw_ref()` where the files backend attempts to create a lock file. When a lock file already exists the function returns a 'REF_TRANSACTION_ERROR_GENERIC'. When this happens, the entire batched updates, not individual operation, is aborted as if it were in a transaction. Change this to return 'REF_TRANSACTION_ERROR_CASE_CONFLICT' instead to aid the batched update mechanism to simply reject such errors. The change only affects batched updates since batched updates will reject individual updates with non-generic errors. So specifically this would only affect: 1. git fetch 2. git receive-pack 3. git update-ref --batch-updates This bubbles the error type up to `files_transaction_prepare()` which tries to lock each reference update. So if the locking fails, we check if the rejection type can be ignored, which is done by calling `ref_transaction_maybe_set_rejected()`. As the error type is now 'REF_TRANSACTION_ERROR_CASE_CONFLICT', the specific reference update would simply be rejected, while other updates in the transaction would continue to be applied. This allows partial application of references in case-insensitive filesystems when fetching colliding references. While the earlier implementation allowed the last reference to be applied overriding the initial references, this change would allow the first reference to be applied while rejecting consequent collisions. This should be an okay compromise since with the files backend, there is no scenario possible where we would retain all colliding references. Let's also be more proactive and notify users on case-insensitive filesystems about such problems by providing a brief about the issue while also recommending using the reftable backend, which doesn't have the same issue. Reported-by: Joe Drew <joe.drew@indexexchange.com> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_get_int()` wrapperPatrick Steinhardt1-2/+2
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_int()`. All callsites are adjusted so that they use `repo_config_get_int(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_get_string()` wrapperPatrick Steinhardt1-1/+1
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_string()`. All callsites are adjusted so that they use `repo_config_get_string(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config()` wrapperPatrick Steinhardt1-2/+2
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config()`. All callsites are adjusted so that they use `repo_config(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-16Merge branch 'ph/fetch-prune-optim'Junio C Hamano1-11/+8
"git fetch --prune" used to be O(n^2) expensive when there are many refs, which has been corrected. * ph/fetch-prune-optim: clean up interface for refs_warn_dangling_symrefs refs: remove old refs_warn_dangling_symref fetch-prune: optimize dangling-ref reporting
2025-07-15Merge branch 'ps/object-store'Junio C Hamano1-10/+11
Code clean-up around object access API. * ps/object-store: odb: rename `read_object_with_reference()` odb: rename `pretend_object_file()` odb: rename `has_object()` odb: rename `repo_read_object_file()` odb: rename `oid_object_info()` odb: trivial refactorings to get rid of `the_repository` odb: get rid of `the_repository` when handling submodule sources odb: get rid of `the_repository` when handling the primary source odb: get rid of `the_repository` in `for_each()` functions odb: get rid of `the_repository` when handling alternates odb: get rid of `the_repository` in `odb_mkstemp()` odb: get rid of `the_repository` in `assert_oid_type()` odb: get rid of `the_repository` in `find_odb()` odb: introduce parent pointers object-store: rename files to "odb.{c,h}" object-store: rename `object_directory` to `odb_source` object-store: rename `raw_object_store` to `object_database`
2025-07-08Merge branch 'kn/fetch-push-bulk-ref-update'Junio C Hamano1-54/+73
"git push" and "git fetch" are taught to update refs in batches to gain performance. * kn/fetch-push-bulk-ref-update: receive-pack: handle reference deletions separately refs/files: skip updates with errors in batched updates receive-pack: use batched reference updates send-pack: fix memory leak around duplicate refs fetch: use batched reference updates refs: add function to translate errors to strings
2025-07-01clean up interface for refs_warn_dangling_symrefsPhil Hord1-4/+1
The refs_warn_dangling_symrefs interface is a bit fragile as it passes in printf-formatting strings with expectations about the number of arguments. This patch series made it worse by adding a 2nd positional argument. But there are only two call sites, and they both use almost identical display options. Make this safer by moving the format strings into the function that uses them to make it easier to see when the arguments don't match. Pass a prefix string and a dry_run flag so the decision logic can be handled where needed. Signed-off-by: Phil Hord <phil.hord@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01fetch-prune: optimize dangling-ref reportingPhil Hord1-10/+10
When pruning during `git fetch` we check each pruned ref against the ref_store one at a time to decide whether to report it as dangling. This causes every local ref to be scanned for each ref being pruned. If there are N refs in the repo and M refs being pruned, this code is O(M*N). However, `git remote prune` uses a very similar function that is only O(N*log(M)). Remove the wasteful ref scanning for each pruned ref and use the faster version already available in refs_warn_dangling_symrefs. Change the message to include the original refname since the message is no longer printed immediately after the line that did just print the refname. In a repo with 126,000 refs, where I was pruning 28,000 refs, this code made about 3.6 billion calls to strcmp and consumed 410 seconds of CPU. (Invariably in that time, my remote would timeout and the fetch would fail anyway.) After this change, the same operation completes in under a second. Signed-off-by: Phil Hord <phil.hord@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `has_object()`Patrick Steinhardt1-8/+9
Rename `has_object()` to `odb_has_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename files to "odb.{c,h}"Patrick Steinhardt1-1/+1
In the preceding commits we have renamed the structures contained in "object-store.h" to `struct object_database` and `struct odb_backend`. As such, the code files "object-store.{c,h}" are confusingly named now. Rename them to "odb.{c,h}" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename `object_directory` to `odb_source`Patrick Steinhardt1-1/+1
The `object_directory` structure is used as an access point for a single object directory like ".git/objects". While the structure isn't yet fully self-contained, the intent is for it to eventually contain all information required to access objects in one specific location. While the name "object directory" is a good fit for now, this will change over time as we continue with the agenda to make pluggable object databases a thing. Eventually, objects may not be accessed via any kind of directory at all anymore, but they could instead be backed by any kind of durable storage mechanism. While it seems quite far-fetched for now, it is thinkable that eventually this might even be some form of a database, for example. As such, the current name of this structure will become worse over time as we evolve into the direction of pluggable ODBs. Immediate next steps will start to carve out proper self-contained object directories, which requires us to pass in these object directories as parameters. Based on our modern naming schema this means that those functions should then be named after their subsystem, which means that we would start to bake the current name into the codebase more and more. Let's preempt this by renaming the structure. There have been a couple alternatives that were discussed: - `odb_backend` was discarded because it led to the association that one object database has a single backend, but the model is that one alternate has one backend. Furthermore, "backend" is more about the actual backing implementation and less about the high-level concept. - `odb_alternate` was discarded because it is a bit of a stretch to also call the main object directory an "alternate". Instead, pick `odb_source` as the new name. It makes it sufficiently clear that there can be multiple sources and does not cause confusion when mixed with the already-existing "alternate" terminology. In the future, this change allows us to easily introduce for example a `odb_files_source` and other format-specific implementations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-25Merge branch 'ps/maintenance-ref-lock'Junio C Hamano1-1/+1
"git maintenance" lacked the care "git gc" had to avoid holding onto the repository lock for too long during packing refs, which has been remedied. * ps/maintenance-ref-lock: builtin/maintenance: fix locking race when handling "gc" task builtin/gc: avoid global state in `gc_before_repack()` usage: allow dying without writing an error message builtin/maintenance: fix locking race with refs and reflogs tasks builtin/maintenance: split into foreground and background tasks builtin/maintenance: fix typedef for function pointers builtin/maintenance: extract function to run tasks builtin/maintenance: stop modifying global array of tasks builtin/maintenance: mark "--task=" and "--schedule=" as incompatible builtin/maintenance: centralize configuration of explicit tasks builtin/gc: drop redundant local variable builtin/gc: use designated field initializers for maintenance tasks
2025-06-03usage: allow dying without writing an error messagePatrick Steinhardt1-1/+1
Sometimes code wants to die in a situation where it already has written an error message. To use the same error code as `die()` we have to use `exit(128)`, which is easy to get wrong and leaves magic numbers all over our codebase. Teach `die_message_builtin()` to not print any error when passed a `NULL` pointer as error string. Like this, such users can now call `die(NULL)` to achieve the same result without any hardcoded error codes. Adapt a couple of builtins to use this new pattern to demonstrate that there is a need for such a helper. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19fetch: use batched reference updatesKarthik Nayak1-54/+73
The reference updates performed as a part of 'git-fetch(1)', take place one at a time. For each reference update, a new transaction is created and committed. This is necessary to ensure we can allow individual updates to fail without failing the entire command. The command also supports an '--atomic' mode, which uses a single transaction to update all of the references. But this mode has an all-or-nothing approach, where if a single update fails, all updates would fail. In 23fc8e4f61 (refs: implement batch reference update support, 2025-04-08), we introduced a new mechanism to batch reference updates. Under the hood, this uses a single transaction to perform a batch of reference updates, while allowing only individual updates to fail. Utilize this newly introduced batch update mechanism in 'git-fetch(1)'. This provides a significant bump in performance, especially when dealing with repositories with large number of references. Adding support for batched updates is simply modifying the flow to also create a batch update transaction in the non-atomic flow. With the reftable backend there is a 22x performance improvement, when performing 'git-fetch(1)' with 10000 refs: Benchmark 1: fetch: many refs (refformat = reftable, refcount = 10000, revision = master) Time (mean ± σ): 3.403 s ± 0.775 s [User: 1.875 s, System: 1.417 s] Range (min … max): 2.454 s … 4.529 s 10 runs Benchmark 2: fetch: many refs (refformat = reftable, refcount = 10000, revision = HEAD) Time (mean ± σ): 154.3 ms ± 17.6 ms [User: 102.5 ms, System: 56.1 ms] Range (min … max): 145.2 ms … 220.5 ms 18 runs Summary fetch: many refs (refformat = reftable, refcount = 10000, revision = HEAD) ran 22.06 ± 5.62 times faster than fetch: many refs (refformat = reftable, refcount = 10000, revision = master) In similar conditions, the files backend sees a 1.25x performance improvement: Benchmark 1: fetch: many refs (refformat = files, refcount = 10000, revision = master) Time (mean ± σ): 605.5 ms ± 9.4 ms [User: 117.8 ms, System: 483.3 ms] Range (min … max): 595.6 ms … 621.5 ms 10 runs Benchmark 2: fetch: many refs (refformat = files, refcount = 10000, revision = HEAD) Time (mean ± σ): 485.8 ms ± 4.3 ms [User: 91.1 ms, System: 396.7 ms] Range (min … max): 477.6 ms … 494.3 ms 10 runs Summary fetch: many refs (refformat = files, refcount = 10000, revision = HEAD) ran 1.25 ± 0.02 times faster than fetch: many refs (refformat = files, refcount = 10000, revision = master) With this we'll either be using a regular transaction or a batch update transaction. This helps cleanup some code which is no longer needed as we'll now always have some type of 'ref_transaction' object being propagated. One big change is that earlier, each individual update would propagate a failure. Whereas now, the `ref_transaction_for_each_rejected_update` function is called at the end of the flow to capture the exit status for 'git-fetch(1)' and also to print F/D conflict errors. This does change the order of the errors being printed, but the behavior stays the same. Since transaction errors are now explicitly defined as part of 76e760b999 (refs: introduce enum-based transaction error types, 2025-04-08), utilize them and get rid of custom errors defined within 'builtin/fetch.c'. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15fetch: avoid unnecessary work when there is no current branchJohannes Schindelin1-1/+1
As pointed out by CodeQL, `branch_get()` may return `NULL`, in which case `branch_has_merge_config()` would return early, but we can even avoid enumerating the refs prefixes in that case, saving even more CPU cycles. Technically, we should enclose these two statements in an `if (branch) {...}` block, but the indentation is already quite deep, therefore I refrained from doing that. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15fetch: carefully clear local variable's address after useJohannes Schindelin1-0/+1
As pointed out by CodeQL, it is a potentially dangerous practice to store local variables' addresses in non-local structs. Yet this is exactly what happens with the `acked_commits` attribute that is used in `cmd_fetch()`: The pointer to a local variable is assigned to it. Now, it is Git's convention that `cmd_*()` functions are essentially only returning just before exiting the process, therefore there is little danger that this attribute is used after the code flow returns from that function. However, code in `cmd_*()` function is often so useful that it gets lifted into a library function, at which point this issue could become a real problem. Let's make sure to clear the `acked_commits` attribute out after it was used, and before the function returns (at which point the address would go stale). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12Merge branch 'ps/object-store-cleanup'Junio C Hamano1-8/+7
Further code clean-up in the object-store layer. * ps/object-store-cleanup: object-store: drop `repo_has_object_file()` treewide: convert users of `repo_has_object_file()` to `has_object()` object-store: allow fetching objects via `has_object()` object-store: move function declarations to their respective subsystems object-store: move and rename `odb_pack_keep()` object-store: drop `loose_object_path()` object-store: move `struct packed_git` into "packfile.h"
2025-04-29treewide: convert users of `repo_has_object_file()` to `has_object()`Patrick Steinhardt1-8/+7
As the comment of `repo_has_object_file()` and its `_with_flags()` variant tells us, these functions are considered to be deprecated in favor of `has_object()`. There are a couple of slight benefits in favor of the replacement: - The new function has a short-and-sweet name. - More explicit defaults: `has_object()` doesn't fetch missing objects via promisor remotes, and neither does it reload packfiles if an object wasn't found by default. This ensures that it becomes immediately obvious when a simple object existence check may result in expensive actions. Most importantly though, it is confusing that we have two sets of functions that ultimately do the same thing, but with different defaults. Start sunsetting `repo_has_object_file()` and its `_with_flags()` sibling by replacing all callsites with `has_object()`: - `repo_has_object_file(...)` is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT)` is equivalent to `has_object(..., 0)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_SKIP_FETCH_OBJECT)` is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)` is equivalent to `has_object(..., HAS_OBJECT_FETCH_PROMISOR)`. The replacements should be functionally equivalent. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-24Merge branch 'ps/parse-options-integers'Junio C Hamano1-2/+8
Update parse-options API to catch mistakes to pass address of an integral variable of a wrong type/size. * ps/parse-options-integers: parse-options: detect mismatches in integer signedness parse-options: introduce precision handling for `OPTION_UNSIGNED` parse-options: introduce precision handling for `OPTION_INTEGER` parse-options: rename `OPT_MAGNITUDE()` to `OPT_UNSIGNED()` parse-options: support unit factors in `OPT_INTEGER()` global: use designated initializers for options parse: fix off-by-one for minimum signed values
2025-04-24Merge branch 'ps/object-file-cleanup'Junio C Hamano1-1/+1
Code clean-up. * ps/object-file-cleanup: object-store: merge "object-store-ll.h" and "object-store.h" object-store: remove global array of cached objects object: split out functions relating to object store subsystem object-file: drop `index_blob_stream()` object-file: split up concerns of `HASH_*` flags object-file: split out functions relating to object store subsystem object-file: move `xmmap()` into "wrapper.c" object-file: move `git_open_cloexec()` to "compat/open.c" object-file: move `safe_create_leading_directories()` into "path.c" object-file: move `mkdir_in_gitdir()` into "path.c"
2025-04-24Merge branch 'ps/object-file-cleanup' into ps/object-store-cleanupJunio C Hamano1-1/+1
* ps/object-file-cleanup: object-store: merge "object-store-ll.h" and "object-store.h" object-store: remove global array of cached objects object: split out functions relating to object store subsystem object-file: drop `index_blob_stream()` object-file: split up concerns of `HASH_*` flags object-file: split out functions relating to object store subsystem object-file: move `xmmap()` into "wrapper.c" object-file: move `git_open_cloexec()` to "compat/open.c" object-file: move `safe_create_leading_directories()` into "path.c" object-file: move `mkdir_in_gitdir()` into "path.c"
2025-04-17Merge branch 'jk/fetch-follow-remote-head-fix'Junio C Hamano1-25/+11
"git fetch [<remote>]" with only the configured fetch refspec should be the only thing to update refs/remotes/<remote>/HEAD, but the code was overly eager to do so in other cases. * jk/fetch-follow-remote-head-fix: fetch: make set_head() call easier to read fetch: don't ask for remote HEAD if followRemoteHEAD is "never" fetch: only respect followRemoteHEAD with configured refspecs
2025-04-17global: use designated initializers for optionsPatrick Steinhardt1-2/+8
While we expose macros for most of our different option types understood by the "parse-options" subsystem, not every combination of fields that has one as that would otherwise quickly lead to an explosion of macros. Instead, we just initialize structures manually for those variants of fields that don't have a macro. Callsites that open-code these structure initialization don't use designated initializers though and instead just provide values for each of the fields that they want to initialize. This has three significant downsides: - Callsites need to specify all values up to the last field that they care about. This often includes fields that should simply be left at their default zero-initialized state, which adds distraction. - Any reader not deeply familiar with the layout of the structure has a hard time figuring out what the respective initializers mean. - Reordering or introducing new fields in the middle of the structure is impossible without adapting all callsites. Convert all sites to instead use designated initializers, which we have started using in our codebase quite a while ago. This allows us to skip any default-initialized fields, gives the reader context by specifying the field names and allows us to reorder or introduce new fields where we want to. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-16Merge branch 'kn/non-transactional-batch-updates'Junio C Hamano1-1/+1
Updating multiple references have only been possible in all-or-none fashion with transactions, but it can be more efficient to batch multiple updates even when some of them are allowed to fail in a best-effort manner. A new "best effort batches of updates" mode has been introduced. * kn/non-transactional-batch-updates: update-ref: add --batch-updates flag for stdin mode refs: support rejection in batch updates during F/D checks refs: implement batch reference update support refs: introduce enum-based transaction error types refs/reftable: extract code from the transaction preparation refs/files: remove duplicate duplicates check refs: move duplicate refname update check to generic layer refs/files: remove redundant check in split_symref_update()
2025-04-16Merge branch 'jt/ref-transaction-abort-fix'Junio C Hamano1-1/+8
A ref transaction corner case fix. * jt/ref-transaction-abort-fix: builtin/fetch: avoid aborting closed reference transaction
2025-04-15Merge branch 'jt/clone-guess-remote-head-fix'Junio C Hamano1-1/+1
"git clone" still gave the message about the default branch name; this message has been turned into an advice message that can be turned off. * jt/clone-guess-remote-head-fix: advice: allow disabling default branch name advice builtin/clone: suppress unexpected default branch advice remote: allow `guess_remote_head()` to suppress advice
2025-04-15object-store: merge "object-store-ll.h" and "object-store.h"Patrick Steinhardt1-1/+1
The "object-store-ll.h" header has been introduced to keep transitive header dependendcies and compile times at bay. Now that we have created a new "object-store.c" file though we can easily move the last remaining additional bit of "object-store.h", the `odb_path_map`, out of the header. Do so. As the "object-store.h" header is now equivalent to its low-level alternative we drop the latter and inline it into the former. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-09fetch: make set_head() call easier to readJeff King1-4/+5
We ignore any error returned from set_head(), but 638060dcb9 (fetch set_head: refactor to use remote directly, 2025-01-26) left its call in a noop "if" conditional as a sort of note-to-self. When c834d1a7ce (fetch: only respect followRemoteHEAD with configured refspecs, 2025-03-18) added a "do_set_head" flag, it was rolled into the same conditional, putting set_head() on the right-hand side of a short-circuit AND. That's not wrong, but it really hides the point of the line, which is (maybe) calling the function. Instead, let's have a full if() block for the flag, and then our comment (with some rewording) will be sufficient to clarify the error handling. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08refs: introduce enum-based transaction error typesKarthik Nayak1-1/+1
Replace preprocessor-defined transaction errors with a strongly-typed enum `ref_transaction_error`. This change: - Improves type safety and function signature clarity. - Makes error handling more explicit and discoverable. - Maintains existing error cases, while adding new error cases for common scenarios. This refactoring paves the way for more comprehensive error handling which we will utilize in the upcoming commits to add batch reference update support. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-25remote: allow `guess_remote_head()` to suppress adviceJustin Tobler1-1/+1
The `repo_default_branch_name()` invoked through `guess_remote_head()` is configured to always display the default branch advice message. Adapt `guess_remote_head()` to accept flags and convert the `all` parameter to a flag. Add the `REMOTE_GUESS_HEAD_QUIET` flag to to enable suppression of advice messages. Call sites are updated accordingly. Signed-off-by: Justin Tobler <jltobler@gmail.com> Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21builtin/fetch: avoid aborting closed reference transactionJustin Tobler1-1/+8
As part of the reference transaction commit phase, the transaction is set to a closed state regardless of whether it was successful of not. Attempting to abort a closed transaction via `ref_transaction_abort()` results in a `BUG()`. In c92abe71df (builtin/fetch: fix leaking transaction with `--atomic`, 2024-08-22), logic to free a transaction after the commit phase is moved to the centralized exit path. In cases where the transaction commit failed, this results in a closed transaction being aborted and signaling a bug. Free the transaction and set it to NULL when the commit fails. This allows the exit path to correctly handle the error without attempting to abort the transaction. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21refspec: replace `refspec_item_init()` with fetch/push variantsTaylor Blau1-1/+1
For similar reasons as in the previous refactoring of `refspec_init()` into `refspec_init_fetch()` and `refspec_init_push()`, apply the same refactoring to `refspec_item_init()`. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-18fetch: don't ask for remote HEAD if followRemoteHEAD is "never"Jeff King1-4/+2
When we are going to consider updating the refs/remotes/*/HEAD symref, we have to ask the remote side where its HEAD points. But if we know that the feature is disabled by config, we don't need to bother! This saves a little bit of work and network communication for the server. And even a little bit of effort on the client, as our local set_head() function did a bit of work matching the remote HEAD before realizing that we're not going to do anything with it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-18fetch: only respect followRemoteHEAD with configured refspecsJeff King1-19/+6
The new followRemoteHEAD feature is triggered for almost every fetch, causing us to ask the server about the remote "HEAD" and to consider updating our local tracking HEAD symref. This patch limits the feature only to the case when we are fetching a remote using its configured refspecs (typically into its refs/remotes/ hierarchy). There are two reasons for this. One is efficiency. E.g., the fixes in 6c915c3f85 (fetch: do not ask for HEAD unnecessarily, 2024-12-06) and 20010b8c20 (fetch: avoid ls-refs only to ask for HEAD symref update, 2025-03-08) were aimed at reducing the work we do when we would not be able to update HEAD anyway. But they do not quite cover all cases. The remaining one is: git fetch origin refs/heads/foo:refs/remotes/origin/foo which _sometimes_ can update HEAD, but usually not. And that leads us to the second point, which is being simple and explainable. The code for updating the tracking HEAD symref requires both that we learned which ref the remote HEAD points at, and that the server advertised that ref to us. But because the v2 protocol narrows the server's advertisement, the command above would not typically update HEAD at all, unless it happened to point to the "foo" branch. Or even weirder, it probably _would_ update if the server is very old and supports only the v0 protocol, which always gives a full advertisement. This creates confusing behavior for the user: sometimes we may try to update HEAD and sometimes not, depending on vague rules. One option here would be to loosen the update code to accept the remote HEAD even if the server did not advertise that ref. I think that could work, but it may also lead to interesting corner cases (e.g., creating a dangling symref locally, even though the branch is not unborn on the server, if we happen not to have fetched it). So let's instead simplify the rules: we'll only consider updating the tracking HEAD symref when we're doing a full fetch of the remote's configured refs. This is easy to implement; we can just set a flag at the moment we realize we're using the configured refspecs. And we can drop the special case code added by 6c915c3f85 and 20010b8c20, since this covers those cases. The existing tests from those commits still pass. In t5505, an incidental call to "git fetch <remote> <refspec>" updated HEAD, which caused us to adjust the test in 3f763ddf28 (fetch: set remote/HEAD if it does not exist, 2024-11-22). We can now adjust that back to how it was before the feature was added. Even though t5505 is incidentally testing our new desired behavior, we'll add an explicit test in t5510 to make sure it is covered. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fetch: use ref prefix list to skip ls-refsJeff King1-20/+7
In git-fetch we have an optimization to avoid issuing an ls-refs command to the server if we don't care about the value of any refs (e.g., because we are fetching exact object ids), saving a round-trip to the server. This comes from e70a3030e7 (fetch: do not list refs if fetching only hashes, 2018-09-27). It uses an explicit flag "must_list_refs" to decide when we need to do so. That was needed back then, because the list of ref-prefixes was not always complete. If it was empty, it did not necessarily mean that we were not interested in any refs). But that is no longer the case; an empty list of prefixes means that we truly do not care about any refs. And so rather than an explicit flag, we can just check whether we are interested in any ref prefixes. This simplifies the code slightly, as there is now a single source of truth for the decision. It also fixes a bug in / optimizes a very unlikely case, which is: git fetch $remote ^foo $oid I.e., a negative refspec combined with an exact oid fetch. This is somewhat nonsense, in that there are no positive refspecs mentioning refs to countermand with the negative one. But we should be able to do this without issuing an ls-refs command (excluding "foo" from the empty set will obviously still be the empty set). However, the current code does not do so. The negative refspec is not counted as a noop in un-setting the must_list_refs flag (hardly the fault of e70a3030e7, as negative refspecs did not appear until much later). But by using the prefix list as a source of truth, this naturally just works; the negative refspec does not add a prefix to ask about, and hence does not trigger the ls-refs call. This is esoteric enough that I didn't bother adding a test. The real value here is in the code simplification. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fetch: avoid ls-refs only to ask for HEAD symref updateJeff King1-3/+2
When we fetch from a configured remote, we may try to update the local refs/remotes/<origin>/HEAD, and so we ask the server to advertise its HEAD to us. But if we aren't otherwise asking about any refs at all, then we know this HEAD update can never happen! To consider a new value for HEAD, the set_head() function uses guess_remote_head(). And even if it sees an explicit symref value for HEAD, it will only report that as a match if we also saw that remote ref advertised, and it mapped to a local tracking ref via get_fetch_map(). In other words, a fetch like this: git fetch origin $exact_oid:refs/heads/foo can never update HEAD, because we will never have fetched (nor even see the advertisement for) the ref that HEAD points to. Currently the command above will still call ls-refs to ask about the HEAD, even though it is pointless. This patch teaches it to skip the ls-refs call entirely in this case, which avoids a round-trip to the server. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fetch: stop protecting additions to ref-prefix listJeff King1-6/+4
When using the ref-prefix feature of protocol v2, a client which sends no prefixes at all will get the full advertisement. And so the code in git-fetch was historically loose about setting up that list based on our refspecs. There were cases where we needed to know about some refs, so we just didn't add anything to the ref-prefix list. And hence further code, like that for tag-following and updating origin/HEAD, had to be careful about adding to an empty list. E.g., see the bug fixed by bd52d9a058 (fetch: fix following tags when fetching specific OID, 2025-03-07). But the previous commit removed the last such case, and now we know an empty ref-prefix list (at least inside git-fetch's do_fetch() function) means that we really don't need to see any refs. So we can drop those extra conditionals. This simplifies the code a little. But it also means that some cases can now use ref prefixes when they would not otherwise. As the test shows, fetching an exact oid into a local ref can now avoid enumerating all of the refs. The refspec itself doesn't need to know about any remote refs, and the tag auto-following can just ask about refs/tags/. The same is true for asking about HEAD to update the local origin/HEAD. I didn't add a test for that yet, though, as we can optimize it even further. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10fetch: ask server to advertise HEAD for config-less fetchJeff King1-0/+8
If we're not given any refspecs (either on the command line or via config) and we have no branch merge config, then we fetch the remote HEAD into our local FETCH_HEAD. In that case we do not send any ref-prefix option to the server at all, and we see the full advertisement. But this is sub-optimal. We only care about HEAD, so we can just ask for that, and ignore all of the other refs. The new test demonstrates a case where we see fewer refs (in this case only one less, but in theory we could be ignoring millions of them). This also removes the only case where we care about seeing some refs from the other side, but don't add anything to the ref_prefixes list. Cleaning this up means one less maintenance burden. Before this patch, any code which wanted to add to the list had to make sure the list was not empty, since an empty list meant "ask for everything". Now it really means "we are not interested in any refs". This should let us optimize a few more cases in subsequent patches. Note that we'll add "HEAD" to the list of prefixes, and later code for updating "refs/remotes/<remote>/HEAD" may likewise do so. In theory this could cause duplicates in the list, but in practice these can't both trigger. We hit our new case only if there are no refspecs, and the "<remote>/HEAD" feature is enabled only when we are fetching from a remote with configured refspecs. We could be defensive with a flag, but it didn't seem worth it to me (the absolute worse case is a useless redundant ref-prefix line sent to the server). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10Merge branch 'tb/fetch-follow-tags-fix'Junio C Hamano1-1/+3
* tb/fetch-follow-tags-fix: fetch: fix following tags when fetching specific OID
2025-03-07fetch: fix following tags when fetching specific OIDTaylor Blau1-1/+3
In 3f763ddf28 (fetch: set remote/HEAD if it does not exist, 2024-11-22), unconditionally adds "HEAD" to the list of ref prefixes we send to the server. This breaks a core assumption that the list of prefixes we send to the server is complete. We must either send all prefixes we care about, or none at all (in the latter case the server then advertises everything). The tag following code is careful to only add "refs/tags/" to the list of prefixes if there are already entries in the prefix list. But because the new code from 3f763ddf28 runs after the tag code, and because it unconditionally adds to the prefix list, we may end up with a prefix list that _should_ have "refs/tags/" in it, but doesn't. When that is the case, the server does not advertise any tags, and our auto-following breaks because we never learned about any tags in the first place. Fix this by only adding "HEAD" to the ref prefixes when we know that we are already limiting the advertisement. In either case we'll learn about HEAD (either through the limited advertisement, or implicitly through a full advertisement). Reported-by: Igor Todorovski <itodorov@ca.ibm.com> Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27fetch set_head: fix non-mirror remotes in bare repositoriesBence Ferdinandy1-5/+5
In b1b713f722 (fetch set_head: handle mirrored bare repositories, 2024-11-22) it was implicitly assumed that all remotes will be mirrors in a bare repository, thus fetching a non-mirrored remote could lead to HEAD pointing to a non-existent reference. Make sure we only overwrite HEAD if we are in a bare repository and fetching from a mirror. Otherwise, proceed as normally, and create refs/remotes/<nonmirrorremote>/HEAD instead. Reported-by: Christian Hesse <list@eworm.de> Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27fetch set_head: refactor to use remote directlyBence Ferdinandy1-8/+7
As a preparatory step to use even more properties from the remote struct, refactor set_head to take the entire struct as a parameter, instead of the necessary bits. This also allows consolidating the use of gtransport->remote in set_head, making the access of the remote's properties consistent in the function. Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-06Merge branch 'bf/fetch-set-head-config'Junio C Hamano1-2/+2
A hotfix on an advice messagge added during this cycle. * bf/fetch-set-head-config: fetch: fix erroneous set_head advice message
2025-01-06fetch: fix erroneous set_head advice messageBence Ferdinandy1-2/+2
9e2b7005be (fetch set_head: add warn-if-not-$branch option, 2024-12-05) tried to expand the advice message for set_head with the new option, but unfortunately did not manage to add the right incantation. Fix the advice message with the correct usage of warn-if-not-$branch. Reported-by: Teng Long <dyroneteng@gmail.com> Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-23Merge branch 'ps/build-sign-compare'Junio C Hamano1-0/+3
Start working to make the codebase buildable with -Wsign-compare. * ps/build-sign-compare: t/helper: don't depend on implicit wraparound scalar: address -Wsign-compare warnings builtin/patch-id: fix type of `get_one_patchid()` builtin/blame: fix type of `length` variable when emitting object ID gpg-interface: address -Wsign-comparison warnings daemon: fix type of `max_connections` daemon: fix loops that have mismatching integer types global: trivial conversions to fix `-Wsign-compare` warnings pkt-line: fix -Wsign-compare warning on 32 bit platform csum-file: fix -Wsign-compare warning on 32-bit platform diff.h: fix index used to loop through unsigned integer config.mak.dev: drop `-Wno-sign-compare` global: mark code units that generate warnings with `-Wsign-compare` compat/win32: fix -Wsign-compare warning in "wWinMain()" compat/regex: explicitly ignore "-Wsign-compare" warnings git-compat-util: introduce macros to disable "-Wsign-compare" warnings
2024-12-19Merge branch 'bf/fetch-set-head-config'Junio C Hamano1-6/+55
"git fetch" honors "remote.<remote>.followRemoteHEAD" settings to tweak the remote-tracking HEAD in "refs/remotes/<remote>/HEAD". * bf/fetch-set-head-config: remote set-head: set followRemoteHEAD to "warn" if "always" fetch set_head: add warn-if-not-$branch option fetch set_head: move warn advice into advise_if_enabled fetch: add configuration for set_head behaviour
2024-12-19Merge branch 'jc/set-head-symref-fix'Junio C Hamano1-1/+19
"git fetch" from a configured remote learned to update a missing remote-tracking HEAD but it asked the remote about their HEAD even when it did not need to, which has been corrected. Incidentally, this also corrects "git fetch --tags $URL" which was broken by the new feature in an unspecified way. * jc/set-head-symref-fix: fetch: do not ask for HEAD unnecessarily
2024-12-19Merge branch 'bf/set-head-symref'Junio C Hamano1-0/+74
When "git fetch $remote" notices that refs/remotes/$remote/HEAD is missing and discovers what branch the other side points with its HEAD, refs/remotes/$remote/HEAD is updated to point to it. * bf/set-head-symref: fetch set_head: handle mirrored bare repositories fetch: set remote/HEAD if it does not exist refs: add create_only option to refs_update_symref_extended refs: add TRANSACTION_CREATE_EXISTS error remote set-head: better output for --auto remote set-head: refactor for readability refs: atomically record overwritten ref in update_symref refs: standardize output of refs_read_symbolic_ref t/t5505-remote: test failure of set-head t/t5505-remote: set default branch to main
2024-12-07fetch: do not ask for HEAD unnecessarilyJunio C Hamano1-1/+19
In 3f763ddf28 (fetch: set remote/HEAD if it does not exist, 2024-11-22), git-fetch learned to opportunistically set $REMOTE/HEAD when fetching by always asking for remote HEAD, in the hope that it will help setting refs/remotes/<name>/HEAD if missing. But it is not needed to always ask for remote HEAD. When we are fetching from a remote, for which we have remote-tracking branches, we do need to know about HEAD. But if we are doing one-shot fetch, e.g., $ git fetch --tags https://github.com/git/git we do not even know what sub-hierarchy of refs/remotes/<remote>/ we need to adjust the remote HEAD for. There is no need to ask for HEAD in such a case. Incidentally, because the unconditional request to list "HEAD" affected the number of ref-prefixes requested in the ls-remote request, this affected how the requests for tags are added to the same ls-remote request, breaking "git fetch --tags $URL" performed against a URL that is not configured as a remote. Reported-by: Josh Steadmon <steadmon@google.com> [jc: tests are also borrowed from Josh's patch] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06global: mark code units that generate warnings with `-Wsign-compare`Patrick Steinhardt1-0/+3
Mark code units that generate warnings with `-Wsign-compare`. This allows for a structured approach to get rid of all such warnings over time in a way that can be easily measured. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06fetch set_head: add warn-if-not-$branch optionBence Ferdinandy1-5/+11
Currently if we want to have a remote/HEAD locally that is different from the one on the remote, but we still want to get a warning if remote changes HEAD, our only option is to have an indiscriminate warning with "follow_remote_head" set to "warn". Add a new option "warn-if-not-$branch", where $branch is a branch name we do not wish to get a warning about. If the remote HEAD is $branch do not warn, otherwise, behave as "warn". E.g. let's assume, that our remote origin has HEAD set to "master", but locally we have "git remote set-head origin seen". Setting 'remote.origin.followRemoteHEAD = "warn"' will always print a warning, even though the remote has not changed HEAD from "master". Setting 'remote.origin.followRemoteHEAD = "warn-if-not-master" will squelch the warning message, unless the remote changes HEAD from "master". Note, that should the remote change HEAD to "seen" (which we have locally), there will still be no warning. Improve the advice message in report_set_head to also include silencing the warning message with "warn-if-not-$branch". Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06fetch set_head: move warn advice into advise_if_enabledBence Ferdinandy1-4/+13
Advice about what to do when getting a warning is typed out explicitly twice and is printed as regular output. The output is also tested for. Extract the advice message into a single place and use a wrapper function, so if later the advice is made more chatty the signature only needs to be changed in once place. Remove the testing for the advice output in the tests. Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04Merge branch 'ps/ref-backend-migration-optim'Junio C Hamano1-2/+2
The migration procedure between two ref backends has been optimized. * ps/ref-backend-migration-optim: reftable: rename scratch buffer refs: adapt `initial_transaction` flag to be unsigned reftable/block: optimize allocations by using scratch buffer reftable/block: rename `block_writer::buf` variable reftable/writer: optimize allocations by using a scratch buffer refs: don't normalize log messages with `REF_SKIP_CREATE_REFLOG` refs: skip collision checks in initial transactions refs: use "initial" transaction semantics to migrate refs refs/files: support symbolic and root refs in initial transaction refs: introduce "initial" transaction flag refs/files: move logic to commit initial transaction refs: allow passing flags when setting up a transaction
2024-12-02fetch: add configuration for set_head behaviourBence Ferdinandy1-6/+40
In the current implementation, if refs/remotes/$remote/HEAD does not exist, running fetch will create it, but if it does exist it will not do anything, which is a somewhat safe and minimal approach. Unfortunately, for users who wish to NOT have refs/remotes/$remote/HEAD set for any reason (e.g. so that `git rev-parse origin` doesn't accidentally point them somewhere they do not want to), there is no way to remove this behaviour. On the other side of the spectrum, users may want fetch to automatically update HEAD or at least give them a warning if something changed on the remote. Introduce a new setting, remote.$remote.followRemoteHEAD with four options: - "never": do not ever do anything, not even create - "create": the current behaviour, now the default behaviour - "warn": print a message if remote and local HEAD is different - "always": silently update HEAD on every change Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-27Merge branch 'bf/set-head-symref' into bf/fetch-set-head-configJunio C Hamano1-0/+74
* bf/set-head-symref: fetch set_head: handle mirrored bare repositories fetch: set remote/HEAD if it does not exist refs: add create_only option to refs_update_symref_extended refs: add TRANSACTION_CREATE_EXISTS error remote set-head: better output for --auto remote set-head: refactor for readability refs: atomically record overwritten ref in update_symref refs: standardize output of refs_read_symbolic_ref t/t5505-remote: test failure of set-head t/t5505-remote: set default branch to main
2024-11-25fetch set_head: handle mirrored bare repositoriesBence Ferdinandy1-5/+11
When adding a remote to bare repository with "git remote add --mirror", running fetch will fail to update HEAD to the remote's HEAD, since it does not know how to handle bare repositories. On the other hand HEAD already has content, since "git init --bare" has already set HEAD to whatever is the default branch set for the user. Unless this - by chance - is the same as the remote's HEAD, HEAD will be pointing to a bad symref. Teach set_head to handle bare repositories, by overwriting HEAD so it mirrors the remote's HEAD. Note, that in this case overriding the local HEAD reference is necessary, since HEAD will exist before fetch can be run, but this should not be an issue, since the whole purpose of --mirror is to be an exact mirror of the remote, so following any changes to HEAD makes sense. Also note, that although "git remote set-head" also fails when trying to update the remote's locally tracked HEAD in a mirrored bare repository, the usage of the command does not make much sense after this patch: fetch will update the remote HEAD correctly, and setting it manually to something else is antithetical to the concept of mirroring. Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-25fetch: set remote/HEAD if it does not existBence Ferdinandy1-0/+68
When cloning a repository remote/HEAD is created, but when the user creates a repository with git init, and later adds a remote, remote/HEAD is only created if the user explicitly runs a variant of "remote set-head". Attempt to set remote/HEAD during fetch, if the user does not have it already set. Silently ignore any errors. Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-22Merge branch 'jk/fetch-prefetch-double-free-fix'Junio C Hamano1-6/+2
Double-free fix. * jk/fetch-prefetch-double-free-fix: refspec: store raw refspecs inside refspec_item refspec: drop separate raw_nr count fetch: adjust refspec->raw_nr when filtering prefetch refspecs
2024-11-21refs: allow passing flags when setting up a transactionPatrick Steinhardt1-2/+2
Allow passing flags when setting up a transaction such that the behaviour of the transaction itself can be altered. This functionality will be used in a subsequent patch. Adapt callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-12refspec: store raw refspecs inside refspec_itemJeff King1-6/+2
The refspec struct keeps two matched arrays: one for the refspec_item structs and one for the original raw refspec strings. The main reason for this is that there are other users of refspec_item that do not care about the raw strings. But it does make managing the refspec struct awkward, as we must keep the two arrays in sync. This has led to bugs in the past (both leaks and double-frees). Let's just store a copy of the raw refspec string directly in each refspec_item struct. This simplifies the handling at a small cost: 1. Direct callers of refspec_item_init() will now get an extra copy of the refspec string, even if they don't need it. This should be negligible, as the struct is already allocating two strings for the parsed src/dst values (and we tend to only do it sparingly anyway for things like the TAG_REFSPEC literal). 2. Users of refspec_appendf() will now generate a temporary string, copy it, and then free the result (versus handing off ownership of the temporary string). We could get around this by having a "nodup" variant of refspec_item_init(), but it doesn't seem worth the extra complexity for something that is not remotely a hot code path. Code which accesses refspec->raw now needs to look at refspec->item.raw. Other callers which just use refspec_item directly can remain the same. We'll free the allocated string in refspec_item_clear(), which they should be calling anyway to free src/dst. One subtle note: refspec_item_init() can return an error, in which case we'll still have set its "raw" field. But that is also true of the "src" and "dst" fields, so any caller which does not _clear() the failed item is already potentially leaking. In practice most code just calls die() on an error anyway, but you can see the exception in valid_fetch_refspec(), which does correctly call _clear() even on error. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-12refspec: drop separate raw_nr countJeff King1-1/+0
A refspec struct contains zero or more refspec_item structs, along with matching "raw" strings. The items and raw strings are kept in separate arrays, but those arrays will always have the same length (because we write them only via refspec_append_nodup(), which grows both). This can lead to bugs when manipulating the array, since the arrays and lengths must be modified in lockstep. For example, the bug fixed in the previous commit, which forgot to decrement raw_nr. So let's get rid of "raw_nr" and have only "nr", making this kind of bug impossible (and also making it clear that the two are always matched, something that existing code already assumed but was not guaranteed by the interface). Even though we'd expect "alloc" and "raw_alloc" to likewise move in lockstep, we still need to keep separate counts there if we want to continue to use ALLOC_GROW() for both. Conceptually this would all be simpler if refspec_item just held onto its own raw string, and we had a single array. But there are callers which use refspec_item outside of "struct refspec" (and so don't hold on to a matching "raw" string at all), which we'd possibly need to adjust. So let's not worry about refactoring that for now, and just get rid of the redundant count variable. That is the first step on the road to combining them anyway. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-12fetch: adjust refspec->raw_nr when filtering prefetch refspecsJeff King1-0/+1
In filter_prefetch_refspecs(), we may remove one or more refspecs if they point into refs/tags/. When we do, we remove the item from the refspec->items array, shifting subsequent items down, and then decrement the refspec->nr count. We also remove the item from the refspec->raw array, but fail to decrement refspec->raw_nr. This leaves us with a count that is too high, and anybody looking at the "raw" array will erroneously see either: 1. The removed entry, if there were no subsequent items to shift down. 2. A duplicate of the final entry, as everything is shifted down but there was nothing to overwrite the final item. The obvious culprit to run into this is calling refspec_clear(), which will try to free the removed entry (case 1) or double-free the final entry (case 2). But even though the bug has existed since the function was added in 2e03115d0c (fetch: add --prefetch option, 2021-04-16), we did not trigger it in the test suite. The --prefetch option is normally only used with configured refspecs, and we never bother to call refspec_clear() on those (they are stored as part of a struct remote, which is held in a global variable). But you could trigger case 2 manually like: git fetch --prefetch . refs/tags/foo refs/tags/bar Ironically you couldn't trigger case 1, because the code accidentally leaked the string in the raw array, and the two bugs (the leak and the double-free) cancelled out. But when we fixed the leak in ea4780307c (fetch: free "raw" string when shrinking refspec, 2024-09-24), it became possible to trigger that, too, with a single item: git fetch --prefetch . refs/tags/foo We can fix both cases by just correctly decrementing "raw_nr" when we shrink the array. Even though we don't expect people to use --prefetch with command-line refspecs, we'll add a test to make sure it behaves well (like the test just before it, we're just confirming that the filtered prefetch succeeds at all). Reported-by: Eric Mills <ermills@epic.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04doc: correct misleading descriptions for --shallow-excludeElijah Newren1-2/+2
The documentation for the --shallow-exclude option to clone/fetch/etc. claims that the option takes a revision, but it does not. As per upload-pack.c's process_deepen_not(), it passes the option to expand_ref() and dies if it does not find exactly one ref matching the name passed. Further, this has always been the case ever since these options were introduced by the commits merged in a460ea4a3cb1 (Merge branch 'nd/shallow-deepen', 2016-10-10). Fix the documentation to match the implementation. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08fetch: respect --server-option when fetching multiple remotesXing Xin1-0/+2
Fix an issue where server options specified via the command line (`--server-option` or `-o`) were not sent when fetching from multiple remotes using Git protocol v2. To reproduce the issue with a repository containing multiple remotes: GIT_TRACE_PACKET=1 git -c protocol.version=2 fetch --server-option=demo --all Observe that no server options are sent to any remote. The root cause was identified in `builtin/fetch.c:fetch_multiple`, which is invoked when fetching from more than one remote. This function forks a `git-fetch` subprocess for each remote but did not include the specified server options in the subprocess arguments. This commit ensures that command-line specified server options are properly passed to each subprocess. Relevant tests have been added. Signed-off-by: Xing Xin <xingxin.xx@bytedance.com> Reviewed-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25fetch: free "raw" string when shrinking refspecJeff King1-0/+1
The "--prefetch" option to git-fetch modifies the default refspec, including eliminating some entries entirely. When we drop an entry we free the strings in the refspec_item, but we forgot to free the matching string in the "raw" array of the refspec struct. There's no behavioral bug here (since we correctly shrink the raw array, too), but we're leaking the allocated string. Let's add in the leak-fix, and while we're at it drop "const" from the type of the raw string array. These strings are always allocated by refspec_append(), etc, and this makes the memory ownership more clear. This is all a bit more intimate with the refspec code than I'd like, and I suspect it would be better if each refspec_item held on to its own raw string, we had a single array, and we could use refspec_item_clear() to clean up everything. But that's a non-trivial refactoring, since refspec_item structs can be held outside of a "struct refspec", without having a matching raw string at all. So let's leave that for now and just fix the leak in the most immediate way. This lets us mark t5582 as leak-free. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-23Merge branch 'jc/pass-repo-to-builtins'Junio C Hamano1-2/+5
The convention to calling into built-in command implementation has been updated to pass the repository, if known, together with the prefix value. * jc/pass-repo-to-builtins: add: pass in repo variable instead of global the_repository builtin: remove USE_THE_REPOSITORY for those without the_repository builtin: remove USE_THE_REPOSITORY_VARIABLE from builtin.h builtin: add a repository parameter for builtin functions
2024-09-13builtin: remove USE_THE_REPOSITORY_VARIABLE from builtin.hJohn Cai1-1/+1
Instead of including USE_THE_REPOSITORY_VARIABLE by default on every builtin, remove it from builtin.h and add it to all the builtins that include builtin.h (by definition, that means all builtins/*.c). Also, remove the include statement for repository.h since it gets brought in through builtin.h. The next step will be to migrate each builtin from having to use the_repository. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-13builtin: add a repository parameter for builtin functionsJohn Cai1-1/+4
In order to reduce the usage of the global the_repository, add a parameter to builtin functions that will get passed a repository variable. This commit uses UNUSED on most of the builtin functions, as subsequent commits will modify the actual builtins to pass the repository parameter down. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05drop trailing newline from warning/error/die messagesJeff King1-2/+2
Our error reporting routines append a trailing newline, and the strings we pass to them should not include them (otherwise we get an extra blank line after the message). These cases were all found by looking at the results of: git grep -P '[^_](error|error_errno|warning|die|die_errno)\(.*\\n"[,)]' '*.c' Note that we _do_ sometimes include a newline in the middle of such messages, to create multiline output (hence our grep matching "," or ")" after we see the newline, so we know we're at the end of the string). It's possible that one or more of these cases could intentionally be including a blank line at the end, but having looked at them all manually, I think these are all just mistakes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-03Merge branch 'js/fetch-push-trace2-annotation'Junio C Hamano1-1/+15
More trace2 events at key points on push and fetch code paths have been added. * js/fetch-push-trace2-annotation: send-pack: add new tracing regions for push fetch: add top-level trace2 regions trace2: implement trace2_printf() for event target
2024-08-22fetch: add top-level trace2 regionsJosh Steadmon1-1/+15
At $DAYJOB we experienced some slow fetch operations and needed some additional data to help diagnose the issue. Add top-level trace2 regions for the various modes of operation of `git-fetch`. None of these regions are in recursive code, so any enclosed trace messages should only see their nesting level increase by one. Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22builtin/fetch: fix leaking transaction with `--atomic`Patrick Steinhardt1-4/+4
With the `--atomic` flag, we use a single ref transaction to commit all ref updates in git-fetch(1). The lifetime of transactions is somewhat weird: while `ref_transaction_abort()` will free the transaction, a call to `ref_transaction_commit()` won't. We thus have to manually free the transaction in the successful case. Adapt the code to free the transaction in the exit path to plug the resulting memory leak. As `ref_transaction_abort()` already freed the transaction for us, we have to unset the transaction when we hit that code path to not cause a double free. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09refs: add referent to each_ref_fnJohn Cai1-1/+2
Add a parameter to each_ref_fn so that callers to the ref APIs that use this function as a callback can have acess to the unresolved value of a symbolic ref. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20Merge branch 'kn/update-ref-symref'Junio C Hamano1-2/+2
"git update-ref --stdin" learned to handle transactional updates of symbolic-refs. * kn/update-ref-symref: update-ref: add support for 'symref-update' command reftable: pick either 'oid' or 'target' for new updates update-ref: add support for 'symref-create' command update-ref: add support for 'symref-delete' command update-ref: add support for 'symref-verify' command refs: specify error for regular refs with `old_target` refs: create and use `ref_update_expects_existing_old_ref()`
2024-06-17Merge branch 'ps/no-writable-strings'Junio C Hamano1-3/+8
Building with "-Werror -Wwrite-strings" is now supported. * ps/no-writable-strings: (27 commits) config.mak.dev: enable `-Wwrite-strings` warning builtin/merge: always store allocated strings in `pull_twohead` builtin/rebase: always store allocated string in `options.strategy` builtin/rebase: do not assign default backend to non-constant field imap-send: fix leaking memory in `imap_server_conf` imap-send: drop global `imap_server_conf` variable mailmap: always store allocated strings in mailmap blob revision: always store allocated strings in output encoding remote-curl: avoid assigning string constant to non-const variable send-pack: always allocate receive status parse-options: cast long name for OPTION_ALIAS http: do not assign string constant to non-const field compat/win32: fix const-correctness with string constants pretty: add casts for decoration option pointers object-file: make `buf` parameter of `index_mem()` a constant object-file: mark cached object buffers as const ident: add casts for fallback name and GECOS entry: refactor how we remove items for delayed checkouts line-log: always allocate the output prefix line-log: stop assigning string constant to file parent buffer ...
2024-06-07refspec: remove global tag refspec structurePatrick Steinhardt1-3/+8
We have a global tag refspec structure that is used by both git-clone(1) and git-fetch(1). Initialization of the structure will break once we enable `-Wwrite-strings`, even though the breakage is harmless. While we could just add casts, the structure isn't really required in the first place as we can simply initialize the structures at the respective callsites. Refactor the code accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07update-ref: add support for 'symref-delete' commandKarthik Nayak1-2/+2
Add a new command 'symref-delete' to allow deletions of symbolic refs in a transaction via the '--stdin' mode of the 'git-update-ref' command. The 'symref-delete' command can, when given an <old-target>, delete the provided <ref> only when it points to <old-target>. This command is only compatible with the 'no-deref' mode because we optionally want to check the 'old_target' of the ref being deleted. De-referencing a symbolic ref would provide a regular ref and we already have the 'delete' command for regular refs. While users can also use 'git symbolic-ref -d' to delete symbolic refs, the 'symref-delete' command in 'git-update-ref' allows users to do so within a transaction, which promises atomicity of the operation and can be batched with other commands. When no 'old_target' is provided it can also delete regular refs, similar to how the 'delete' command can delete symrefs when no 'old_oid' is provided. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30Merge branch 'ps/refs-without-the-repository-updates'Junio C Hamano1-1/+2
Further clean-up the refs subsystem to stop relying on the_repository, and instead use the repository associated to the ref_store object. * ps/refs-without-the-repository-updates: refs/packed: remove references to `the_hash_algo` refs/files: remove references to `the_hash_algo` refs/files: use correct repository refs: remove `dwim_log()` refs: drop `git_default_branch_name()` refs: pass repo when peeling objects refs: move object peeling into "object.c" refs: pass ref store when detecting dangling symrefs refs: convert iteration over replace refs to accept ref store refs: retrieve worktree ref stores via associated repository refs: refactor `resolve_gitlink_ref()` to accept a repository refs: pass repo when retrieving submodule ref store refs: track ref stores via strmap refs: implement releasing ref storages refs: rename `init_db` callback to avoid confusion refs: adjust names for `init` and `init_db` callbacks
2024-05-23Merge branch 'kn/ref-transaction-symref' into kn/update-ref-symrefJunio C Hamano1-1/+1
* kn/ref-transaction-symref: refs: remove `create_symref` and associated dead code refs: rename `refs_create_symref()` to `refs_update_symref()` refs: use transaction in `refs_create_symref()` refs: add support for transactional symref updates refs: move `original_update_refname` to 'refs.c' refs: support symrefs in 'reference-transaction' hook files-backend: extract out `create_symref_lock()` refs: accept symref values in `ref_transaction_update()`
2024-05-20Merge branch 'kn/ref-transaction-symref'Junio C Hamano1-1/+1
Updates to symbolic refs can now be made as a part of ref transaction. * kn/ref-transaction-symref: refs: remove `create_symref` and associated dead code refs: rename `refs_create_symref()` to `refs_update_symref()` refs: use transaction in `refs_create_symref()` refs: add support for transactional symref updates refs: move `original_update_refname` to 'refs.c' refs: support symrefs in 'reference-transaction' hook files-backend: extract out `create_symref_lock()` refs: accept symref values in `ref_transaction_update()`
2024-05-17refs: pass ref store when detecting dangling symrefsPatrick Steinhardt1-1/+2
Both `warn_dangling_symref()` and `warn_dangling_symrefs()` derive the ref store via `the_repository`. Adapt them to instead take in the ref store as a parameter. While at it, rename the functions to have a `ref_` prefix to align them with other functions that take a ref store. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07cocci: apply rules to rewrite callers of "refs" interfacesPatrick Steinhardt1-6/+14
Apply the rules that rewrite callers of "refs" interfaces to explicitly pass `struct ref_store`. The resulting patch has been applied with the `--whitespace=fix` option. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07refs: accept symref values in `ref_transaction_update()`Karthik Nayak1-1/+1
The function `ref_transaction_update()` obtains ref information and flags to create a `ref_update` and add them to the transaction at hand. To extend symref support in transactions, we need to also accept the old and new ref targets and process it. This commit adds the required parameters to the function and modifies all call sites. The two parameters added are `new_target` and `old_target`. The `new_target` is used to denote what the reference should point to when the transaction is applied. Some functions allow this parameter to be NULL, meaning that the reference is not changed. The `old_target` denotes the value the reference must have before the update. Some functions allow this parameter to be NULL, meaning that the old value of the reference is not checked. We also update the internal function `ref_transaction_add_update()` similarly to take the two new parameters. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15Merge branch 'ds/fetch-config-parse-microfix'Junio C Hamano1-0/+1
A config parser callback function fell through instead of returning after recognising and processing a variable, wasting cycles, which has been corrected. * ds/fetch-config-parse-microfix: fetch: return when parsing submodule.recurse
2024-04-05fetch: return when parsing submodule.recurseDerrick Stolee1-0/+1
When parsing config keys, the normal pattern is to return 0 after completing the logic for a specific config key, since no other key will match. One instance, for "submodule.recurse", was missing this case in builtin/fetch.c. This is a very minor change, and will have minimal impact to performance. This particular block was edited recently in 56e8bb4fb4 (fetch: use `fetch_config` to store "fetch.recurseSubmodules" value, 2023-05-17), which led to some hesitation that perhaps this omission was on purpose. However, no later cases within git_fetch_config() will match the key if equal to "submodule.recurse" and neither will any key matches within the catch-all git_default_config(). Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11Merge branch 'js/merge-base-with-missing-commit'Junio C Hamano1-0/+2
Make sure failure return from merge_bases_many() is properly caught. * js/merge-base-with-missing-commit: merge-ort/merge-recursive: do report errors in `merge_submodule()` merge-recursive: prepare for `merge_submodule()` to report errors commit-reach(repo_get_merge_bases_many_dirty): pass on errors commit-reach(repo_get_merge_bases_many): pass on "missing commits" errors commit-reach(get_octopus_merge_bases): pass on "missing commits" errors commit-reach(repo_get_merge_bases): pass on "missing commits" errors commit-reach(get_merge_bases_many_0): pass on "missing commits" errors commit-reach(merge_bases_many): pass on "missing commits" errors commit-reach(paint_down_to_common): start reporting errors commit-reach(paint_down_to_common): prepare for handling shallow commits commit-reach(repo_in_merge_bases_many): report missing commits commit-reach(repo_in_merge_bases_many): optionally expect missing commits commit-reach(paint_down_to_common): plug two memory leaks
2024-02-28commit-reach(repo_in_merge_bases_many): report missing commitsJohannes Schindelin1-0/+2
Some functions in Git's source code follow the convention that returning a negative value indicates a fatal error, e.g. repository corruption. Let's use this convention in `repo_in_merge_bases()` to report when one of the specified commits is missing (i.e. when `repo_parse_commit()` reports an error). Also adjust the callers of `repo_in_merge_bases()` to handle such negative return values. Note: As of this patch, errors are returned only if any of the specified merge heads is missing. Over the course of the next patches, missing commits will also be reported by the `paint_down_to_common()` function, which is called by `repo_in_merge_bases_many()`, and those errors will be properly propagated back to the caller at that stage. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26fetch: convert strncmp() with strlen() to starts_with()René Scharfe1-3/+2
Using strncmp() and strlen() to check whether a string starts with another one requires repeating the prefix candidate. Use starts_with() instead, which reduces repetition and is more readable. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-19Merge branch 'tb/fetch-all-configuration'Junio C Hamano1-1/+16
"git fetch" learned to pay attention to "fetch.all" configuration variable, which pretends as if "--all" was passed from the command line when no remote parameter was given. * tb/fetch-all-configuration: fetch: add new config option fetch.all
2024-01-08Merge branch 'en/header-cleanup'Junio C Hamano1-2/+0
Remove unused header "#include". * en/header-cleanup: treewide: remove unnecessary includes in source files treewide: add direct includes currently only pulled in transitively trace2/tr2_tls.h: remove unnecessary include submodule-config.h: remove unnecessary include pkt-line.h: remove unnecessary include line-log.h: remove unnecessary include http.h: remove unnecessary include fsmonitor--daemon.h: remove unnecessary includes blame.h: remove unnecessary includes archive.h: remove unnecessary include treewide: remove unnecessary includes in source files treewide: remove unnecessary includes from header files
2024-01-08fetch: add new config option fetch.allTamino Bauknecht1-1/+16
Introduce a boolean configuration option fetch.all which allows to fetch all available remotes by default. The config option can be overridden by explicitly specifying a remote or by using --no-all. The behavior for --all is unchanged and calling git-fetch with --all and a remote will still result in an error. Additionally, describe the configuration variable in the config documentation and implement new tests to cover the expected behavior. Also add --no-all to the command-line documentation of git-fetch. Signed-off-by: Tamino Bauknecht <dev@tb6.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26treewide: remove unnecessary includes in source filesElijah Newren1-2/+0
Each of these were checked with gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE} to ensure that removing the direct inclusion of the header actually resulted in that header no longer being included at all (i.e. that no other header pulled it in transitively). ...except for a few cases where we verified that although the header was brought in transitively, nothing from it was directly used in that source file. These cases were: * builtin/credential-cache.c * builtin/pull.c * builtin/send-pack.c Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-18fetch: no redundant error message for atomic fetchJiang Xin1-5/+9
If an error occurs during an atomic fetch, a redundant error message will appear at the end of do_fetch(). It was introduced in b3a804663c (fetch: make `--atomic` flag cover backfilling of tags, 2022-02-17). Because a failure message is displayed before setting retcode in the function do_fetch(), calling error() on the err message at the end of this function may result in redundant or empty error message to be displayed. We can remove the redundant error() function, because we know that the function ref_transaction_abort() never fails. While we can find a common pattern for calling ref_transaction_abort() by running command "git grep -A1 ref_transaction_abort", e.g.: if (ref_transaction_abort(transaction, &error)) error("abort: %s", error.buf); Following this pattern, we can tolerate the return value of the function ref_transaction_abort() being changed in the future. We also delay the output of the err message to the end of do_fetch() to reduce redundant code. With these changes, the test case "fetch porcelain output (atomic)" in t5574 will also be fixed. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-13Merge branch 'jk/unused-post-2.42-part2'Junio C Hamano1-2/+2
Unused parameters to functions are marked as such, and/or removed, in order to bring us closer to -Wunused-parameter clean. * jk/unused-post-2.42-part2: parse-options: mark unused parameters in noop callback interpret-trailers: mark unused "unset" parameters in option callbacks parse-options: add more BUG_ON() annotations merge: do not pass unused opt->value parameter parse-options: mark unused "opt" parameter in callbacks parse-options: prefer opt->value to globals in callbacks checkout-index: delay automatic setting of to_tempfile format-patch: use OPT_STRING_LIST for to/cc options merge: simplify parsing of "-n" option merge: make xopts a strvec
2023-09-05parse-options: prefer opt->value to globals in callbacksJeff King1-2/+2
We have several parse-options callbacks that ignore their "opt" parameters entirely. This is a little unusual, as we'd normally put the result of the parsing into opt->value. In the case of these callbacks, though, they directly manipulate global variables instead (and in most cases the caller sets opt->value to NULL in the OPT_CALLBACK declaration). The immediate symptom we'd like to deal with is that the unused "opt" variables trigger -Wunused-parameter. But how to fix that is debatable. One option is to annotate them with UNUSED. But another is to have the caller pass in the appropriate variable via opt->value, and use it. That has the benefit of making the callbacks reusable (in theory at least), and makes it clear from the OPT_CALLBACK declaration which variables will be affected (doubly so for the cases in builtin/fast-export.c, where we do set opt->value, but it is completely ignored!). The slight downside is that we lose type safety, since they're now passing through void pointers. I went with the "just use them" approach here. The loss of type safety is unfortunate, but that is already an issue with most of the other callbacks. If we want to try to address that, we should do so more consistently (and this patch would prepare these callbacks for whatever we choose to do there). Note that in the cases in builtin/fast-export.c, we are passing anonymous enums. We'll have to give them names so that we can declare the appropriate pointer type within the callbacks. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29fetch: mark unused parameter in ref_transaction callbackJeff King1-1/+1
Since this callback is just trying to collect the set of queued tag updates, there is no need for it to look at old_oid at all. Mark it as unused to appease -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-27Merge branch 'jc/transport-parseopt-fix'Junio C Hamano1-4/+1
Command line parser fixes. * jc/transport-parseopt-fix: fetch: reject --no-ipv[46] parse-options: introduce OPT_IPVERSION()
2023-07-18parse-options: introduce OPT_IPVERSION()Junio C Hamano1-4/+1
The command line option parsing for "git clone", "git fetch", and "git push" have duplicated implementations of parsing "--ipv4" and "--ipv6" options, by having two OPT_SET_INT() for "ipv4" and "ipv6". Introduce a new OPT_IPVERSION() macro and use it in these three commands. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-06Merge branch 'gc/config-context'Junio C Hamano1-5/+8
Reduce reliance on a global state in the config reading API. * gc/config-context: config: pass source to config_parser_event_fn_t config: add kvi.path, use it to evaluate includes config.c: remove config_reader from configsets config: pass kvi to die_bad_number() trace2: plumb config kvi config.c: pass ctx with CLI config config: pass ctx with config files config.c: pass ctx in configsets config: add ctx arg to config_fn_t urlmatch.h: use config_fn_t type config: inline git_color_default_config
2023-06-28config: pass kvi to die_bad_number()Glen Choo1-2/+2
Plumb "struct key_value_info" through all code paths that end in die_bad_number(), which lets us remove the helper functions that read analogous values from "struct config_reader". As a result, nothing reads config_reader.config_kvi any more, so remove that too. In config.c, this requires changing the signature of git_configset_get_value() to 'return' "kvi" in an out parameter so that git_configset_get_<type>() can pass it to git_config_<type>(). Only numeric types will use "kvi", so for non-numeric types (e.g. git_configset_get_string()), pass NULL to indicate that the out parameter isn't needed. Outside of config.c, config callbacks now need to pass "ctx->kvi" to any of the git_config_<type>() functions that parse a config string into a number type. Included is a .cocci patch to make that refactor. The only exceptional case is builtin/config.c, where git_config_<type>() is called outside of a config callback (namely, on user-provided input), so config source information has never been available. In this case, die_bad_number() defaults to a generic, but perfectly descriptive message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure not to change the message. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28config: add ctx arg to config_fn_tGlen Choo1-3/+6
Add a new "const struct config_context *ctx" arg to config_fn_t to hold additional information about the config iteration operation. config_context has a "struct key_value_info kvi" member that holds metadata about the config source being read (e.g. what kind of config source it is, the filename, etc). In this series, we're only interested in .kvi, so we could have just used "struct key_value_info" as an arg, but config_context makes it possible to add/adjust members in the future without changing the config_fn_t signature. We could also consider other ways of organizing the args (e.g. moving the config name and value into config_context or key_value_info), but in my experiments, the incremental benefit doesn't justify the added complexity (e.g. a config_fn_t will sometimes invoke another config_fn_t but with a different config value). In subsequent commits, the .kvi member will replace the global "struct config_reader" in config.c, making config iteration a global-free operation. It requires much more work for the machinery to provide meaningful values of .kvi, so for now, merely change the signature and call sites, pass NULL as a placeholder value, and don't rely on the arg in any meaningful way. Most of the changes are performed by contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every config_fn_t: - Modifies the signature to accept "const struct config_context *ctx" - Passes "ctx" to any inner config_fn_t, if needed - Adds UNUSED attributes to "ctx", if needed Most config_fn_t instances are easily identified by seeing if they are called by the various config functions. Most of the remaining ones are manually named in the .cocci patch. Manual cleanups are still needed, but the majority of it is trivial; it's either adjusting config_fn_t that the .cocci patch didn't catch, or adding forward declarations of "struct config_context ctx" to make the signatures make sense. The non-trivial changes are in cases where we are invoking a config_fn_t outside of config machinery, and we now need to decide what value of "ctx" to pass. These cases are: - trace2/tr2_cfg.c:tr2_cfg_set_fl() This is indirectly called by git_config_set() so that the trace2 machinery can notice the new config values and update its settings using the tr2 config parsing function, i.e. tr2_cfg_cb(). - builtin/checkout.c:checkout_main() This calls git_xmerge_config() as a shorthand for parsing a CLI arg. This might be worth refactoring away in the future, since git_xmerge_config() can call git_default_config(), which can do much more than just parsing. Handle them by creating a KVI_INIT macro that initializes "struct key_value_info" to a reasonable default, and use that to construct the "ctx" arg. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21object-store-ll.h: split this header out of object-store.hElijah Newren1-1/+1
The vast majority of files including object-store.h did not need dir.h nor khash.h. Split the header into two files, and let most just depend upon object-store-ll.h, while letting the two callers that need it depend on the full object-store.h. After this patch: $ git grep -h include..object-store | sort | uniq -c 2 #include "object-store.h" 129 #include "object-store-ll.h" Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21repository: remove unnecessary include of path.hElijah Newren1-0/+1
This also made it clear that several .c files that depended upon path.h were missing a #include for it; add the missing includes while at it. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21cache.h: remove this no-longer-used headerElijah Newren1-2/+1
Since this header showed up in some places besides just #include statements, update/clean-up/remove those other places as well. Note that compat/fsmonitor/fsm-path-utils-darwin.c previously got away with violating the rule that all files must start with an include of git-compat-util.h (or a short-list of alternate headers that happen to include it first). This change exposed the violation and caused it to stop building correctly; fix it by having it include git-compat-util.h first, as per policy. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17fetch: use `fetch_config` to store "submodule.fetchJobs" valuePatrick Steinhardt1-5/+6
Move the parsed "submodule.fetchJobs" config value into the `fetch_config` structure. This reduces our reliance on global variables and further unifies the way we parse the configuration in git-fetch(1). Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17fetch: use `fetch_config` to store "fetch.parallel" valuePatrick Steinhardt1-7/+8
Move the parsed "fetch.parallel" config value into the `fetch_config` structure. This reduces our reliance on global variables and further unifies the way we parse the configuration in git-fetch(1). Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17fetch: use `fetch_config` to store "fetch.recurseSubmodules" valuePatrick Steinhardt1-15/+16
Move the parsed "fetch.recurseSubmodules" config value into the `fetch_config` structure. This reduces our reliance on global variables and further unifies the way we parse the configuration in git-fetch(1). Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17fetch: use `fetch_config` to store "fetch.showForcedUpdates" valuePatrick Steinhardt1-14/+21
Move the parsed "fetch.showForcedUpdaets" config value into the `fetch_config` structure. This reduces our reliance on global variables and further unifies the way we parse the configuration in git-fetch(1). Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17fetch: use `fetch_config` to store "fetch.pruneTags" valuePatrick Steinhardt1-4/+5
Move the parsed "fetch.pruneTags" config value into the `fetch_config` structure. This reduces our reliance on global variables and further unifies the way we parse the configuration in git-fetch(1). Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17fetch: use `fetch_config` to store "fetch.prune" valuePatrick Steinhardt1-4/+5
Move the parsed "fetch.prune" config value into the `fetch_config` structure. This reduces our reliance on global variables and further unifies the way we parse the configuration in git-fetch(1). Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17fetch: pass through `fetch_config` directlyPatrick Steinhardt1-15/+16
The `fetch_config` structure currently only has a single member, which is the `display_format`. We're about extend it to contain all parsed config values and will thus need it available in most of the code. Prepare for this change by passing through the `fetch_config` directly instead of only passing its single member. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17fetch: drop unneeded NULL-check for `remote_ref`Patrick Steinhardt1-3/+2
Drop the `NULL` check for `remote_ref` in `update_local_ref()`. The function only has a single caller, and that caller always passes in a non-`NULL` value. This fixes a false positive in Coverity. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17fetch: drop unused DISPLAY_FORMAT_UNKNOWN enum valuePatrick Steinhardt1-1/+0
With 50957937f9 (fetch: introduce `display_format` enum, 2023-05-10), a new enumeration was introduced to determine the display format that is to be used by git-fetch(1). The `DISPLAY_FORMAT_UNKNOWN` value isn't ever used though, and neither do we rely on the explicit `0` value for initialization anywhere. Remove the enum value. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-15Merge branch 'ps/fetch-output-format'Junio C Hamano1-197/+285
"git fetch" learned the "--porcelain" option that emits what it did in a machine-parseable format. * ps/fetch-output-format: fetch: introduce machine-parseable "porcelain" output format fetch: move option related variables into main function fetch: lift up parsing of "fetch.output" config variable fetch: introduce `display_format` enum fetch: refactor calculation of the display table width fetch: print left-hand side when fetching HEAD:foo fetch: add a test to exercise invalid output formats fetch: split out tests for output format fetch: fix `--no-recurse-submodules` with multi-remote fetches
2023-05-10fetch: introduce machine-parseable "porcelain" output formatPatrick Steinhardt1-19/+69
The output of git-fetch(1) is obviously designed for consumption by users, only: we neatly columnize data, we abbreviate reference names, we print neat arrows and we don't provide information about actual object IDs that have changed. This makes the output format basically unusable in the context of scripted invocations of git-fetch(1) that want to learn about the exact changes that the command performs. Introduce a new machine-parseable "porcelain" output format that is supposed to fix this shortcoming. This output format is intended to provide information about every reference that is about to be updated, the old object ID that the reference has been pointing to and the new object ID it will be updated to. Furthermore, the output format provides the same flags as the human-readable format to indicate basic conditions for each reference update like whether it was a fast-forward update, a branch deletion, a rejected update or others. The output format is quite simple: ``` <flag> <old-object-id> <new-object-id> <local-reference>\n ``` We assume two conditions which are generally true: - The old and new object IDs have fixed known widths and cannot contain spaces. - References cannot contain newlines. With these assumptions, the output format becomes unambiguously parseable. Furthermore, given that this output is designed to be consumed by scripts, the machine-readable data is printed to stdout instead of stderr like the human-readable output is. This is mostly done so that other data printed to stderr, like error messages or progress meters, don't interfere with the parseable data. A notable ommission here is that the output format does not include the remote from which a reference was fetched, which might be important information especially in the context of multi-remote fetches. But as such a format would require us to print the remote for every single reference update due to parallelizable fetches it feels wasteful for the most likely usecase, which is when fetching from a single remote. In a similar spirit, a second restriction is that this cannot be used with `--recurse-submodules`. This is because any reference updates would be ambiguous without also printing the repository in which the update happens. Considering that both multi-remote and submodule fetches are user-facing features, using them in conjunction with `--porcelain` that is intended for scripting purposes is likely not going to be useful in the majority of cases. With that in mind these restrictions feel acceptable. If usecases for either of these come up in the future though it is easy enough to add a new "porcelain-v2" format that adds this information. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10fetch: move option related variables into main functionPatrick Steinhardt1-97/+100
The options of git-fetch(1) which we pass to `parse_options()` are declared globally in `builtin/fetch.c`. This means we're forced to use global variables for all the options, which is more likely to cause confusion than explicitly passing state around. Refactor the code to move the options into `cmd_fetch()`. Move variables that were previously forced to be declared globally and which are only used by `cmd_fetch()` into function-local scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10fetch: lift up parsing of "fetch.output" config variablePatrick Steinhardt1-18/+32
Parsing the display format happens inside of `display_state_init()`. As we only need to check for a simple config entry, this is a natural location to put this code as it means that display-state logic is neatly contained in a single location. We're about to introduce a new "porcelain" output format though that is intended to be parseable by machines, for example inside of a script. This format can be enabled by passing the `--porcelain` switch to git-fetch(1). As a consequence, we'll have to add a second callsite that influences the output format, which will become awkward to handle. Refactor the code such that callers are expected to pass the display format that is to be used into `display_state_init()`. This allows us to lift up the code into the main function, where we can then hook it into command line options parser in a follow-up commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10fetch: introduce `display_format` enumPatrick Steinhardt1-21/+46
We currently have two different display formats in git-fetch(1) with the "full" and "compact" formats. This is tracked with a boolean value that simply denotes whether the display format is supposed to be compacted or not. This works reasonably well while there are only two formats, but we're about to introduce another format that will make this a bit more awkward to use. Introduce a `enum display_format` that is more readily extensible. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10fetch: refactor calculation of the display table widthPatrick Steinhardt1-41/+34
When displaying reference updates, we try to print the references in a neat table. As the table's width is determined its contents we thus need to precalculate the overall width before we can start printing updated references. The calculation is driven by `display_state_init()`, which invokes `refcol_width()` for every reference that is to be printed. This split is somewhat confusing. For one, we filter references that shall be attributed to the overall width in both places. And second, we needlessly recalculate the maximum line length based on the terminal columns and display format for every reference. Refactor the code so that the complete width calculations are neatly contained in `refcol_width()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10fetch: print left-hand side when fetching HEAD:fooPatrick Steinhardt1-18/+19
`store_updated_refs()` parses the remote reference for two purposes: - It gets used as a note when writing FETCH_HEAD. - It is passed through to `display_ref_update()` to display updated references in the following format: ``` * branch master -> master ``` In most cases, the parsed remote reference is the prettified reference name and can thus be used for both cases. But if the remote reference is HEAD, the parsed remote reference becomes empty. This is intended when we write the FETCH_HEAD, where we skip writing the note in that case. But when displaying the updated references this leads to inconsistent output where the left-hand side of reference updates is missing in some cases: ``` $ git fetch origin HEAD HEAD:explicit-head :implicit-head main From https://github.com/git/git * branch HEAD -> FETCH_HEAD * [new ref] -> explicit-head * [new ref] -> implicit-head * branch main -> FETCH_HEAD ``` This behaviour has existed ever since the table-based output has been introduced for git-fetch(1) via 165f390250 (git-fetch: more terse fetch output, 2007-11-03) and was never explicitly documented either in the commit message or in any of our tests. So while it may not be a bug per se, it feels like a weird inconsistency and not like it was a concious design decision. The logic of how we compute the remote reference name that we ultimately pass to `display_ref_update()` is not easy to follow. There are three different cases here: - When the remote reference name is "HEAD" we set the remote reference name to the empty string. This is the case that causes the left-hand side to go missing, where we would indeed want to print "HEAD" instead of the empty string. This is what `prettify_refname()` would return. - When the remote reference name has a well-known prefix then we strip this prefix. This matches what `prettify_refname()` does. - Otherwise, we keep the fully qualified reference name. This also matches what `prettify_refname()` does. As the return value of `prettify_refname()` would do the correct thing for us in all three cases, we can thus fix the inconsistency by passing through the full remote reference name to `display_ref_update()`, which learns to call `prettify_refname()`. At the same time, this also simplifies the code a bit. Note that this patch also changes formatting of the block that computes the "kind" (which is the category like "branch" or "tag") and "what" (which is the prettified reference name like "master" or "v1.0") variables. This is done on purpose so that it is part of the diff, hopefully making the change easier to comprehend. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10fetch: fix `--no-recurse-submodules` with multi-remote fetchesPatrick Steinhardt1-0/+2
When running `git fetch --no-recurse-submodules`, the exectation is that we don't fetch any submodules. And while this works for fetches of a single remote, it doesn't when fetching multiple remotes at once. The result is that we do recurse into submodules even though the user has explicitly asked us not to. This is because while we pass on `--recurse-submodules={yes,on-demand}` if specified by the user, we don't pass on `--no-recurse-submodules` to the subprocess spawned to perform the submodule fetch. Fix this by also forwarding this flag as expected. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-09Merge branch 'en/header-split-cache-h-part-2'Junio C Hamano1-0/+1
More header clean-up. * en/header-split-cache-h-part-2: (22 commits) reftable: ensure git-compat-util.h is the first (indirect) include diff.h: reduce unnecessary includes object-store.h: reduce unnecessary includes commit.h: reduce unnecessary includes fsmonitor: reduce includes of cache.h cache.h: remove unnecessary headers treewide: remove cache.h inclusion due to previous changes cache,tree: move basic name compare functions from read-cache to tree cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c hash-ll.h: split out of hash.h to remove dependency on repository.h tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h dir.h: move DTYPE defines from cache.h versioncmp.h: move declarations for versioncmp.c functions from cache.h ws.h: move declarations for ws.c functions from cache.h match-trees.h: move declarations for match-trees.c functions from cache.h pkt-line.h: move declarations for pkt-line.c functions from cache.h base85.h: move declarations for base85.c functions from cache.h copy.h: move declarations for copy.c functions from cache.h server-info.h: move declarations for server-info.c functions from cache.h packfile.h: move pack_window and pack_entry from cache.h ...
2023-04-25Merge branch 'en/header-split-cache-h'Junio C Hamano1-0/+6
Header clean-up. * en/header-split-cache-h: (24 commits) protocol.h: move definition of DEFAULT_GIT_PORT from cache.h mailmap, quote: move declarations of global vars to correct unit treewide: reduce includes of cache.h in other headers treewide: remove double forward declaration of read_in_full cache.h: remove unnecessary includes treewide: remove cache.h inclusion due to pager.h changes pager.h: move declarations for pager.c functions from cache.h treewide: remove cache.h inclusion due to editor.h changes editor: move editor-related functions and declarations into common file treewide: remove cache.h inclusion due to object.h changes object.h: move some inline functions and defines from cache.h treewide: remove cache.h inclusion due to object-file.h changes object-file.h: move declarations for object-file.c functions from cache.h treewide: remove cache.h inclusion due to git-zlib changes git-zlib: move declarations for git-zlib functions from cache.h treewide: remove cache.h inclusion due to object-name.h changes object-name.h: move declarations for object-name.c functions from cache.h treewide: remove unnecessary cache.h inclusion treewide: be explicit about dependence on mem-pool.h treewide: be explicit about dependence on oid-array.h ...
2023-04-24pkt-line.h: move declarations for pkt-line.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11pager.h: move declarations for pager.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11object-name.h: move declarations for object-name.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11treewide: be explicit about dependence on oid-array.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11treewide: be explicit about dependence on advice.hElijah Newren1-0/+1
Dozens of files made use of advice functions, without explicitly including advice.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include advice.h if they are using it. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11treewide: be explicit about dependence on trace.h & trace2.hElijah Newren1-0/+2
Dozens of files made use of trace and trace2 functions, without explicitly including trace.h or trace2.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include trace.h or trace2.h if they are using them. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-06Merge branch 'ds/fetch-bundle-uri-with-all'Junio C Hamano1-1/+6
"git fetch --all" does not have to download and handle the same bundleURI over and over, which has been corrected. * ds/fetch-bundle-uri-with-all: fetch: download bundles once, even with --all
2023-04-06Merge branch 'en/header-split-cleanup'Junio C Hamano1-0/+2
Split key function and data structure definitions out of cache.h to new header files and adjust the users. * en/header-split-cleanup: csum-file.h: remove unnecessary inclusion of cache.h write-or-die.h: move declarations for write-or-die.c functions from cache.h treewide: remove cache.h inclusion due to setup.h changes setup.h: move declarations for setup.c functions from cache.h treewide: remove cache.h inclusion due to environment.h changes environment.h: move declarations for environment.c functions from cache.h treewide: remove unnecessary includes of cache.h wrapper.h: move declarations for wrapper.c functions from cache.h path.h: move function declarations for path.c functions from cache.h cache.h: remove expand_user_path() abspath.h: move absolute path functions from cache.h environment: move comment_line_char from cache.h treewide: remove unnecessary cache.h inclusion from several sources treewide: remove unnecessary inclusion of gettext.h treewide: be explicit about dependence on gettext.h treewide: remove unnecessary cache.h inclusion from a few headers
2023-04-06Merge branch 'ab/remove-implicit-use-of-the-repository'Junio C Hamano1-10/+11
Code clean-up around the use of the_repository. * ab/remove-implicit-use-of-the-repository: libs: use "struct repository *" argument, not "the_repository" post-cocci: adjust comments for recent repo_* migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-06Merge branch 'ps/fetch-ref-update-reporting'Junio C Hamano1-129/+138
Clean-up of the code path that reports what "git fetch" did to each ref. * ps/fetch-ref-update-reporting: fetch: centralize printing of reference updates fetch: centralize logic to print remote URL fetch: centralize handling of per-reference format fetch: pass the full local reference name to `format_display` fetch: move output format into `display_state` fetch: move reference width calculation into `display_state`
2023-04-04Merge branch 'ab/remove-implicit-use-of-the-repository' into ↵Junio C Hamano1-10/+11
en/header-split-cache-h * ab/remove-implicit-use-of-the-repository: libs: use "struct repository *" argument, not "the_repository" post-cocci: adjust comments for recent repo_* migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"
2023-03-31fetch: download bundles once, even with --allDerrick Stolee1-1/+6
When fetch.bundleURI is set, 'git fetch' downloads bundles from the given bundle URI before fetching from the specified remote. However, when using non-file remotes, 'git fetch --all' will launch 'git fetch' subprocesses which then read fetch.bundleURI and fetch the bundle list again. We do not expect the bundle list to have new information during these multiple runs, so avoid these extra calls by un-setting fetch.bundleURI in the subprocess arguments. Be careful to skip fetching bundles for the empty bundle string. Fetching bundles from the empty list presents some interesting test failures. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28cocci: apply the "promisor-remote.h" part of "the_repository.pending"Ævar Arnfjörð Bjarmason1-2/+2
Apply the part of "the_repository.pending.cocci" pertaining to "promisor-remote.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28cocci: apply the "object-store.h" part of "the_repository.pending"Ævar Arnfjörð Bjarmason1-6/+6
Apply the part of "the_repository.pending.cocci" pertaining to "object-store.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28cocci: apply the "commit-reach.h" part of "the_repository.pending"Ævar Arnfjörð Bjarmason1-1/+2
Apply the part of "the_repository.pending.cocci" pertaining to "commit-reach.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28cocci: apply the "cache.h" part of "the_repository.pending"Ævar Arnfjörð Bjarmason1-1/+1
Apply the part of "the_repository.pending.cocci" pertaining to "cache.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21environment.h: move declarations for environment.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21treewide: be explicit about dependence on gettext.hElijah Newren1-0/+1
Dozens of files made use of gettext functions, without explicitly including gettext.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include gettext.h if they are using it. However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an include of gettext.h, it was left out to avoid conflicting with an in-flight topic. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20fetch: centralize printing of reference updatesPatrick Steinhardt1-53/+55
In order to print updated references during a fetch, the two different call sites that do this will first call `format_display()` followed by a call to `fputs()`. This is needlessly roundabout now that we have the `display_state` structure that encapsulates all of the printing logic for references. Move displaying the reference updates into `format_display()` and rename it to `display_ref_update()` to better match its new purpose, which finalizes the conversion to make both the formatting and printing logic of reference updates self-contained. This will make it easier to add new output formats and printing to a different file descriptor than stderr. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20fetch: centralize logic to print remote URLPatrick Steinhardt1-55/+44
When fetching from a remote, we not only print the actual references that have changed, but will also print the URL from which we have fetched them to standard output. The logic to handle this is duplicated across two different callsites with some non-trivial logic to compute the anonymized URL. Furthermore, we're using global state to track whether we have already shown the URL to the user or not. Refactor the code by moving it into `format_display()`. Like this, we can convert the global variable into a member of `display_state`. And second, we can deduplicate the logic to compute the anonymized URL. This also works as expected when fetching from multiple remotes, for example via a group of remotes, as we do this by forking a standalone git-fetch(1) process per remote that is to be fetched. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20fetch: centralize handling of per-reference formatPatrick Steinhardt1-3/+4
The function `format_display()` is used to print a single reference update to a buffer which will then ultimately be printed by the caller. This architecture causes us to duplicate some logic across the different callsites of this function. This makes it hard to follow the code as some parts of the logic are located in one place, while other parts of the logic are located in a different place. Furthermore, by having the logic scattered around it becomes quite hard to implement a new output format for the reference updates. We can make the logic a whole lot easier to understand by making the `format_display()` function self-contained so that it handles formatting and printing of the references. This will eventually allow us to easily implement a completely different output format, but also opens the door to conditionally print to either stdout or stderr depending on the output format. As a first step towards that goal we move the formatting directive used by both callers to print a single reference update into this function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20fetch: pass the full local reference name to `format_display`Patrick Steinhardt1-12/+11
Before printing the name of the local references that would be updated by a fetch we first prettify the reference name. This is done at the calling side so that `format_display()` never sees the full name of the local reference. This restricts our ability to introduce new output formats that might want to print the full reference name. Right now, all callsites except one are prettifying the reference name anyway. And the only callsite that doesn't passes `FETCH_HEAD` as the hardcoded reference name to `format_display()`, which would never be changed by a call to `prettify_refname()` anyway. So let's refactor the code to pass in the full local reference name and then prettify it in the formatting code. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20fetch: move output format into `display_state`Patrick Steinhardt1-7/+6
The git-fetch(1) command supports printing references either in "full" or "compact" format depending on the `fetch.ouput` config key. The format that is to be used is tracked in a global variable. De-globalize the variable by moving it into the `display_state` structure. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20fetch: move reference width calculation into `display_state`Patrick Steinhardt1-46/+65
In order to print references in proper columns we need to calculate the width of the reference column before starting to print the references. This is done with the help of a global variable `refcol_width`. Refactor the code to instead use a new structure `display_state` that contains the computed width and plumb it through the stack as required. This is only the first step towards de-globalizing the state required to print references. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-19Merge branch 'ew/fetch-no-write-fetch-head-fix'Junio C Hamano1-0/+2
* ew/fetch-no-write-fetch-head-fix: fetch: pass --no-write-fetch-head to subprocesses
2023-03-17Merge branch 'ew/fetch-hiderefs'Junio C Hamano1-0/+2
A new "fetch.hideRefs" option can be used to exclude specified refs from "rev-list --objects --stdin --not --all" traversal for checking object connectivity, most useful when there are many unrelated histories in a single repository. * ew/fetch-hiderefs: fetch: support hideRefs to speed up connectivity checks
2023-03-17Merge branch 'jk/unused-post-2.39-part2'Junio C Hamano1-2/+4
More work towards -Wunused. * jk/unused-post-2.39-part2: (21 commits) help: mark unused parameter in git_unknown_cmd_config() run_processes_parallel: mark unused callback parameters userformat_want_item(): mark unused parameter for_each_commit_graft(): mark unused callback parameter rewrite_parents(): mark unused callback parameter fetch-pack: mark unused parameter in callback function notes: mark unused callback parameters prio-queue: mark unused parameters in comparison functions for_each_object: mark unused callback parameters list-objects: mark unused callback parameters mark unused parameters in signal handlers run-command: mark error routine parameters as unused mark "pointless" data pointers in callbacks ref-filter: mark unused callback parameters http-backend: mark unused parameters in virtual functions http-backend: mark argc/argv unused object-name: mark unused parameters in disambiguate callbacks serve: mark unused parameters in virtual functions serve: use repository pointer to get config ls-refs: drop config caching ...
2023-03-17Merge branch 'en/header-cleanup'Junio C Hamano1-0/+1
Code clean-up to clarify the rule that "git-compat-util.h" must be the first to be included. * en/header-cleanup: diff.h: remove unnecessary include of object.h Remove unnecessary includes of builtin.h treewide: replace cache.h with more direct headers, where possible replace-object.h: move read_replace_refs declaration from cache.h to here object-store.h: move struct object_info from cache.h dir.h: refactor to no longer need to include cache.h object.h: stop depending on cache.h; make cache.h depend on object.h ident.h: move ident-related declarations out of cache.h pretty.h: move has_non_ascii() declaration from commit.h cache.h: remove dependence on hex.h; make other files include it explicitly hex.h: move some hex-related declarations from cache.h hash.h: move some oid-related declarations from cache.h alloc.h: move ALLOC_GROW() functions from cache.h treewide: remove unnecessary cache.h includes in source files treewide: remove unnecessary cache.h includes treewide: remove unnecessary git-compat-util.h includes in headers treewide: ensure one of the appropriate headers is sourced first
2023-03-09fetch: pass --no-write-fetch-head to subprocessesEric Wong1-0/+2
It seems a user would expect this option would work regardless of whether it's fetching from a single remote, many remotes, or recursing into submodules. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27fetch: support hideRefs to speed up connectivity checksEric Wong1-0/+2
With roughly 800 remotes all fetching into their own refs/remotes/$REMOTE/* island, the connectivity check[1] gets expensive for each fetch on systems which lack sufficient RAM to cache objects. To do a no-op fetch on one $REMOTE out of hundreds, hideRefs now allows the no-op fetch to take ~30 seconds instead of ~20 minutes on a noisy, RAM-constrained machine (localhost, so no network latency): git -c fetch.hideRefs=refs \ -c fetch.hideRefs='!refs/remotes/$REMOTE/' \ fetch $REMOTE [1] `git rev-list --objects --stdin --not --all --quiet --alternate-refs' Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24run_processes_parallel: mark unused callback parametersJeff King1-2/+4
Our parallel process API takes several callbacks via function pointers in the run_process_paralell_opts struct. Not every callback needs every parameter; let's mark the unused ones to make -Wunused-parameter happy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23cache.h: remove dependence on hex.h; make other files include it explicitlyElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-21fetch: choose a sensible default with --jobs=0 againMatthias Aßhauer1-0/+3
prior to 51243f9 (run-command API: don't fall back on online_cpus(), 2022-10-12) `git fetch --multiple --jobs=0` would choose some default amount of jobs, similar to `git -c fetch.parallel=0 fetch --multiple`. While our documentation only ever promised that `fetch.parallel` would fall back to a "sensible default", it makes sense to do the same for `--jobs`. So fall back to online_cpus() and not BUG() out. This fixes https://github.com/git-for-windows/git/issues/4302 Reported-by: Drew Noakes <drnoakes@microsoft.com> Signed-off-by: Matthias Aßhauer <mha1993@live.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15Merge branch 'ds/bundle-uri-5'Junio C Hamano1-0/+6
The bundle-URI subsystem adds support for creation-token heuristics to help incremental fetches. * ds/bundle-uri-5: bundle-uri: test missing bundles with heuristic bundle-uri: store fetch.bundleCreationToken fetch: fetch from an external bundle URI bundle-uri: drop bundle.flag from design doc clone: set fetch.bundleURI if appropriate bundle-uri: download in creationToken order bundle-uri: parse bundle.<id>.creationToken values bundle-uri: parse bundle.heuristic=creationToken t5558: add tests for creationToken heuristic bundle: verify using check_connected() bundle: test unbundling with incomplete history
2023-01-31fetch: fetch from an external bundle URIDerrick Stolee1-0/+6
When a user specifies a URI via 'git clone --bundle-uri', that URI may be a bundle list that advertises a 'bundle.heuristic' value. In that case, the Git client stores a 'fetch.bundleURI' config value storing that URI. Teach 'git fetch' to check for this config value and download bundles from that URI before fetching from the Git remote(s). Likely, the bundle provider has configured a heuristic (such as "creationToken") that will allow the Git client to download only a portion of the bundles before continuing the fetch. Since this URI is completely independent of the remote server, we want to be sure that we connect to the bundle URI before creating a connection to the Git remote. We do not want to hold a stateful connection for too long if we can avoid it. To test that this works correctly, extend the previous tests that set 'fetch.bundleURI' to do follow-up fetches. The bundle list is updated incrementally at each phase to demonstrate that the heuristic avoids downloading older bundles. This includes the middle fetch downloading the objects in bundle-3.bundle from the Git remote, and therefore not needing that bundle in the third fetch. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-19fetch: fix duplicate remote parallel fetch bugCalvin Wan1-0/+1
Fetching in parallel from a remote group with a duplicated remote results in the following: error: cannot lock ref '<ref>': is at <oid> but expected <oid> This doesn't happen in serial since fetching from the same remote that has already been fetched from is a noop. Therefore, remove any duplicated remotes after remote groups are parsed. Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-08Merge branch 'rs/no-more-run-command-v'Taylor Blau1-3/+6
Simplify the run-command API. * rs/no-more-run-command-v: replace and remove run_command_v_opt() replace and remove run_command_v_opt_cd_env_tr2() replace and remove run_command_v_opt_tr2() replace and remove run_command_v_opt_cd_env() use child_process members "args" and "env" directly use child_process member "args" instead of string array variable sequencer: simplify building argument list in do_exec() bisect--helper: factor out do_bisect_run() bisect: simplify building "checkout" argument list am: simplify building "show" argument list run-command: fix return value comment merge: remove always-the-same "verbose" arguments
2022-10-30replace and remove run_command_v_opt()René Scharfe1-3/+6
Replace the remaining calls of run_command_v_opt() with run_command() calls and explict struct child_process variables. This is more verbose, but not by much overall. The code becomes more flexible, e.g. it's easy to extend to conditionally add a new argument. Then remove the now unused function and its own flag names, simplifying the run-command API. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-12run-command API: move *_tr2() users to "run_processes_parallel()"Ævar Arnfjörð Bjarmason1-6/+12
Have the users of the "run_processes_parallel_tr2()" function use "run_processes_parallel()" instead. In preceding commits the latter was refactored to take a "struct run_process_parallel_opts" argument, since the only reason for "run_processes_parallel_tr2()" to exist was to take arguments that are now a part of that struct we can do away with it. See ee4512ed481 (trace2: create new combined trace facility, 2019-02-22) for the addition of the "*_tr2()" variant of the function, it was used by every caller except "t/helper/test-run-command.c".. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12run-command API: don't fall back on online_cpus()Ævar Arnfjörð Bjarmason1-0/+2
When a "jobs = 0" is passed let's BUG() out rather than fall back on online_cpus(). The default behavior was added when this API was implemented in c553c72eed6 (run-command: add an asynchronous parallel child processor, 2015-12-15). Most of our code in-tree that scales up to "online_cpus()" by default calls that function by itself. Keeping this default behavior just for the sake of two callers means that we'd need to maintain this one spot where we're second-guessing the config passed down into pp_init(). The preceding commit has an overview of the API callers that passed "jobs = 0". There were only two of them (actually three, but they resolved to these two config parsing codepaths). The "fetch.parallel" caller already had a test for the "fetch.parallel=0" case added in 0353c688189 (fetch: do not run a redundant fetch from submodule, 2022-05-16), but there was no such test for "submodule.fetchJobs". Let's add one here. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12run-command API: have "run_processes_parallel{,_tr2}()" return voidÆvar Arnfjörð Bjarmason1-9/+8
Change the "run_processes_parallel{,_tr2}()" functions to return void, instead of int. Ever since c553c72eed6 (run-command: add an asynchronous parallel child processor, 2015-12-15) they have unconditionally returned 0. To get a "real" return value out of this function the caller needs to get it via the "task_finished_fn" callback, see the example in hook.c added in 96e7225b310 (hook: add 'run' subcommand, 2021-12-22). So the "result = " and "if (!result)" code added to "builtin/fetch.c" d54dea77dba (fetch: let --jobs=<n> parallelize --multiple, too, 2019-10-05) has always been redundant, we always took that "if" path. Likewise the "ret =" in "t/helper/test-run-command.c" added in be5d88e1128 (test-tool run-command: learn to run (parts of) the testsuite, 2019-10-04) wasn't used, instead we got the return value from the "if (suite.failed.nr > 0)" block seen in the context. Subsequent commits will alter this API interface, getting rid of this always-zero return value makes it easier to understand those changes. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-19Merge branch 'jk/list-objects-filter-cleanup'Junio C Hamano1-1/+1
A couple of bugfixes with code clean-up. * jk/list-objects-filter-cleanup: list-objects-filter: convert filter_spec to a strbuf list-objects-filter: add and use initializers list-objects-filter: handle null default filter spec list-objects-filter: don't memset after releasing filter struct
2022-09-15Merge branch 'jk/proto-v2-ref-prefix-fix'Junio C Hamano1-3/+15
"git fetch" over protocol v2 sent an incorrect ref prefix request to the server and made "git pull" with configured fetch refspec that does not cover the remote branch to merge with fail, which has been corrected. * jk/proto-v2-ref-prefix-fix: fetch: add branch.*.merge to default ref-prefix extension fetch: stop checking for NULL transport->remote in do_fetch()
2022-09-14Merge branch 'ab/unused-annotation'Junio C Hamano1-4/+4
Undoes 'jk/unused-annotation' topic and redoes it to work around Coccinelle rules misfiring false positives in unrelated codepaths. * ab/unused-annotation: git-compat-util.h: use "deprecated" for UNUSED variables git-compat-util.h: use "UNUSED", not "UNUSED(var)"
2022-09-14Merge branch 'jk/unused-annotation'Junio C Hamano1-4/+5
Annotate function parameters that are not used (but cannot be removed for structural reasons), to prepare us to later compile with -Wunused warning turned on. * jk/unused-annotation: is_path_owned_by_current_uid(): mark "report" parameter as unused run-command: mark unused async callback parameters mark unused read_tree_recursive() callback parameters hashmap: mark unused callback parameters config: mark unused callback parameters streaming: mark unused virtual method parameters transport: mark bundle transport_options as unused refs: mark unused virtual method parameters refs: mark unused reflog callback parameters refs: mark unused each_ref_fn parameters git-compat-util: add UNUSED macro
2022-09-12list-objects-filter: add and use initializersJeff King1-1/+1
In 7e2619d8ff (list_objects_filter_options: plug leak of filter_spec strings, 2022-09-08), we noted that the filter_spec string_list was inconsistent in how it handled memory ownership of strings stored in the list. The fix there was a bit of a band-aid to set the "strdup_strings" variable right before adding anything. That works OK, and it lets the users of the API continue to zero-initialize the struct. But it makes the code a bit hard to follow and accident-prone, as any other spots appending the filter_spec need to think about whether to set the strdup_strings value, too (there's one such spot in partial_clone_get_default_filter_spec(), which is probably a possible memory leak). So let's do that full cleanup now. We'll introduce a LIST_OBJECTS_FILTER_INIT macro and matching function, and use them as appropriate (though it is for the "_options" struct, this matches the corresponding list_objects_filter_release() function). This is harder than it seems! Many other structs, like git_transport_data, embed the filter struct. So they need to initialize it themselves even if the rest of the enclosing struct is OK with zero-initialization. I found all of the relevant spots by grepping manually for declarations of list_objects_filter_options. And then doing so recursively for structs which embed it, and ones which embed those, and so on. I'm pretty sure I got everything, but there's no change that would alert the compiler if any topics in flight added new declarations. To catch this case, we now double-check in the parsing function that things were initialized as expected and BUG() if appropriate. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-08fetch: add branch.*.merge to default ref-prefix extensionJeff King1-3/+15
When running "git pull" with no arguments, we'll do a default "git fetch" and then try to merge the branch specified by the branch.*.merge config. There's code in get_ref_map() to treat that "merge" branch as something we want to fetch, even if it is not otherwise covered by the default refspec. This works fine with the v0 protocol, as the server tells us about all of the refs, and get_ref_map() is the ultimate decider of what we fetch. But in the v2 protocol, we send the ref-prefix extension to the server, asking it to limit the ref advertisement. And we only tell it about the default refspec for the remote; we don't mention the branch.*.merge config at all. This usually doesn't matter, because the default refspec matches "refs/heads/*", which covers all branches. But if you explicitly use a narrow refspec, then "git pull" on some branches may fail. The server doesn't advertise the branch, so we don't fetch it, and "git pull" thinks that it went away upstream. We can fix this by including any branch.*.merge entries for the current branch in the list of ref-prefixes we pass to the server. This only needs to happen when using the default configured refspec (since command-line refspecs are already added, and take precedence in deciding what we fetch). We don't otherwise need to replicate any of the "what to fetch" logic in get_ref_map(). These ref-prefixes are an optimization, so it's OK if we tell the server to advertise the branch.*.merge ref, even if we're not going to pull it. We'll just choose not to fetch it. The test here is based on one constructed by Johannes. I modified the branch names to trigger the ref-prefix issue (and be more descriptive), and to confirm that "git pull" actually updated the local ref, which should be more robust than just checking stderr. Reported-by: Lana Deere <lana.deere@gmail.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-08fetch: stop checking for NULL transport->remote in do_fetch()Jeff King1-1/+1
This field will never be NULL; if it were, we'd segfault earlier in the function when we unconditionally check transport->remote->fetch_tags. Likewise, many other functions dereference it unconditionally. This is a small simplification, but it will make things easier as we extend this conditional in the next patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01git-compat-util.h: use "UNUSED", not "UNUSED(var)"Ævar Arnfjörð Bjarmason1-4/+4
As reported in [1] the "UNUSED(var)" macro introduced in 2174b8c75de (Merge branch 'jk/unused-annotation' into next, 2022-08-24) breaks coccinelle's parsing of our sources in files where it occurs. Let's instead partially go with the approach suggested in [2] of making this not take an argument. As noted in [1] "coccinelle" will ignore such tokens in argument lists that it doesn't know about, and it's less of a surprise to syntax highlighters. This undoes the "help us notice when a parameter marked as unused is actually use" part of 9b240347543 (git-compat-util: add UNUSED macro, 2022-08-19), a subsequent commit will further tweak the macro to implement a replacement for that functionality. 1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/ 2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-29Merge branch 'ds/decorate-filter-tweak'Junio C Hamano1-2/+4
The namespaces used by "log --decorate" from "refs/" hierarchy by default has been tightened. * ds/decorate-filter-tweak: fetch: use ref_namespaces during prefetch maintenance: stop writing log.excludeDecoration log: create log.initialDecorationSet=all log: add --clear-decorations option log: add default decoration filter log-tree: use ref_namespaces instead of if/else-if refs: use ref_namespaces for replace refs base refs: add array of ref namespaces t4207: test coloring of grafted decorations t4207: modernize test refs: allow "HEAD" as decoration filter
2022-08-19hashmap: mark unused callback parametersJeff King1-1/+1
Hashmap comparison functions must conform to a particular callback interface, but many don't use all of their parameters. Especially the void cmp_data pointer, but some do not use keydata either (because they can easily form a full struct to pass when doing lookups). Let's mark these to make -Wunused-parameter happy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19refs: mark unused each_ref_fn parametersJeff King1-3/+4
Functions used with for_each_ref(), etc, need to conform to the each_ref_fn interface. But most of them don't need every parameter; let's annotate the unused ones to quiet -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05fetch: use ref_namespaces during prefetchDerrick Stolee1-2/+4
The "refs/prefetch/" namespace is used by 'git fetch --prefetch' as a replacement of the destination of the refpsec for a remote. Git also removes refspecs that include tags. Instead of using string literals for the 'refs/tags/ and 'refs/prefetch/' namespaces, use the entries in the ref_namespaces array. This kind of change could be done in many places around the codebase, but we are isolating only to this change because of the way the refs/prefetch/ namespace somewhat motivated the creation of the ref_namespaces array. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18Merge branch 'ab/cocci-unused'Junio C Hamano1-2/+1
Add Coccinelle rules to detect the pattern of initializing and then finalizing a structure without using it in between at all, which happens after code restructuring and the compilers fail to recognize as an unused variable. * ab/cocci-unused: cocci: generalize "unused" rule to cover more than "strbuf" cocci: add and apply a rule to find "unused" strbufs cocci: have "coccicheck{,-pending}" depend on "coccicheck-test" cocci: add a "coccicheck-test" target and test *.cocci rules Makefile & .gitignore: ignore & clean "git.res", not "*.res" Makefile: remove mandatory "spatch" arguments from SPATCH_FLAGS
2022-07-11Merge branch 'ds/branch-checked-out'Junio C Hamano1-29/+17
Introduce a helper to see if a branch is already being worked on (hence should not be newly checked out in a working tree), which performs much better than the existing find_shared_symref() to replace many uses of the latter. * ds/branch-checked-out: branch: drop unused worktrees variable fetch: stop passing around unused worktrees variable branch: fix branch_checked_out() leaks branch: use branch_checked_out() when deleting refs fetch: use new branch_checked_out() and add tests branch: check for bisects and rebases branch: add branch_checked_out() helper
2022-07-06cocci: add and apply a rule to find "unused" strbufsÆvar Arnfjörð Bjarmason1-2/+1
Add a coccinelle rule to remove "struct strbuf" initialization followed by calling "strbuf_release()" function, without any uses of the strbuf in the same function. See the tests in contrib/coccinelle/tests/unused.{c,res} for what it's intended to find and replace. The inclusion of "contrib/scalar/scalar.c" is because "spatch" was manually run on it (we don't usually run spatch on contrib). Per the "buggy code" comment we also match a strbuf_init() before the xmalloc(), but we're not seeking to be so strict as to make checks that the compiler will catch for us redundant. Saying we'll match either "init" or "xmalloc" lines makes the rule simpler. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-21fetch: stop passing around unused worktrees variableJeff King1-15/+9
In 12d47e3b1f (fetch: use new branch_checked_out() and add tests, 2022-06-14), fetch's update_local_ref() function stopped using its "worktrees" parameter. It doesn't need it, since the branch_checked_out() function examines the global worktrees under the hood. So we can not only drop the unused parameter from that function, but also from its entire call chain. And as we do so all the way up to do_fetch(), we can see that nobody uses it at all, and we can drop the local variable there entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-15fetch: use new branch_checked_out() and add testsDerrick Stolee1-14/+8
When fetching refs from a remote, it is possible that the refspec will cause use to overwrite a ref that is checked out in a worktree. The existing logic in builtin/fetch.c uses a possibly-slow mechanism. Update those sections to use the new, more efficient branch_checked_out() helper. These uses were not previously tested, so add a test case that can be used for these kinds of collisions. There is only one test now, but more tests will be added as other consumers of branch_checked_out() are added. Note that there are two uses in builtin/fetch.c, but only one of the messages is tested. This is because the tested check is run before completing the fetch, and the untested check is not reachable without concurrent updates to the filesystem. Thus, it is beneficial to keep that extra check for the sake of defense-in-depth. However, we should not attempt to test the check, as the effort required is too complicated to be worth the effort. This use in update_local_ref() also requires a change in the error message because we no longer have access to the worktree struct, only the path of the worktree. This error is so rare that making a distinction between the two is not critical. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-25Merge branch 'jc/avoid-redundant-submodule-fetch'Junio C Hamano1-1/+15
"git fetch --recurse-submodules" from multiple remotes (either from a remote group, or "--all") used to make one extra "git fetch" in the submodules, which has been corrected. * jc/avoid-redundant-submodule-fetch: fetch: do not run a redundant fetch from submodule
2022-05-18fetch: do not run a redundant fetch from submoduleJunio C Hamano1-1/+15
When 7dce19d3 (fetch/pull: Add the --recurse-submodules option, 2010-11-12) introduced the "--recurse-submodule" option, the approach taken was to perform fetches in submodules only once, after all the main fetching (it may usually be a fetch from a single remote, but it could be fetching from a group of remotes using fetch_multiple()) succeeded. Later we added "--all" to fetch from all defined remotes, which complicated things even more. If your project has a submodule, and you try to run "git fetch --recurse-submodule --all", you'd see a fetch for the top-level, which invokes another fetch for the submodule, followed by another fetch for the same submodule. All but the last fetch for the submodule come from a "git fetch --recurse-submodules" subprocess that is spawned via the fetch_multiple() interface for the remotes, and the last fetch comes from the code at the end. Because recursive fetching from submodules is done in each fetch for the top-level in fetch_multiple(), the last fetch in the submodule is redundant. It only matters when fetch_one() interacts with a single remote at the top-level. While we are at it, there is one optimization that exists in dealing with a group of remote, but is missing when "--all" is used. In the former, when the group turns out to be a group of one, instead of spawning "git fetch" as a subprocess via the fetch_multiple() interface, we use the normal fetch_one() code path. Do the same when handing "--all", if it turns out that we have only one remote defined. Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-16fetch: limit shared symref check only for local branchesOrgad Shaneh1-0/+1
This check was introduced in 8ee5d73137f (Fix fetch/pull when run without --update-head-ok, 2008-10-13) in order to protect against replacing the ref of the active branch by mistake, for example by running git fetch origin master:master. It was later extended in 8bc1f39f411 (fetch: protect branches checked out in all worktrees, 2021-12-01) to scan all worktrees. This operation is very expensive (takes about 30s in my repository) when there are many tags or branches, and it is executed on every fetch, even if no local heads are updated at all. Limit it to protect only refs/heads/* to improve fetch performance. Signed-off-by: Orgad Shaneh <orgads@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-04Merge branch 'rc/fetch-refetch'Junio C Hamano1-2/+32
"git fetch --refetch" learned to fetch everything without telling the other side what we already have, which is useful when you cannot trust what you have in the local object store. * rc/fetch-refetch: docs: mention --refetch fetch option fetch: after refetch, encourage auto gc repacking t5615-partial-clone: add test for fetch --refetch fetch: add --refetch option builtin/fetch-pack: add --refetch option fetch-pack: add refetch fetch-negotiator: add specific noop initializer
2022-03-28fetch: after refetch, encourage auto gc repackingRobert Coup1-1/+18
After invoking `fetch --refetch`, the object db will likely contain many duplicate objects. If auto-maintenance is enabled, invoke it with appropriate settings to encourage repacking/consolidation. * gc.autoPackLimit: unless this is set to 0 (disabled), override the value to 1 to force pack consolidation. * maintenance.incremental-repack.auto: unless this is set to 0, override the value to -1 to force incremental repacking. Signed-off-by: Robert Coup <robert@coup.net.nz> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-28fetch: add --refetch optionRobert Coup1-1/+14
Teach fetch and transports the --refetch option to force a full fetch without negotiating common commits with the remote. Use when applying a new partial clone filter to refetch all matching objects. Signed-off-by: Robert Coup <robert@coup.net.nz> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-25Merge branch 'gc/recursive-fetch-with-unused-submodules'Junio C Hamano1-7/+7
When "git fetch --recurse-submodules" grabbed submodule commits that would be needed to recursively check out newly fetched commits in the superproject, it only paid attention to submodules that are in the current checkout of the superproject. We now do so for all submodules that have been run "git submodule init" on. * gc/recursive-fetch-with-unused-submodules: submodule: fix latent check_has_commit() bug fetch: fetch unpopulated, changed submodules submodule: move logic into fetch_task_create() submodule: extract get_fetch_task() submodule: store new submodule commits oid_array in a struct submodule: inline submodule_commits() into caller submodule: make static functions read submodules from commits t5526: create superproject commits with test helper t5526: stop asserting on stderr literally t5526: introduce test helper to assert on fetches
2022-03-16Merge branch 'ps/fetch-mirror-optim'Junio C Hamano1-15/+27
Various optimization for "git fetch". * ps/fetch-mirror-optim: refs/files-backend: optimize reading of symbolic refs remote: read symbolic refs via `refs_read_symbolic_ref()` refs: add ability for backends to special-case reading of symbolic refs fetch: avoid lookup of commits when not appending to FETCH_HEAD upload-pack: look up "want" lines via commit-graph
2022-03-16fetch: fetch unpopulated, changed submodulesGlen Choo1-7/+7
"git fetch --recurse-submodules" only considers populated submodules (i.e. submodules that can be found by iterating the index), which makes "git fetch" behave differently based on which commit is checked out. As a result, even if the user has initialized all submodules correctly, they may not fetch the necessary submodule commits, and commands like "git checkout --recurse-submodules" might fail. Teach "git fetch" to fetch cloned, changed submodules regardless of whether they are populated. This is in addition to the current behavior of fetching populated submodules (which is always attempted regardless of what was fetched in the superproject, or even if nothing was fetched in the superproject). A submodule may be encountered multiple times (via the list of populated submodules or via the list of changed submodules). When this happens, "git fetch" only reads the 'populated copy' and ignores the 'changed copy'. Amend the verify_fetch_result() test helper so that we can assert on which 'copy' is being read. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-13Merge branch 'ps/fetch-atomic'Junio C Hamano1-61/+129
"git fetch" can make two separate fetches, but ref updates coming from them were in two separate ref transactions under "--atomic", which has been corrected. * ps/fetch-atomic: fetch: make `--atomic` flag cover pruning of refs fetch: make `--atomic` flag cover backfilling of tags refs: add interface to iterate over queued transactional updates fetch: report errors when backfilling tags fails fetch: control lifecycle of FETCH_HEAD in a single place fetch: backfill tags before setting upstream fetch: increase test coverage of fetches
2022-03-01fetch: avoid lookup of commits when not appending to FETCH_HEADPatrick Steinhardt1-15/+27
When fetching from a remote repository we will by default write what has been fetched into the special FETCH_HEAD reference. The order in which references are written depends on whether the reference is for merge or not, which, despite some other conditions, is also determined based on whether the old object ID the reference is being updated from actually exists in the repository. To write FETCH_HEAD we thus loop through all references thrice: once for the references that are about to be merged, once for the references that are not for merge, and finally for all references that are ignored. For every iteration, we then look up the old object ID to determine whether the referenced object exists so that we can label it as "not-for-merge" if it doesn't exist. It goes without saying that this can be expensive in case where we are fetching a lot of references. While this is hard to avoid in the case where we're writing FETCH_HEAD, users can in fact ask us to skip this work via `--no-write-fetch-head`. In that case, we do not care for the result of those lookups at all because we don't have to order writes to FETCH_HEAD in the first place. Skip this busywork in case we're not writing to FETCH_HEAD. The following benchmark performs a mirror-fetch in a repository with about two million references via `git fetch --prune --no-write-fetch-head +refs/*:refs/*`: Benchmark 1: HEAD~ Time (mean ± σ): 75.388 s ± 1.942 s [User: 71.103 s, System: 8.953 s] Range (min … max): 73.184 s … 76.845 s 3 runs Benchmark 2: HEAD Time (mean ± σ): 69.486 s ± 1.016 s [User: 65.941 s, System: 8.806 s] Range (min … max): 68.864 s … 70.659 s 3 runs Summary 'HEAD' ran 1.08 ± 0.03 times faster than 'HEAD~' Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-01Merge branch 'ps/fetch-atomic' into ps/fetch-mirror-optimJunio C Hamano1-61/+129
* ps/fetch-atomic: fetch: make `--atomic` flag cover pruning of refs fetch: make `--atomic` flag cover backfilling of tags refs: add interface to iterate over queued transactional updates fetch: report errors when backfilling tags fails fetch: control lifecycle of FETCH_HEAD in a single place fetch: backfill tags before setting upstream fetch: increase test coverage of fetches
2022-02-25Merge branch 'ja/i18n-common-messages'Junio C Hamano1-2/+2
Unify more messages to help l10n. * ja/i18n-common-messages: i18n: fix some misformated placeholders in command synopsis i18n: remove from i18n strings that do not hold translatable parts i18n: factorize "invalid value" messages i18n: factorize more 'incompatible options' messages
2022-02-23Merge branch 'ps/fetch-optim-with-commit-graph'Junio C Hamano1-2/+6
A couple of optimization to "git fetch". * ps/fetch-optim-with-commit-graph: fetch: skip computing output width when not printing anything fetch-pack: use commit-graph when computing cutoff
2022-02-18Merge branch 'js/short-help-outside-repo-fix'Junio C Hamano1-2/+4
"git cmd -h" outside a repository should error out cleanly for many commands, but instead it hit a BUG(), which has been corrected. * js/short-help-outside-repo-fix: t0012: verify that built-ins handle `-h` even without gitdir checkout/fetch/pull/pack-objects: allow `-h` outside a repository