| Age | Commit message (Collapse) | Author | Files | Lines |
|
Code refactoring around object database sources.
* ps/object-source-management:
odb: handle recreation of quarantine directories
odb: handle changing a repository's commondir
chdir-notify: add function to unregister listeners
odb: handle initialization of sources in `odb_new()`
http-push: stop setting up `the_repository` for each reference
t/helper: stop setting up `the_repository` repeatedly
builtin/index-pack: fix deferred fsck outside repos
oidset: introduce `oidset_equal()`
odb: move logic to disable ref updates into repo
odb: refactor `odb_clear()` to `odb_free()`
odb: adopt logic to close object databases
setup: convert `set_git_dir()` to have file scope
path: move `enter_repo()` into "setup.c"
|
|
"git config get --path" segfaulted on an ":(optional)path" that
does not exist, which has been corrected.
* jc/optional-path:
config: really treat missing optional path as not configured
config: really pretend missing :(optional) value is not there
config: mark otherwise unused function as file-scope static
|
|
These callers expect that git_config_pathname() that returns 0 is a
signal that the variable they passed has a string they need to act
on. But with the introduction of ":(optional)path" earlier, that is
no longer the case. If the path specified by the configuration
variable is missing, their variable will get a NULL in it, and they
need to act on it (often, just refraining from copying it elsewhere).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The function `enter_repo()` is used to enter a repository at a given
path. As such it sits way closer to setting up a repository than it does
with handling paths, but regardless of that it's located in "path.c"
instead of in "setup.c".
Move the function into "setup.c".
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `each_ref_fn` callback function type is used across our code base
for several different functions that iterate through reference. There's
a bunch of callbacks implementing this type, which makes any changes to
the callback signature extremely noisy. An example of the required churn
is e8207717f1 (refs: add referent to each_ref_fn, 2024-08-09): adding a
single argument required us to change 48 files.
It was already proposed back then [1] that we might want to introduce a
wrapper structure to alleviate the pain going forward. While this of
course requires the same kind of global refactoring as just introducing
a new parameter, it at least allows us to more change the callback type
afterwards by just extending the wrapper structure.
One counterargument to this refactoring is that it makes the structure
more opaque. While it is obvious which callsites need to be fixed up
when we change the function type, it's not obvious anymore once we use
a structure. That being said, we only have a handful of sites that
actually need to populate this wrapper structure: our ref backends,
"refs/iterator.c" as well as very few sites that invoke the iterator
callback functions directly.
Introduce this wrapper structure so that we can adapt the iterator
interfaces more readily.
[1]: <ZmarVcF5JjsZx0dl@tanuki>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In `reprepare_packed_git()` we perform a couple of operations:
- We reload alternate object directories.
- We clear the loose object cache.
- We reprepare packfiles.
While the logic is hosted in "packfile.c", it clearly reaches into other
subsystems that aren't related to packfiles.
Split up the responsibility and introduce `odb_reprepare()` which now
becomes responsible for repreparing the whole object database. The
existing `reprepare_packed_git()` function is refactored accordingly and
only cares about reloading the packfile store now.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Reduce implicit assumption and dependence on the_repository in the
object-file subsystem.
* ps/object-file-wo-the-repository:
object-file: get rid of `the_repository` in index-related functions
object-file: get rid of `the_repository` in `force_object_loose()`
object-file: get rid of `the_repository` in `read_loose_object()`
object-file: get rid of `the_repository` in loose object iterators
object-file: remove declaration for `for_each_file_in_obj_subdir()`
object-file: inline `for_each_loose_file_in_objdir_buf()`
object-file: get rid of `the_repository` when writing objects
odb: introduce `odb_write_object()`
loose: write loose objects map via their source
object-file: get rid of `the_repository` in `finalize_object_file()`
object-file: get rid of `the_repository` in `loose_object_info()`
object-file: get rid of `the_repository` when freshening objects
object-file: inline `check_and_freshen()` functions
object-file: get rid of `the_repository` in `has_loose_object()`
object-file: stop using `the_hash_algo`
object-file: fix -Wsign-compare warnings
|
|
In 036876a1067 (config: hide functions using `the_repository` by
default, 2024-08-13) we have moved around a bunch of functions in the
config subsystem that depend on `the_repository`. Those function have
been converted into mere wrappers around their equivalent function that
takes in a repository as parameter, and the intent was that we'll
eventually remove those wrappers to make the dependency on the global
repository variable explicit at the callsite.
Follow through with that intent and remove `git_config()`. All callsites
are adjusted so that they use `repo_config(the_repository, ...)`
instead. While some callsites might already have a repository available,
this mechanical conversion is the exact same as the current situation
and thus cannot cause any regression. Those sites should eventually be
cleaned up in a later patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* bc/use-sha256-by-default-in-3.0:
Enable SHA-256 by default in breaking changes mode
help: add a build option for default hash
t5300: choose the built-in hash outside of a repo
t4042: choose the built-in hash outside of a repo
t1007: choose the built-in hash outside of a repo
t: default to compile-time default hash if not set
setup: use the default algorithm to initialize repo format
Use legacy hash for legacy formats
builtin: use default hash when outside a repository
hash: add a constant for the legacy hash algorithm
hash: add a constant for the default hash algorithm
|
|
We do not have a backend-agnostic way to write objects into an object
database. While there is `write_object_file()`, this function is rather
specific to the loose object format.
Introduce `odb_write_object()` to plug this gap. For now, this function
is a simple wrapper around `write_object_file()` and doesn't even use
the passed-in object database yet. This will change in subsequent
commits, where `write_object_file()` is converted so that it works on
top of an `odb_source`. `odb_write_object()` will then become
responsible for deciding which source an object shall be written to.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code clean-up around object access API.
* ps/object-store:
odb: rename `read_object_with_reference()`
odb: rename `pretend_object_file()`
odb: rename `has_object()`
odb: rename `repo_read_object_file()`
odb: rename `oid_object_info()`
odb: trivial refactorings to get rid of `the_repository`
odb: get rid of `the_repository` when handling submodule sources
odb: get rid of `the_repository` when handling the primary source
odb: get rid of `the_repository` in `for_each()` functions
odb: get rid of `the_repository` when handling alternates
odb: get rid of `the_repository` in `odb_mkstemp()`
odb: get rid of `the_repository` in `assert_oid_type()`
odb: get rid of `the_repository` in `find_odb()`
odb: introduce parent pointers
object-store: rename files to "odb.{c,h}"
object-store: rename `object_directory` to `odb_source`
object-store: rename `raw_object_store` to `object_database`
|
|
* ps/object-store:
odb: rename `read_object_with_reference()`
odb: rename `pretend_object_file()`
odb: rename `has_object()`
odb: rename `repo_read_object_file()`
odb: rename `oid_object_info()`
odb: trivial refactorings to get rid of `the_repository`
odb: get rid of `the_repository` when handling submodule sources
odb: get rid of `the_repository` when handling the primary source
odb: get rid of `the_repository` in `for_each()` functions
odb: get rid of `the_repository` when handling alternates
odb: get rid of `the_repository` in `odb_mkstemp()`
odb: get rid of `the_repository` in `assert_oid_type()`
odb: get rid of `the_repository` in `find_odb()`
odb: introduce parent pointers
object-store: rename files to "odb.{c,h}"
object-store: rename `object_directory` to `odb_source`
object-store: rename `raw_object_store` to `object_database`
|
|
"git push" and "git fetch" are taught to update refs in batches to
gain performance.
* kn/fetch-push-bulk-ref-update:
receive-pack: handle reference deletions separately
refs/files: skip updates with errors in batched updates
receive-pack: use batched reference updates
send-pack: fix memory leak around duplicate refs
fetch: use batched reference updates
refs: add function to translate errors to strings
|
|
We have a large variety of data formats and protocols where no hash
algorithm was defined and the default was assumed to always be SHA-1.
Instead of explicitly stating SHA-1, let's use the constant to represent
the legacy hash algorithm (which is still SHA-1) so that it's clear
for documentary purposes that it's a legacy fallback option and not an
intentional choice to use SHA-1.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Rename `has_object()` to `odb_has_object()` to match other functions
related to the object database and our modern coding guidelines.
Introduce a compatibility wrapper so that any in-flight topics will
continue to compile.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are a couple of iterator-style functions that execute a callback
for each instance of a given set, all of which currently depend on
`the_repository`. Refactor them to instead take an object database as
parameter so that we can get rid of this dependency.
Rename the functions accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the preceding commits we have renamed the structures contained in
"object-store.h" to `struct object_database` and `struct odb_backend`.
As such, the code files "object-store.{c,h}" are confusingly named now.
Rename them to "odb.{c,h}" accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 9d2962a7c4 (receive-pack: use batched reference updates, 2025-05-19)
we updated the 'git-receive-pack(1)' command to use batched reference
updates. One edge case which was missed during this implementation was
when a user pushes multiple branches such as:
delete refs/heads/branch/conflict
create refs/heads/branch
Before using batched updates, the references would be applied
sequentially and hence no conflicts would arise. With batched updates,
while the first update applies, the second fails due to D/F conflict. A
similar issue was present in 'git-fetch(1)' and was fixed by separating
out reference pruning into a separate transaction in the commit 'fetch:
use batched reference updates'. Apply a similar mechanism for
'git-receive-pack(1)' and separate out reference deletions into its own
batch.
This means 'git-receive-pack(1)' will now use up to two transactions,
whereas before using batched updates it would use _at least_ two
transactions. So using batched updates is still the better option.
Add a test to validate this behavior.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
During git-receive-pack(1), connectivity of the object graph is
validated to ensure that the received packfile does not leave the
repository in a broken state. This is done via git-rev-list(1) and
walking the objects, which can be expensive for large repositories.
Generally, this check is critical to avoid an incomplete received
packfile from corrupting a repository. Server operators may have
additional knowledge though around exactly how Git is being used on the
server-side which can be used to facilitate more efficient connectivity
computation of incoming objects.
For example, if it can be ensured that all objects in a repository are
connected and do not depend on any missing objects, the connectivity of
newly written objects can be checked by walking the object graph
containing only the new objects from the updated tips and identifying
the missing objects which represent the boundary between the new objects
and the repository. These boundary objects can be checked in the
canonical repository to ensure the new objects connect as expected and
thus avoid walking the rest of the object graph.
Git itself cannot make the guarantees required for such an optimization
as it is possible for a repository to contain an unreachable object that
references a missing object without the repository being considered
corrupt.
Introduce the --skip-connectivity-check option for git-receive-pack(1)
which bypasses this connectivity check to give more control to the
server-side. Note that without proper server-side validation of newly
received objects handled outside of Git, usage of this option risks
corrupting a repository.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The reference updates performed as a part of 'git-receive-pack(1)', take
place one at a time. For each reference update, a new transaction is
created and committed. This is necessary to ensure we can allow
individual updates to fail without failing the entire command. The
command also supports an 'atomic' mode, which uses a single transaction
to update all of the references. But this mode has an all-or-nothing
approach, where if a single update fails, all updates would fail.
In 23fc8e4f61 (refs: implement batch reference update support,
2025-04-08), we introduced a new mechanism to batch reference updates.
Under the hood, this uses a single transaction to perform a batch of
reference updates, while allowing only individual updates to fail.
Utilize this newly introduced batch update mechanism in
'git-receive-pack(1)'. This provides a significant bump in performance,
especially when dealing with repositories with large number of
references.
With the reftable backend there is a 18x performance improvement, when
performing receive-pack with 10000 refs:
Benchmark 1: receive: many refs (refformat = reftable, refcount = 10000, revision = master)
Time (mean ± σ): 4.276 s ± 0.078 s [User: 0.796 s, System: 3.318 s]
Range (min … max): 4.185 s … 4.430 s 10 runs
Benchmark 2: receive: many refs (refformat = reftable, refcount = 10000, revision = HEAD)
Time (mean ± σ): 235.4 ms ± 6.9 ms [User: 75.4 ms, System: 157.3 ms]
Range (min … max): 228.5 ms … 254.2 ms 11 runs
Summary
receive: many refs (refformat = reftable, refcount = 10000, revision = HEAD) ran
18.16 ± 0.63 times faster than receive: many refs (refformat = reftable, refcount = 10000, revision = master)
In similar conditions, the files backend sees a 1.21x performance
improvement:
Benchmark 1: receive: many refs (refformat = files, refcount = 10000, revision = master)
Time (mean ± σ): 1.121 s ± 0.021 s [User: 0.128 s, System: 0.975 s]
Range (min … max): 1.097 s … 1.156 s 10 runs
Benchmark 2: receive: many refs (refformat = files, refcount = 10000, revision = HEAD)
Time (mean ± σ): 927.9 ms ± 22.6 ms [User: 99.0 ms, System: 815.2 ms]
Range (min … max): 903.1 ms … 978.0 ms 10 runs
Summary
receive: many refs (refformat = files, refcount = 10000, revision = HEAD) ran
1.21 ± 0.04 times faster than receive: many refs (refformat = files, refcount = 10000, revision = master)
As using batched updates requires the error handling to be moved to the
end of the flow, create and use a 'struct strset' to track the failed
refs and attribute the correct errors to them.
This change also uncovers an issue when a client provides multiple
updates to the same reference. For example:
$ git send-pack remote.git A:foo B:foo
Enumerating objects: 3, done.
Counting objects: 100% (3/3), done.
Delta compression using up to 20 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 226 bytes | 226.00 KiB/s, done.
Total 3 (delta 1), reused 0 (delta 0), pack-reused 0 (from 0)
remote: error: cannot lock ref 'refs/heads/foo': reference already exists
To remote.git
! [remote rejected] A -> foo (failed to update ref)
! [remote failure] B -> foo (remote failed to report status)
As you can see, the remote runs into an error because it cannot lock the
target reference for the second update. Furthermore, the remote complains
that the first update has been rejected whereas the second update didn't
receive any status update because we failed to lock it. Reading this status
message alone a user would probably expect that `foo` has not been updated
at all. But that's not the case: while we claim that the ref wasn't updated,
it surprisingly points to `A` now.
One could argue that this is merely an error in how we report the result of
this push. But ultimately, the user's request itself is already broken and
doesn't make any sense in the first place and cannot ever lead to a sensible
outcome that honors the full request.
The conversion to batched transactions fixes the issue because we now try to
queue both updates in the same transaction. As such, the transaction itself
will notice this conflict and refuse the update altogether before we commit
any of the values.
Note that this requires changes to a couple of tests in t5408 that happened
to exercise this behaviour. Given that the generated output is misleading
and given that the user request cannot ever be fully honored this really
feels more like a bug than properly designed behaviour. As such, changing
the behaviour feels like the right thing to do.
Since now reference updates are batched, the 'reference-transaction'
hook will be invoked with all updates together. Currently git will 'die'
when the hook returns with a non-zero exit status in the 'prepared'
stage. For 'git-receive-pack(1)', this allowed users to reject an
individual reference update, git would have applied previous updates but
immediately abort further execution. This is definitely an incorrect
usage of this hook, since the right place to do this would be the
'update' hook. This patch retains the latter behavior, but
'reference-transaction' hook now changes to a all-or-nothing behavior
when a non-zero exit status is returned in the 'prepared' stage, since
batch updates use a transaction under the hood. This explains the change
in 't1416'.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
As the comment of `repo_has_object_file()` and its `_with_flags()`
variant tells us, these functions are considered to be deprecated in
favor of `has_object()`. There are a couple of slight benefits in favor
of the replacement:
- The new function has a short-and-sweet name.
- More explicit defaults: `has_object()` doesn't fetch missing objects
via promisor remotes, and neither does it reload packfiles if an
object wasn't found by default. This ensures that it becomes
immediately obvious when a simple object existence check may result
in expensive actions.
Most importantly though, it is confusing that we have two sets of
functions that ultimately do the same thing, but with different
defaults.
Start sunsetting `repo_has_object_file()` and its `_with_flags()`
sibling by replacing all callsites with `has_object()`:
- `repo_has_object_file(...)` is equivalent to
`has_object(..., HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)`.
- `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT)`
is equivalent to `has_object(..., 0)`.
- `repo_has_object_file_with_flags(..., OBJECT_INFO_SKIP_FETCH_OBJECT)`
is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED)`.
- `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)`
is equivalent to `has_object(..., HAS_OBJECT_FETCH_PROMISOR)`.
The replacements should be functionally equivalent.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "object-store-ll.h" header has been introduced to keep transitive
header dependendcies and compile times at bay. Now that we have created
a new "object-store.c" file though we can easily move the last remaining
additional bit of "object-store.h", the `odb_path_map`, out of the
header.
Do so. As the "object-store.h" header is now equivalent to its low-level
alternative we drop the latter and inline it into the former.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While we have the "object-store.h" header, most of the functionality for
object stores is actually hosted in "object-file.c". This makes it hard
to find relevant functions and causes us to mix up concerns.
Split out functions relating to the object store subsystem into a new
"object-store.c" file.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `null_oid()` function returns the object ID that only consists of
zeroes. Naturally, this ID also depends on the hash algorithm used, as
the number of zeroes is different between SHA1 and SHA256. Consequently,
the function returns the hash-algorithm-specific null object ID.
This is currently done by depending on `the_hash_algo`, which implicitly
makes us depend on `the_repository`. Refactor the function to instead
pass in the hash algorithm for which we want to retrieve the null object
ID. Adapt callsites accordingly by passing in `the_repository`, thus
bubbling up the dependency on that global variable by one layer.
There are a couple of trivial exceptions for subsystems that already got
rid of `the_repository`. These subsystems instead use the repository
that is available via the calling context:
- "builtin/grep.c"
- "grep.c"
- "refs/debug.c"
There are also two non-trivial exceptions:
- "diff-no-index.c": Here we know that we may not have a repository
initialized at all, so we cannot rely on `the_repository`. Instead,
we adapt `diff_no_index()` to get a `struct git_hash_algo` as
parameter. The only caller is located in "builtin/diff.c", where we
know to call `repo_set_hash_algo()` in case we're running outside of
a Git repository. Consequently, it is fine to continue passing
`the_repository->hash_algo` even in this case.
- "builtin/ls-files.c": There is an in-flight patch series that drops
`USE_THE_REPOSITORY_VARIABLE` in this file, which causes a semantic
conflict because we use `null_oid()` in `show_submodule()`. The
value is passed to `repo_submodule_init()`, which may use the object
ID to resolve a tree-ish in the superproject from which we want to
read the submodule config. As such, the object ID should refer to an
object in the superproject, and consequently we need to use its hash
algorithm.
This means that we could in theory just not bother about this edge
case at all and just use `the_repository` in "diff-no-index.c". But
doing so would feel misdesigned.
Remove the `USE_THE_REPOSITORY_VARIABLE` preprocessor define in
"hash.c".
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The path.[ch] API takes an explicit repository parameter passed
throughout the callchain, instead of relying on the_repository
singleton instance.
* ps/path-sans-the-repository:
path: adjust last remaining users of `the_repository`
environment: move access to "core.sharedRepository" into repo settings
environment: move access to "core.hooksPath" into repo settings
repo-settings: introduce function to clear struct
path: drop `git_path()` in favor of `repo_git_path()`
rerere: let `rerere_path()` write paths into a caller-provided buffer
path: drop `git_common_path()` in favor of `repo_common_path()`
worktree: return allocated string from `get_worktree_git_dir()`
path: drop `git_path_buf()` in favor of `repo_git_path_replace()`
path: drop `git_pathdup()` in favor of `repo_git_path()`
path: drop unused `strbuf_git_path()` function
path: refactor `repo_submodule_path()` family of functions
submodule: refactor `submodule_to_gitdir()` to accept a repo
path: refactor `repo_worktree_path()` family of functions
path: refactor `repo_git_path()` family of functions
path: refactor `repo_common_path()` family of functions
|
|
Further code clean-up on the use of hash functions. Now the
context object knows what hash function it is working with.
* ps/hash-cleanup:
global: adapt callers to use generic hash context helpers
hash: provide generic wrappers to update hash contexts
hash: stop typedeffing the hash context
hash: convert hashing context to a structure
|
|
The `get_worktree_git_dir()` function returns a string constant that
does not need to be free'd by the caller. This string is computed for
three different cases:
- If we don't have a worktree we return a path into the Git directory.
The returned string is owned by `the_repository`, so there is no
need for the caller to free it.
- If we have a worktree, but no worktree ID then the caller requests
the main worktree. In this case we return a path into the common
directory, which again is owned by `the_repository` and thus does
not need to be free'd.
- In the third case, where we have an actual worktree, we compute the
path relative to "$GIT_COMMON_DIR/worktrees/". This string does not
need to be released either, even though `git_common_path()` ends up
allocating memory. But this doesn't result in a memory leak either
because we write into a buffer returned by `get_pathname()`, which
returns one out of four static buffers.
We're about to drop `git_common_path()` in favor of `repo_common_path()`,
which doesn't use the same mechanism but instead returns an allocated
string owned by the caller. While we could adapt `get_worktree_git_dir()`
to also use `get_pathname()` and print the derived common path into that
buffer, the whole schema feels a lot like premature optimization in this
context. There are some callsites where we call `get_worktree_git_dir()`
in a loop that iterates through all worktrees. But none of these loops
seem to be even remotely in the hot path, so saving a single allocation
there does not feel worth it.
Refactor the function to instead consistently return an allocated path
so that we can start using `repo_common_path()` in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code clean-up.
* kn/pack-write-with-reduced-globals:
pack-write: pass hash_algo to internal functions
pack-write: pass hash_algo to `write_rev_file()`
pack-write: pass hash_algo to `write_idx_file()`
pack-write: pass repository to `index_pack_lockfile()`
pack-write: pass hash_algo to `fixup_pack_header_footer()`
|
|
Adapt callers to use generic hash context helpers instead of using the
hash algorithm to update them. This makes the callsites easier to reason
about and removes the possibility that the wrong hash algorithm is used
to update the hash context's state. And as a nice side effect this also
gets rid of a bunch of users of `the_hash_algo`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We generally avoid using `typedef` in the Git codebase. One exception
though is the `git_hash_ctx`, likely because it used to be a union
rather than a struct until the preceding commit refactored it. But now
that it is a normal `struct` there isn't really a need for a typedef
anymore.
Drop the typedef and adapt all callers accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `index_pack_lockfile()` function uses the global `the_repository`
variable to access the repository. To avoid global variable usage, pass
the repository from the layers above.
Altough the layers above could have access to the repository internally,
simply pass in `the_repository`. This avoids any compatibility issues
and bubbles up global variable usage to upper layers which can be
eventually resolved.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
More code paths have a repository passed through the callchain,
instead of assuming the primary the_repository object.
* ps/the-repository:
match-trees: stop using `the_repository`
graph: stop using `the_repository`
add-interactive: stop using `the_repository`
tmp-objdir: stop using `the_repository`
resolve-undo: stop using `the_repository`
credential: stop using `the_repository`
mailinfo: stop using `the_repository`
diagnose: stop using `the_repository`
server-info: stop using `the_repository`
send-pack: stop using `the_repository`
serve: stop using `the_repository`
trace: stop using `the_repository`
pager: stop using `the_repository`
progress: stop using `the_repository`
|
|
In Git, fsck operations can ignore known broken objects via the
`fsck.skipList` configuration. This option expects a path to a file with
the list of object names. When the configuration is specified without a
path, an error message is printed, but the command continues as if the
configuration was not set. Configuring `fsck.skipList` without a value
is a misconfiguration so config parsing should be more strict and reject
it.
Update `git_fsck_config()` to no longer ignore misconfiguration of
`fsck.skipList`. The same behavior is also present for
`fetch.fsck.skipList` and `receive.fsck.skipList` so the configuration
parsers for these are updated to ensure the related operations remain
consistent.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Stop using `the_repository` in the "tmp-objdir" subsystem by passing
in the repostiroy when creating a new temporary object directory.
While we could trivially update the caller to pass in the hash algorithm
used by the index itself, we instead pass in `the_hash_algo`. This is
mostly done to stay consistent with the rest of the code in that file,
which isn't prepared to handle arbitrary repositories, either.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Stop using `the_repository` in the "server-info" subsystem by passing in
a repository when updating server info and storing the repository in the
`update_info_ctx` structure to make it accessible to other functions.
Adjust callers accordingly by using `the_repository`. While there may be
some callers that have a repository available in their context, this
trivial conversion allows for easier verification and bubbles up the use
of `the_repository` by one level.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Mark code units that generate warnings with `-Wsign-compare`. This
allows for a structured approach to get rid of all such warnings over
time in a way that can be easily measured.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Allow passing flags when setting up a transaction such that the
behaviour of the transaction itself can be altered. This functionality
will be used in a subsequent patch.
Adapt callers accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The push reports that report failures to the user when pushing a
reference leak in several places. Plug these leaks by introducing a new
function `ref_push_report_free()` that frees the list of reports and
call it as required. While at it, fix a trivially leaking error string
in the vicinity.
These leaks get hit in t5411, but plugging them does not make the whole
test suite pass.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Typofix.
* ak/typofix-builtins:
builtin: fix typos
|
|
The reftable backend learned to more efficiently handle exclude
patterns while enumerating the refs.
* ps/reftable-exclude:
refs/reftable: wire up support for exclude patterns
reftable/reader: make table iterator reseekable
t/unit-tests: introduce reftable library
Makefile: stop listing test library objects twice
builtin/receive-pack: fix exclude patterns when announcing refs
refs: properly apply exclude patterns to namespaced refs
|
|
Fix typos in comments.
Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In `write_head_info()` we announce references to the remote client. We
need to honor "transfer.hideRefs" here so that we do not announce any
references that the client shouldn't be able to learn about. This is
done via two separate mechanisms:
- We hand over exclude patterns to the reference backend. We can only
honor "plain" exclude patterns here that do not have prefixes with
special meaning such as "^" or "!". Filtering down the references is
handled by `hidden_refs_to_excludes()`.
- In `show_ref_cb()` we perform a second check against hidden refs.
For one this is done such that we can handle those special prefixes.
And second, handling exclude patterns in ref backends is optional,
so we also have to handle "normal" patterns.
The special-meaning "^" prefix alters whether a hidden ref applies to
the namespace-stripped reference name or the full name. So while we
would usually call `refs_for_each_namespaced_ref()` to only get those
references in the current namespace, we can't because we'd get the
already-rewritten reference names. Instead, we are forced to use
`refs_for_each_fullref_in()` and then manually strip away the namespace
prefix such that we have access to both names.
But this also means that we do not get namespace handling for exclude
patterns, which `refs_for_each_namespaced_ref()` brings for free. This
results in a bug because we potentially end up hiding away references
based on their namespaced name and not on the stripped name as we really
should be doing.
Fix this by manually rewriting the exclude patterns to their namespaced
variants.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Instead of including USE_THE_REPOSITORY_VARIABLE by default on every
builtin, remove it from builtin.h and add it to all the builtins that
include builtin.h (by definition, that means all builtins/*.c).
Also, remove the include statement for repository.h since it gets
brought in through builtin.h.
The next step will be to migrate each builtin
from having to use the_repository.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In order to reduce the usage of the global the_repository, add a
parameter to builtin functions that will get passed a repository
variable.
This commit uses UNUSED on most of the builtin functions, as subsequent
commits will modify the actual builtins to pass the repository parameter
down.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Use of API functions that implicitly depend on the_repository
object in the config subsystem has been rewritten to pass a
repository object through the callchain.
* ps/config-wo-the-repository:
config: hide functions using `the_repository` by default
global: prepare for hiding away repo-less config functions
config: don't depend on `the_repository` with branch conditions
config: don't have setters depend on `the_repository`
config: pass repo to functions that rename or copy sections
config: pass repo to `git_die_config()`
config: pass repo to `git_config_get_expiry_in_days()`
config: pass repo to `git_config_get_expiry()`
config: pass repo to `git_config_get_max_percent_split_change()`
config: pass repo to `git_config_get_split_index()`
config: pass repo to `git_config_get_index_threads()`
config: expose `repo_config_clear()`
config: introduce missing setters that take repo as parameter
path: hide functions using `the_repository` by default
path: stop relying on `the_repository` in `worktree_git_path()`
path: stop relying on `the_repository` when reporting garbage
hooks: remove implicit dependency on `the_repository`
editor: do not rely on `the_repository` for interactive edits
path: expose `do_git_common_path()` as `repo_common_pathv()`
path: expose `do_git_path()` as `repo_git_pathv()`
|
|
We implicitly depend on `the_repository` in our hook subsystem because
we use `strbuf_git_path()` to compute hook paths. Remove this dependency
by accepting a `struct repository` as parameter instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a parameter to each_ref_fn so that callers to the ref APIs
that use this function as a callback can have acess to the
unresolved value of a symbolic ref.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A CPP macro USE_THE_REPOSITORY_VARIABLE is introduced to help
transition the codebase to rely less on the availability of the
singleton the_repository instance.
* ps/use-the-repository:
hex: guard declarations with `USE_THE_REPOSITORY_VARIABLE`
t/helper: remove dependency on `the_repository` in "proc-receive"
t/helper: fix segfault in "oid-array" command without repository
t/helper: use correct object hash in partial-clone helper
compat/fsmonitor: fix socket path in networked SHA256 repos
replace-object: use hash algorithm from passed-in repository
protocol-caps: use hash algorithm from passed-in repository
oidset: pass hash algorithm when parsing file
http-fetch: don't crash when parsing packfile without a repo
hash-ll: merge with "hash.h"
refs: avoid include cycle with "repository.h"
global: introduce `USE_THE_REPOSITORY_VARIABLE` macro
hash: require hash algorithm in `empty_tree_oid_hex()`
hash: require hash algorithm in `is_empty_{blob,tree}_oid()`
hash: make `is_null_oid()` independent of `the_repository`
hash: convert `oidcmp()` and `oideq()` to compare whole hash
global: ensure that object IDs are always padded
hash: require hash algorithm in `oidread()` and `oidclr()`
hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()`
hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions
|
|
"git update-ref --stdin" learned to handle transactional updates of
symbolic-refs.
* kn/update-ref-symref:
update-ref: add support for 'symref-update' command
reftable: pick either 'oid' or 'target' for new updates
update-ref: add support for 'symref-create' command
update-ref: add support for 'symref-delete' command
update-ref: add support for 'symref-verify' command
refs: specify error for regular refs with `old_target`
refs: create and use `ref_update_expects_existing_old_ref()`
|
|
The `empty_tree_oid_hex()` function use `the_repository` to derive the
hash function that shall be used. Require callers to pass in the hash
algorithm to get rid of this implicit dependency.
While at it, remove the unused `empty_blob_oid_hex()` function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Both `oidread()` and `oidclr()` use `the_repository` to derive the hash
function that shall be used. Require callers to pass in the hash
algorithm to get rid of this implicit dependency.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We're about to enable `-Wwrite-strings`, which changes the type of
string constants to `const char[]`. Fix various sites where we assign
such constants to non-const variables.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a new command 'symref-delete' to allow deletions of symbolic refs in
a transaction via the '--stdin' mode of the 'git-update-ref' command.
The 'symref-delete' command can, when given an <old-target>, delete the
provided <ref> only when it points to <old-target>.
This command is only compatible with the 'no-deref' mode because we
optionally want to check the 'old_target' of the ref being deleted.
De-referencing a symbolic ref would provide a regular ref and we already
have the 'delete' command for regular refs.
While users can also use 'git symbolic-ref -d' to delete symbolic refs,
the 'symref-delete' command in 'git-update-ref' allows users to do so
within a transaction, which promises atomicity of the operation and can
be batched with other commands.
When no 'old_target' is provided it can also delete regular refs,
similar to how the 'delete' command can delete symrefs when no 'old_oid'
is provided.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The out parameter of `git_config_string()` is a `const char **` even
though we transfer ownership of memory to the caller. This is quite
misleading and has led to many memory leaks all over the place. Adapt
the parameter to instead be `char **`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The out parameter of `git_config_pathname()` is a `const char **` even
though we transfer ownership of memory to the caller. This is quite
misleading and has led to many memory leaks all over the place. Adapt
the parameter to instead be `char **`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* kn/ref-transaction-symref:
refs: remove `create_symref` and associated dead code
refs: rename `refs_create_symref()` to `refs_update_symref()`
refs: use transaction in `refs_create_symref()`
refs: add support for transactional symref updates
refs: move `original_update_refname` to 'refs.c'
refs: support symrefs in 'reference-transaction' hook
files-backend: extract out `create_symref_lock()`
refs: accept symref values in `ref_transaction_update()`
|
|
Updates to symbolic refs can now be made as a part of ref
transaction.
* kn/ref-transaction-symref:
refs: remove `create_symref` and associated dead code
refs: rename `refs_create_symref()` to `refs_update_symref()`
refs: use transaction in `refs_create_symref()`
refs: add support for transactional symref updates
refs: move `original_update_refname` to 'refs.c'
refs: support symrefs in 'reference-transaction' hook
files-backend: extract out `create_symref_lock()`
refs: accept symref values in `ref_transaction_update()`
|
|
Apply the rules that rewrite callers of "refs" interfaces to explicitly
pass `struct ref_store`. The resulting patch has been applied with the
`--whitespace=fix` option.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The function `ref_transaction_update()` obtains ref information and
flags to create a `ref_update` and add them to the transaction at hand.
To extend symref support in transactions, we need to also accept the
old and new ref targets and process it. This commit adds the required
parameters to the function and modifies all call sites.
The two parameters added are `new_target` and `old_target`. The
`new_target` is used to denote what the reference should point to when
the transaction is applied. Some functions allow this parameter to be
NULL, meaning that the reference is not changed.
The `old_target` denotes the value the reference must have before the
update. Some functions allow this parameter to be NULL, meaning that the
old value of the reference is not checked.
We also update the internal function `ref_transaction_add_update()`
similarly to take the two new parameters.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "receive-pack" program (which responds to "git push") was not
converted to run "git maintenance --auto" when other codepaths that
used to run "git gc --auto" were updated, which has been corrected.
* ps/run-auto-maintenance-in-receive-pack:
builtin/receive-pack: convert to use git-maintenance(1)
run-command: introduce function to prepare auto-maintenance process
|
|
In 850b6edefa (auto-gc: extract a reusable helper from "git fetch",
2020-05-06), we have introduced a helper function `run_auto_gc()` that
kicks off `git gc --auto`. The intent of this function was to pass down
the "--quiet" flag to git-gc(1) as required without duplicating this at
all callsites. In 7c3e9e8cfb (auto-gc: pass --quiet down from am,
commit, merge and rebase, 2020-05-06) we then converted callsites that
need to pass down this flag to use the new helper function. This has the
notable omission of git-receive-pack(1), which is the only remaining
user of `git gc --auto` that sets up the proccess manually. This is
probably because it unconditionally passes down the `--quiet` flag and
thus didn't benefit much from the new helper function.
In a95ce12430 (maintenance: replace run_auto_gc(), 2020-09-17) we then
replaced `run_auto_gc()` with `run_auto_maintenance()` which invokes
git-maintenance(1) instead of git-gc(1). This command is the modern
replacement for git-gc(1) and is both more thorough and also more
flexible because administrators can configure which tasks exactly to run
during maintenance.
But due to git-receive-pack(1) not using `run_auto_gc()` in the first
place it did not get converted to use git-maintenance(1) like we do
everywhere else now. Address this oversight and start to use the newly
introduced function `prepare_auto_maintenance()`. This will also make it
easier for us to adapt this code together with all the other callsites
that invoke auto-maintenance in the future.
This removes the last internal user of `git gc --auto`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Some functions in Git's source code follow the convention that returning
a negative value indicates a fatal error, e.g. repository corruption.
Let's use this convention in `repo_in_merge_bases()` to report when one
of the specified commits is missing (i.e. when `repo_parse_commit()`
reports an error).
Also adjust the callers of `repo_in_merge_bases()` to handle such
negative return values.
Note: As of this patch, errors are returned only if any of the specified
merge heads is missing. Over the course of the next patches, missing
commits will also be reported by the `paint_down_to_common()` function,
which is called by `repo_in_merge_bases_many()`, and those errors will
be properly propagated back to the caller at that stage.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Use the public function find_commit_header() and remove find_header(),
as it becomes unused. This is safe and appropriate because we pass the
NUL-terminated payload buffer to check_nonce() instead of its start and
length. The underlying strbuf push_cert cannot contain NULs, as it is
built using strbuf_addstr(), only.
We no longer need to call strlen(), as find_commit_header() returns the
length of nonce already.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Use the public function find_commit_header() instead of find_header() to
simplify the code. This is possible and safe because we're operating on
a strbuf, which is always NUL-terminated, so there is no risk of running
over the end of the buffer. It cannot contain NUL within the buffer, as
it is built using strbuf_addstr(), only.
The string comparison becomes more complicated because we need to check
for NUL explicitly after comparing the length-limited option, but on the
flip side we don't need to clean up allocations or track the remaining
buffer length.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Remove unused header "#include".
* en/header-cleanup:
treewide: remove unnecessary includes in source files
treewide: add direct includes currently only pulled in transitively
trace2/tr2_tls.h: remove unnecessary include
submodule-config.h: remove unnecessary include
pkt-line.h: remove unnecessary include
line-log.h: remove unnecessary include
http.h: remove unnecessary include
fsmonitor--daemon.h: remove unnecessary includes
blame.h: remove unnecessary includes
archive.h: remove unnecessary include
treewide: remove unnecessary includes in source files
treewide: remove unnecessary includes from header files
|
|
Each of these were checked with
gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).
...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file. These cases were:
* builtin/credential-cache.c
* builtin/pull.c
* builtin/send-pack.c
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When parsing fsck.*, receive.fsck.*, or fetch.fsck.*, we don't check for
an implicit bool. So any of:
[fsck]
badTree
[receive "fsck"]
badTree
[fetch "fsck"]
badTree
will cause us to segfault. We can fix it with config_error_nonbool() in
the usual way, but we have to make a few more changes to get good error
messages. The problem is that all three spots do:
if (skip_prefix(var, "fsck.", &var))
to match and parse the actual message id. But that means that "var" now
just says "badTree" instead of "receive.fsck.badTree", making the
resulting message confusing. We can fix that by storing the parsed
message id in its own separate variable.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
transfer.unpackLimit ought to be used as a fallback, but overrode
fetch.unpackLimit and receive.unpackLimit instead.
* ts/unpacklimit-config-fix:
transfer.unpackLimit: fetch/receive.unpackLimit takes precedence
|
|
The transfer.unpackLimit configuration variable is documented to be
used only as a fallback value when the more operation-specific
fetch.unpackLimit and receive.unpackLimit variables are not set, but
the implementation had the precedence reversed. Apparently this was
broken since the transfer.unpackLimit was introduced in e28714c5
(Consolidate {receive,fetch}.unpackLimit, 2007-01-24).
Often when documentation and code have diverged for so long, we
prefer to change the documentation instead, to avoid disrupting
users. But doing so would make these weirdly unlike most other
"specific overrides general" config options. And the fact that the
bug has existed for so long without anyone noticing implies to me
that nobody really tries to mix and match them much.
Signed-off-by: Taylor Santiago <taylorsantiago@google.com>
[jc: rewrote the log message, added tests, covered receive-pack as well]
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Enumerating refs in the packed-refs file, while excluding refs that
match certain patterns, has been optimized.
* tb/refs-exclusion-and-packed-refs:
ls-refs.c: avoid enumerating hidden refs where possible
upload-pack.c: avoid enumerating hidden refs where possible
builtin/receive-pack.c: avoid enumerating hidden references
refs.h: implement `hidden_refs_to_excludes()`
refs.h: let `for_each_namespaced_ref()` take excluded patterns
revision.h: store hidden refs in a `strvec`
refs/packed-backend.c: add trace2 counters for jump list
refs/packed-backend.c: implement jump lists to avoid excluded pattern(s)
refs/packed-backend.c: refactor `find_reference_location()`
refs: plumb `exclude_patterns` argument throughout
builtin/for-each-ref.c: add `--exclude` option
ref-filter.c: parameterize match functions over patterns
ref-filter: add `ref_filter_clear()`
ref-filter: clear reachable list pointers after freeing
ref-filter.h: provide `REF_FILTER_INIT`
refs.c: rename `ref_filter`
|
|
Further shuffling of declarations across header files to streamline
file dependencies.
* cw/compat-util-header-cleanup:
git-compat-util: move alloc macros to git-compat-util.h
treewide: remove unnecessary includes for wrapper.h
kwset: move translation table from ctype
sane-ctype.h: create header for sane-ctype macros
git-compat-util: move wrapper.c funcs to its header
git-compat-util: move strbuf.c funcs to its header
|
|
Now that `refs_for_each_fullref_in()` has the ability to avoid
enumerating references matching certain pattern(s), use that to avoid
visiting hidden refs when constructing the ref advertisement via
receive-pack.
Note that since this exclusion is best-effort, we still need
`show_ref_cb()` to check whether or not each reference is hidden or not
before including it in the advertisement.
As was the case when applying this same optimization to `upload-pack`,
`receive-pack`'s reference advertisement phase can proceed much quicker
by avoiding enumerating references that will not be part of the
advertisement.
(Below, we're still using linux.git with one hidden refs/pull/N ref per
commit):
$ hyperfine -L v ,.compile 'git{v} -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git'
Benchmark 1: git -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git
Time (mean ± σ): 89.1 ms ± 1.7 ms [User: 82.0 ms, System: 7.0 ms]
Range (min … max): 87.7 ms … 95.5 ms 31 runs
Benchmark 2: git.compile -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git
Time (mean ± σ): 4.5 ms ± 0.2 ms [User: 0.5 ms, System: 3.9 ms]
Range (min … max): 4.1 ms … 5.6 ms 508 runs
Summary
'git.compile -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git' ran
20.00 ± 1.05 times faster than 'git -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git'
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In subsequent commits, it will be convenient to have a 'const char **'
of hidden refs (matching `transfer.hiderefs`, `uploadpack.hideRefs`,
etc.), instead of a `string_list`.
Convert spots throughout the tree that store the list of hidden refs
from a `string_list` to a `strvec`.
Note that in `parse_hide_refs_config()` there is an ugly const-cast used
to avoid an extra copy of each value before trimming any trailing slash
characters. This could instead be written as:
ref = xstrdup(value);
len = strlen(ref);
while (len && ref[len - 1] == '/')
ref[--len] = '\0';
strvec_push(hide_refs, ref);
free(ref);
but the double-copy (once when calling `xstrdup()`, and another via
`strvec_push()`) is wasteful.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Reduce reliance on a global state in the config reading API.
* gc/config-context:
config: pass source to config_parser_event_fn_t
config: add kvi.path, use it to evaluate includes
config.c: remove config_reader from configsets
config: pass kvi to die_bad_number()
trace2: plumb config kvi
config.c: pass ctx with CLI config
config: pass ctx with config files
config.c: pass ctx in configsets
config: add ctx arg to config_fn_t
urlmatch.h: use config_fn_t type
config: inline git_color_default_config
|
|
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Plumb "struct key_value_info" through all code paths that end in
die_bad_number(), which lets us remove the helper functions that read
analogous values from "struct config_reader". As a result, nothing reads
config_reader.config_kvi any more, so remove that too.
In config.c, this requires changing the signature of
git_configset_get_value() to 'return' "kvi" in an out parameter so that
git_configset_get_<type>() can pass it to git_config_<type>(). Only
numeric types will use "kvi", so for non-numeric types (e.g.
git_configset_get_string()), pass NULL to indicate that the out
parameter isn't needed.
Outside of config.c, config callbacks now need to pass "ctx->kvi" to any
of the git_config_<type>() functions that parse a config string into a
number type. Included is a .cocci patch to make that refactor.
The only exceptional case is builtin/config.c, where git_config_<type>()
is called outside of a config callback (namely, on user-provided input),
so config source information has never been available. In this case,
die_bad_number() defaults to a generic, but perfectly descriptive
message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure
not to change the message.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a new "const struct config_context *ctx" arg to config_fn_t to hold
additional information about the config iteration operation.
config_context has a "struct key_value_info kvi" member that holds
metadata about the config source being read (e.g. what kind of config
source it is, the filename, etc). In this series, we're only interested
in .kvi, so we could have just used "struct key_value_info" as an arg,
but config_context makes it possible to add/adjust members in the future
without changing the config_fn_t signature. We could also consider other
ways of organizing the args (e.g. moving the config name and value into
config_context or key_value_info), but in my experiments, the
incremental benefit doesn't justify the added complexity (e.g. a
config_fn_t will sometimes invoke another config_fn_t but with a
different config value).
In subsequent commits, the .kvi member will replace the global "struct
config_reader" in config.c, making config iteration a global-free
operation. It requires much more work for the machinery to provide
meaningful values of .kvi, so for now, merely change the signature and
call sites, pass NULL as a placeholder value, and don't rely on the arg
in any meaningful way.
Most of the changes are performed by
contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every
config_fn_t:
- Modifies the signature to accept "const struct config_context *ctx"
- Passes "ctx" to any inner config_fn_t, if needed
- Adds UNUSED attributes to "ctx", if needed
Most config_fn_t instances are easily identified by seeing if they are
called by the various config functions. Most of the remaining ones are
manually named in the .cocci patch. Manual cleanups are still needed,
but the majority of it is trivial; it's either adjusting config_fn_t
that the .cocci patch didn't catch, or adding forward declarations of
"struct config_context ctx" to make the signatures make sense.
The non-trivial changes are in cases where we are invoking a config_fn_t
outside of config machinery, and we now need to decide what value of
"ctx" to pass. These cases are:
- trace2/tr2_cfg.c:tr2_cfg_set_fl()
This is indirectly called by git_config_set() so that the trace2
machinery can notice the new config values and update its settings
using the tr2 config parsing function, i.e. tr2_cfg_cb().
- builtin/checkout.c:checkout_main()
This calls git_xmerge_config() as a shorthand for parsing a CLI arg.
This might be worth refactoring away in the future, since
git_xmerge_config() can call git_default_config(), which can do much
more than just parsing.
Handle them by creating a KVI_INIT macro that initializes "struct
key_value_info" to a reasonable default, and use that to construct the
"ctx" arg.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The vast majority of files including object-store.h did not need dir.h
nor khash.h. Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.
After this patch:
$ git grep -h include..object-store | sort | uniq -c
2 #include "object-store.h"
129 #include "object-store-ll.h"
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This also made it clear that several .c files that depended upon path.h
were missing a #include for it; add the missing includes while at it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
More header clean-up.
* en/header-split-cache-h-part-2: (22 commits)
reftable: ensure git-compat-util.h is the first (indirect) include
diff.h: reduce unnecessary includes
object-store.h: reduce unnecessary includes
commit.h: reduce unnecessary includes
fsmonitor: reduce includes of cache.h
cache.h: remove unnecessary headers
treewide: remove cache.h inclusion due to previous changes
cache,tree: move basic name compare functions from read-cache to tree
cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c
hash-ll.h: split out of hash.h to remove dependency on repository.h
tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h
dir.h: move DTYPE defines from cache.h
versioncmp.h: move declarations for versioncmp.c functions from cache.h
ws.h: move declarations for ws.c functions from cache.h
match-trees.h: move declarations for match-trees.c functions from cache.h
pkt-line.h: move declarations for pkt-line.c functions from cache.h
base85.h: move declarations for base85.c functions from cache.h
copy.h: move declarations for copy.c functions from cache.h
server-info.h: move declarations for server-info.c functions from cache.h
packfile.h: move pack_window and pack_entry from cache.h
...
|
|
The code to parse capability list for v0 on-wire protocol fell into
an infinite loop when a capability appears multiple times, which
has been corrected.
* jk/protocol-cap-parse-fix:
v0 protocol: use size_t for capability length/offset
t5512: test "ls-remote --heads --symref" filtering with v0 and v2
t5512: allow any protocol version for filtered symref test
t5512: add v2 support for "ls-remote --symref" test
v0 protocol: fix sha1/sha256 confusion for capabilities^{}
t5512: stop referring to "v1" protocol
v0 protocol: fix infinite loop when parsing multi-valued capabilities
|
|
Header clean-up.
* en/header-split-cache-h: (24 commits)
protocol.h: move definition of DEFAULT_GIT_PORT from cache.h
mailmap, quote: move declarations of global vars to correct unit
treewide: reduce includes of cache.h in other headers
treewide: remove double forward declaration of read_in_full
cache.h: remove unnecessary includes
treewide: remove cache.h inclusion due to pager.h changes
pager.h: move declarations for pager.c functions from cache.h
treewide: remove cache.h inclusion due to editor.h changes
editor: move editor-related functions and declarations into common file
treewide: remove cache.h inclusion due to object.h changes
object.h: move some inline functions and defines from cache.h
treewide: remove cache.h inclusion due to object-file.h changes
object-file.h: move declarations for object-file.c functions from cache.h
treewide: remove cache.h inclusion due to git-zlib changes
git-zlib: move declarations for git-zlib functions from cache.h
treewide: remove cache.h inclusion due to object-name.h changes
object-name.h: move declarations for object-name.c functions from cache.h
treewide: remove unnecessary cache.h inclusion
treewide: be explicit about dependence on mem-pool.h
treewide: be explicit about dependence on oid-array.h
...
|
|
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When parsing server capabilities, we use "int" to store lengths and
offsets. At first glance this seems like a spot where our parser may be
confused by integer overflow if somebody sent us a malicious response.
In practice these strings are all bounded by the 64k limit of a
pkt-line, so using "int" is OK. However, it makes the code simpler to
audit if they just use size_t everywhere. Note that because we take
these parameters as pointers, this also forces many callers to update
their declared types.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Dozens of files made use of trace and trace2 functions, without
explicitly including trace.h or trace2.h. This made it more difficult
to find which files could remove a dependence on cache.h. Make C files
explicitly include trace.h or trace2.h if they are using them.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Split key function and data structure definitions out of cache.h to
new header files and adjust the users.
* en/header-split-cleanup:
csum-file.h: remove unnecessary inclusion of cache.h
write-or-die.h: move declarations for write-or-die.c functions from cache.h
treewide: remove cache.h inclusion due to setup.h changes
setup.h: move declarations for setup.c functions from cache.h
treewide: remove cache.h inclusion due to environment.h changes
environment.h: move declarations for environment.c functions from cache.h
treewide: remove unnecessary includes of cache.h
wrapper.h: move declarations for wrapper.c functions from cache.h
path.h: move function declarations for path.c functions from cache.h
cache.h: remove expand_user_path()
abspath.h: move absolute path functions from cache.h
environment: move comment_line_char from cache.h
treewide: remove unnecessary cache.h inclusion from several sources
treewide: remove unnecessary inclusion of gettext.h
treewide: be explicit about dependence on gettext.h
treewide: remove unnecessary cache.h inclusion from a few headers
|
|
Code clean-up around the use of the_repository.
* ab/remove-implicit-use-of-the-repository:
libs: use "struct repository *" argument, not "the_repository"
post-cocci: adjust comments for recent repo_* migration
cocci: apply the "revision.h" part of "the_repository.pending"
cocci: apply the "rerere.h" part of "the_repository.pending"
cocci: apply the "refs.h" part of "the_repository.pending"
cocci: apply the "promisor-remote.h" part of "the_repository.pending"
cocci: apply the "packfile.h" part of "the_repository.pending"
cocci: apply the "pretty.h" part of "the_repository.pending"
cocci: apply the "object-store.h" part of "the_repository.pending"
cocci: apply the "diff.h" part of "the_repository.pending"
cocci: apply the "commit.h" part of "the_repository.pending"
cocci: apply the "commit-reach.h" part of "the_repository.pending"
cocci: apply the "cache.h" part of "the_repository.pending"
cocci: add missing "the_repository" macros to "pending"
cocci: sort "the_repository" rules by header
cocci: fix incorrect & verbose "the_repository" rules
cocci: remove dead rule from "the_repository.pending.cocci"
|
|
en/header-split-cache-h
* ab/remove-implicit-use-of-the-repository:
libs: use "struct repository *" argument, not "the_repository"
post-cocci: adjust comments for recent repo_* migration
cocci: apply the "revision.h" part of "the_repository.pending"
cocci: apply the "rerere.h" part of "the_repository.pending"
cocci: apply the "refs.h" part of "the_repository.pending"
cocci: apply the "promisor-remote.h" part of "the_repository.pending"
cocci: apply the "packfile.h" part of "the_repository.pending"
cocci: apply the "pretty.h" part of "the_repository.pending"
cocci: apply the "object-store.h" part of "the_repository.pending"
cocci: apply the "diff.h" part of "the_repository.pending"
cocci: apply the "commit.h" part of "the_repository.pending"
cocci: apply the "commit-reach.h" part of "the_repository.pending"
cocci: apply the "cache.h" part of "the_repository.pending"
cocci: add missing "the_repository" macros to "pending"
cocci: sort "the_repository" rules by header
cocci: fix incorrect & verbose "the_repository" rules
cocci: remove dead rule from "the_repository.pending.cocci"
|
|
Code clean-up to include and/or uninclude parse-options.h file as
needed.
* sg/parse-options-h-users:
treewide: remove unnecessary inclusions of parse-options.h from headers
treewide: include parse-options.h in source files
|
|
Apply the part of "the_repository.pending.cocci" pertaining to
"object-store.h".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Apply the part of "the_repository.pending.cocci" pertaining to
"commit-reach.h".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Apply the part of "the_repository.pending.cocci" pertaining to
"cache.h".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git receive-pack" that responds to "git push" requests failed to
clean a stale lockfile when killed in the middle, which has been
corrected.
* ps/receive-pack-unlock-before-die:
receive-pack: fix stale packfile locks when dying
|
|
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This is another step towards letting us remove the include of cache.h in
strbuf.c. It does mean that we also need to add includes of abspath.h
in a number of C files.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Dozens of files made use of gettext functions, without explicitly
including gettext.h. This made it more difficult to find which files
could remove a dependence on cache.h. Make C files explicitly include
gettext.h if they are using it.
However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an
include of gettext.h, it was left out to avoid conflicting with an
in-flight topic.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The builtins 'ls-remote', 'pack-objects', 'receive-pack', 'reflog' and
'send-pack' use parse_options(), but their source files don't directly
include 'parse-options.h'. Furthermore, the source files
'diagnose.c', 'list-objects-filter-options.c', 'remote.c' and
'send-pack.c' define option parsing callback functions, while
'revision.c' defines an option parsing helper function, and thus need
access to various fields in 'struct option' and 'struct
parse_opt_ctx_t', but they don't directly include 'parse-options.h'
either. They all can still be built, of course, because they include
one of the header files that does include 'parse-options.h' (though
unnecessarily, see the next commit).
Add those missing includes to these files, as our general rule is that
"a C file must directly include the header files that declare the
functions and the types it uses".
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git push" has been taught to allow deletion of refs with one-level
names to help repairing a repository who acquired such a ref by
mistake. In general, we don't encourage use of such a ref, and
creation or update to such a ref is rejected as before.
* zh/push-to-delete-onelevel-ref:
push: allow delete single-level ref
receive-pack: fix funny ref error messsage
|
|
Instead of forcing each command to choose to honor GPG related
configuration variables, make the subsystem lazily initialize
itself.
* jc/gpg-lazy-init:
drop pure pass-through config callbacks
gpg-interface: lazily initialize and read the configuration
|
|
When accepting a packfile in git-receive-pack(1), we feed that packfile
into git-index-pack(1) to generate the packfile index. As the packfile
would often only contain unreachable objects until the references have
been updated, concurrently running garbage collection might be tempted
to delete the packfile right away and thus cause corruption. To fix
this, we ask git-index-pack(1) to create a `.keep` file before moving
the packfile into place, which is getting deleted again once all of the
reference updates have been processed.
Now in production systems we have observed that those `.keep` files are
sometimes not getting deleted as expected, where the result is that
repositories tend to grow packfiles that are never deleted over time.
This seems to be caused by a race when git-receive-pack(1) is killed
after we have migrated the kept packfile from the quarantine directory
into the main object database. While this race window is typically small
it can be extended for example by installing a `proc-receive` hook.
Fix this race by registering the lockfile as a tempfile so that it will
automatically be removed at exit or when receiving a signal.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We discourage the creation/update of single-level refs
because some upper-layer applications only work in specified
reference namespaces, such as "refs/heads/*" or "refs/tags/*",
these single-level refnames may not be recognized. However,
we still hope users can delete them which have been created
by mistake.
Therefore, when updating branches on the server with
"git receive-pack", by checking whether it is a branch deletion
operation, it will determine whether to allow the update of
a single-level refs. This avoids creating/updating such
single-level refs, but allows them to be deleted.
On the client side, "git push" also does not properly fill in
the old-oid of single-level refs, which causes the server-side
"git receive-pack" to think that the ref's old-oid has changed
when deleting single-level refs, this causes the push to be
rejected. So the solution is to fix the client to be able to
delete single-level refs by properly filling old-oid.
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When the user deletes the remote one level branch through
"git push origin -d refs/foo", remote will return an error:
"refusing to create funny ref 'refs/foo' remotely", here we
are not creating "refs/foo" instead wants to delete it, so a
better error description here would be: "refusing to update
funny ref 'refs/foo' remotely".
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Instead of forcing the porcelain commands to always read the
configuration variables related to the signing and verifying
signatures, lazily initialize the necessary subsystem on demand upon
the first use.
This hopefully would make it more future-proof as we do not have to
think and decide whether we should call git_gpg_config() in the
git_config() callback for each command.
A few git_config() callback functions that used to be custom
callbacks are now just a thin wrapper around git_default_config().
We could further remove, git_FOO_config and replace calls to
git_config(git_FOO_config) with git_config(git_default_config), but
to make it clear which ones are affected and the effect is only the
removal of git_gpg_config(), it is vastly preferred not to do such a
change in this step (they can be done on top once the dust settled).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Fix a memory leak that's been with us since this code was introduced
in [1]. Later in [2] we started using FLEX_ALLOC_MEM() to allocate the
"struct command *".
1. 575f497456e (Add first cut at "git-receive-pack", 2005-06-29)
2. eb1af2df0b1 (git-receive-pack: start parsing ref update commands,
2005-06-29)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When serving a push, git-receive-pack(1) needs to verify that the
packfile sent by the client contains all objects that are required by
the updated references. This connectivity check works by marking all
preexisting references as uninteresting and using the new reference tips
as starting point for a graph walk.
Marking all preexisting references as uninteresting can be a problem
when it comes to performance. Git forges tend to do internal bookkeeping
to keep alive sets of objects for internal use or make them easy to find
via certain references. These references are typically hidden away from
the user so that they are neither advertised nor writeable. At GitLab,
we have one particular repository that contains a total of 7 million
references, of which 6.8 million are indeed internal references. With
the current connectivity check we are forced to load all these
references in order to mark them as uninteresting, and this alone takes
around 15 seconds to compute.
We can optimize this by only taking into account the set of visible refs
when marking objects as uninteresting. This means that we may now walk
more objects until we hit any object that is marked as uninteresting.
But it is rather unlikely that clients send objects that make large
parts of objects reachable that have previously only ever been hidden,
whereas the common case is to push incremental changes that build on top
of the visible object graph.
This provides a huge boost to performance in the mentioned repository,
where the vast majority of its refs hidden. Pushing a new commit into
this repo with `transfer.hideRefs` set up to hide 6.8 million of 7 refs
as it is configured in Gitaly leads to a 4.5-fold speedup:
Benchmark 1: main
Time (mean ± σ): 30.977 s ± 0.157 s [User: 30.226 s, System: 1.083 s]
Range (min … max): 30.796 s … 31.071 s 3 runs
Benchmark 2: pks-connectivity-check-hide-refs
Time (mean ± σ): 6.799 s ± 0.063 s [User: 6.803 s, System: 0.354 s]
Range (min … max): 6.729 s … 6.850 s 3 runs
Summary
'pks-connectivity-check-hide-refs' ran
4.56 ± 0.05 times faster than 'main'
As we mostly go through the same codepaths even in the case where there
are no hidden refs at all compared to the code before there is no change
in performance when no refs are hidden:
Benchmark 1: main
Time (mean ± σ): 48.188 s ± 0.432 s [User: 49.326 s, System: 5.009 s]
Range (min … max): 47.706 s … 48.539 s 3 runs
Benchmark 2: pks-connectivity-check-hide-refs
Time (mean ± σ): 48.027 s ± 0.500 s [User: 48.934 s, System: 5.025 s]
Range (min … max): 47.504 s … 48.500 s 3 runs
Summary
'pks-connectivity-check-hide-refs' ran
1.00 ± 0.01 times faster than 'main'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
|
We're about to add a new argument to git-rev-list(1) that allows it to
add all references that are visible when taking `transfer.hideRefs` et
al into account. This will require us to potentially parse multiple sets
of hidden refs, which is not easily possible right now as there is only
a single, global instance of the list of parsed hidden refs.
Refactor `parse_hide_refs_config()` and `ref_is_hidden()` so that both
take the list of hidden references as input and adjust callers to keep a
local list, instead. This allows us to easily use multiple hidden-ref
lists. Furthermore, it allows us to properly free this list before we
exit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
|
|
As reported in [1] the "UNUSED(var)" macro introduced in
2174b8c75de (Merge branch 'jk/unused-annotation' into next,
2022-08-24) breaks coccinelle's parsing of our sources in files where
it occurs.
Let's instead partially go with the approach suggested in [2] of
making this not take an argument. As noted in [1] "coccinelle" will
ignore such tokens in argument lists that it doesn't know about, and
it's less of a surprise to syntax highlighters.
This undoes the "help us notice when a parameter marked as unused is
actually use" part of 9b240347543 (git-compat-util: add UNUSED macro,
2022-08-19), a subsequent commit will further tweak the macro to
implement a replacement for that functionality.
1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The start_async(), etc, functions need a "proc" callback that conforms
to a particular interface. Not every callback needs every parameter
(e.g., the caller might not even ask to open an input descriptor, in
which case there is no point in the callback looking at it). Let's mark
these for -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Functions used with for_each_ref(), etc, need to conform to the
each_ref_fn interface. But most of them don't need every parameter;
let's annotate the unused ones to quiet -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A new bug() and BUG_if_bug() API is introduced to make it easier to
uniformly log "detect multiple bugs and abort in the end" pattern.
* ab/bug-if-bug:
cache-tree.c: use bug() and BUG_if_bug()
receive-pack: use bug() and BUG_if_bug()
parse-options.c: use optbug() instead of BUG() "opts" check
parse-options.c: use new bug() API for optbug()
usage.c: add a non-fatal bug() function to go with BUG()
common-main.c: move non-trace2 exit() behavior out of trace2.c
|
|
Rename .env_array member to .env in the child_process structure.
* ab/env-array:
run-command API users: use "env" not "env_array" in comments & names
run-command API: rename "env_array" to "env"
|
|
Start following-up on the rename mentioned in c7c4bdeccf3 (run-command
API: remove "env" member, always use "env_array", 2021-11-25) of
"env_array" to "env".
The "env_array" name was picked in 19a583dc39e (run-command: add
env_array, an optional argv_array for env, 2014-10-19) because "env"
was taken. Let's not forever keep the oddity of "*_array" for this
"struct strvec", but not for its "args" sibling.
This commit is almost entirely made with a coccinelle rule[1]. The
only manual change here is in run-command.h to rename the struct
member itself and to change "env_array" to "env" in the
CHILD_PROCESS_INIT initializer.
The rest of this is all a result of applying [1]:
* make contrib/coccinelle/run_command.cocci.patch
* patch -p1 <contrib/coccinelle/run_command.cocci.patch
* git add -u
1. cat contrib/coccinelle/run_command.pending.cocci
@@
struct child_process E;
@@
- E.env_array
+ E.env
@@
struct child_process *E;
@@
- E->env_array
+ E->env
I've avoided changing any comments and derived variable names here,
that will all be done in the next commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Amend code added in a6a84319686 (receive-pack.c: shorten the
execute_commands loop over all commands, 2015-01-07) and amended to
hard die in b6a4788586d (receive-pack.c: die instead of error in case
of possible future bug, 2015-01-07) to use the new bug() function
instead.
Let's also rename the warn_if_*() function that code is in to
BUG_if_*(), its name became outdated in b6a4788586d.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code clean-up.
* tb/receive-pack-code-cleanup:
builtin/receive-pack.c: remove redundant 'if'
|
|
Introduce and apply coccinelle rule to discourage an explicit
comparison between a pointer and NULL, and applies the clean-up to
the maintenance track.
* ep/maint-equals-null-cocci:
tree-wide: apply equals-null.cocci
tree-wide: apply equals-null.cocci
contrib/coccinnelle: add equals-null.cocci
|
|
In c7c4bdeccf (run-command API: remove "env" member, always use
"env_array", 2021-11-25), there was a push to replace
cld.env = env->v;
with
strvec_pushv(&cld.env_array, env->v);
The conversion in c7c4bdeccf was mostly plug-and-play, with the snag
that some instances of strvec_pushv() became guarded with a NULL check
to ensure that the second argument was non-NULL.
This conversion was slightly over-eager to add a conditional in
builtin/receive-pack.c::unpack(), since we know at the point that we
add the result of `tmp_objdir_env()` into the child process's
environment, that `tmp_objdir` is non-NULL.
This follows from the conditional just before our strvec_pushv() call
(which returns from the function if `tmp_objdir` was NULL), as well as
the call to tmp_objdir_add_as_alternate() just below, which relies on
its argument (`tmp_objdir`) being non-NULL.
In the meantime, this extra conditional isn't hurting anything. But it
is redundant and thus unnecessarily confusing. So let's remove it.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* ep/maint-equals-null-cocci:
tree-wide: apply equals-null.cocci
contrib/coccinnelle: add equals-null.cocci
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Count string_list items in size_t, not "unsigned int".
* ab/string-list-count-in-size-t:
string-list API: change "nr" and "alloc" to "size_t"
gettext API users: don't explicitly cast ngettext()'s "n"
|
|
Code clean-up to allow callers of run_commit_hook() to learn if it
got "success" because the hook succeeded or because there wasn't
any hook.
* ab/racy-hooks:
hooks: fix an obscure TOCTOU "did we just run a hook?" race
merge: don't run post-hook logic on --no-verify
|
|
Object-file API shuffling.
* ab/object-file-api-updates:
object-file API: pass an enum to read_object_with_reference()
object-file.c: add a literal version of write_object_file_prepare()
object-file API: have hash_object_file() take "enum object_type"
object API: rename hash_object_file_literally() to write_*()
object-file API: split up and simplify check_object_signature()
object API users + docs: check <0, not !0 with check_object_signature()
object API docs: move check_object_signature() docs to cache.h
object API: correct "buf" v.s. "map" mismatch in *.c and *.h
object-file API: have write_object_file() take "enum object_type"
object-file API: add a format_object_header() function
object-file API: return "void", not "int" from hash_object_file()
object-file.c: split up declaration of unrelated variables
|
|
Fix a Time-of-check to time-of-use (TOCTOU) race in code added in
680ee550d72 (commit: skip discarding the index if there is no
pre-commit hook, 2017-08-14).
This obscure race condition can occur if we e.g. ran the "pre-commit"
hook and it modified the index, but hook_exists() returns false later
on (e.g., because the hook itself went away, the directory became
unreadable, etc.). Then we won't call discard_cache() when we should
have.
The race condition itself probably doesn't matter, and users would
have been unlikely to run into it in practice. This problem has been
noted on-list when 680ee550d72 was discussed[1], but had not been
fixed.
This change is mainly intended to improve the readability of the code
involved, and to make reasoning about it more straightforward. It
wasn't as obvious what we were trying to do here, but by having an
"invoked_hook" it's clearer that e.g. our discard_cache() is happening
because of the earlier hook execution.
Let's also change this for the push-to-checkout hook. Now instead of
checking if the hook exists and either doing a push to checkout or a
push to deploy we'll always attempt a push to checkout. If the hook
doesn't exist we'll fall back on push to deploy. The same behavior as
before, without the TOCTOU race. See 0855331941b (receive-pack:
support push-to-checkout hook, 2014-12-01) for the introduction of the
previous behavior.
This leaves uses of hook_exists() in two places that matter. The
"reference-transaction" check in refs.c, see 67541597670 (refs:
implement reference transaction hook, 2020-06-19), and the
"prepare-commit-msg" hook, see 66618a50f9c (sequencer: run
'prepare-commit-msg' hook, 2018-01-24).
In both of those cases we're saving ourselves CPU time by not
preparing data for the hook that we'll then do nothing with if we
don't have the hook. So using this "invoked_hook" pattern doesn't make
sense in those cases.
The "reference-transaction" and "prepare-commit-msg" hook also aren't
racy. In those cases we'll skip the hook runs if we race with a new
hook being added, whereas in the TOCTOU races being fixed here we were
incorrectly skipping the required post-hook logic.
1. https://lore.kernel.org/git/20170810191613.kpmhzg4seyxy3cpq@sigill.intra.peff.net/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Change the "nr" and "alloc" members of "struct string_list" to use
"size_t" instead of "nr". On some platforms the size of an "unsigned
int" will be smaller than a "size_t", e.g. a 32 bit unsigned v.s. 64
bit unsigned. As "struct string_list" is a generic API we use in a lot
of places this might cause overflows.
As one example: code in "refs.c" keeps track of the number of refs
with a "size_t", and auxiliary code in builtin/remote.c in
get_ref_states() appends those to a "struct string_list".
While we're at it split the "nr" and "alloc" in string-list.h across
two lines, which is the case for most such struct member
declarations (e.g. in "strbuf.h" and "strvec.h").
Changing e.g. "int i" to "size_t i" in run_and_feed_hook() isn't
strictly necessary, and there are a lot more cases where we'll use a
local "int", "unsigned int" etc. variable derived from the "nr" in the
"struct string_list". But in that case as well as
add_wrapped_shortlog_msg() in builtin/shortlog.c we need to adjust the
printf format referring to "nr" anyway, so let's also change the other
variables referring to it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Change the write_object_file() function to take an "enum object_type"
instead of a "const char *type". Its callers either passed
{commit,tree,blob,tag}_type and can pass the corresponding OBJ_* type
instead, or were hardcoding strings like "blob".
This avoids the back & forth fragility where the callers of
write_object_file() would have the enum type, and convert it
themselves via type_name(). We do have to now do that conversion
ourselves before calling write_object_file_prepare(), but those
codepaths will be similarly adjusted in subsequent commits.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"receive-pack" checks if it will do any ref updates (various
conditions could reject a push) before received objects are taken
out of the temporary directory used for quarantine purposes, so
that a push that is known-to-fail will not leave crufts that a
future "gc" needs to clean up.
* cb/clear-quarantine-early-on-all-ref-update-errors:
receive-pack: purge temporary data if no command is ready to run
|
|
More "config-based hooks".
* ab/config-based-hooks-2:
run-command: remove old run_hook_{le,ve}() hook API
receive-pack: convert push-to-checkout hook to hook.h
read-cache: convert post-index-change to use hook.h
commit: convert {pre-commit,prepare-commit-msg} hook to hook.h
git-p4: use 'git hook' to run hooks
send-email: use 'git hook run' for 'sendemail-validate'
git hook run: add an --ignore-missing flag
hooks: convert worktree 'post-checkout' hook to hook library
hooks: convert non-worktree 'post-checkout' hook to hook library
merge: convert post-merge to use hook.h
am: convert applypatch-msg to use hook.h
rebase: convert pre-rebase to use hook.h
hook API: add a run_hooks_l() wrapper
am: convert {pre,post}-applypatch to use hook.h
gc: use hook library for pre-auto-gc hook
hook API: add a run_hooks() wrapper
hook: add 'run' subcommand
|
|
Code clean-up.
* jc/find-header:
receive-pack.c: consolidate find header logic
|
|
When pushing a hidden ref, e.g.:
$ git push origin HEAD:refs/hidden/foo
"receive-pack" will reject our request with an error message like this:
! [remote rejected] HEAD -> refs/hidden/foo (deny updating a hidden ref)
The remote side ("git-receive-pack") will not create the hidden ref as
expected, but the pack file sent by "git-send-pack" is left inside the
remote repository. I.e. the quarantine directory is not purged as it
should be.
Add a checkpoint before calling "tmp_objdir_migrate()" and after calling
the "pre-receive" hook to purge that temporary data in the quarantine
area when there is no command ready to run.
The reason we do not add the checkpoint before the "pre-receive" hook,
but after it, is that the "pre-receive" hook is called with a switch-off
"skip_broken" flag, and all commands, even broken ones, should be fed
by calling "feed_receive_hook()".
Add a new test case in t5516 as well.
Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Helped-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Chen Bojun <bojun.cbj@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Move the push-to-checkout hook away from run-command.h to and over to
the new hook.h library.
This removes the last direct user of run_hook_le(), so we could remove
that function now, but let's leave that to a follow-up cleanup commit.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are two functions that have very similar logic of finding a header
value. find_commit_header, and find_header. We can conslidate the logic
by introducing a new function find_header_mem, which is equivalent to
find_commit_header except it takes a len parameter that determines how
many bytes will be read. find_commit_header and find_header can then both
call find_header_mem.
This reduces duplicate logic, as the logic for finding header values
can now all live in one place.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
New interface into the tmp-objdir API to help in-core use of the
quarantine feature.
* ns/tmp-objdir:
tmp-objdir: disable ref updates when replacing the primary odb
tmp-objdir: new API for creating temporary writable databases
|
|
"git fetch" without the "--update-head-ok" option ought to protect
a checked out branch from getting updated, to prevent the working
tree that checks it out to go out of sync. The code was written
before the use of "git worktree" got widespread, and only checked
the branch that was checked out in the current worktree, which has
been updated.
(originally called ak/fetch-not-overwrite-any-current-branch)
* ak/protect-any-current-branch:
branch: protect branches checked out in all worktrees
receive-pack: protect current branch for bare repository worktree
receive-pack: clean dead code from update_worktree()
fetch: protect branches checked out in all worktrees
worktree: simplify find_shared_symref() memory ownership model
branch: lowercase error messages
receive-pack: lowercase error messages
fetch: lowercase error messages
|
|
Extend the signing of objects with SSH keys and learn to pay
attention to the key validity time range when verifying.
* fs/ssh-signing-key-lifetime:
ssh signing: verify ssh-keygen in test prereq
ssh signing: make fmt-merge-msg consider key lifetime
ssh signing: make verify-tag consider key lifetime
ssh signing: make git log verify key lifetime
ssh signing: make verify-commit consider key lifetime
ssh signing: add key lifetime test prereqs
ssh signing: use sigc struct to pass payload
t/fmt-merge-msg: make gpgssh tests more specific
t/fmt-merge-msg: do not redirect stderr
|
|
When the "git push" command is killed while the receiving end is
trying to report what happened to the ref update proposals, the
latter used to die, due to SIGPIPE. The code now ignores SIGPIPE
to increase our chances to run the post-receive hook after it
happens.
* rj/receive-pack-avoid-sigpipe-during-status-reporting:
receive-pack: ignore SIGPIPE while reporting status to client
|
|
To be able to extend the payload metadata with things like its creation
timestamp or the creators ident we remove the payload parameters to
check_signature() and use the already existing sigc->payload field
instead, only adding the length field to the struct. This also allows
us to get rid of the xmemdupz() calls in the verify functions. Since
sigc is now used to input data as well as output the result move it to
the front of the function list.
- Add payload_length to struct signature_check
- Populate sigc.payload/payload_len on all call sites
- Remove payload parameters to check_signature()
- Remove payload parameters to internal verify_* functions and use sigc
instead
- Remove xmemdupz() used for verbose output since payload is now already
populated.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The tmp_objdir API provides the ability to create temporary object
directories, but was designed with the goal of having subprocesses
access these object stores, followed by the main process migrating
objects from it to the main object store or just deleting it. The
subprocesses would view it as their primary datastore and write to it.
Here we add the tmp_objdir_replace_primary_odb function that replaces
the current process's writable "main" object directory with the
specified one. The previous main object directory is restored in either
tmp_objdir_migrate or tmp_objdir_destroy.
For the --remerge-diff usecase, add a new `will_destroy` flag in `struct
object_database` to mark ephemeral object databases that do not require
fsync durability.
Add 'git prune' support for removing temporary object databases, and
make sure that they have a name starting with tmp_ and containing an
operation-specific name.
Based-on-patch-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A bare repository won’t have a working tree at "..", but it may still
have separate working trees created with git worktree. We should protect
the current branch of such working trees from being updated or deleted,
according to receive.denyCurrentBranch.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
update_worktree() can only be called with a non-NULL worktree parameter,
because that’s the only case where we set do_update_worktree = 1.
worktree->path is always initialized to non-NULL.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Storing the worktrees list in a static variable meant that
find_shared_symref() had to rebuild the list on each call (which is
inefficient when the call site is in a loop), and also that each call
invalidated the pointer returned by the previous call (which is
confusing).
Instead, make it the caller’s responsibility to pass in the worktrees
list and manage its lifetime.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Documentation/CodingGuidelines says “do not end error messages with a
full stop” and “do not capitalize the first word”. Clean up existing
messages, some of which we will be touching in later steps in the
series, that deviate from these rules in this file, as a preparation for
the main part of the topic.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Remove the "env" member from "struct child_process" in favor of always
using the "env_array". As with the preceding removal of "argv" in
favor of "args" this gets rid of current and future oddities around
memory management at the API boundary (see the amended API docs).
For some of the conversions we can replace patterns like:
child.env = env->v;
With:
strvec_pushv(&child.env_array, env->v);
But for others we need to guard the strvec_pushv() with a NULL check,
since we're not passing in the "v" member of a "struct strvec",
e.g. in the case of tmp_objdir_env()'s return value.
Ideally we'd rename the "env_array" member to simply "env" as a
follow-up, since it and "args" are now inconsistent in not having an
"_array" suffix, and seemingly without any good reason, unless we look
at the history of how they came to be.
But as we've currently got 122 in-tree hits for a "git grep env_array"
let's leave that for now (and possibly forever). Doing that rename
would be too disruptive.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Change a pattern of hardcoding an "argv" array size, populating it and
assigning to the "argv" member of "struct child_process" to instead
use "strvec_push()" to add data to the "args" member.
As noted in the preceding commit this moves us further towards being
able to remove the "argv" member in a subsequent commit
These callers could have used strvec_pushl(), but moving to
strvec_push() makes the diff easier to read, and keeps the arguments
aligned as before.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Change a pattern of hardcoding an "argv" array size, populating it and
assigning to the "argv" member of "struct child_process" to instead
use "strvec_pushl()" to add data to the "args" member.
This implements the same behavior as before in fewer lines of code,
and moves us further towards being able to remove the "argv" member in
a subsequent commit.
Since we've entirely removed the "argv" variable(s) we can be sure
that no potential logic errors of the type discussed in a preceding
commit are being introduced here, i.e. ones where the local "argv" was
being modified after the assignment to "struct child_process"'s
"argv".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Before running the post-receive hook, status info is reported back to
the client. If a remote client exits before or during the status report,
receive-pack is killed by SIGPIPE and post-receive is never executed.
The post-receive hook is often used to send email notifications (see
contrib/hooks/post-receive-email), update bug trackers, start automatic
builds, etc. Not executing it after an interrupted yet "successful" push
can lead to inconsistencies.
Ignore SIGPIPE before reporting status to the client to increase the
chances of post-receive running if pre-receive was successful. This does
not guarantee 100% consistency but it should resist early disconnection
by the client.
Signed-off-by: Robin Jarry <robin@jarry.cc>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Use ssh public crypto for object and push-cert signing.
* fs/ssh-signing:
ssh signing: test that gpg fails for unknown keys
ssh signing: tests for logs, tags & push certs
ssh signing: duplicate t7510 tests for commits
ssh signing: verify signatures using ssh-keygen
ssh signing: provide a textual signing_key_id
ssh signing: retrieve a default key from ssh-agent
ssh signing: add ssh key format and signing code
ssh signing: add test prereqs
ssh signing: preliminary refactoring and clean-up
|
|
Use the new hook_exists() function instead of find_hook() where the
latter was called in boolean contexts. This make subsequent changes in
a series where we further refactor the hook API clearer, as we won't
conflate wanting to get the path of the hook with checking for its
existence.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Move the find_hook() function from run-command.c to a new hook.c
library. This change establishes a stub library that's pretty
pointless right now, but will see much wider use with Emily Shaffer's
upcoming "configuration-based hooks" series.
Eventually all the hook related code will live in hook.[ch]. Let's
start that process by moving the simple find_hook() function over
as-is.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The run-command API has been updated so that the callers can easily
ask the file descriptors open for packfiles to be closed immediately
before spawning commands that may trigger auto-gc.
* js/run-command-close-packs:
Close object store closer to spawning child processes
run_auto_maintenance(): implicitly close the object store
run-command: offer to close the object store before running
run-command: prettify the `RUN_COMMAND_*` flags
pull: release packs before fetching
commit-graph: when closing the graph, also release the slab
|
|
Code clean-up around "git serve".
* ab/serve-cleanup:
upload-pack: document and rename --advertise-refs
serve.[ch]: remove "serve_options", split up --advertise-refs code
{upload,receive}-pack tests: add --advertise-refs tests
serve.c: move version line to advertise_capabilities()
serve: move transfer.advertiseSID check into session_id_advertise()
serve.[ch]: don't pass "struct strvec *keys" to commands
serve: use designated initializers
transport: use designated initializers
transport: rename "fetch" in transport_vtable to "fetch_refs"
serve: mark has_capability() as static
|
|
To verify a ssh signature we first call ssh-keygen -Y find-principal to
look up the signing principal by their public key from the
allowedSignersFile. If the key is found then we do a verify. Otherwise
we only validate the signature but can not verify the signers identity.
Verification uses the gpg.ssh.allowedSignersFile (see ssh-keygen(1) "ALLOWED
SIGNERS") which contains valid public keys and a principal (usually
user@domain). Depending on the environment this file can be managed by
the individual developer or for example generated by the central
repository server from known ssh keys with push access. This file is usually
stored outside the repository, but if the repository only allows signed
commits/pushes, the user might choose to store it in the repository.
To revoke a key put the public key without the principal prefix into
gpg.ssh.revocationKeyring or generate a KRL (see ssh-keygen(1)
"KEY REVOCATION LISTS"). The same considerations about who to trust for
verification as with the allowedSignersFile apply.
Using SSH CA Keys with these files is also possible. Add
"cert-authority" as key option between the principal and the key to mark
it as a CA and all keys signed by it as valid for this CA.
See "CERTIFICATES" in ssh-keygen(1).
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In many cases where we spawned child processes that _may_ trigger a
repack, we explicitly closed the object store first (so that the
`repack` process can delete the `.pack` files, which would otherwise not
be possible on Windows since files cannot be deleted as long as they as
still in use).
Wherever possible, we now use the new `close_object_store` bit of the
`run_command()` API, to delay closing the object store even further.
This makes the code easier to maintain because it is now more obvious
that we only release those file handles because of those child
processes.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The object ID iterator used by the connectivity checks returns the next
object ID via an out-parameter and then uses a return code to indicate
whether an item was found. This is a bit roundabout: instead of a
separate error code, we can just return the next object ID directly and
use `NULL` pointers as indicator that the iterator got no items left.
Furthermore, this avoids a copy of the object ID.
Refactor the iterator and all its implementations to return object IDs
directly. This brings a tiny performance improvement when doing a mirror-fetch of a repository with about 2.3M refs:
Benchmark #1: 328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch
Time (mean ± σ): 30.110 s ± 0.148 s [User: 27.161 s, System: 5.075 s]
Range (min … max): 29.934 s … 30.406 s 10 runs
Benchmark #2: 328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch
Time (mean ± σ): 29.899 s ± 0.109 s [User: 26.916 s, System: 5.104 s]
Range (min … max): 29.696 s … 29.996 s 10 runs
Summary
'328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch' ran
1.01 ± 0.01 times faster than '328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch'
While this 1% speedup could be labelled as statistically insignificant,
the speedup is consistent on my machine. Furthermore, this is an end to
end test, so it is expected that the improvement in the connectivity
check itself is more significant.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The --advertise-refs documentation in git-upload-pack added in
9812f2136b3 (upload-pack.c: use parse-options API, 2016-05-31) hasn't
been entirely true ever since v2 support was implemented in
e52449b6722 (connect: request remote refs using v2, 2018-03-15). Under
v2 we don't advertise the refs at all, but rather dump the
capabilities header.
This option has always been an obscure internal implementation detail,
it wasn't even documented for git-receive-pack. Since it has exactly
one user let's rename it to --http-backend-info-refs, which is more
accurate and points the reader in the right direction. Let's also
cross-link this from the protocol v1 and v2 documentation.
I'm retaining a hidden --advertise-refs alias in case there's any
external users of this, and making both options hidden to the bash
completion (as with most other internal-only options).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
9cf6d3357aa (Add git-index-pack utility, 2005-10-12) and
466dbc42f58 (receive-pack: Send internal errors over side-band #2,
2010-02-10) we added these static functions and forward-declared their
__attribute__((printf)).
I think this may have been to work around some compiler limitation at
the time, but in any case we have a lot of code that uses the briefer
way of declaring these that I'm using here, so if we had any such
issues with compilers we'd have seen them already.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Up until recently, object IDs did not have an algorithm member, only a
hash. Consequently, it was possible to share one null (all-zeros)
object ID among all hash algorithms. Now that we're going to be
handling objects from multiple hash algorithms, it's important to make
sure that all object IDs have a correct algorithm field.
Introduce a per-algorithm null OID, and add it to struct hash_algo.
Introduce a wrapper function as well, and use it everywhere we used to
use the null_oid constant.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
CALLOC_ARRAY() macro replaces many uses of xcalloc().
* rs/calloc-array:
cocci: allow xcalloc(1, size)
use CALLOC_ARRAY
git-compat-util.h: drop trailing semicolon from macro definition
|
|
Code clean-up.
* jc/calloc-fix:
xcalloc: use CALLOC_ARRAY() when applicable
|
|
These are for codebase before Git 2.31
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add and apply a semantic patch for converting code that open-codes
CALLOC_ARRAY to use it instead. It shortens the code and infers the
element size automatically.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The approach to "fsck" the incoming objects in "index-pack" is
attractive for performance reasons (we have them already in core,
inflated and ready to be inspected), but fundamentally cannot be
applied fully when we receive more than one pack stream, as a tree
object in one pack may refer to a blob object in another pack as
".gitmodules", when we want to inspect blobs that are used as
".gitmodules" file, for example. Teach "index-pack" to emit
objects that must be inspected later and check them in the calling
"fetch-pack" process.
* jt/transfer-fsck-across-packs:
fetch-pack: print and use dangling .gitmodules
fetch-pack: with packfile URIs, use index-pack arg
http-fetch: allow custom index-pack args
http: allow custom index-pack args
|
|
Teach index-pack to print dangling .gitmodules links after its "keep" or
"pack" line instead of declaring an error, and teach fetch-pack to check
such lines printed.
This allows the tree side of the .gitmodules link to be in one packfile
and the blob side to be in another without failing the fsck check,
because it is now fetch-pack which checks such objects after all
packfiles have been downloaded and indexed (and not index-pack on an
individual packfile, as it is before this commit).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We have a function which parses a buffer with a signature at the end,
parse_signature, and this function is used for signed tags. However,
we'll need to store values for multiple algorithms, and we'll do this by
using a header for the non-default algorithm.
Adjust the parse_signature interface to store the parsed data in two
strbufs and turn the existing function into parse_signed_buffer. The
latter is still used in places where we know we always have a signed
buffer, such as push certs.
Adjust all the callers to deal with this new interface.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The transport layer was taught to optionally exchange the session
ID assigned by the trace2 subsystem during fetch/push transactions.
* js/trace2-session-id:
receive-pack: log received client session ID
send-pack: advertise session ID in capabilities
upload-pack, serve: log received client session ID
fetch-pack: advertise session ID in capabilities
transport: log received server session ID
serve: advertise session ID in v2 capabilities
receive-pack: advertise session ID in v0 capabilities
upload-pack: advertise session ID in v0 capabilities
trace2: add a public function for getting the SID
docs: new transfer.advertiseSID option
docs: new capability to advertise session IDs
|
|
When receive-pack receives a session-id capability from the client, log
the received session ID via a trace2 data event.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When transfer.advertiseSID is true, advertise receive-pack's session ID
via the new session-id capability.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the verison negotiation phase between "receive-pack" and
"proc-receive", "proc-receive" can send an empty flush-pkt to end the
negotiation and use default version 0. Capabilities (such as
"push-options") are not supported in version 0.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Johannes found a flaky hang in `t5411/test-0013-bad-protocol.sh` in the
osx-clang job of the CI/PR builds, and ran into an issue when using
the `--stress` option with the following error messages:
fatal: unable to write flush packet: Broken pipe
send-pack: unexpected disconnect while reading sideband packet
fatal: the remote end hung up unexpectedly
In this test case, the "proc-receive" hook sends an error message and
dies earlier. While "receive-pack" on the other side of the pipe
should forward the error message of the "proc-receive" hook to the
client side, but it fails to do so. This is because "receive-pack"
uses `packet_write_fmt()` and `packet_flush()` to write pkt-line
message to "proc-receive" hook, and these functions die immediately
when pipe is broken. Using "gently" forms for these functions will get
more predicable output.
Add more "--die-*" options to test helper to test different stages of
the protocol between "receive-pack" and "proc-receive" hook.
Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git receive-pack" that accepts requests by "git push" learned to
outsource most of the ref updates to the new "proc-receive" hook.
* jx/proc-receive-hook:
doc: add documentation for the proc-receive hook
transport: parse report options for tracking refs
t5411: test updates of remote-tracking branches
receive-pack: new config receive.procReceiveRefs
doc: add document for capability report-status-v2
New capability "report-status-v2" for git-push
receive-pack: feed report options to post-receive
receive-pack: add new proc-receive hook
t5411: add basic test cases for proc-receive hook
transport: not report a non-head push as a branch
|
|
Add a new multi-valued config variable "receive.procReceiveRefs"
for `receive-pack` command, like the follows:
git config --system --add receive.procReceiveRefs refs/for
git config --system --add receive.procReceiveRefs refs/drafts
If the specific prefix strings given by the config variables match the
reference names of the commands which are sent from git client to
`receive-pack`, these commands will be executed by an external hook
(named "proc-receive"), instead of the internal `execute_commands`
function.
For example, if it is set to "refs/for", pushing to a reference such as
"refs/for/master" will not create or update reference "refs/for/master",
but may create or update a pull request directly by running the hook
"proc-receive".
Optional modifiers can be provided in the beginning of the value to
filter commands for specific actions: create (a), modify (m),
delete (d). A `!` can be included in the modifiers to negate the
reference prefix entry. E.g.:
git config --system --add receive.procReceiveRefs ad:refs/heads
git config --system --add receive.procReceiveRefs !:refs/heads
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The new introduced "proc-receive" hook may handle a command for a
pseudo-reference with a zero-old as its old-oid, while the hook may
create or update a reference with different name, different new-oid,
and different old-oid (the reference may exist already with a non-zero
old-oid). Current "report-status" protocol cannot report the status for
such reference rewrite.
Add new capability "report-status-v2" and new report protocol which is
not backward compatible for report of git-push.
If a user pushes to a pseudo-reference "refs/for/master/topic", and
"receive-pack" creates two new references "refs/changes/23/123/1" and
"refs/changes/24/124/1", for client without the knowledge of
"report-status-v2", "receive-pack" will only send "ok/ng" directives in
the report, such as:
ok ref/for/master/topic
But for client which has the knowledge of "report-status-v2",
"receive-pack" will use "option" directives to report more attributes
for the reference given by the above "ok/ng" directive.
ok refs/for/master/topic
option refname refs/changes/23/123/1
option new-oid <new-oid>
ok refs/for/master/topic
option refname refs/changes/24/124/1
option new-oid <new-oid>
The client will report two new created references to the end user.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When commands are fed to the "post-receive" hook, report options will
be parsed and the real old-oid, new-oid, reference name will feed to
the "post-receive" hook.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Git calls an internal `execute_commands` function to handle commands
sent from client to `git-receive-pack`. Regardless of what references
the user pushes, git creates or updates the corresponding references if
the user has write-permission. A contributor who has no
write-permission, cannot push to the repository directly. So, the
contributor has to write commits to an alternate location, and sends
pull request by emails or by other ways. We call this workflow as a
distributed workflow.
It would be more convenient to work in a centralized workflow like what
Gerrit provided for some cases. For example, a read-only user who
cannot push to a branch directly can run the following `git push`
command to push commits to a pseudo reference (has a prefix "refs/for/",
not "refs/heads/") to create a code review.
git push origin \
HEAD:refs/for/<branch-name>/<session>
The `<branch-name>` in the above example can be as simple as "master",
or a more complicated branch name like "foo/bar". The `<session>` in
the above example command can be the local branch name of the client
side, such as "my/topic".
We cannot implement a centralized workflow elegantly by using
"pre-receive" + "post-receive", because Git will call the internal
function "execute_commands" to create references (even the special
pseudo reference) between these two hooks. Even though we can delete
the temporarily created pseudo reference via the "post-receive" hook,
having a temporary reference is not safe for concurrent pushes.
So, add a filter and a new handler to support this kind of workflow.
The filter will check the prefix of the reference name, and if the
command has a special reference name, the filter will turn a specific
field (`run_proc_receive`) on for the command. Commands with this filed
turned on will be executed by a new handler (a hook named
"proc-receive") instead of the internal `execute_commands` function.
We can use this "proc-receive" command to create pull requests or send
emails for code review.
Suggested by Junio, this "proc-receive" hook reads the commands,
push-options (optional), and send result using a protocol in pkt-line
format. In the following example, the letter "S" stands for
"receive-pack" and letter "H" stands for the hook.
# Version and features negotiation.
S: PKT-LINE(version=1\0push-options atomic...)
S: flush-pkt
H: PKT-LINE(version=1\0push-options...)
H: flush-pkt
# Send commands from server to the hook.
S: PKT-LINE(<old-oid> <new-oid> <ref>)
S: ... ...
S: flush-pkt
# Send push-options only if the 'push-options' feature is enabled.
S: PKT-LINE(push-option)
S: ... ...
S: flush-pkt
# Receive result from the hook.
# OK, run this command successfully.
H: PKT-LINE(ok <ref>)
# NO, I reject it.
H: PKT-LINE(ng <ref> <reason>)
# Fall through, let 'receive-pack' to execute it.
H: PKT-LINE(ok <ref>)
H: PKT-LINE(option fall-through)
# OK, but has an alternate reference. The alternate reference name
# and other status can be given in options
H: PKT-LINE(ok <ref>)
H: PKT-LINE(option refname <refname>)
H: PKT-LINE(option old-oid <old-oid>)
H: PKT-LINE(option new-oid <new-oid>)
H: PKT-LINE(option forced-update)
H: ... ...
H: flush-pkt
After receiving a command, the hook will execute the command, and may
create/update different reference. For example, a command for a pseudo
reference "refs/for/master/topic" may create/update different reference
such as "refs/pull/123/head". The alternate reference name and other
status are given in option lines.
The list of commands returned from "proc-receive" will replace the
relevant commands that are sent from user to "receive-pack", and
"receive-pack" will continue to run the "execute_commands" function and
other routines. Finally, the result of the execution of these commands
will be reported to end user.
The reporting function from "receive-pack" to "send-pack" will be
extended in latter commit just like what the "proc-receive" hook reports
to "receive-pack".
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "argc" and "argv" names made sense when the struct was argv_array,
but now they're just confusing. Let's rename them to "nr" (which we use
for counts elsewhere) and "v" (which is rather terse, but reads well
when combined with typical variable names like "args.v").
Note that we have to update all of the callers immediately. Playing
tricks with the preprocessor is hard here, because we wouldn't want to
rewrite unrelated tokens.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code which split an argv_array call across multiple lines, like:
argv_array_pushl(&args, "one argument",
"another argument", "and more",
NULL);
was recently mechanically renamed to use strvec, which results in
mis-matched indentation like:
strvec_pushl(&args, "one argument",
"another argument", "and more",
NULL);
Let's fix these up to align the arguments with the opening paren. I did
this manually by sifting through the results of:
git jump grep 'strvec_.*,$'
and liberally applying my editor's auto-format. Most of the changes are
of the form shown above, though I also normalized a few that had
originally used a single-tab indentation (rather than our usual style of
aligning with the open paren). I also rewrapped a couple of obvious
cases (e.g., where previously too-long lines became short enough to fit
on one), but I wasn't aggressive about it. In cases broken to three or
more lines, the grouping of arguments is sometimes meaningful, and it
wasn't worth my time or reviewer time to ponder each case individually.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We eventually want to drop the argv_array name and just use strvec
consistently. There's no particular reason we have to do it all at once,
or care about interactions between converted and unconverted bits.
Because of our preprocessor compat layer, the names are interchangeable
to the compiler (so even a definition and declaration using different
names is OK).
This patch converts all of the files in builtin/ to keep the diff to a
manageable size.
The conversion was done purely mechanically with:
git ls-files '*.c' '*.h' |
xargs perl -i -pe '
s/ARGV_ARRAY/STRVEC/g;
s/argv_array/strvec/g;
'
and then selectively staging files with "git add builtin/". We'll deal
with any indentation/style fallouts separately.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This requires updating #include lines across the code-base, but that's
all fairly mechanical, and was done with:
git ls-files '*.c' '*.h' |
xargs perl -i -pe 's/argv-array.h/strvec.h/'
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
SHA-256 migration work continues.
* bc/sha-256-part-2: (44 commits)
remote-testgit: adapt for object-format
bundle: detect hash algorithm when reading refs
t5300: pass --object-format to git index-pack
t5704: send object-format capability with SHA-256
t5703: use object-format serve option
t5702: offer an object-format capability in the test
t/helper: initialize the repository for test-sha1-array
remote-curl: avoid truncating refs with ls-remote
t1050: pass algorithm to index-pack when outside repo
builtin/index-pack: add option to specify hash algorithm
remote-curl: detect algorithm for dumb HTTP by size
builtin/ls-remote: initialize repository based on fetch
t5500: make hash independent
serve: advertise object-format capability for protocol v2
connect: parse v2 refs with correct hash algorithm
connect: pass full packet reader when parsing v2 refs
Documentation/technical: document object-format for protocol v2
t1302: expect repo format version 1 for SHA-256
builtin/show-index: provide options to determine hash algo
t5302: modernize test formatting
...
|
|
Detect when the server doesn't support our hash algorithm and abort.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Advertise the current hash algorithm in use by using the object-format
capability as part of the ref advertisement.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code cleanup.
* tb/shallow-cleanup:
shallow: use struct 'shallow_lock' for additional safety
shallow.h: document '{commit,rollback}_shallow_file'
shallow: extract a header file for shallow-related functions
commit: make 'commit_graft_pos' non-static
|
|
The <stdlib.h> header on NetBSD brings in its own definition of
hmac() function (eek), which conflicts with our own and unrelated
function with the same name. Our function has been renamed to work
around the issue.
* cb/avoid-colliding-with-netbsd-hmac:
builtin/receive-pack: avoid generic function name hmac()
|
|
fabec2c5c3 (builtin/receive-pack: switch to use the_hash_algo, 2019-08-18)
renames hmac_sha1 to hmac, as it was updated to use the hash function used
by git (which won't be sha1 in the future).
hmac() is provided by NetBSD >= 8 libc and therefore conflicts as shown by :
builtin/receive-pack.c:421:13: error: conflicting types for 'hmac'
static void hmac(unsigned char *out,
^~~~
In file included from ./git-compat-util.h:172:0,
from ./builtin.h:4,
from builtin/receive-pack.c:1:
/usr/include/stdlib.h:305:10: note: previous declaration of 'hmac' was here
ssize_t hmac(const char *, const void *, size_t, const void *, size_t, void *,
^~~~
Rename it again to hmac_hash to reflect it will use the git's defined hash
function and avoid the conflict, while at it update a comment to better
describe the HMAC function that was used.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Fix in-core inconsistency after fetching into a shallow repository
that broke the code to write out commit-graph.
* tb/reset-shallow:
shallow.c: use '{commit,rollback}_shallow_file'
t5537: use test_write_lines and indented heredocs for readability
|
|
In previous patches, the functions 'commit_shallow_file' and
'rollback_shallow_file' were introduced to reset the shallowness
validity checks on a repository after potentially modifying
'.git/shallow'.
These functions can be made safer by wrapping the 'struct lockfile *' in
a new type, 'shallow_lock', so that they cannot be called with a raw
lock (and potentially misused by other code that happens to possess a
lockfile, but has nothing to do with shallowness).
This patch introduces that type as a thin wrapper around 'struct
lockfile', and updates the two aforementioned functions and their
callers to use it.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are many functions in commit.h that are more related to shallow
repositories than they are to any sort of generic commit machinery.
Likely this began when there were only a few shallow-related functions,
and commit.h seemed a reasonable enough place to put them.
But, now there are a good number of shallow-related functions, and
placing them all in 'commit.h' doesn't make sense.
This patch extracts a 'shallow.h', which takes all of the declarations
from 'commit.h' for functions which already exist in 'shallow.c'. We
will bring the remaining shallow-related functions defined in 'commit.c'
in a subsequent patch.
For now, move only the ones that already are implemented in 'shallow.c',
and update the necessary includes.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Validation of push certificate has been made more robust against
timing attacks.
* bc/constant-memequal:
receive-pack: compilation fix
builtin/receive-pack: use constant-time comparison for HMAC value
|
|
In bd0b42aed3 (fetch-pack: do not take shallow lock unnecessarily,
2019-01-10), the author noted that 'is_repository_shallow' produces
visible side-effect(s) by setting 'is_shallow' and 'shallow_stat'.
This is a problem for e.g., fetching with '--update-shallow' in a
shallow repository with 'fetch.writeCommitGraph' enabled, since the
update to '.git/shallow' will cause Git to think that the repository
isn't shallow when it is, thereby circumventing the commit-graph
compatibility check.
This causes problems in shallow repositories with at least shallow refs
that have at least one ancestor (since the client won't have those
objects, and therefore can't take the reachability closure over commits
when writing a commit-graph).
Address this by introducing thin wrappers over 'commit_lock_file' and
'rollback_lock_file' for use specifically when the lock is held over
'.git/shallow'. These wrappers (appropriately called
'commit_shallow_file' and 'rollback_shallow_file') call into their
respective functions in 'lockfile.h', but additionally reset validity
checks used by the shallow machinery.
Replace each instance of 'commit_lock_file' and 'rollback_lock_file'
with 'commit_shallow_file' and 'rollback_shallow_file' when the lock
being held is over the '.git/shallow' file.
As a result, 'prune_shallow' can now only be called once (since
'check_shallow_file_for_update' will die after calling
'reset_repository_shallow'). But, this is OK since we only call
'prune_shallow' at most once per process.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We do not use C99 "for loop initial declaration" in our codebase
(yet), but one snuck in.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we're comparing a push cert nonce, we currently do so using strcmp.
Most implementations of strcmp short-circuit and exit as soon as they
know whether two values are equal. This, however, is a problem when
we're comparing the output of HMAC, as it leaks information in the time
taken about how much of the two values match if they do indeed differ.
In our case, the nonce is used to prevent replay attacks against our
server via the embedded timestamp and replay attacks using requests from
a different server via the HMAC. Push certs, which contain the nonces,
are signed, so an attacker cannot tamper with the nonces without
breaking validation of the signature. They can, of course, create their
own signatures with invalid nonces, but they can also create their own
signatures with valid nonces, so there's nothing to be gained. Thus,
there is no security problem.
Even though it doesn't appear that there are any negative consequences
from the current technique, for safety and to encourage good practices,
let's use a constant time comparison function for nonce verification.
POSIX does not provide one, but they are easy to write.
The technique we use here is also used in NaCl and the Go standard
library and relies on the fact that bitwise or and xor are constant time
on all known architectures.
We need not be concerned about exiting early if the actual and expected
lengths differ, since the standard cryptographic assumption is that
everyone, including an attacker, knows the format of and algorithm used
in our nonces (and in any event, they have the source code and can
determine it easily). As a result, we assume everyone knows how long
our nonces should be. This philosophy is also taken by the Go standard
library and other cryptographic libraries when performing constant time
comparisons on HMAC values.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We renamed the actual data structure in 910650d2f8 (Rename sha1_array to
oid_array, 2017-03-31), but the file is still called sha1-array. Besides
being slightly confusing, it makes it more annoying to grep for leftover
occurrences of "sha1" in various files, because the header is included
in so many places.
Let's complete the transition by renaming the source and header files
(and fixing up a few comment references).
I kept the "-" in the name, as that seems to be our style; cf.
fc1395f4a4 (sha1_file.c: rename to use dash in file name, 2018-04-10).
We also have oidmap.h and oidset.h without any punctuation, but those
are "struct oidmap" and "struct oidset" in the code. We _could_ make
this "oidarray" to match, but somehow it looks uglier to me because of
the length of "array" (plus it would be a very invasive patch for little
gain).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The receive.denyCurrentBranch config option controls what happens if
you push to a branch that is checked out into a non-bare repository.
By default, it rejects it. It can be disabled via `ignore` or `warn`.
Another yet trickier option is `updateInstead`.
However, this setting was forgotten when the git worktree command was
introduced: only the main worktree's current branch is respected.
With this change, all worktrees are respected.
That change also leads to revealing another bug,
i.e. `receive.denyCurrentBranch = true` was ignored when pushing into a
non-bare repository's unborn current branch using ref namespaces. As
`is_ref_checked_out()` returns 0 which means `receive-pack` does not get
into conditional statement to switch `deny_current_branch` accordingly
(ignore, warn, refuse, unconfigured, updateInstead).
receive.denyCurrentBranch uses the function `refs_resolve_ref_unsafe()`
(called via `resolve_refdup()`) to resolve the symbolic ref HEAD, but
that function fails when HEAD does not point at a valid commit.
As we replace the call to `refs_resolve_ref_unsafe()` with
`find_shared_symref()`, which has no problem finding the worktree for a
given branch even if it is unborn yet, this bug is fixed at the same
time: receive.denyCurrentBranch now also handles worktrees with unborn
branches as intended even while using ref namespaces.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Since sha1_to_hex is limited to SHA-1, replace it with hash_to_hex.
Rename several variables to indicate that they can contain any hash.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The push cert code uses HMAC-SHA-1 to create a nonce. This is a secure
use of SHA-1 which is not affected by its collision resistance (or lack
thereof). However, it makes sense for us to use a better algorithm if
one is available, one which may even be more performant. Futhermore,
until we have specialized functions for computing the hex value of an
arbitrary function, it simplifies the code greatly to use the same hash
algorithm everywhere.
Switch this code to use GIT_MAX_BLKSZ and the_hash_algo for computing
the push cert nonce, and rename the hmac_sha1 function to simply "hmac".
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The tips of refs from the alternate object store can be used as
starting point for reachability computation now.
* jk/check-connected-with-alternates:
check_everything_connected: assume alternate ref tips are valid
object-store.h: move for_each_alternate_ref() from transport.h
|
|
The commit-graph file is now part of the "files that the runtime
may keep open file descriptors on, all of which would need to be
closed when done with the object store", and the file descriptor to
an existing commit-graph file now is closed before "gc" finalizes a
new instance to replace it.
* ds/close-object-store:
packfile: rename close_all_packs to close_object_store
packfile: close commit-graph in close_all_packs
commit-graph: use raw_object_store when closing
|
|
There's nothing inherently transport-related about enumerating the
alternate ref tips. The code has lived in transport.[ch] because the
only use so far had been advertising available tips during transport.
But it could be used for more, and a future patch will teach rev-list to
access these refs.
Let's move it alongside the other alt-odb code, declaring it in
object-store.h with the implementation in sha1-file.c.
This lets us drop the inclusion of transport.h from receive-pack, which
perhaps shows how it was misplaced (though receive-pack is about
transporting objects, transport.h is mostly about the client side).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The close_all_packs() method is now responsible for more than just pack-files.
It also closes the commit-graph and the multi-pack-index. Rename the function
to be more descriptive of its larger role. The name also fits because the
input parameter is a raw_object_store.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|