aboutsummaryrefslogtreecommitdiffstats
path: root/shallow.c
AgeCommit message (Collapse)AuthorFilesLines
2025-11-04refs: introduce wrapper struct for `each_ref_fn`Patrick Steinhardt1-12/+4
The `each_ref_fn` callback function type is used across our code base for several different functions that iterate through reference. There's a bunch of callbacks implementing this type, which makes any changes to the callback signature extremely noisy. An example of the required churn is e8207717f1 (refs: add referent to each_ref_fn, 2024-08-09): adding a single argument required us to change 48 files. It was already proposed back then [1] that we might want to introduce a wrapper structure to alleviate the pain going forward. While this of course requires the same kind of global refactoring as just introducing a new parameter, it at least allows us to more change the callback type afterwards by just extending the wrapper structure. One counterargument to this refactoring is that it makes the structure more opaque. While it is obvious which callsites need to be fixed up when we change the function type, it's not obvious anymore once we use a structure. That being said, we only have a handful of sites that actually need to populate this wrapper structure: our ref backends, "refs/iterator.c" as well as very few sites that invoke the iterator callback functions directly. Introduce this wrapper structure so that we can adapt the iterator interfaces more readily. [1]: <ZmarVcF5JjsZx0dl@tanuki> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-22treewide: pass strvecs around for setup_revisions_from_strvec()Jeff King1-2/+2
The previous commit converted callers of setup_revisions() with a strvec to use the safer and easier _from_strvec() variant. Let's now convert spots that don't directly have a strvec, but receive an argc/argv pair that eventually comes from one. We'll instead pass the strvec down to the point where we call setup_revisions(). That makes these functions slightly less flexible if they were to grow other callers that don't use strvecs, but this rigidity is buying us some safety. It is only safe to pass the free_removed_argv_elements option to setup_revisions() if we know the elements of argv/argc are allocated on the heap. That isn't communicated in the type system when we are passed the bare elements. But if we get a strvec, we know that the elements are allocated strings. And at any rate, each of these modified functions has only a single caller (that has a strvec), so the loss of flexibility is unlikely to ever matter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `has_object()`Patrick Steinhardt1-6/+6
Rename `has_object()` to `odb_has_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename files to "odb.{c,h}"Patrick Steinhardt1-1/+1
In the preceding commits we have renamed the structures contained in "object-store.h" to `struct object_database` and `struct odb_backend`. As such, the code files "object-store.{c,h}" are confusingly named now. Rename them to "odb.{c,h}" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-29treewide: convert users of `repo_has_object_file()` to `has_object()`Patrick Steinhardt1-3/+6
As the comment of `repo_has_object_file()` and its `_with_flags()` variant tells us, these functions are considered to be deprecated in favor of `has_object()`. There are a couple of slight benefits in favor of the replacement: - The new function has a short-and-sweet name. - More explicit defaults: `has_object()` doesn't fetch missing objects via promisor remotes, and neither does it reload packfiles if an object wasn't found by default. This ensures that it becomes immediately obvious when a simple object existence check may result in expensive actions. Most importantly though, it is confusing that we have two sets of functions that ultimately do the same thing, but with different defaults. Start sunsetting `repo_has_object_file()` and its `_with_flags()` sibling by replacing all callsites with `has_object()`: - `repo_has_object_file(...)` is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT)` is equivalent to `has_object(..., 0)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_SKIP_FETCH_OBJECT)` is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)` is equivalent to `has_object(..., HAS_OBJECT_FETCH_PROMISOR)`. The replacements should be functionally equivalent. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-store: merge "object-store-ll.h" and "object-store.h"Patrick Steinhardt1-1/+1
The "object-store-ll.h" header has been introduced to keep transitive header dependendcies and compile times at bay. Now that we have created a new "object-store.c" file though we can easily move the last remaining additional bit of "object-store.h", the `odb_path_map`, out of the header. Do so. As the "object-store.h" header is now equivalent to its low-level alternative we drop the latter and inline it into the former. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10object: stop depending on `the_repository`Patrick Steinhardt1-5/+5
There are a couple of functions exposed by "object.c" that implicitly depend on `the_repository`. Remove this dependency by injecting the repository via a parameter. Adapt callers accordingly by simply using `the_repository`, except in cases where the subsystem is already free of the repository. In that case, we instead pass the repository provided by the caller's context. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28path: drop `git_path()` in favor of `repo_git_path()`Patrick Steinhardt1-1/+3
Remove `git_path()` in favor of the `repo_git_path()` family of functions, which makes the implicit dependency on `the_repository` go away. Note that `git_path()` returned a string allocated via `get_pathname()`, which uses a rotating set of statically allocated buffers. Consequently, callers didn't have to free the returned string. The same isn't true for `repo_common_path()`, so we also have to add logic to free the returned strings. This refactoring also allows us to remove `repo_common_pathv()` as well as `get_pathname()` from the public interface. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27sign-compare: avoid comparing ptrdiff with an int/unsignedJunio C Hamano1-1/+1
Instead, offset the base pointer with integer and compare it with the other pointer. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27shallow: fix -Wsign-compare warningsPatrick Steinhardt1-20/+18
Fix a couple of -Wsign-compare issues in "shallow.c" and mark the file as -Wsign-compare-clean. This change prepares the code for a refactoring of `repo_in_merge_bases_many()`, which will be adapted to accept the number of commits as `size_t` instead of `int`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06global: mark code units that generate warnings with `-Wsign-compare`Patrick Steinhardt1-0/+1
Mark code units that generate warnings with `-Wsign-compare`. This allows for a structured approach to get rid of all such warnings over time in a way that can be easily measured. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25shallow: fix leak when unregistering last shallow rootPatrick Steinhardt1-3/+2
When unregistering a shallow root we shrink the array of grafts by one and move remaining grafts one to the left. This can of course only happen when there are any grafts left, because otherwise there is nothing to move. As such, this code is guarded by a condition that only performs the move in case there are grafts after the position of the graft to be unregistered. By mistake we also put the call to free the unregistered graft into that condition. But that doesn't make any sense, as we want to always free the graft when it exists. Fix the resulting memory leak by doing so. This leak is exposed by t5500, but plugging it does not make the whole test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05shallow: fix leaking members of `struct shallow_info`Patrick Steinhardt1-0/+9
We do not free several struct members in `clear_shallow_info()`. Fix this to plug the resulting leaks. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05shallow: free grafts when unregistering themPatrick Steinhardt1-1/+3
When removing a graft via `unregister_shallow()` we remove it from the grafts array, but do not free the structure. Fix this to plug the leak. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05object: clear grafts when clearing parsed object poolPatrick Steinhardt1-1/+1
We do not clear grafts part of the parsed object pool when clearing the pool itself, which can lead to memory leaks when a repository is being cleared. Fix this by moving `reset_commit_grafts()` into "object.c" and making it part of the `struct parsed_object_pool` interface such that we can call it from `parsed_object_pool_clear()`. Adapt `parsed_object_pool_new()` to take and store a reference to its owning repository, which is needed by `unparse_commit()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09refs: add referent to each_ref_fnJohn Cai1-0/+2
Add a parameter to each_ref_fn so that callers to the ref APIs that use this function as a callback can have acess to the unresolved value of a symbolic ref. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14global: introduce `USE_THE_REPOSITORY_VARIABLE` macroPatrick Steinhardt1-0/+2
Use of the `the_repository` variable is deprecated nowadays, and we slowly but steadily convert the codebase to not use it anymore. Instead, callers should be passing down the repository to work on via parameters. It is hard though to prove that a given code unit does not use this variable anymore. The most trivial case, merely demonstrating that there is no direct use of `the_repository`, is already a bit of a pain during code reviews as the reviewer needs to manually verify claims made by the patch author. The bigger problem though is that we have many interfaces that implicitly rely on `the_repository`. Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code units to opt into usage of `the_repository`. The intent of this macro is to demonstrate that a certain code unit does not use this variable anymore, and to keep it from new dependencies on it in future changes, be it explicit or implicit For now, the macro only guards `the_repository` itself as well as `the_hash_algo`. There are many more known interfaces where we have an implicit dependency on `the_repository`, but those are not guarded at the current point in time. Over time though, we should start to add guards as required (or even better, just remove them). Define the macro as required in our code units. As expected, most of our code still relies on the global variable. Nearly all of our builtins rely on the variable as there is no way yet to pass `the_repository` to their entry point. For now, declare the macro in "biultin.h" to keep the required changes at least a little bit more contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07cocci: apply rules to rewrite callers of "refs" interfacesPatrick Steinhardt1-6/+10
Apply the rules that rewrite callers of "refs" interfaces to explicitly pass `struct ref_store`. The resulting patch has been applied with the `--whitespace=fix` option. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28commit-reach(repo_in_merge_bases_many): report missing commitsJohannes Schindelin1-6/+12
Some functions in Git's source code follow the convention that returning a negative value indicates a fatal error, e.g. repository corruption. Let's use this convention in `repo_in_merge_bases()` to report when one of the specified commits is missing (i.e. when `repo_parse_commit()` reports an error). Also adjust the callers of `repo_in_merge_bases()` to handle such negative return values. Note: As of this patch, errors are returned only if any of the specified merge heads is missing. Over the course of the next patches, missing commits will also be reported by the `paint_down_to_common()` function, which is called by `repo_in_merge_bases_many()`, and those errors will be properly propagated back to the caller at that stage. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28commit-reach(repo_in_merge_bases_many): optionally expect missing commitsJohannes Schindelin1-2/+3
Currently this function treats unrelated commit histories the same way as commit histories with missing commit objects. Typically, missing commit objects constitute a corrupt repository, though, and should be reported as such. The next commits will make it so, but there is one exception: In `git fetch --update-shallow` we _expect_ commit objects to be missing, and we do want to treat the now-incomplete commit histories as unrelated. To allow for that, let's introduce an additional parameter that is passed to `repo_in_merge_bases_many()` to trigger this behavior, and use it in the two callers in `shallow.c`. This commit changes behavior slightly: unless called from the `shallow.c` functions that set the `ignore_missing_commits` bit, any non-existing tip commit that is passed to `repo_in_merge_bases_many()` will now result in an error. Note: When encountering missing commits while traversing the commit history in search for merge bases, with this commit there won't be a change in behavior just yet, their children will still be interpreted as root commits. This bug will get fixed by follow-up commits. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26treewide: remove unnecessary includes in source filesElijah Newren1-1/+0
Each of these were checked with gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE} to ensure that removing the direct inclusion of the header actually resulted in that header no longer being included at all (i.e. that no other header pulled it in transitively). ...except for a few cases where we verified that although the header was brought in transitively, nothing from it was directly used in that source file. These cases were: * builtin/credential-cache.c * builtin/pull.c * builtin/send-pack.c Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-07shallow: fix memory leak when registering shallow rootsPatrick Steinhardt1-1/+3
When registering shallow roots, we unset the list of parents of the to-be-registered commit if it's already been parsed. This causes us to leak memory though because we never free this list. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05git-compat-util: move alloc macros to git-compat-util.hCalvin Wan1-1/+0
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for dynamic array allocation. Moving these macros to git-compat-util.h with the other alloc macros focuses alloc.[ch] to allocation for Git objects and additionally allows us to remove inclusions to alloc.h from files that solely used the above macros. Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05treewide: remove unnecessary includes for wrapper.hCalvin Wan1-1/+0
Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21object-store-ll.h: split this header out of object-store.hElijah Newren1-1/+1
The vast majority of files including object-store.h did not need dir.h nor khash.h. Split the header into two files, and let most just depend upon object-store-ll.h, while letting the two callers that need it depend on the full object-store.h. After this patch: $ git grep -h include..object-store | sort | uniq -c 2 #include "object-store.h" 129 #include "object-store-ll.h" Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21repository: remove unnecessary include of path.hElijah Newren1-0/+1
This also made it clear that several .c files that depended upon path.h were missing a #include for it; add the missing includes while at it. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21cache.h: remove this no-longer-used headerElijah Newren1-1/+1
Since this header showed up in some places besides just #include statements, update/clean-up/remove those other places as well. Note that compat/fsmonitor/fsm-path-utils-darwin.c previously got away with violating the rule that all files must start with an include of git-compat-util.h (or a short-list of alternate headers that happen to include it first). This change exposed the violation and caused it to stop building correctly; fix it by having it include git-compat-util.h first, as per policy. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21statinfo: move stat_{data,validity} functions from cache/read-cacheElijah Newren1-0/+1
These functions do not depend upon struct cache_entry or struct index_state in any way, and it seems more logical to break them out into this file, especially since statinfo.h already has the struct stat_data declaration. Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11treewide: be explicit about dependence on trace.h & trace2.hElijah Newren1-0/+1
Dozens of files made use of trace and trace2 functions, without explicitly including trace.h or trace2.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include trace.h or trace2.h if they are using them. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-04Merge branch 'ab/remove-implicit-use-of-the-repository' into ↵Junio C Hamano1-10/+11
en/header-split-cache-h * ab/remove-implicit-use-of-the-repository: libs: use "struct repository *" argument, not "the_repository" post-cocci: adjust comments for recent repo_* migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"
2023-03-28libs: use "struct repository *" argument, not "the_repository"Ævar Arnfjörð Bjarmason1-1/+1
As can easily be seen from grepping in our sources, we had these uses of "the_repository" in various library code in cases where the function in question was already getting a "struct repository *" argument. Let's use that argument instead. Out of these changes only the changes to "cache-tree.c", "commit-reach.c", "shallow.c" and "upload-pack.c" would have cleanly applied before the migration away from the "repo_*()" wrapper macros in the preceding commits. The rest aren't new, as we'd previously implicitly refer to "the_repository", but it's now more obvious that we were doing the wrong thing all along, and should have used the parameter instead. The change to change "get_index_format_default(the_repository)" in "read-cache.c" to use the "r" variable instead should arguably have been part of [1], or in the subsequent cleanup in [2]. Let's do it here, as can be seen from the initial code in [3] it's not important that we use "the_repository" there, but would prefer to always use the current repository. This change excludes the "the_repository" use in "upload-pack.c"'s upload_pack_advertise(), as the in-flight [4] makes that change. 1. ee1f0c242ef (read-cache: add index.skipHash config option, 2023-01-06) 2. 6269f8eaad0 (treewide: always have a valid "index_state.repo" member, 2023-01-17) 3. 7211b9e7534 (repo-settings: consolidate some config settings, 2019-08-13) 4. <Y/hbUsGPVNAxTdmS@coredump.intra.peff.net> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28cocci: apply the "object-store.h" part of "the_repository.pending"Ævar Arnfjörð Bjarmason1-3/+3
Apply the part of "the_repository.pending.cocci" pertaining to "object-store.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28cocci: apply the "commit.h" part of "the_repository.pending"Ævar Arnfjörð Bjarmason1-2/+2
Apply the part of "the_repository.pending.cocci" pertaining to "commit.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28cocci: apply the "commit-reach.h" part of "the_repository.pending"Ævar Arnfjörð Bjarmason1-4/+5
Apply the part of "the_repository.pending.cocci" pertaining to "commit-reach.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21wrapper.h: move declarations for wrapper.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21treewide: remove unnecessary cache.h inclusion from a few headersElijah Newren1-1/+1
Ever since a64215b6cd ("object.h: stop depending on cache.h; make cache.h depend on object.h", 2023-02-24), we have a few headers that could have replaced their include of cache.h with an include of object.h. Make that change now. Some C files had to start including cache.h after this change (or some smaller header it had brought in), because the C files were depending on things from cache.h but were only formerly implicitly getting cache.h through one of these headers being modified in this patch. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23cache.h: remove dependence on hex.h; make other files include it explicitlyElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23alloc.h: move ALLOC_GROW() functions from cache.hElijah Newren1-1/+2
This allows us to replace includes of cache.h with includes of the much smaller alloc.h in many places. It does mean that we also need to add includes of alloc.h in a number of C files. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01git-compat-util.h: use "UNUSED", not "UNUSED(var)"Ævar Arnfjörð Bjarmason1-5/+5
As reported in [1] the "UNUSED(var)" macro introduced in 2174b8c75de (Merge branch 'jk/unused-annotation' into next, 2022-08-24) breaks coccinelle's parsing of our sources in files where it occurs. Let's instead partially go with the approach suggested in [2] of making this not take an argument. As noted in [1] "coccinelle" will ignore such tokens in argument lists that it doesn't know about, and it's less of a surprise to syntax highlighters. This undoes the "help us notice when a parameter marked as unused is actually use" part of 9b240347543 (git-compat-util: add UNUSED macro, 2022-08-19), a subsequent commit will further tweak the macro to implement a replacement for that functionality. 1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/ 2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19refs: mark unused each_ref_fn parametersJeff King1-4/+8
Functions used with for_each_ref(), etc, need to conform to the each_ref_fn interface. But most of them don't need every parameter; let's annotate the unused ones to quiet -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-13Merge branch 'jt/unparse-commit-upon-graft-change'Junio C Hamano1-0/+7
Updating the graft information invalidates the list of parents of in-core commit objects that used to be in the graft file. * jt/unparse-commit-upon-graft-change: commit,shallow: unparse commits if grafts changed
2022-06-07Merge branch 'ab/plug-leak-in-revisions'Junio C Hamano1-0/+1
Plug the memory leaks from the trickiest API of all, the revision walker. * ab/plug-leak-in-revisions: (27 commits) revisions API: add a TODO for diff_free(&revs->diffopt) revisions API: have release_revisions() release "topo_walk_info" revisions API: have release_revisions() release "date_mode" revisions API: call diff_free(&revs->pruning) in revisions_release() revisions API: release "reflog_info" in release revisions() revisions API: clear "boundary_commits" in release_revisions() revisions API: have release_revisions() release "prune_data" revisions API: have release_revisions() release "grep_filter" revisions API: have release_revisions() release "filter" revisions API: have release_revisions() release "cmdline" revisions API: have release_revisions() release "mailmap" revisions API: have release_revisions() release "commits" revisions API users: use release_revisions() for "prune_data" users revisions API users: use release_revisions() with UNLEAK() revisions API users: use release_revisions() in builtin/log.c revisions API users: use release_revisions() in http-push.c revisions API users: add "goto cleanup" for release_revisions() stash: always have the owner of "stash_info" free it revisions API users: use release_revisions() needing REV_INFO_INIT revision.[ch]: document and move code declared around "init" ...
2022-06-06commit,shallow: unparse commits if grafts changedJonathan Tan1-0/+7
When a commit is parsed, it pretends to have a different (possibly empty) list of parents if there is graft information for that commit. But there is a bug that could occur when a commit is parsed, the graft information is updated (for example, when a shallow file is rewritten), and the same commit is subsequently used: the parents of the commit do not conform to the updated graft information, but the information at the time of parsing. This is usually not an issue, as a commit is usually introduced into the repository at the same time as its graft information. That means that when we try to parse that commit, we already have its graft information. But it is an issue when fetching a shallow point directly into a repository with submodules. The function assign_shallow_commits_to_refs() parses all sought objects (including the shallow point, which we are directly fetching). In update_shallow() in fetch-pack.c, assign_shallow_commits_to_refs() is called before commit_shallow_file(), which means that the shallow point would have been parsed before graft information is updated. Once a commit is parsed, it is no longer sensitive to any graft information updates. This parsed commit is subsequently used when we do a revision walk to search for submodules to fetch, meaning that the commit is considered to have parents even though it is a shallow point (and therefore should be treated as having no parents). Therefore, whenever graft information is updated, mark the commits that were previously grafts and the commits that are newly grafts as unparsed. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-20Merge branch 'ep/maint-equals-null-cocci'Junio C Hamano1-1/+1
Introduce and apply coccinelle rule to discourage an explicit comparison between a pointer and NULL, and applies the clean-up to the maintenance track. * ep/maint-equals-null-cocci: tree-wide: apply equals-null.cocci tree-wide: apply equals-null.cocci contrib/coccinnelle: add equals-null.cocci
2022-05-02tree-wide: apply equals-null.cocciJunio C Hamano1-1/+1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-13revisions API users: add straightforward release_revisions()Ævar Arnfjörð Bjarmason1-0/+1
Add a release_revisions() to various users of "struct rev_list" in those straightforward cases where we only need to add the release_revisions() call to the end of a block, and don't need to e.g. refactor anything to use a "goto cleanup" pattern. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-17shallow: reset commit grafts when shallow is resetJonathan Tan1-0/+1
When reset_repository_shallow() is called, Git clears its cache of shallow information, so that if shallow information is re-requested, Git will read fresh data from disk instead of reusing its stale cached data. However, the cache of commit grafts is not likewise cleared, even though there are commit grafts created from shallow information. This means that if on-disk shallow information were to be updated and then a commit-graft-using codepath were run (for example, a revision walk), Git would be using stale commit graft information. This can be seen from the test in this patch, in which Git performs a revision walk (to check for changed submodules) after a fetch with --update-shallow. Therefore, clear the cache of commit grafts whenever reset_repository_shallow() is called. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-12git-rev-list: add --exclude-first-parent-only flagJerry Zhang1-1/+1
It is useful to know when a branch first diverged in history from some integration branch in order to be able to enumerate the user's local changes. However, these local changes can include arbitrary merges, so it is necessary to ignore this merge structure when finding the divergence point. In order to do this, teach the "rev-list" family to accept "--exclude-first-parent-only", which restricts the traversal of excluded commits to only follow first parent links. -A-----E-F-G--main \ / / B-C-D--topic In this example, the goal is to return the set {B, C, D} which represents a topic branch that has been merged into main branch. `git rev-list topic ^main` will end up returning no commits since excluding main will end up traversing the commits on topic as well. `git rev-list --exclude-first-parent-only topic ^main` however will return {B, C, D} as desired. Add docs for the new flag, and clarify the doc for --first-parent to indicate that it applies to traversing the set of included commits only. Signed-off-by: Jerry Zhang <jerry@skydio.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28commit_graft_pos(): take an oid instead of a bare hashJeff King1-1/+1
All of our callers have an object_id, and are just dereferencing the hash field to pass to us. Let's take the actual object_id instead. We still access the hash to pass to hash_pos, but it's a step in the right direction. This makes the callers slightly simpler, but also gets rid of the untyped pointer, as well as the now-inaccurate name "sha1". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-30Merge branch 'sg/commit-graph-cleanups' into masterJunio C Hamano1-9/+5
The changed-path Bloom filter is improved using ideas from an independent implementation. * sg/commit-graph-cleanups: commit-graph: simplify write_commit_graph_file() #2 commit-graph: simplify write_commit_graph_file() #1 commit-graph: simplify parse_commit_graph() #2 commit-graph: simplify parse_commit_graph() #1 commit-graph: clean up #includes diff.h: drop diff_tree_oid() & friends' return value commit-slab: add a function to deep free entries on the slab commit-graph-format.txt: all multi-byte numbers are in network byte order commit-graph: fix parsing the Chunk Lookup table tree-walk.c: don't match submodule entries for 'submod/anything'
2020-06-08commit-slab: add a function to deep free entries on the slabSZEDER Gábor1-9/+5
clear_##slabname() frees only the memory allocated for a commit slab itself, but entries in the commit slab might own additional memory outside the slab that should be freed as well. We already have (at least) one such commit slab, and this patch series is about to add one more. To free all additional memory owned by entries on the commit slab the user of such a slab could iterate over all commits it knows about, peek whether there is a valid entry associated with each commit, and free the additional memory, if any. Or it could rely on intimate knowledge about the internals of the commit slab implementation, and could itself iterate directly through all entries in the slab, and free the additional memory. Or it could just leak the additional memory... Introduce deep_clear_##slabname() to allow releasing memory owned by commit slab entries by invoking the 'void free_fn(elemtype *ptr)' function specified as parameter for each entry in the slab. Use it in get_shallow_commits() in 'shallow.c' to replace an open-coded iteration over a commit slab's entries. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-13Merge branch 'tb/shallow-cleanup'Junio C Hamano1-11/+25
Code cleanup. * tb/shallow-cleanup: shallow: use struct 'shallow_lock' for additional safety shallow.h: document '{commit,rollback}_shallow_file' shallow: extract a header file for shallow-related functions commit: make 'commit_graft_pos' non-static
2020-05-01Merge branch 'tb/reset-shallow'Junio C Hamano1-9/+21
Fix in-core inconsistency after fetching into a shallow repository that broke the code to write out commit-graph. * tb/reset-shallow: shallow.c: use '{commit,rollback}_shallow_file' t5537: use test_write_lines and indented heredocs for readability
2020-04-30shallow: use struct 'shallow_lock' for additional safetyTaylor Blau1-11/+11
In previous patches, the functions 'commit_shallow_file' and 'rollback_shallow_file' were introduced to reset the shallowness validity checks on a repository after potentially modifying '.git/shallow'. These functions can be made safer by wrapping the 'struct lockfile *' in a new type, 'shallow_lock', so that they cannot be called with a raw lock (and potentially misused by other code that happens to possess a lockfile, but has nothing to do with shallowness). This patch introduces that type as a thin wrapper around 'struct lockfile', and updates the two aforementioned functions and their callers to use it. Suggested-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-30shallow: extract a header file for shallow-related functionsTaylor Blau1-0/+14
There are many functions in commit.h that are more related to shallow repositories than they are to any sort of generic commit machinery. Likely this began when there were only a few shallow-related functions, and commit.h seemed a reasonable enough place to put them. But, now there are a good number of shallow-related functions, and placing them all in 'commit.h' doesn't make sense. This patch extracts a 'shallow.h', which takes all of the declarations from 'commit.h' for functions which already exist in 'shallow.c'. We will bring the remaining shallow-related functions defined in 'commit.c' in a subsequent patch. For now, move only the ones that already are implemented in 'shallow.c', and update the necessary includes. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-24shallow.c: use '{commit,rollback}_shallow_file'Taylor Blau1-9/+21
In bd0b42aed3 (fetch-pack: do not take shallow lock unnecessarily, 2019-01-10), the author noted that 'is_repository_shallow' produces visible side-effect(s) by setting 'is_shallow' and 'shallow_stat'. This is a problem for e.g., fetching with '--update-shallow' in a shallow repository with 'fetch.writeCommitGraph' enabled, since the update to '.git/shallow' will cause Git to think that the repository isn't shallow when it is, thereby circumventing the commit-graph compatibility check. This causes problems in shallow repositories with at least shallow refs that have at least one ancestor (since the client won't have those objects, and therefore can't take the reachability closure over commits when writing a commit-graph). Address this by introducing thin wrappers over 'commit_lock_file' and 'rollback_lock_file' for use specifically when the lock is held over '.git/shallow'. These wrappers (appropriately called 'commit_shallow_file' and 'rollback_shallow_file') call into their respective functions in 'lockfile.h', but additionally reset validity checks used by the shallow machinery. Replace each instance of 'commit_lock_file' and 'rollback_lock_file' with 'commit_shallow_file' and 'rollback_shallow_file' when the lock being held is over the '.git/shallow' file. As a result, 'prune_shallow' can now only be called once (since 'check_shallow_file_for_update' will die after calling 'reset_repository_shallow'). But, this is OK since we only call 'prune_shallow' at most once per process. Helped-by: Jonathan Tan <jonathantanmy@google.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-30oid_array: rename source file from sha1-arrayJeff King1-1/+1
We renamed the actual data structure in 910650d2f8 (Rename sha1_array to oid_array, 2017-03-31), but the file is still called sha1-array. Besides being slightly confusing, it makes it more annoying to grep for leftover occurrences of "sha1" in various files, because the header is included in so many places. Let's complete the transition by renaming the source and header files (and fixing up a few comment references). I kept the "-" in the name, as that seems to be our style; cf. fc1395f4a4 (sha1_file.c: rename to use dash in file name, 2018-04-10). We also have oidmap.h and oidset.h without any punctuation, but those are "struct oidmap" and "struct oidset" in the code. We _could_ make this "oidarray" to match, but somehow it looks uglier to me because of the length of "array" (plus it would be a very invasive patch for little gain). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-11Merge branch 'rs/dedup-includes'Junio C Hamano1-3/+0
Code cleanup. * rs/dedup-includes: treewide: remove duplicate #include directives
2019-10-09Merge branch 'as/shallow-slab-use-fix'Junio C Hamano1-0/+2
Correct code that tried to reference all entries in a sparse array of pointers by mistake. * as/shallow-slab-use-fix: shallow.c: don't free unallocated slabs
2019-10-04treewide: remove duplicate #include directivesRené Scharfe1-3/+0
Found with "git grep '^#include ' '*.c' | sort | uniq -d". Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-02shallow.c: don't free unallocated slabsAli Utku Selen1-0/+2
Fix possible segfault when cloning a submodule shallow. Signed-off-by: Ali Utku Selen <auselen@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-27Use the right 'struct repository' instead of the_repositoryNguyễn Thái Ngọc Duy1-1/+2
There are a couple of places where 'struct repository' is already passed around, but the_repository is still used. Use the right repo. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-10fetch-pack: do not take shallow lock unnecessarilyJonathan Tan1-0/+7
When fetching using protocol v2, the remote may send a "shallow-info" section if the client is shallow. If so, Git as the client currently takes the shallow file lock, even if the "shallow-info" section is empty. This is not a problem except that Git does not support taking the shallow file lock after modifying the shallow file, because is_repository_shallow() stores information that is never cleared. And this take-after-modify occurs when Git does a tag-following fetch from a shallow repository on a transport that does not support tag following (since in this case, 2 fetches are performed). To solve this issue, take the shallow file lock (and perform all other shallow processing) only if the "shallow-info" section is non-empty; otherwise, behave as if it were empty. A full solution (probably, ensuring that any action of committing shallow file locks also includes clearing the information stored by is_repository_shallow()) would solve the issue without need for this patch, but this patch is independently useful (as an optimization to prevent writing a file in an unnecessary case), hence why I wrote it. I have included a NEEDSWORK outlining the full solution. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-06Merge branch 'js/shallow-and-fetch-prune'Junio C Hamano1-6/+17
"git repack" in a shallow clone did not correctly update the shallow points in the repository, leading to a repository that does not pass fsck. * js/shallow-and-fetch-prune: repack -ad: prune the list of shallow commits shallow: offer to prune only non-existing entries repack: point out a bug handling stale shallow info
2018-10-25shallow: offer to prune only non-existing entriesJohannes Schindelin1-6/+17
The `prune_shallow()` function wants a full reachability check to be completed before it goes to work, to ensure that all unreachable entries are removed from the shallow file. However, in the upcoming patch we do not even want to go that far. We really only need to remove entries corresponding to pruned commits, i.e. to commits that no longer exist. Let's support that use case. Rather than extending the signature of `prune_shallow()` to accept another Boolean, let's turn it into a bit field and declare constants, for readability. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-19Merge branch 'nd/the-index'Junio C Hamano1-1/+1
Various codepaths in the core-ish part learn to work on an arbitrary in-core index structure, not necessarily the default instance "the_index". * nd/the-index: (23 commits) revision.c: reduce implicit dependency the_repository revision.c: remove implicit dependency on the_index ws.c: remove implicit dependency on the_index tree-diff.c: remove implicit dependency on the_index submodule.c: remove implicit dependency on the_index line-range.c: remove implicit dependency on the_index userdiff.c: remove implicit dependency on the_index rerere.c: remove implicit dependency on the_index sha1-file.c: remove implicit dependency on the_index patch-ids.c: remove implicit dependency on the_index merge.c: remove implicit dependency on the_index merge-blobs.c: remove implicit dependency on the_index ll-merge.c: remove implicit dependency on the_index diff-lib.c: remove implicit dependency on the_index read-cache.c: remove implicit dependency on the_index diff.c: remove implicit dependency on the_index grep.c: remove implicit dependency on the_index diff.c: remove the_index dependency in textconv() functions blame.c: rename "repo" argument to "r" combine-diff.c: remove implicit dependency on the_index ...
2018-09-21revision.c: remove implicit dependency on the_indexNguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20commit.h: remove method declarationsDerrick Stolee1-0/+1
These methods are now declared in commit-reach.h. Remove them from commit.h and add new include statements in all files that require these declarations. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-29tag: add repository argument to deref_tagStefan Beller1-1/+3
Add a repository argument to allow the callers of deref_tag to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-29commit: add repository argument to lookup_commitStefan Beller1-7/+10
Add a repository argument to allow callers of lookup_commit to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-29commit: add repository argument to lookup_commit_reference_gentlyStefan Beller1-3/+6
Add a repository argument to allow callers of lookup_commit_reference_gently to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-29Merge branch 'sb/object-store-grafts' into sb/object-store-lookupJunio C Hamano1-36/+38
* sb/object-store-grafts: commit: allow lookup_commit_graft to handle arbitrary repositories commit: allow prepare_commit_graft to handle arbitrary repositories shallow: migrate shallow information into the object parser path.c: migrate global git_path_* to take a repository argument cache: convert get_graft_file to handle arbitrary repositories commit: convert read_graft_file to handle arbitrary repositories commit: convert register_commit_graft to handle arbitrary repositories commit: convert commit_graft_pos() to handle arbitrary repositories shallow: add repository argument to is_repository_shallow shallow: add repository argument to check_shallow_file_for_update shallow: add repository argument to register_shallow shallow: add repository argument to set_alternate_shallow_file commit: add repository argument to lookup_commit_graft commit: add repository argument to prepare_commit_graft commit: add repository argument to read_graft_file commit: add repository argument to register_commit_graft commit: add repository argument to commit_graft_pos object: move grafts to object parser object-store: move object access functions to object-store.h
2018-06-25Merge branch 'nd/reject-empty-shallow-request'Junio C Hamano1-0/+3
"git fetch --shallow-since=<cutoff>" that specifies the cut-off point that is newer than the existing history used to end up grabbing the entire history. Such a request now errors out. * nd/reject-empty-shallow-request: upload-pack: reject shallow requests that would return nothing
2018-06-25Merge branch 'nd/commit-util-to-slab'Junio C Hamano1-12/+28
The in-core "commit" object had an all-purpose "void *util" field, which was tricky to use especially in library-ish part of the code. All of the existing uses of the field has been migrated to a more dedicated "commit-slab" mechanism and the field is eliminated. * nd/commit-util-to-slab: commit.h: delete 'util' field in struct commit merge: use commit-slab in merge remote desc instead of commit->util log: use commit-slab in prepare_bases() instead of commit->util show-branch: note about its object flags usage show-branch: use commit-slab for commit-name instead of commit->util name-rev: use commit-slab for rev-name instead of commit->util bisect.c: use commit-slab for commit weight instead of commit->util revision.c: use commit-slab for show_source sequencer.c: use commit-slab to associate todo items to commits sequencer.c: use commit-slab to mark seen commits shallow.c: use commit-slab for commit depth instead of commit->util describe: use commit-slab for commit names instead of commit->util blame: use commit-slab for blame suspects instead of commit->util commit-slab: support shared commit-slab commit-slab.h: code split
2018-06-04upload-pack: reject shallow requests that would return nothingNguyễn Thái Ngọc Duy1-0/+3
Shallow clones with --shallow-since or --shalow-exclude work by running rev-list to get all reachable commits, then draw a boundary between reachable and unreachable and send "shallow" requests based on that. The code does miss one corner case: if rev-list returns nothing, we'll have no border and we'll send no shallow requests back to the client (i.e. no history cuts). This essentially means a full clone (or a full branch if the client requests just one branch). One example is the oldest commit is older than what is specified by --shallow-since. To avoid this, if rev-list returns nothing, we abort the clone/fetch. The user could adjust their request (e.g. --shallow-since further back in the past) and retry. Another possible option for this case is to fall back to a default depth (like depth 1). But I don't like too much magic that way because we may return something unexpected to the user. If they request "history since 2008" and we return a single depth at 2000, that might break stuff for them. It is better to tell them that something is wrong and let them take the best course of action. Note that we need to die() in get_shallow_commits_by_rev_list() instead of just checking for empty result from its caller deepen_by_rev_list() and handling the error there. The reason is, empty result could be a valid case: if you have commits in year 2013 and you request --shallow-since=year.2000 then you should get a full clone (i.e. empty result). Reported-by: Andreas Krey <a.krey@gmx.de> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-30Merge branch 'js/use-bug-macro'Junio C Hamano1-3/+3
Developer support update, by using BUG() macro instead of die() to mark codepaths that should not happen more clearly. * js/use-bug-macro: BUG_exit_code: fix sparse "symbol not declared" warning Convert remaining die*(BUG) messages Replace all die("BUG: ...") calls by BUG() ones run-command: use BUG() to report bugs, not die() test-tool: help verifying BUG() code paths
2018-05-21shallow.c: use commit-slab for commit depth instead of commit->utilNguyễn Thái Ngọc Duy1-12/+28
It's done so that commit->util can be removed. See more explanation in the commit that removes commit->util. While at there, plug a leak for keeping track of depth in this code. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18shallow: migrate shallow information into the object parserStefan Beller1-27/+23
We need to convert the shallow functions all at the same time as we move the data structures they operate on into the repository. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18path.c: migrate global git_path_* to take a repository argumentStefan Beller1-5/+7
Migrate all git_path_* functions that are defined in path.c to take a repository argument. Unlike other patches in this series, do not use the #define trick, as we rewrite the whole function, which is rather small. This doesn't migrate all the functions, as other builtins have their own local path functions defined using GIT_PATH_FUNC. So keep that macro around to serve the other locations. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18shallow: add repository argument to is_repository_shallowStefan Beller1-4/+4
Add a repository argument to allow callers of is_repository_shallow to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18shallow: add repository argument to check_shallow_file_for_updateStefan Beller1-3/+4
Add a repository argument to allow callers of check_shallow_file_for_update to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18shallow: add repository argument to register_shallowStefan Beller1-2/+2
Add a repository argument to allow callers of register_shallow to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18shallow: add repository argument to set_alternate_shallow_fileStefan Beller1-1/+1
Add a repository argument to allow callers of set_alternate_shallow_file to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18commit: add repository argument to lookup_commit_graftJonathan Nieder1-2/+3
Add a repository argument to allow callers of lookup_commit_graft to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-16commit: add repository argument to register_commit_graftJonathan Nieder1-1/+2
Add a repository argument to allow callers of register_commit_graft to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-16object-store: move object access functions to object-store.hStefan Beller1-0/+1
This should make these functions easier to find and cache.h less overwhelming to read. In particular, this moves: - read_object_file - oid_object_info - write_object_file As a result, most of the codebase needs to #include object-store.h. In this patch the #include is only added to files that would fail to compile otherwise. It would be better to #include wherever identifiers from the header are used. That can happen later when we have better tooling for it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-10lock_file: make function-local locks non-staticMartin Ågren1-1/+1
Placing `struct lock_file`s on the stack used to be a bad idea, because the temp- and lockfile-machinery would keep a pointer into the struct. But after 076aa2cbd (tempfile: auto-allocate tempfiles on heap, 2017-09-05), we can safely have lockfiles on the stack. (This applies even if a user returns early, leaving a locked lock behind.) These `struct lock_file`s are local to their respective functions and we can drop their staticness. For good measure, I have inspected these sites and come to believe that they always release the lock, with the possible exception of bailing out using `die()` or `exit()` or by returning from a `cmd_foo()`. As pointed out by Jeff King, it would be bad if someone held on to a `struct lock_file *` for some reason. After some grepping, I agree with his findings: no-one appears to be doing that. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-06Replace all die("BUG: ...") calls by BUG() onesJohannes Schindelin1-3/+3
In d8193743e08 (usage.c: add BUG() function, 2017-05-12), a new macro was introduced to use for reporting bugs instead of die(). It was then subsequently used to convert one single caller in 588a538ae55 (setup_git_env: convert die("BUG") to BUG(), 2017-05-12). The cover letter of the patch series containing this patch (cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not terribly clear why only one call site was converted, or what the plan is for other, similar calls to die() to report bugs. Let's just convert all remaining ones in one fell swoop. This trick was performed by this invocation: sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-29Merge branch 'ma/leakplugs'Junio C Hamano1-1/+1
Memory leaks in various codepaths have been plugged. * ma/leakplugs: pack-bitmap[-write]: use `object_array_clear()`, don't leak object_array: add and use `object_array_pop()` object_array: use `object_array_clear()`, not `free()` leak_pending: use `object_array_clear()`, not `free()` commit: fix memory leak in `reduce_heads()` builtin/commit: fix memory leak in `prepare_index()`
2017-09-25Merge branch 'jk/write-in-full-fix'Junio C Hamano1-3/+3
Many codepaths did not diagnose write failures correctly when disks go full, due to their misuse of write_in_full() helper function, which have been corrected. * jk/write-in-full-fix: read_pack_header: handle signed/unsigned comparison in read result config: flip return value of store_write_*() notes-merge: use ssize_t for write_in_full() return value pkt-line: check write_in_full() errors against "< 0" convert less-trivial versions of "write_in_full() != len" avoid "write_in_full(fd, buf, len) != len" pattern get-tar-commit-id: check write_in_full() return against 0 config: avoid "write_in_full(fd, buf, len) < len" pattern
2017-09-24object_array: add and use `object_array_pop()`Martin Ågren1-1/+1
In a couple of places, we pop objects off an object array `foo` by decreasing `foo.nr`. We access `foo.nr` in many places, but most if not all other times we do so read-only, e.g., as we iterate over the array. But when we change `foo.nr` behind the array's back, it feels a bit nasty and looks like it might leak memory. Leaks happen if the popped element has an allocated `name` or `path`. At the moment, that is not the case. Still, 1) the object array might gain more fields that want to be freed, 2) a code path where we pop might start using names or paths, 3) one of these code paths might be copied to somewhere where we do, and 4) using a dedicated function for popping is conceptually cleaner. Introduce and use `object_array_pop()` instead. Release memory in the new function. Document that popping an object leaves the associated elements in limbo. The converted places were identified by grepping for "\.nr\>" and looking for "--". Make the new function return NULL on an empty array. This is consistent with `pop_commit()` and allows the following: while ((o = object_array_pop(&foo)) != NULL) { // do something } But as noted above, we don't need to go out of our way to avoid reading `foo.nr`. This is probably more readable: while (foo.nr) { ... o = object_array_pop(&foo); // do something } The name of `object_array_pop()` does not quite align with `add_object_array()`. That is unfortunate. On the other hand, it matches `object_array_clear()`. Arguably it's `add_...` that is the odd one out, since it reads like it's used to "add" an "object array". For that reason, side with `object_array_clear()`. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14avoid "write_in_full(fd, buf, len) != len" patternJeff King1-3/+3
The return value of write_in_full() is either "-1", or the requested number of bytes[1]. If we make a partial write before seeing an error, we still return -1, not a partial value. This goes back to f6aa66cb95 (write_in_full: really write in full or return error on disk full., 2007-01-11). So checking anything except "was the return value negative" is pointless. And there are a couple of reasons not to do so: 1. It can do a funny signed/unsigned comparison. If your "len" is signed (e.g., a size_t) then the compiler will promote the "-1" to its unsigned variant. This works out for "!= len" (unless you really were trying to write the maximum size_t bytes), but is a bug if you check "< len" (an example of which was fixed recently in config.c). We should avoid promoting the mental model that you need to check the length at all, so that new sites are not tempted to copy us. 2. Checking for a negative value is shorter to type, especially when the length is an expression. 3. Linus says so. In d34cf19b89 (Clean up write_in_full() users, 2007-01-11), right after the write_in_full() semantics were changed, he wrote: I really wish every "write_in_full()" user would just check against "<0" now, but this fixes the nasty and stupid ones. Appeals to authority aside, this makes it clear that writing it this way does not have an intentional benefit. It's a historical curiosity that we never bothered to clean up (and which was undoubtedly cargo-culted into new sites). So let's convert these obviously-correct cases (this includes write_str_in_full(), which is just a wrapper for write_in_full()). [1] A careful reader may notice there is one way that write_in_full() can return a different value. If we ask write() to write N bytes and get a return value that is _larger_ than N, we could return a larger total. But besides the fact that this would imply a totally broken version of write(), it would already invoke undefined behavior. Our internal remaining counter is an unsigned size_t, which means that subtracting too many byte will wrap it around to a very large number. So we'll instantly begin reading off the end of the buffer, trying to write gigabytes (or petabytes) of data. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06tempfile: auto-allocate tempfiles on heapJeff King1-7/+6
The previous commit taught the tempfile code to give up ownership over tempfiles that have been renamed or deleted. That makes it possible to use a stack variable like this: struct tempfile t; create_tempfile(&t, ...); ... if (!err) rename_tempfile(&t, ...); else delete_tempfile(&t); But doing it this way has a high potential for creating memory errors. The tempfile we pass to create_tempfile() ends up on a global linked list, and it's not safe for it to go out of scope until we've called one of those two deactivation functions. Imagine that we add an early return from the function that forgets to call delete_tempfile(). With a static or heap tempfile variable, the worst case is that the tempfile hangs around until the program exits (and some functions like setup_shallow_temporary rely on this intentionally, creating a tempfile and then leaving it for later cleanup). But with a stack variable as above, this is a serious memory error: the variable goes out of scope and may be filled with garbage by the time the tempfile code looks at it. Let's see if we can make it harder to get this wrong. Since many callers need to allocate arbitrary numbers of tempfiles, we can't rely on static storage as a general solution. So we need to turn to the heap. We could just ask all callers to pass us a heap variable, but that puts the burden on them to call free() at the right time. Instead, let's have the tempfile code handle the heap allocation _and_ the deallocation (when the tempfile is deactivated and removed from the list). This changes the return value of all of the creation functions. For the cleanup functions (delete and rename), we'll add one extra bit of safety: instead of taking a tempfile pointer, we'll take a pointer-to-pointer and set it to NULL after freeing the object. This makes it safe to double-call functions like delete_tempfile(), as the second call treats the NULL input as a noop. Several callsites follow this pattern. The resulting patch does have a fair bit of noise, as each caller needs to be converted to handle: 1. Storing a pointer instead of the struct itself. 2. Passing the pointer instead of taking the struct address. 3. Handling a "struct tempfile *" return instead of a file descriptor. We could play games to make this less noisy. For example, by defining the tempfile like this: struct tempfile { struct heap_allocated_part_of_tempfile { int fd; ...etc } *actual_data; } Callers would continue to have a "struct tempfile", and it would be "active" only when the inner pointer was non-NULL. But that just makes things more awkward in the long run. There aren't that many callers, so we can simply bite the bullet and adjust all of them. And the compiler makes it easy for us to find them all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06tempfile: do not delete tempfile on failed closeJeff King1-1/+1
When close_tempfile() fails, we delete the tempfile and reset the fields of the tempfile struct. This makes it easier for callers to return without cleaning up, but it also makes this common pattern: if (close_tempfile(tempfile)) return error_errno("error closing %s", tempfile->filename.buf); wrong, because the "filename" field has been reset after the failed close. And it's not easy to fix, as in many cases we don't have another copy of the filename (e.g., if it was created via one of the mks_tempfile functions, and we just have the original template string). Let's drop the feature that a failed close automatically deletes the file. This puts the burden on the caller to do the deletion themselves, but this isn't that big a deal. Callers which do: if (write(...) || close_tempfile(...)) { delete_tempfile(...); return -1; } already had to call delete when the write() failed, and so aren't affected. Likewise, any caller which just calls die() in the error path is OK; we'll delete the tempfile during the atexit handler. Because this patch changes the semantics of close_tempfile() without changing its signature, all callers need to be manually checked and converted to the new scheme. This patch covers all in-tree callers, but there may be others for not-yet-merged topics. To catch these, we rename the function to close_tempfile_gently(), which will attract compile-time attention to new callers. (Technically the original could be considered "gentle" already in that it didn't die() on errors, but this one is even more so). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06always check return value of close_tempfileJeff King1-2/+2
If close_tempfile() encounters an error, then it deletes the tempfile and resets the "struct tempfile". But many code paths ignore the return value and continue to use the tempfile. Instead, we should generally treat this the same as a write() error. Note that in the postimage of some of these cases our error message will be bogus after a failed close because we look at tempfile->filename (either directly or via get_tempfile_path). But after the failed close resets the tempfile object, this is guaranteed to be the empty string. That will be addressed in a future patch (because there are many more cases of the same problem than just these instances). Note also in the hunk in gpg-interface.c that it's fine to call delete_tempfile() in the error path, even if close_tempfile() failed and already deleted the file. The tempfile code is smart enough to know the second deletion is a noop. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06setup_temporary_shallow: move tempfile struct into functionJeff King1-6/+5
The setup_temporary_shallow() function creates a temporary file, but we never access the tempfile struct outside of the function. This is OK, since it means we'll just clean up the tempfile on exit. But we can simplify the code a bit by moving the global tempfile struct to the only function in which it's used. Note that it must remain "static" due to tempfile.c's requirement that tempfile storage never goes away until program exit. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06setup_temporary_shallow: avoid using inactive tempfileJeff King1-1/+1
When there are no shallow entries to write, we skip creating the tempfile entirely and try to return the empty string. But we do so by calling get_tempfile_path() on the inactive tempfile object. This will trigger an assertion that kills the program. The bug was introduced by 6e122b449b (setup_temporary_shallow(): use tempfile module, 2015-08-10). But nobody seems to have noticed since then because we do not end up calling this function at all when there are no shallow items. In other words, this code path is completely unexercised. Since the tempfile object is a static global, it _is_ possible that we call the function twice, writing out shallow info the first time and then "reusing" our tempfile object the second time. But: 1. It seems unlikely that this was the intent, as hitting this code path would imply somebody clearing the shallow_info list between calls. And if somebody _did_ call the function multiple times without clearing the shallow_info list, we'd hit a different BUG for trying to reuse an already-active tempfile. 2. I verified by code inspection that the function is only called once per program. And also replacing this code with a BUG() and running the test suite demonstrates that it is not triggered there. So we could probably just replace this with an assertion and confirm that it's never called. However, the original intent does seem to be that you _could_ call it when the shallow_info is empty. And that's easy enough to do; since the return value doesn't need to point to a writable buffer, we can just return a string literal. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-13commit: convert lookup_commit_graft to struct object_idStefan Beller1-2/+2
With this patch, commit.h doesn't contain the string 'sha1' any more. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-12Merge branch 'rs/use-div-round-up'Junio C Hamano1-4/+4
Code cleanup. * rs/use-div-round-up: use DIV_ROUND_UP
2017-07-10use DIV_ROUND_UPRené Scharfe1-4/+4
Convert code that divides and rounds up to use DIV_ROUND_UP to make the intent clearer and reduce the number of magic constants. Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-29Merge branch 'js/plug-leaks'Junio C Hamano1-2/+6
Fix memory leaks pointed out by Coverity (and people). * js/plug-leaks: (26 commits) checkout: fix memory leak submodule_uses_worktrees(): plug memory leak show_worktree(): plug memory leak name-rev: avoid leaking memory in the `deref` case remote: plug memory leak in match_explicit() add_reflog_for_walk: avoid memory leak shallow: avoid memory leak line-log: avoid memory leak receive-pack: plug memory leak in update() fast-export: avoid leaking memory in handle_tag() mktree: plug memory leaks reported by Coverity pack-redundant: plug memory leak setup_discovered_git_dir(): plug memory leak setup_bare_git_dir(): help static analysis split_commit_in_progress(): simplify & fix memory leak checkout: fix memory leak cat-file: fix memory leak mailinfo & mailsplit: check for EOF while parsing status: close file descriptor after reading git-rebase-todo difftool: address a couple of resource/memory leaks ...
2017-05-08Convert lookup_commit* to struct object_idbrian m. carlson1-10/+10
Convert lookup_commit, lookup_commit_or_die, lookup_commit_reference, and lookup_commit_reference_gently to take struct object_id arguments. Introduce a temporary in parse_object buffer in order to convert this function. This is required since in order to convert parse_object and parse_object_buffer, lookup_commit_reference_gently and lookup_commit_or_die would need to be converted. Not introducing a temporary would therefore require that lookup_commit_or_die take a struct object_id *, but lookup_commit would take unsigned char *, leaving a confusing and hard-to-use interface. parse_object_buffer will lose this temporary in a later patch. This commit was created with manual changes to commit.c, commit.h, and object.c, plus the following semantic patch: @@ expression E1, E2; @@ - lookup_commit_reference_gently(E1.hash, E2) + lookup_commit_reference_gently(&E1, E2) @@ expression E1, E2; @@ - lookup_commit_reference_gently(E1->hash, E2) + lookup_commit_reference_gently(E1, E2) @@ expression E1; @@ - lookup_commit_reference(E1.hash) + lookup_commit_reference(&E1) @@ expression E1; @@ - lookup_commit_reference(E1->hash) + lookup_commit_reference(E1) @@ expression E1; @@ - lookup_commit(E1.hash) + lookup_commit(&E1) @@ expression E1; @@ - lookup_commit(E1->hash) + lookup_commit(E1) @@ expression E1, E2; @@ - lookup_commit_or_die(E1.hash, E2) + lookup_commit_or_die(&E1, E2) @@ expression E1, E2; @@ - lookup_commit_or_die(E1->hash, E2) + lookup_commit_or_die(E1, E2) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08Convert remaining callers of lookup_commit_reference* to object_idbrian m. carlson1-3/+3
There are a small number of remaining callers of lookup_commit_reference and lookup_commit_reference_gently that still need to be converted to struct object_id. Convert these. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08shallow: convert shallow registration functions to object_idbrian m. carlson1-6/+6
Convert register_shallow and unregister_shallow to take struct object_id. register_shallow is a caller of lookup_commit, which we will convert later. It doesn't make sense for the registration and unregistration functions to have incompatible interfaces, so convert them both. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08shallow: avoid memory leakJohannes Schindelin1-2/+6
Reported by Coverity. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-31Rename sha1_array to oid_arraybrian m. carlson1-6/+6
Since this structure handles an array of object IDs, rename it to struct oid_array. Also rename the accessor functions and the initialization constant. This commit was produced mechanically by providing non-Documentation files to the following Perl one-liners: perl -pi -E 's/struct sha1_array/struct oid_array/g' perl -pi -E 's/\bsha1_array_/oid_array_/g' perl -pi -E 's/SHA1_ARRAY_INIT/OID_ARRAY_INIT/g' Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-28sha1-array: convert internal storage for struct sha1_array to object_idbrian m. carlson1-13/+13
Make the internal storage for struct sha1_array use an array of struct object_id internally. Update the users of this struct which inspect its internals. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-21Merge branch 'nd/shallow-fixup'Junio C Hamano1-19/+20
Code cleanup in shallow boundary computation. * nd/shallow-fixup: shallow.c: remove useless code shallow.c: bit manipulation tweaks shallow.c: avoid theoretical pointer wrap-around shallow.c: make paint_alloc slightly more robust shallow.c: stop abusing COMMIT_SLAB_SIZE for paint_info's memory pools shallow.c: rename fields in paint_info to better express their purposes
2016-12-07shallow.c: remove useless codeNguyễn Thái Ngọc Duy1-4/+0
Some context before we talk about the removed code. This paint_down() is part of step 6 of 58babff (shallow.c: the 8 steps to select new commits for .git/shallow - 2013-12-05). When we fetch from a shallow repository, we need to know if one of the new/updated refs needs new "shallow commits" in .git/shallow (because we don't have enough history of those refs) and which one. The question at step 6 is, what (new) shallow commits are required in other to maintain reachability throughout the repository _without_ cutting our history short? To answer, we mark all commits reachable from existing refs with UNINTERESTING ("rev-list --not --all"), mark shallow commits with BOTTOM, then for each new/updated refs, walk through the commit graph until we either hit UNINTERESTING or BOTTOM, marking the ref on the commit as we walk. After all the walking is done, we check the new shallow commits. If we have not seen any new ref marked on a new shallow commit, we know all new/updated refs are reachable using just our history and .git/shallow. The shallow commit in question is not needed and can be thrown away. So, the code. The loop here (to walk through commits) is basically 1. get one commit from the queue 2. ignore if it's SEEN or UNINTERESTING 3. mark it 4. go through all the parents and.. 5a. mark it if it's never marked before 5b. put it back in the queue What we do in this patch is drop step 5a because it is not necessary. The commit being marked at 5a is put back on the queue, and will be marked at step 3 at the next iteration. The only case it will not be marked is when the commit is already marked UNINTERESTING (5a does not check this), which will be ignored at step 2. But we don't care about refs marking on UNINTERESTING. We care about the marking on _shallow commits_ that are not reachable from our current history (and having UNINTERESTING on it means it's reachable). So it's ok for an UNINTERESTING not to be ref-marked. Reported-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-07shallow.c: bit manipulation tweaksRasmus Villemoes1-4/+4
First of all, 1 << 31 is technically undefined behaviour, so let's just use an unsigned literal. If i is 'signed int' and gcc doesn't know that i is positive, gcc generates code to compute the C99-mandated values of "i / 32" and "i % 32", which is a lot more complicated than simple a simple shifts/mask. The only caller of paint_down actually passes an "unsigned int" value, but the prototype of paint_down causes (completely well-defined) conversion to signed int, and gcc has no way of knowing that the converted value is non-negative. Just make the id parameter unsigned. In update_refstatus, the change in generated code is much smaller, presumably because gcc is smart enough to see that i starts as 0 and is only incremented, so it is allowed (per the UD of signed overflow) to assume that i is always non-negative. But let's just help less smart compilers generate good code anyway. Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-07shallow.c: avoid theoretical pointer wrap-aroundRasmus Villemoes1-1/+1
The expression info->free+size is technically undefined behaviour in exactly the case we want to test for. Moreover, the compiler is likely to translate the expression to (unsigned long)info->free + size > (unsigned long)info->end where there's at least a theoretical chance that the LHS could wrap around 0, giving a false negative. This might as well be written using pointer subtraction avoiding these issues. Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-07shallow.c: make paint_alloc slightly more robustNguyễn Thái Ngọc Duy1-0/+3
paint_alloc() allocates a big block of memory and splits it into smaller, fixed size, chunks of memory whenever it's called. Each chunk contains enough bits to present all "new refs" [1] in a fetch from a shallow repository. We do not check if the new "big block" is smaller than the requested memory chunk though. If it happens, we'll happily pass back a memory region smaller than expected. Which will lead to problems eventually. A normal fetch may add/update a dozen new refs. Let's stay on the "reasonably extreme" side and say we need 16k refs (or bits from paint_alloc's perspective). Each chunk of memory would be 2k, much smaller than the memory pool (512k). So, normally, the under-allocation situation should never happen. A bad guy, however, could make a fetch that adds more than 4m new/updated refs to this code which results in a memory chunk larger than pool size. Check this case and abort. Noticed-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> Reviewed-by: Jeff King <peff@peff.net> [1] Details are in commit message of 58babff (shallow.c: the 8 steps to select new commits for .git/shallow - 2013-12-05), step 6. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-07shallow.c: stop abusing COMMIT_SLAB_SIZE for paint_info's memory poolsNguyễn Thái Ngọc Duy1-2/+4
We need to allocate a "big" block of memory in paint_alloc(). The exact size does not really matter. But the pool size has no relation with commit-slab. Stop using that macro here. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-07shallow.c: rename fields in paint_info to better express their purposesNguyễn Thái Ngọc Duy1-9/+9
paint_alloc() is basically malloc(), tuned for allocating a fixed number of bits on every call without worrying about freeing any individual allocation since all will be freed at the end. It does it by allocating a big block of memory every time it runs out of "free memory". "slab" is a poor choice of name, at least poorer than "pool". Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-31Merge branch 'ls/filter-process'Junio C Hamano1-1/+1
The smudge/clean filter API expect an external process is spawned to filter the contents for each path that has a filter defined. A new type of "process" filter API has been added to allow the first request to run the filter for a path to spawn a single process, and all filtering need is served by this single process for multiple paths, reducing the process creation overhead. * ls/filter-process: contrib/long-running-filter: add long running filter example convert: add filter.<driver>.process option convert: prepare filter.<driver>.process option convert: make apply_filter() adhere to standard Git error handling pkt-line: add functions to read/write flush terminated packet streams pkt-line: add packet_write_gently() pkt-line: add packet_flush_gently() pkt-line: add packet_write_fmt_gently() pkt-line: extract set_packet_header() pkt-line: rename packet_write() to packet_write_fmt() run-command: add clean_on_exit_handler run-command: move check_pipe() from write_or_die to run_command convert: modernize tests convert: quote filter names in error messages
2016-10-17pkt-line: rename packet_write() to packet_write_fmt()Lars Schneider1-1/+1
packet_write() should be called packet_write_fmt() because it is a printf-like function that takes a format string as first parameter. packet_write_fmt() should be used for text strings only. Arbitrary binary data should use a new packet_write() function that is introduced in a subsequent patch. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10Merge branch 'nd/shallow-deepen'Junio C Hamano1-0/+78
The existing "git fetch --depth=<n>" option was hard to use correctly when making the history of an existing shallow clone deeper. A new option, "--deepen=<n>", has been added to make this easier to use. "git clone" also learned "--shallow-since=<date>" and "--shallow-exclude=<tag>" options to make it easier to specify "I am interested only in the recent N months worth of history" and "Give me only the history since that version". * nd/shallow-deepen: (27 commits) fetch, upload-pack: --deepen=N extends shallow boundary by N commits upload-pack: add get_reachable_list() upload-pack: split check_unreachable() in two, prep for get_reachable_list() t5500, t5539: tests for shallow depth excluding a ref clone: define shallow clone boundary with --shallow-exclude fetch: define shallow boundary with --shallow-exclude upload-pack: support define shallow boundary by excluding revisions refs: add expand_ref() t5500, t5539: tests for shallow depth since a specific date clone: define shallow clone boundary based on time with --shallow-since fetch: define shallow boundary with --shallow-since upload-pack: add deepen-since to cut shallow repos based on time shallow.c: implement a generic shallow boundary finder based on rev-list fetch-pack: use a separate flag for fetch in deepening mode fetch-pack.c: mark strings for translating fetch-pack: use a common function for verbose printing fetch-pack: use skip_prefix() instead of starts_with() upload-pack: move rev-list code out of check_non_tip() upload-pack: make check_non_tip() clean things up on error upload-pack: tighten number parsing at "deepen" lines ...
2016-08-01pass constants as first argument to st_mult()René Scharfe1-1/+1
The result of st_mult() is the same no matter the order of its arguments. It invokes the macro unsigned_mult_overflows(), which divides the second parameter by the first one. Pass constants first to allow that division to be done already at compile time. Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13shallow.c: implement a generic shallow boundary finder based on rev-listNguyễn Thái Ngọc Duy1-0/+78
Instead of a custom commit walker like get_shallow_commits(), this new function uses rev-list to mark NOT_SHALLOW to all reachable commits, except borders. The definition of reachable is to be defined by the protocol later. This makes it more flexible to define shallow boundary. The way we find border is paint all reachable commits NOT_SHALLOW. Any of them that "touches" commits without NOT_SHALLOW flag are considered shallow (e.g. zero parents via grafting mechanism). Shallow commits and their true parents are all marked SHALLOW. Then NOT_SHALLOW is removed from shallow commits at the end. There is an interesting observation. With a generic walker, we can produce all kinds of shallow cutting. In the following graph, every commit but "x" is reachable. "b" is a parent of "a". x -- a -- o / / x -- c -- b -- o After this function is run, "a" and "c" are both considered shallow commits. After grafting occurs at the client side, what we see is a -- o / c -- b -- o Notice that because of grafting, "a" has zero parents, so "b" is no longer a parent of "a". This is unfortunate and may be solved in two ways. The first is change the way shallow grafting works and keep "a -- b" connection if "b" exists and always ends at shallow commits (iow, no loose ends). This is hard to detect, or at least not cheap to do. The second way is mark one "x" as shallow commit instead of "a" and produce this graph at client side: x -- a -- o / / c -- b -- o More commits, but simpler grafting rules. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22use st_add and st_mult for allocation size computationJeff King1-1/+1
If our size computation overflows size_t, we may allocate a much smaller buffer than we expected and overflow it. It's probably impossible to trigger an overflow in most of these sites in practice, but it is easy enough convert their additions and multiplications into overflow-checking variants. This may be fixing real bugs, and it makes auditing the code easier. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22convert trivial cases to ALLOC_ARRAYJeff King1-3/+3
Each of these cases can be converted to use ALLOC_ARRAY or REALLOC_ARRAY, which has two advantages: 1. It automatically checks the array-size multiplication for overflow. 2. It always uses sizeof(*array) for the element-size, so that it can never go out of sync with the declared type of the array. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-20Remove get_object_hash.brian m. carlson1-1/+1
Convert all instances of get_object_hash to use an appropriate reference to the hash member of the oid member of struct object. This provides no functional change, as it is essentially a macro substitution. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
2015-11-20Convert struct object to object_idbrian m. carlson1-2/+2
struct object is one of the major data structures dealing with object IDs. Convert it to use struct object_id instead of an unsigned char array. Convert get_object_hash to refer to the new member as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
2015-11-20Add several uses of get_object_hash.brian m. carlson1-1/+1
Convert most instances where the sha1 member of struct object is dereferenced to use get_object_hash. Most instances that are passed to functions that have versions taking struct object_id, such as get_sha1_hex/get_oid_hex, or instances that can be trivially converted to use struct object_id instead, are not converted. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
2015-10-30Merge branch 'rs/pop-commit'Junio C Hamano1-5/+1
Code simplification. * rs/pop-commit: use pop_commit() for consuming the first entry of a struct commit_list
2015-10-29Merge branch 'tk/sigchain-unnecessary-post-tempfile'Junio C Hamano1-1/+0
Remove no-longer used #include. * tk/sigchain-unnecessary-post-tempfile: shallow: remove unused #include "sigchain.h" read-cache: remove unused #include "sigchain.h" diff: remove unused #include "sigchain.h" credential-cache--daemon: remove unused #include "sigchain.h"
2015-10-26use pop_commit() for consuming the first entry of a struct commit_listRené Scharfe1-5/+1
Instead of open-coding the function pop_commit() just call it. This makes the intent clearer and reduces code size. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-22shallow: remove unused #include "sigchain.h"Tobias Klauser1-1/+0
After switching to use the tempfile module in commit 6e122b44 (setup_temporary_shallow(): use tempfile module), no declarations from sigchain.h are used in read-cache.c anymore. Thus, remove the #include. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-25Merge branch 'mh/tempfile'Junio C Hamano1-31/+10
The "lockfile" API has been rebuilt on top of a new "tempfile" API. * mh/tempfile: credential-cache--daemon: use tempfile module credential-cache--daemon: delete socket from main() gc: use tempfile module to handle gc.pid file lock_repo_for_gc(): compute the path to "gc.pid" only once diff: use tempfile module setup_temporary_shallow(): use tempfile module write_shared_index(): use tempfile module register_tempfile(): new function to handle an existing temporary file tempfile: add several functions for creating temporary files prepare_tempfile_object(): new function, extracted from create_tempfile() tempfile: a new module for handling temporary files commit_lock_file(): use get_locked_file_path() lockfile: add accessor get_lock_file_path() lockfile: add accessors get_lock_file_fd() and get_lock_file_fp() create_bundle(): duplicate file descriptor to avoid closing it twice lockfile: move documentation to lockfile.h and lockfile.c
2015-08-10memoize common git-path "constant" filesJeff King1-5/+5
One of the most common uses of git_path() is to pass a constant, like git_path("MERGE_MSG"). This has two drawbacks: 1. The return value is a static buffer, and the lifetime is dependent on other calls to git_path, etc. 2. There's no compile-time checking of the pathname. This is OK for a one-off (after all, we have to spell it correctly at least once), but many of these constant strings appear throughout the code. This patch introduces a series of functions to "memoize" these strings, which are essentially globals for the lifetime of the program. We compute the value once, take ownership of the buffer, and return the cached value for subsequent calls. cache.h provides a helper macro for defining these functions as one-liners, and defines a few common ones for global use. Using a macro is a little bit gross, but it does nicely document the purpose of the functions. If we need to touch them all later (e.g., because we learned how to change the git_dir variable at runtime, and need to invalidate all of the stored values), it will be much easier to have the complete list. Note that the shared-global functions have separate, manual declarations. We could do something clever with the macros (e.g., expand it to a declaration in some places, and a declaration _and_ a definition in path.c). But there aren't that many, and it's probably better to stay away from too-magical macros. Likewise, if we abandon the C preprocessor in favor of generating these with a script, we could get much fancier. E.g., normalizing "FOO/BAR-BAZ" into "git_path_foo_bar_baz". But the small amount of saved typing is probably not worth the resulting confusion to readers who want to grep for the function's definition. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10setup_temporary_shallow(): use tempfile moduleMichael Haggerty1-28/+7
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10lockfile: add accessor get_lock_file_path()Michael Haggerty1-3/+3
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25shallow: rewrite functions to take object_id argumentsMichael Haggerty1-18/+11
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25each_ref_fn: change to take an object_id parameterMichael Haggerty1-6/+13
Change typedef each_ref_fn to take a "const struct object_id *oid" parameter instead of "const unsigned char *sha1". To aid this transition, implement an adapter that can be used to wrap old-style functions matching the old typedef, which is now called "each_ref_sha1_fn"), and make such functions callable via the new interface. This requires the old function and its cb_data to be wrapped in a "struct each_ref_fn_sha1_adapter", and that object to be used as the cb_data for an adapter function, each_ref_fn_adapter(). This is an enormous diff, but most of it consists of simple, mechanical changes to the sites that call any of the "for_each_ref" family of functions. Subsequent to this change, the call sites can be rewritten one by one to use the new interface. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-05Merge branch 'bc/object-id'Junio C Hamano1-4/+4
Identify parts of the code that knows that we use SHA-1 hash to name our objects too much, and use (1) symbolic constants instead of hardcoded 20 as byte count and/or (2) use struct object_id instead of unsigned char [20] for object names. * bc/object-id: apply: convert threeway_stage to object_id patch-id: convert to use struct object_id commit: convert parts to struct object_id diff: convert struct combine_diff_path to object_id bulk-checkin.c: convert to use struct object_id zip: use GIT_SHA1_HEXSZ for trailers archive.c: convert to use struct object_id bisect.c: convert leaf functions to use struct object_id define utility functions for object IDs define a structure for object IDs
2015-03-13commit: convert parts to struct object_idbrian m. carlson1-4/+4
Convert struct commit_graft and necessary local parts of commit.c. Also, convert several constants based on the hex length of an SHA-1 to use GIT_SHA1_HEXSZ, and move several magic constants into variables for readability. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-24Merge branch 'jk/blame-commit-label' into maintJunio C Hamano1-1/+1
"git blame HEAD -- missing" failed to correctly say "HEAD" when it tried to say "No such path 'missing' in HEAD". * jk/blame-commit-label: blame.c: fix garbled error message use xstrdup_or_null to replace ternary conditionals builtin/commit.c: use xstrdup_or_null instead of envdup builtin/apply.c: use xstrdup_or_null instead of null_strdup git-compat-util: add xstrdup_or_null helper
2015-02-11Merge branch 'jc/unused-symbols'Junio C Hamano1-1/+1
Mark file-local symbols as "static", and drop functions that nobody uses. * jc/unused-symbols: shallow.c: make check_shallow_file_for_update() static remote.c: make clear_cas_option() static urlmatch.c: make match_urls() static revision.c: make save_parents() and free_saved_parents() static line-log.c: make line_log_data_init() static pack-bitmap.c: make pack_bitmap_filename() static prompt.c: remove git_getpass() nobody uses http.c: make finish_active_slot() and handle_curl_result() static
2015-02-11Merge branch 'jk/blame-commit-label'Junio C Hamano1-1/+1
"git blame HEAD -- missing" failed to correctly say "HEAD" when it tried to say "No such path 'missing' in HEAD". * jk/blame-commit-label: blame.c: fix garbled error message use xstrdup_or_null to replace ternary conditionals builtin/commit.c: use xstrdup_or_null instead of envdup builtin/apply.c: use xstrdup_or_null instead of null_strdup git-compat-util: add xstrdup_or_null helper
2015-01-15shallow.c: make check_shallow_file_for_update() staticJunio C Hamano1-1/+1
No external callers exist. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-13use xstrdup_or_null to replace ternary conditionalsJeff King1-1/+1
This replaces "x ? xstrdup(x) : NULL" with xstrdup_or_null(x). The change is fairly mechanical, with the exception of resolve_refdup, which can eliminate a temporary variable. There are still a few hits grepping for "?.*xstrdup", but these are of slightly different forms and cannot be converted (e.g., "x ? xstrdup(x->foo) : NULL"). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-24Merge branch 'eb/no-pthreads'Junio C Hamano1-5/+2
Allow us build with NO_PTHREADS=NoThanks compilation option. * eb/no-pthreads: Handle atexit list internaly for unthreaded builds pack-objects: set number of threads before checking and warning index-pack: fix compilation with NO_PTHREADS
2014-10-19Handle atexit list internaly for unthreaded buildsEtienne Buira1-5/+2
Wrap atexit()s calls on unthreaded builds to handle callback list internally. This is needed because on unthreaded builds, asyncs inherits parent's atexit() list, that gets run as soon as the async exit()s (and again at the end of async's parent process). That led to remove temporary files too early. Also remove a by-atexit-callback guard against this kind of issue in clone.c, as this patch makes it redundant. Fixes test 5537 (temporary shallow file vanished before unpack-objects could open it) BTW remove an unused variable in shallow.c. Helped-by: Duy Nguyen <pclouds@gmail.com> Helped-by: Andreas Schwab <schwab@linux-m68k.org> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Etienne Buira <etienne.buira@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01lockfile.h: extract new header file for the functions in lockfile.cMichael Haggerty1-0/+1
Move the interface declaration for the functions in lockfile.c from cache.h to a new file, lockfile.h. Add #includes where necessary (and remove some redundant includes of cache.h by files that already include builtin.h). Move the documentation of the lock_file state diagram from lockfile.c to the new header file. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01lockfile: change lock_file::filename into a strbufMichael Haggerty1-3/+3
For now, we still make sure to allocate at least PATH_MAX characters for the strbuf because resolve_symlink() doesn't know how to expand the space for its return value. (That will be fixed in a moment.) Another alternative would be to just use a strbuf as scratch space in lock_file() but then store a pointer to the naked string in struct lock_file. But lock_file objects are often reused. By reusing the same strbuf, we can avoid having to reallocate the string most times when a lock_file object is reused. Helped-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-18use REALLOC_ARRAY for changing the allocation size of arraysRené Scharfe1-2/+1
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-13trace: improve trace performanceKarsten Blees1-5/+5
The trace API currently rechecks the environment variable and reopens the trace file on every API call. This has the ugly side effect that errors (e.g. file cannot be opened, or the user specified a relative path) are also reported on every call. Performance can be improved by about factor three by remembering the environment state and keeping the file open. Replace the 'const char *key' parameter in the API with a pointer to a 'struct trace_key' that bundles the environment variable name with additional, trace-internal state. Change the call sites of these APIs to use a static 'struct trace_key' instead of a string constant. In trace.c::get_trace_fd(), save and reuse the file descriptor in 'struct trace_key'. Add a 'trace_disable()' API, so that packet_trace() can cleanly disable tracing when it encounters packed data (instead of using unsetenv()). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-17shallow: verify shallow file after taking lockJeff King1-2/+2
Before writing the shallow file, we stat() the existing file to make sure it has not been updated since our operation began. However, we do not do so under a lock, so there is a possible race: 1. Process A takes the lock. 2. Process B calls check_shallow_file_for_update and finds no update. 3. Process A commits the lockfile. 4. Process B takes the lock, then overwrite's process A's changes. We can fix this by doing our check while we hold the lock. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-27shallow: automatically clean up shallow tempfilesJeff King1-7/+34
We sometimes write tempfiles of the form "shallow_XXXXXX" during fetch/push operations with shallow repositories. Under normal circumstances, we clean up the result when we are done. However, we do no take steps to clean up after ourselves when we exit due to die() or signal death. This patch teaches the tempfile creation code to register handlers to clean up after ourselves. To handle this, we change the ownership semantics of the filename returned by setup_temporary_shallow. It now keeps a copy of the filename itself, and returns only a const pointer to it. We can also do away with explicit tempfile removal in the callers. They all exit not long after finishing with the file, so they can rely on the auto-cleanup, simplifying the code. Note that we keep things simple and maintain only a single filename to be cleaned. This is sufficient for the current caller, but we future-proof it with a die("BUG"). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-27shallow: use stat_validity to check for up-to-date fileJeff King1-17/+7
When we are about to write the shallow file, we check that it has not changed since we last read it. Instead of hand-rolling this, we can use stat_validity. This is built around the index stat-check, so it is more robust than just checking the mtime, as we do now (it uses the same check as we do for index files). The new code also handles the case of a shallow file appearing unexpectedly. With the current code, two simultaneous processes making us shallow (e.g., two "git fetch --depth=1" running at the same time in a non-shallow repository) can race to overwrite each other. As a bonus, we also remove a race in determining the stat information of what we read (we stat and then open, leaving a race window; instead we should open and then fstat the descriptor). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-17Merge branch 'nd/shallow-clone'Junio C Hamano1-7/+462
Fetching from a shallow-cloned repository used to be forbidden, primarily because the codepaths involved were not carefully vetted and we did not bother supporting such usage. This attempts to allow object transfer out of a shallow-cloned repository in a controlled way (i.e. the receiver become a shallow repository with truncated history). * nd/shallow-clone: (31 commits) t5537: fix incorrect expectation in test case 10 shallow: remove unused code send-pack.c: mark a file-local function static git-clone.txt: remove shallow clone limitations prune: clean .git/shallow after pruning objects clone: use git protocol for cloning shallow repo locally send-pack: support pushing from a shallow clone via http receive-pack: support pushing to a shallow clone via http smart-http: support shallow fetch/clone remote-curl: pass ref SHA-1 to fetch-pack as well send-pack: support pushing to a shallow clone receive-pack: allow pushes that update .git/shallow connected.c: add new variant that runs with --shallow-file add GIT_SHALLOW_FILE to propagate --shallow-file to subprocesses receive/send-pack: support pushing from a shallow clone receive-pack: reorder some code in unpack() fetch: add --update-shallow to accept refs that update .git/shallow upload-pack: make sure deepening preserves shallow roots fetch: support fetching from a shallow repository clone: support remote shallow repository ...
2014-01-06shallow: remove unused codeRamsay Jones1-16/+0
Commit 58babfff ("shallow.c: the 8 steps to select new commits for .git/shallow", 05-12-2013) added a function to implement step 5 of the quoted eight steps, namely 'remove_nonexistent_ours_in_pack()'. This function implements an optional optimization step in the new shallow commit selection algorithm. However, this function has no callers. (The commented out call sites would need to change, in order to provide information required by the function.) Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10prune: clean .git/shallow after pruning objectsNguyễn Thái Ngọc Duy1-2/+53
This patch teaches "prune" to remove shallow roots that are no longer reachable from any refs (e.g. when the relevant refs are removed). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10receive-pack: allow pushes that update .git/shallowNguyễn Thái Ngọc Duy1-0/+23
The basic 8 steps to update .git/shallow does not fully apply here because the user may choose to accept just a few refs (while fetch always accepts all refs). The steps are modified a bit. 1-6. same as before. After calling assign_shallow_commits_to_refs at step 6, each shallow commit has a bitmap that marks all refs that require it. 7. mark all "ours" shallow commits that are reachable from any refs. We will need to do the original step 7 on them later. 8. go over all shallow commit bitmaps, mark refs that require new shallow commits. 9. setup a strict temporary shallow file to plug all the holes, even if it may cut some of our history short. This file is used by all hooks. The hooks could use --shallow-file=$GIT_DIR/shallow to overcome this and reach everything in current repo. 10. go over the new refs one by one. For each ref, do the reachability test if it needs a shallow commit on the list from step 7. Remove it if it's reachable from our refs. Gather all required shallow commits, run check_everything_connected() with the new ref, then install them to .git/shallow. This mode is disabled by default and can be turned on with receive.shallowupdate Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10add GIT_SHALLOW_FILE to propagate --shallow-file to subprocessesNguyễn Thái Ngọc Duy1-1/+3
This may be needed when a hook is run after a new shallow pack is received, but .git/shallow is not settled yet. A temporary shallow file to plug all loose ends should be used instead. GIT_SHALLOW_FILE is overriden by --shallow-file. --shallow-file does not work in this case because the hook may spawn many git subprocesses and the launch commands do not have --shallow-file as it's a recent addition. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10upload-pack: make sure deepening preserves shallow rootsNguyễn Thái Ngọc Duy1-1/+5
When "fetch --depth=N" where N exceeds the longest chain of history in the source repo, usually we just send an "unshallow" line to the client so full history is obtained. When the source repo is shallow we need to make sure to "unshallow" the current shallow point _and_ "shallow" again when the commit reaches its shallow bottom in the source repo. This should fix both cases: large <N> and --unshallow. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10shallow.c: steps 6 and 7 to select new commits for .git/shallowNguyễn Thái Ngọc Duy1-0/+294
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10shallow.c: the 8 steps to select new commits for .git/shallowNguyễn Thái Ngọc Duy1-0/+72
Suppose a fetch or push is requested between two shallow repositories (with no history deepening or shortening). A pack that contains necessary objects is transferred over together with .git/shallow of the sender. The receiver has to determine whether it needs to update .git/shallow if new refs needs new shallow comits. The rule here is avoid updating .git/shallow by default. But we don't want to waste the received pack. If the pack contains two refs, one needs new shallow commits installed in .git/shallow and one does not, we keep the latter and reject/warn about the former. Even if .git/shallow update is allowed, we only add shallow commits strictly necessary for the former ref (remember the sender can send more shallow commits than necessary) and pay attention not to accidentally cut the receiver history short (no history shortening is asked for) So the steps to figure out what ref need what new shallow commits are: 1. Split the sender shallow commit list into "ours" and "theirs" list by has_sha1_file. Those that exist in current repo in "ours", the remaining in "theirs". 2. Check the receiver .git/shallow, remove from "ours" the ones that also exist in .git/shallow. 3. Fetch the new pack. Either install or unpack it. 4. Do has_sha1_file on "theirs" list again. Drop the ones that fail has_sha1_file. Obviously the new pack does not need them. 5. If the pack is kept, remove from "ours" the ones that do not exist in the new pack. 6. Walk the new refs to answer the question "what shallow commits, both ours and theirs, are required in .git/shallow in order to add this ref?". Shallow commits not associated to any refs are removed from their respective list. 7. (*) Check reachability (from the current refs) of all remaining commits in "ours". Those reachable are removed. We do not want to cut any part of our (reachable) history. We only check up commits. True reachability test is done by check_everything_connected() at the end as usual. 8. Combine the final "ours" and "theirs" and add them all to .git/shallow. Install new refs. The case where some hook rejects some refs on a push is explained in more detail in the push patches. Of these steps, #6 and #7 are expensive. Both require walking through some commits, or in the worst case all commits. And we rather avoid them in at least common case, where the transferred pack does not contain any shallow commits that the sender advertises. Let's look at each scenario: 1) the sender has longer history than the receiver All shallow commits from the sender will be put into "theirs" list at step 1 because none of them exists in current repo. In the common case, "theirs" becomes empty at step 4 and exit early. 2) the sender has shorter history than the receiver All shallow commits from the sender are likely in "ours" list at step 1. In the common case, if the new pack is kept, we could empty "ours" and exit early at step 5. If the pack is not kept, we hit the expensive step 6 then exit after "ours" is emptied. There'll be only a handful of objects to walk in fast-forward case. If it's forced update, we may need to walk to the bottom. 3) the sender has same .git/shallow as the receiver This is similar to case 2 except that "ours" should be emptied at step 2 and exit early. A fetch after "clone --depth=X" is case 1. A fetch after "clone" (from a shallow repo) is case 3. Luckily they're cheap for the common case. A push from "clone --depth=X" falls into case 2, which is expensive. Some more work may be done at the sender/client side to avoid more work on the server side: if the transferred pack does not contain any shallow commits, send-pack should not send any shallow commits to the receive-pack, effectively turning it into a normal push and avoid all steps. This patch implements all steps except #3, already handled by fetch-pack and receive-pack, #6 and #7, which has their own patch due to their size. (*) in previous versions step 7 was put before step 3. I reorder it so that the common case that keeps the pack does not need to walk commits at all. In future if we implement faster commit reachability check (maybe with the help of pack bitmaps or commit cache), step 7 could become cheap and be moved up before 6 again. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10shallow.c: extend setup_*_shallow() to accept extra shallow commitsNguyễn Thái Ngọc Duy1-5/+15
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10make the sender advertise shallow commits to the receiverNguyễn Thái Ngọc Duy1-0/+15
If either receive-pack or upload-pack is called on a shallow repository, shallow commits (*) will be sent after the ref advertisement (but before the packet flush), so that the receiver has the full "shape" of the sender's commit graph. This will be needed for the receiver to update its .git/shallow if necessary. This breaks the protocol for all clients trying to push to a shallow repo, or fetch from one. Which is basically the same end result as today's "is_repository_shallow() && die()" in receive-pack and upload-pack. New clients will be made aware of shallow upstream and can make use of this information. The sender must send all shallow commits that are sent in the following pack. It may send more shallow commits than necessary. upload-pack for example may choose to advertise no shallow commits if it knows in advance that the pack it's going to send contains no shallow commits. But upload-pack is the server, so we choose the cheaper way, send full .git/shallow and let the client deal with it. Smart HTTP is not affected by this patch. Shallow support on smart-http comes later separately. (*) A shallow commit is a commit that terminates the revision walker. It is usually put in .git/shallow in order to keep the revision walker from going out of bound because there is no guarantee that objects behind this commit is available. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05Merge branch 'jk/robustify-parse-commit'Junio C Hamano1-2/+1
* jk/robustify-parse-commit: checkout: do not die when leaving broken detached HEAD use parse_commit_or_die instead of custom message use parse_commit_or_die instead of segfaulting assume parse_commit checks for NULL commit assume parse_commit checks commit->object.parsed log_tree_diff: die when we fail to parse a commit
2013-10-24use parse_commit_or_die instead of custom messageJeff King1-2/+1
Many calls to parse_commit detect errors and die. In some cases, the custom error messages are more useful than what parse_commit_or_die could produce, because they give some context, like which ref the commit came from. Some, however, just say "invalid commit". Let's convert the latter to use parse_commit_or_die; its message is slightly more informative, and it makes the error more consistent throughout git. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-28shallow: add setup_temporary_shallow()Nguyễn Thái Ngọc Duy1-0/+23
This function is like setup_alternate_shallow() except that it does not lock $GIT_DIR/shallow. It is supposed to be used when a program generates temporary shallow for use by another program, then throw the shallow file away. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-28shallow: only add shallow graft points to new shallow fileNguyễn Thái Ngọc Duy1-0/+2
for_each_commit_graft() goes through all graft points, and shallow boundaries are just one special kind of grafting. If $GIT_DIR/shallow and $GIT_DIR/info/grafts are both present, write_shallow_commits() may catch both sets, accidentally turning some graft points to shallow boundaries. Don't do that. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-18move setup_alternate_shallow and write_shallow_commits to shallow.cNguyễn Thái Ngọc Duy1-0/+54
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-22Merge branch 'mk/upload-pack-off-by-one-dead-code-removal'Junio C Hamano1-11/+6
* mk/upload-pack-off-by-one-dead-code-removal: upload-pack: remove a piece of dead code
2013-07-15upload-pack: remove a piece of dead codeMatthijs Kooijman1-11/+6
Commit 682c7d2 (upload-pack: fix off-by-one depth calculation in shallow clone) introduced a new check in get_shallow_commits to decide when to stop traversing the history and mark the current commit as a shallow root. With this new check in place, the old check can no longer be true, since the first check always fires first. This commit removes that check, making the code a bit more simple again. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl> Acked-by: Duy Nguyen <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-28fetch-pack: prepare updated shallow file before fetching the packNguyễn Thái Ngọc Duy1-2/+40
index-pack --strict looks up and follows parent commits. If shallow information is not ready by the time index-pack is run, index-pack may be led to non-existent objects. Make fetch-pack save shallow file to disk before invoking index-pack. git learns new global option --shallow-file to pass on the alternate shallow file path. Undocumented (and not even support --shallow-file= syntax) because it's unlikely to be used again elsewhere. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-11upload-pack: fix off-by-one depth calculation in shallow cloneNguyễn Thái Ngọc Duy1-1/+7
get_shallow_commits() is used to determine the cut points at a given depth (i.e. the number of commits in a chain that the user likes to get). However we count current depth up to the commit "commit" but we do the cutting at its parents (i.e. current depth + 1). This makes upload-pack always return one commit more than requested. This patch fixes it. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-29object.h: Add OBJECT_ARRAY_INIT macro and make use of it.Thiago Farina1-1/+1
Signed-off-by: Thiago Farina <tfransosi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-18Merge branch 'mk/maint-parse-careful'Junio C Hamano1-1/+2
* mk/maint-parse-careful: peel_onion: handle NULL check return value from parse_commit() in various functions parse_commit: don't fail, if object is NULL revision.c: handle tag->tagged == NULL reachable.c::process_tree/blob: check for NULL process_tag: handle tag->tagged == NULL check results of parse_commit in merge_bases list-objects.c::process_tree/blob: check for NULL reachable.c::add_one_tree: handle NULL from lookup_tree mark_blob/tree_uninteresting: check for NULL get_sha1_oneline: check return value of parse_object read_object_with_reference: don't read beyond the buffer
2008-02-18check return value from parse_commit() in various functionsMartin Koegler1-1/+2
Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-17deref_tag: handle return value NULLMartin Koegler1-1/+1
Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-07War on whitespaceJunio C Hamano1-1/+0
This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-01-21is_repository_shallow(): prototype fix.Junio C Hamano1-1/+1
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-24get_shallow_commits: Avoid memory leak if a commit has been reached already.Alexandre Julliard1-1/+3
Signed-off-by: Alexandre Julliard <julliard@winehq.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-24Shallow clone: do not ignore shallowness when following tagsAlexandre Julliard1-1/+2
Tags should be considered when truncating the commit list. The patch below fixes it, and fetches the right number of commits for each tag. However the correct fix is probably to not fetch historical tags at all. Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-24allow deepening of a shallow repositoryJohannes Schindelin1-2/+6
Now, by saying "git fetch -depth <n> <repo>" you can deepen a shallow repository. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-24support fetching into a shallow repositoryJohannes Schindelin1-0/+97
A shallow commit is a commit which has parents, which in turn are "grafted away", i.e. the commit appears as if it were a root. Since these shallow commits should not be edited by the user, but only by core git, they are recorded in the file $GIT_DIR/shallow. A repository containing shallow commits is called shallow. The advantage of a shallow repository is that even if the upstream contains lots of history, your local (shallow) repository needs not occupy much disk space. The disadvantage is that you might miss a merge base when pulling some remote branch. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>