aboutsummaryrefslogtreecommitdiffstats
path: root/http-walker.c
AgeCommit message (Collapse)AuthorFilesLines
2025-10-30http: refactor subsystem to use `packfile_list`sPatrick Steinhardt1-17/+9
The dumb HTTP protocol directly fetches packfiles from the remote server and temporarily stores them in a list of packfiles. Those packfiles are not yet added to the repository's packfile store until we finalize the whole fetch. Refactor the code to instead use a `struct packfile_list` to store those packs. This prepares us for a subsequent change where the `->next` pointer of `struct packed_git` will go away. Note that this refactoring creates some temporary duplication of code, as we now have both `packfile_list_find_oid()` and `find_oid_pack()`. The latter function will be removed in a subsequent commit though. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `has_object()`Patrick Steinhardt1-4/+4
Rename `has_object()` to `odb_has_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename files to "odb.{c,h}"Patrick Steinhardt1-1/+1
In the preceding commits we have renamed the structures contained in "object-store.h" to `struct object_database` and `struct odb_backend`. As such, the code files "object-store.{c,h}" are confusingly named now. Rename them to "odb.{c,h}" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename `object_directory` to `odb_source`Patrick Steinhardt1-1/+1
The `object_directory` structure is used as an access point for a single object directory like ".git/objects". While the structure isn't yet fully self-contained, the intent is for it to eventually contain all information required to access objects in one specific location. While the name "object directory" is a good fit for now, this will change over time as we continue with the agenda to make pluggable object databases a thing. Eventually, objects may not be accessed via any kind of directory at all anymore, but they could instead be backed by any kind of durable storage mechanism. While it seems quite far-fetched for now, it is thinkable that eventually this might even be some form of a database, for example. As such, the current name of this structure will become worse over time as we evolve into the direction of pluggable ODBs. Immediate next steps will start to carve out proper self-contained object directories, which requires us to pass in these object directories as parameters. Based on our modern naming schema this means that those functions should then be named after their subsystem, which means that we would start to bake the current name into the codebase more and more. Let's preempt this by renaming the structure. There have been a couple alternatives that were discussed: - `odb_backend` was discarded because it led to the association that one object database has a single backend, but the model is that one alternate has one backend. Furthermore, "backend" is more about the actual backing implementation and less about the high-level concept. - `odb_alternate` was discarded because it is a bit of a stretch to also call the main object directory an "alternate". Instead, pick `odb_source` as the new name. It makes it sufficiently clear that there can be multiple sources and does not cause confusion when mixed with the already-existing "alternate" terminology. In the future, this change allows us to easily introduce for example a `odb_files_source` and other format-specific implementations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-29treewide: convert users of `repo_has_object_file()` to `has_object()`Patrick Steinhardt1-2/+4
As the comment of `repo_has_object_file()` and its `_with_flags()` variant tells us, these functions are considered to be deprecated in favor of `has_object()`. There are a couple of slight benefits in favor of the replacement: - The new function has a short-and-sweet name. - More explicit defaults: `has_object()` doesn't fetch missing objects via promisor remotes, and neither does it reload packfiles if an object wasn't found by default. This ensures that it becomes immediately obvious when a simple object existence check may result in expensive actions. Most importantly though, it is confusing that we have two sets of functions that ultimately do the same thing, but with different defaults. Start sunsetting `repo_has_object_file()` and its `_with_flags()` sibling by replacing all callsites with `has_object()`: - `repo_has_object_file(...)` is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT)` is equivalent to `has_object(..., 0)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_SKIP_FETCH_OBJECT)` is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)` is equivalent to `has_object(..., HAS_OBJECT_FETCH_PROMISOR)`. The replacements should be functionally equivalent. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-29object-store: drop `loose_object_path()`Patrick Steinhardt1-1/+2
The function `loose_object_path()` is a trivial wrapper around `odb_loose_path()`, with the only exception that it always uses the primary object database of the given repository. This doesn't really add a ton of value though, so let's drop the function and inline it at every callsite. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-store: merge "object-store-ll.h" and "object-store.h"Patrick Steinhardt1-1/+1
The "object-store-ll.h" header has been introduced to keep transitive header dependendcies and compile times at bay. Now that we have created a new "object-store.c" file though we can easily move the last remaining additional bit of "object-store.h", the `odb_path_map`, out of the header. Do so. As the "object-store.h" header is now equivalent to its low-level alternative we drop the latter and inline it into the former. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06global: mark code units that generate warnings with `-Wsign-compare`Patrick Steinhardt1-0/+1
Mark code units that generate warnings with `-Wsign-compare`. This allows for a structured approach to get rid of all such warnings over time in a way that can be easily measured. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-25packfile: convert find_sha1_pack() to use object_idJeff King1-1/+1
The find_sha1_pack() function has a few problems: - it's badly named, since it works with any object hash - it takes the hash as a bare pointer rather than an object_id struct We can fix both of these easily, as all callers actually have a real object_id anyway. I also found the existence of this function somewhat confusing, as it is about looking in an arbitrary set of linked packed_git structs. It's good for things like dumb-http which are looking in downloaded remote packs, and not our local packs. But despite the name, it is not a good way to find the pack which contains a local object (it skips the use of the midx, the pack mru list, and so on). So let's also add an explanatory comment above the declaration that may point people in the right direction. I suspect the calls in fast-import.c, which use the packed_git list from the repository struct, could actually just be using find_pack_entry(). But since we'd need to keep it anyway for dumb-http, I didn't dig further there. If we eventually drop dumb-http support, then it might be worth examining them to see if we can get rid of the function entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25http-walker: use object_id instead of bare hashJeff King1-12/+13
We long ago switched most code to using object_id structs instead of bare "unsigned char *" hashes. This gives us more type safety from the compiler, and generally makes it easier to understand what we expect in each parameter. But the dumb-http code has lagged behind. And indeed, the whole "walker" subsystem interface has the same problem, though http-walker is the only user left. So let's update the walker interface to pass object_id structs (which we already have anyway at all call sites!), and likewise use those within the http-walker methods that it calls. This cleans up the dumb-http code a bit, but will also let us fix a few more commonly used helper functions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-09-25http-walker: free fake packed_git listJeff King1-0/+10
The dumb-http walker code creates a "fake" packed_git list representing packs we've downloaded from the remote (I call it "fake" because generally that struct is only used and managed by the local repository struct). But during our cleanup phase we don't touch those at all, causing a leak. There's no support here from the rest of the object-database API, as these structs are not meant to be freed, except when closing the object store completely. But we can see that raw_object_store_clear() just calls free() on them, and that's enough here to fix the leak. I also added a call to close_pack() before each. In the regular code this happens via close_object_store(), which we do as part of raw_object_store_clear(). This is necessary to prevent leaking mmap'd data (like the pack idx) or descriptors. The leak-checker won't catch either of these itself, but I did confirm with some hacky warning() calls and running t5550 that it's easy to leak at least index data. This is all much more intimate with the packed_git struct than I'd like, but I think fixing it would be a pretty big refactor. And it's just not worth it for dumb-http code which is rarely used these days. If we can silence the leak-checker without creating too much hassle, we should just do that. This lets us mark t5550 as leak-free. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25http: fix leak of http_object_request structJeff King1-4/+4
The new_http_object_request() function allocates a struct on the heap, along with some fields inside the struct. But the matching function to clean it up, release_http_object_request(), only frees the interior fields without freeing the struct itself, causing a leak. The related http_pack_request new/release pair gets this right, and at first glance we should be able to do the same thing and just add a single free() call. But there's a catch. These http_object_request structs are typically embedded in the object_request struct of http-walker.c. And when we clean up that parent struct, it sanity-checks the embedded struct to make sure we are not leaking descriptors. Which means a use-after-free if we simply free() the embedded struct. I have no idea how valuable that sanity-check is, or whether it can simply be deleted. This all goes back to 5424bc557f (http*: add helper methods for fetching objects (loose), 2009-06-06). But the obvious way to make it all work is to be sure we set the pointer to NULL after freeing it (and our freeing process closes the descriptor, so we know there is no leak). To make sure we do that consistently, we'll switch the pointer we take in release_http_object_request() to a pointer-to-pointer, and we'll set it to NULL ourselves. And then the compiler can help us find each caller which needs to be updated. Most cases will just pass "&obj_req->req", which will obviously do the right thing. In a few cases, like http-push's finish_request(), we are working with a copy of the pointer, so we don't NULL the original. But it's OK because the next step is to free the struct containing the original pointer anyway. This lets us mark t5551 as leak-free. Ironically this is the "smart" http test, and the leak here only affects dumb http. But there's a single dumb-http invocation in there. The full dumb tests are in t5550, which still has some more leaks. This also makes t5559 leak-free, as it's just an HTTP/2 variant of t5551. But we don't need to mark it as such, since it inherits the flag from t5551. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14global: introduce `USE_THE_REPOSITORY_VARIABLE` macroPatrick Steinhardt1-0/+2
Use of the `the_repository` variable is deprecated nowadays, and we slowly but steadily convert the codebase to not use it anymore. Instead, callers should be passing down the repository to work on via parameters. It is hard though to prove that a given code unit does not use this variable anymore. The most trivial case, merely demonstrating that there is no direct use of `the_repository`, is already a bit of a pain during code reviews as the reviewer needs to manually verify claims made by the patch author. The bigger problem though is that we have many interfaces that implicitly rely on `the_repository`. Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code units to opt into usage of `the_repository`. The intent of this macro is to demonstrate that a certain code unit does not use this variable anymore, and to keep it from new dependencies on it in future changes, be it explicit or implicit For now, the macro only guards `the_repository` itself as well as `the_hash_algo`. There are many more known interfaces where we have an implicit dependency on `the_repository`, but those are not guarded at the current point in time. Over time though, we should start to add guards as required (or even better, just remove them). Define the macro as required in our code units. As expected, most of our code still relies on the global variable. Nearly all of our builtins rely on the variable as there is no way yet to pass `the_repository` to their entry point. For now, declare the macro in "biultin.h" to keep the required changes at least a little bit more contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14hash: require hash algorithm in `oidread()` and `oidclr()`Patrick Steinhardt1-1/+1
Both `oidread()` and `oidclr()` use `the_repository` to derive the hash function that shall be used. Require callers to pass in the hash algorithm to get rid of this implicit dependency. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()`Patrick Steinhardt1-1/+1
Many of our hash functions have two variants, one receiving a `struct git_hash_algo` and one that derives it via `the_repository`. Adapt all of those functions to always require the hash algorithm as input and drop the variants that do not accept one. As those functions are now independent of `the_repository`, we can move them from "hash.h" to "hash-ll.h". Note that both in this and subsequent commits in this series we always just pass `the_repository->hash_algo` as input even if it is obvious that there is a repository in the context that we should be using the hash from instead. This is done to be on the safe side and not introduce any regressions. All callsites should eventually be amended to use a repo passed via parameters, but this is outside the scope of this patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26treewide: remove unnecessary includes in source filesElijah Newren1-1/+0
Each of these were checked with gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE} to ensure that removing the direct inclusion of the header actually resulted in that header no longer being included at all (i.e. that no other header pulled it in transitively). ...except for a few cases where we verified that although the header was brought in transitively, nothing from it was directly used in that source file. These cases were: * builtin/credential-cache.c * builtin/pull.c * builtin/send-pack.c Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21object-store-ll.h: split this header out of object-store.hElijah Newren1-1/+1
The vast majority of files including object-store.h did not need dir.h nor khash.h. Split the header into two files, and let most just depend upon object-store-ll.h, while letting the two callers that need it depend on the full object-store.h. After this patch: $ git grep -h include..object-store | sort | uniq -c 2 #include "object-store.h" 129 #include "object-store-ll.h" Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-25Merge branch 'en/header-split-cache-h'Junio C Hamano1-1/+1
Header clean-up. * en/header-split-cache-h: (24 commits) protocol.h: move definition of DEFAULT_GIT_PORT from cache.h mailmap, quote: move declarations of global vars to correct unit treewide: reduce includes of cache.h in other headers treewide: remove double forward declaration of read_in_full cache.h: remove unnecessary includes treewide: remove cache.h inclusion due to pager.h changes pager.h: move declarations for pager.c functions from cache.h treewide: remove cache.h inclusion due to editor.h changes editor: move editor-related functions and declarations into common file treewide: remove cache.h inclusion due to object.h changes object.h: move some inline functions and defines from cache.h treewide: remove cache.h inclusion due to object-file.h changes object-file.h: move declarations for object-file.c functions from cache.h treewide: remove cache.h inclusion due to git-zlib changes git-zlib: move declarations for git-zlib functions from cache.h treewide: remove cache.h inclusion due to object-name.h changes object-name.h: move declarations for object-name.c functions from cache.h treewide: remove unnecessary cache.h inclusion treewide: be explicit about dependence on mem-pool.h treewide: be explicit about dependence on oid-array.h ...
2023-04-11treewide: remove cache.h inclusion due to object-file.h changesElijah Newren1-1/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-06Merge branch 'ab/remove-implicit-use-of-the-repository'Junio C Hamano1-2/+2
Code clean-up around the use of the_repository. * ab/remove-implicit-use-of-the-repository: libs: use "struct repository *" argument, not "the_repository" post-cocci: adjust comments for recent repo_* migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-04Merge branch 'ab/remove-implicit-use-of-the-repository' into ↵Junio C Hamano1-2/+2
en/header-split-cache-h * ab/remove-implicit-use-of-the-repository: libs: use "struct repository *" argument, not "the_repository" post-cocci: adjust comments for recent repo_* migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"
2023-03-28cocci: apply the "object-store.h" part of "the_repository.pending"Ævar Arnfjörð Bjarmason1-2/+2
Apply the part of "the_repository.pending.cocci" pertaining to "object-store.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-17http: mark unused parameter in fill_active_slot() callbacksJeff King1-2/+2
We have a generic "fill" function that is used by both the dumb http push and fetch code paths. It takes a void parameter in case the caller wants to pass along extra data, but (since the previous commit) neither does so. So we could simply drop the extra parameter. But since it's good practice to provide a void pointer for in callback functions, we'll leave it here for the future, and just annotate it as unused (to appease -Wunused-parameter). While we're marking it, let's also fix the type in http-walker's function to have the correct "void" type. The original had to cast the function pointer and was technically undefined behavior (though generally OK in practice). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-17http: drop unused parameter from start_object_request()Jeff King1-4/+3
We take a "walker" parameter for the request, but don't actually look at it. This is due to 5424bc557f (http*: add helper methods for fetching objects (loose), 2009-06-06). Before then, we consulted the "walker" struct to tell us if we should be verbose, but now those messages are printed elsewhere. Let's drop the unused parameter to make -Wunused-parameter happy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23cache.h: remove dependence on hex.h; make other files include it explicitlyElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-02tree-wide: apply equals-null.cocciJunio C Hamano1-7/+7
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30http: rename CURLOPT_FILE to CURLOPT_WRITEDATAÆvar Arnfjörð Bjarmason1-1/+1
The CURLOPT_FILE name is an alias for CURLOPT_WRITEDATA, the CURLOPT_WRITEDATA name has been preferred since curl 7.9.7, released in May 2002[1]. 1. https://curl.se/libcurl/c/CURLOPT_WRITEDATA.html Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30http: drop support for curl < 7.16.0Jeff King1-12/+0
In the last commit we dropped support for curl < 7.11.1, let's continue that and drop support for versions older than 7.16.0. This allows us to get rid of some now-obsolete #ifdefs. Choosing 7.16.0 is a somewhat arbitrary cutoff: 1. It came out in October of 2006, almost 15 years ago. Besides being a nice round number, around 10 years is a common end-of-life support period, even for conservative distributions. 2. That version introduced the curl_multi interface, which gives us a lot of bang for the buck in removing #ifdefs RHEL 5 came with curl 7.15.5[1] (released in August 2006). RHEL 5's extended life cycle program ended on 2020-11-30[1]. RHEL 6 comes with curl 7.19.7 (released in November 2009), and RHEL 7 comes with 7.29.0 (released in February 2013). 1. http://lore.kernel.org/git/873e1f31-2a96-5b72-2f20-a5816cad1b51@jupiterrise.com Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27Always use oidread to read into struct object_idbrian m. carlson1-1/+1
In the future, we'll want oidread to automatically set the hash algorithm member for an object ID we read into it, so ensure we use oidread instead of hashcpy everywhere we're copying a hash value into a struct object_id. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-10http: refactor finish_http_pack_request()Jonathan Tan1-2/+3
finish_http_pack_request() does multiple tasks, including some housekeeping on a struct packed_git - (1) closing its index, (2) removing it from a list, and (3) installing it. These concerns are independent of fetching a pack through HTTP: they are there only because (1) the calling code opens the pack's index before deciding to fetch it, (2) the calling code maintains a list of packfiles that can be fetched, and (3) the calling code fetches it in order to make use of its objects in the same process. In preparation for a subsequent commit, which adds a feature that does not need any of this housekeeping, remove (1), (2), and (3) from finish_http_pack_request(). (2) and (3) are now done by a helper function, and (1) is the responsibility of the caller (in this patch, done closer to the point where the pack index is opened). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-25Merge branch 'bc/hash-transition-16'Junio C Hamano1-9/+9
Conversion from unsigned char[20] to struct object_id continues. * bc/hash-transition-16: (35 commits) gitweb: make hash size independent Git.pm: make hash size independent read-cache: read data in a hash-independent way dir: make untracked cache extension hash size independent builtin/difftool: use parse_oid_hex refspec: make hash size independent archive: convert struct archiver_args to object_id builtin/get-tar-commit-id: make hash size independent get-tar-commit-id: parse comment record hash: add a function to lookup hash algorithm by length remote-curl: make hash size independent http: replace sha1_to_hex http: compute hash of downloaded objects using the_hash_algo http: replace hard-coded constant with the_hash_algo http-walker: replace sha1_to_hex http-push: remove remaining uses of sha1_to_hex http-backend: allow 64-character hex names http-push: convert to use the_hash_algo builtin/pull: make hash-size independent builtin/am: make hash size independent ...
2019-04-01http-walker: replace sha1_to_hexbrian m. carlson1-9/+9
Since sha1_to_hex is limited to SHA-1, replace the uses of it in this file with hash_to_hex. Rename several variables accordingly to reflect that they are no longer limited to SHA-1. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-01object-store: rename and expand packed_git's sha1 memberbrian m. carlson1-1/+1
This member is used to represent the pack checksum of the pack in question. Expand this member to be GIT_MAX_RAWSZ bytes in length so it works with longer hashes and rename it to be "hash" instead of "sha1". This transformation was made with a change to the definition and the following semantic patch: @@ struct packed_git *E1; @@ - E1->sha1 + E1->hash @@ struct packed_git E1; @@ - E1.sha1 + E1.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-24http: use normalize_curl_result() instead of manual conversionJeff King1-11/+2
When we switched off CURLOPT_FAILONERROR in 17966c0a63 (http: avoid disconnecting on 404s for loose objects, 2016-07-11), the fetch_object() function started manually handling 404's. Since we now have normalize_curl_result() for use elsewhere, we can use it here as well, shortening the code. Note that we lose the check for http/https in the URL here. None of the other result-normalizing code paths bother with this. Looking at missing_target(), which checks specifically for an FTP-specific CURLcode and "http" code 550, it seems likely that git-over-ftp has been subtly broken since 17966c0a63. This patch does nothing to fix that, but nor should it make anything worse (in fact, it may be slightly better because we'll actually recognize an error as such, rather than assuming CURLE_OK means we actually got some data). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-24http: normalize curl results for dumb loose and alternates fetchesJeff King1-0/+8
If the dumb-http walker encounters a 404 when fetching a loose object, it then looks at any http-alternates for the object. The 404 check is implemented by missing_target(), which checks not only the http code, but also that we got an http error from the CURLcode. That broke when we stopped using CURLOPT_FAILONERROR in 17966c0a63 (http: avoid disconnecting on 404s for loose objects, 2016-07-11), since our CURLcode will now be CURLE_OK. As a result, fetching over dumb-http from a repository with alternates could result in Git printing "Unable to find abcd1234..." and aborting. We could probably fix this just by loosening missing_target(). However, there's other code which looks at the curl result, and it would have to be tweaked as well. Instead, let's just normalize the result the same way the smart-http code does. There's a similar case in processing the alternates (where we failover from "info/http-alternates" to "info/alternates"). We'll give it the same treatment. After this patch, we should be hitting all code paths that need this normalization (notably absent here is the http_pack_request path, but it does not use FAILONERROR, nor missing_target()). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-08convert has_sha1_file() callers to has_object_file()Jeff King1-2/+2
The only remaining callers of has_sha1_file() actually have an object_id already. They can use the "object" variant, rather than dereferencing the hash themselves. The code changes here were completely generated by the included coccinelle patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-08sha1-file: modernize loose object file functionsJeff King1-1/+1
The loose object access code in sha1-file.c is some of the oldest in Git, and could use some modernizing. It mostly uses "unsigned char *" for object ids, which these days should be "struct object_id". It also uses the term "sha1_file" in many functions, which is confusing. The term "loose_objects" is much better. It clearly distinguishes them from packed objects (which didn't even exist back when the name "sha1_file" came into being). And it also distinguishes it from the checksummed-file concept in csum-file.c (which until recently was actually called "struct sha1file"!). This patch converts the functions {open,close,map,stat}_sha1_file() into open_loose_object(), etc, and switches their sha1 arguments for object_id structs. Similarly, path functions like fill_sha1_path() become fill_loose_path() and use object_ids. The function sha1_loose_object_info() already says "loose", so we can just drop the "sha1" (and teach it to use object_id). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-08http: use struct object_id instead of bare sha1Jeff King1-3/+3
The dumb-http walker code still passes around and stores object ids as "unsigned char *sha1". Let's modernize it. There's probably still more work to be done to handle dumb-http fetches with a new, larger hash. But that can wait; this is enough that we can now convert some of the low-level object routines that we call into from here (and in fact, some of the "oid.hash" references added here will be further improved in the next patch). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-13sha1_file_name(): overwrite buffer instead of appendingJeff King1-1/+1
The sha1_file_name() function is used to generate the path to a loose object in the object directory. It doesn't make much sense for it to append, since the the path we write may be absolute (i.e., you cannot reliably build up a path with it). Because many callers use it with a static buffer, they have to strbuf_reset() manually before each call (and the other callers always use an empty buffer, so they don't care either way). Let's handle this automatically. Since we're changing the semantics, let's take the opportunity to give it a more hash-neutral name (which will also catch any callers from topics in flight). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-29convert "hashcmp() != 0" to "!hasheq()"Jeff King1-1/+1
This rounds out the previous three patches, covering the inequality logic for the "hash" variant of the functions. As with the previous three, the accompanying code changes are the mechanical result of applying the coccinelle patch; see those patches for more discussion. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-29convert "hashcmp() == 0" to hasheq()Jeff King1-1/+1
This is the partner patch to the previous one, but covering the "hash" variants instead of "oid". Note that our coccinelle rule is slightly more complex to avoid triggering the call in hasheq(). I didn't bother to add a new rule to convert: - hasheq(E1->hash, E2->hash) + oideq(E1, E2) Since these are new functions, there won't be any such existing callers. And since most of the code is already using oideq, we're not likely to introduce new ones. We might still see "!hashcmp(E1->hash, E2->hash)" from topics in flight. But because our new rule comes after the existing ones, that should first get converted to "!oidcmp(E1, E2)" and then to "oideq(E1, E2)". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-11Merge branch 'sb/object-store'Junio C Hamano1-1/+3
Refactoring the internal global data structure to make it possible to open multiple repositories, work with and then close them. Rerolled by Duy on top of a separate preliminary clean-up topic. The resulting structure of the topics looked very sensible. * sb/object-store: (27 commits) sha1_file: allow sha1_loose_object_info to handle arbitrary repositories sha1_file: allow map_sha1_file to handle arbitrary repositories sha1_file: allow map_sha1_file_1 to handle arbitrary repositories sha1_file: allow open_sha1_file to handle arbitrary repositories sha1_file: allow stat_sha1_file to handle arbitrary repositories sha1_file: allow sha1_file_name to handle arbitrary repositories sha1_file: add repository argument to sha1_loose_object_info sha1_file: add repository argument to map_sha1_file sha1_file: add repository argument to map_sha1_file_1 sha1_file: add repository argument to open_sha1_file sha1_file: add repository argument to stat_sha1_file sha1_file: add repository argument to sha1_file_name sha1_file: allow prepare_alt_odb to handle arbitrary repositories sha1_file: allow link_alt_odb_entries to handle arbitrary repositories sha1_file: add repository argument to prepare_alt_odb sha1_file: add repository argument to link_alt_odb_entries sha1_file: add repository argument to read_info_alternates sha1_file: add repository argument to link_alt_odb_entry sha1_file: add raw_object_store argument to alt_odb_usable pack: move approximate object count to object store ...
2018-03-26sha1_file: add repository argument to sha1_file_nameStefan Beller1-1/+2
Add a repository argument to allow sha1_file_name callers to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. While at it, move the declaration to object-store.h, where it should be easier to find. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-26object-store: move packed_git and packed_git_mru to object storeStefan Beller1-0/+1
In a process with multiple repositories open, packfile accessors should be associated to a single repository and not shared globally. Move packed_git and packed_git_mru into the_repository and adjust callers to reflect this. [nd: while at there, wrap access to these two fields in get_packed_git() and get_packed_git_mru(). This allows us to lazily initialize these fields without caller doing that explicitly] Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14http-walker: convert struct object_request to use struct object_idbrian m. carlson1-8/+8
Convert struct object_request to use struct object_id by updating the definition and applying the following semantic patch, plus the standard object_id transforms: @@ struct object_request E1; @@ - E1.sha1 + E1.oid.hash @@ struct object_request *E1; @@ - E1->sha1 + E1->oid.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-17sha1_file: remove static strbuf from sha1_file_name()Christian Couder1-2/+4
Using a static buffer in sha1_file_name() is error prone and the performance improvements it gives are not needed in many of the callers. So let's get rid of this static buffer and, if necessary or helpful, let's use one in the caller. Suggested-by: Jeff Hostetler <git@jeffhostetler.com> Helped-by: Kevin Daudt <me@ikke.info> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move find_sha1_pack()Jonathan Tan1-0/+1
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-17Merge branch 'jk/http-walker-buffer-underflow-fix'Junio C Hamano1-4/+7
"Dumb http" transport used to misparse a nonsense http-alternates response, which has been fixed. * jk/http-walker-buffer-underflow-fix: http-walker: fix buffer underflow processing remote alternates
2017-03-13http-walker: fix buffer underflow processing remote alternatesJeff King1-4/+7
If we parse a remote alternates (or http-alternates), we expect relative lines like: ../../foo.git/objects which we convert into "$URL/../foo.git/" (and then use that as a base for fetching more objects). But if the remote feeds us nonsense like just: ../ we will try to blindly strip the last 7 characters, assuming they contain the string "objects". Since we don't _have_ 7 characters at all, this results in feeding a small negative value to strbuf_add(), which converts it to a size_t, resulting in a big positive value. This should consistently fail (since we can't generall allocate the max size_t minus 7 bytes), so there shouldn't be any security implications. Let's fix this by using strbuf_strip_suffix() to drop the characters we want. If they're not present, we'll ignore the alternate (in theory we could use it as-is, but the rest of the http-walker code unconditionally tacks "objects/" back on, so it is it not prepared to handle such a case). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-06http: release strbuf on disabled alternatesEric Wong1-0/+2
This likely has no real-world impact on memory usage, but it is cleaner for future readers. Fixes: abcbdc03895f ("http: respect protocol.*.allow=user for http-alternates") Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-06http: inform about alternates-as-redirects behaviorEric Wong1-3/+5
It is disconcerting for users to not notice the behavior change in handling alternates from commit cb4d2d35c4622ec2 ("http: treat http-alternates like redirects") Give the user a hint about the config option so they can see the URL and decide whether or not they want to enable http.followRedirects in their config. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-15http: respect protocol.*.allow=user for http-alternatesJeff King1-11/+41
The http-walker may fetch the http-alternates (or alternates) file from a remote in order to find more objects. This should count as a "not from the user" use of the protocol. But because we implement the redirection ourselves and feed the new URL to curl, it will use the CURLOPT_PROTOCOLS rules, not the more restrictive CURLOPT_REDIR_PROTOCOLS. The ideal solution would be for each curl request we make to know whether or not is directly from the user or part of an alternates redirect, and then set CURLOPT_PROTOCOLS as appropriate. However, that would require plumbing that information through all of the various layers of the http code. Instead, let's check the protocol at the source: when we are parsing the remote http-alternates file. The only downside is that if there's any mismatch between what protocol we think it is versus what curl thinks it is, it could violate the policy. To address this, we'll make the parsing err on the picky side, and only allow protocols that it can parse definitively. So for example, you can't elude the "http" policy by asking for "HTTP://", even though curl might handle it; we would reject it as unknown. The only unsafe case would be if you have a URL that starts with "http://" but curl interprets as another protocol. That seems like an unlikely failure mode (and we are still protected by our base CURLOPT_PROTOCOL setting, so the worst you could do is trigger one of https, ftp, or ftps). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-06http-walker: complain about non-404 loose object errorsJeff King1-2/+5
Since commit 17966c0a6 (http: avoid disconnecting on 404s for loose objects, 2016-07-11), we turn off curl's FAILONERROR option and instead manually deal with failing HTTP codes. However, the logic to do so only recognizes HTTP 404 as a failure. This is probably the most common result, but if we were to get another code, the curl result remains CURLE_OK, and we treat it as success. We still end up detecting the failure when we try to zlib-inflate the object (which will fail), but instead of reporting the HTTP error, we just claim that the object is corrupt. Instead, let's catch anything in the 300's or above as an error (300's are redirects which are not an error at the HTTP level, but are an indication that we've explicitly disabled redirects, so we should treat them as such; we certainly don't have the resulting object content). Note that we also fill in req->errorstr, which we didn't do before. Without FAILONERROR, curl will not have filled this in, and it will remain a blank string. This never mattered for the 404 case, because in the logic below we hit the "missing_target()" branch and print nothing. But for other errors, we'd want to say _something_, if only to fill in the blank slot in the error message. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-06Merge branch 'ew/http-walker' into jk/http-walker-limit-redirectJunio C Hamano1-29/+26
* ew/http-walker: list: avoid incompatibility with *BSD sys/queue.h http-walker: reduce O(n) ops with doubly-linked list http: avoid disconnecting on 404s for loose objects http-walker: remove unused parameter from fetch_object
2016-12-06http: treat http-alternates like redirectsJeff King1-3/+5
The previous commit made HTTP redirects more obvious and tightened up the default behavior. However, there's another way for a server to ask a git client to fetch arbitrary content: by having an http-alternates file (or a regular alternates file, which is used as a backup). Similar to the HTTP redirect case, a malicious server can claim to have refs pointing at object X, return a 404 when the client asks for X, but point to some other URL via http-alternates, which the client will transparently fetch. The end result is that it looks from the user's perspective like the objects came from the malicious server, as the other URL is not mentioned at all. Worse, because we feed the new URL to curl ourselves, the usual protocol restrictions do not kick in (neither curl's default of disallowing file://, nor the protocol whitelisting in f4113cac0 (http: limit redirection to protocol-whitelist, 2015-09-22). Let's apply the same rules here as we do for HTTP redirects. Namely: - unless http.followRedirects is set to "always", we will not follow remote redirects from http-alternates (or alternates) at all - set CURLOPT_PROTOCOLS alongside CURLOPT_REDIR_PROTOCOLS restrict ourselves to a known-safe set and respect any user-provided whitelist. - mention alternate object stores on stderr so that the user is aware another source of objects may be involved The first item may prove to be too restrictive. The most common use of alternates is to point to another path on the same server. While it's possible for a single-server redirect to be an attack, it takes a fairly obscure setup (victim and evil repository on the same host, host speaks dumb http, and evil repository has access to edit its own http-alternates file). So we could make the checks more specific, and only cover cross-server redirects. But that means parsing the URLs ourselves, rather than letting curl handle them. This patch goes for the simpler approach. Given that they are only used with dumb http, http-alternates are probably pretty rare. And there's an escape hatch: the user can allow redirects on a specific server by setting http.<url>.followRedirects to "always". Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-12http-walker: reduce O(n) ops with doubly-linked listEric Wong1-27/+15
Using the a Linux-kernel-derived doubly-linked list implementation from the Userspace RCU library allows us to enqueue and delete items from the object request queue in constant time. This change reduces enqueue times in the prefetch() function where object request queue could grow to several thousand objects. I left out the list_for_each_entry* family macros from list.h which relied on the __typeof__ operator as we support platforms without it. Thus, list_entry (aka "container_of") needs to be called explicitly inside macro-wrapped for loops. The downside is this costs us an additional pointer per object request, but this is offset by reduced overhead on queue operations leading to improved performance and shorter queue depths. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-12http: avoid disconnecting on 404s for loose objectsEric Wong1-0/+9
404s are common when fetching loose objects on static HTTP servers, and reestablishing a connection for every single 404 adds additional latency. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-12http-walker: remove unused parameter from fetch_objectEric Wong1-2/+2
This parameter has not been used since commit 1d389ab65dc6 ("Add support for parallel HTTP transfers") back in 2005 Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25http-walker: store url in a strbufJeff King1-9/+10
We do an unchecked sprintf directly into our url buffer. This doesn't overflow because we know that it was sized for "$base/objects/info/http-alternates", and we are writing "$base/objects/info/alternates", which must be smaller. But that is not immediately obvious to a reader who is looking for buffer overflows. Let's switch to a strbuf, so that we do not have to think about this issue at all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-02http-walker: simplify process_alternates_response() using strbufRené Scharfe1-9/+6
Use strbuf to build the new base, which takes care of allocations and the terminating NUL character automatically. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-19use xstrfmt to replace xmalloc + sprintfJeff King1-2/+1
This is one line shorter, and makes sure the length in the malloc and sprintf steps match. These conversions are very straightforward; we can drop the malloc entirely, and replace the sprintf with xstrfmt. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-19use xstrdup instead of xmalloc + strcpyJeff King1-2/+1
This is one line shorter, and makes sure the length in the malloc and copy steps match. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-12Rename static function fetch_pack() to http_fetch_pack()Michael Haggerty1-2/+2
Avoid confusion with the non-static function of the same name from fetch-pack.h. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-04http: make curl callbacks match contracts from curl headerDan McGee1-2/+2
Yes, these don't match perfectly with the void* first parameter of the fread/fwrite in the standard library, but they do match the curl expected method signature. This is needed when a refactor passes a curl_write_callback around, which would otherwise give incorrect parameter warnings. Signed-off-by: Dan McGee <dpmcgee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-16standardize brace placement in struct definitionsJonathan Nieder1-4/+2
In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char *baz; }; Indeed, grepping for 'struct [a-z_]* {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-21Merge branch 'gv/portable'Junio C Hamano1-1/+1
* gv/portable: test-lib: use DIFF definition from GIT-BUILD-OPTIONS build: propagate $DIFF to scripts Makefile: Tru64 portability fix Makefile: HP-UX 10.20 portability fixes Makefile: HPUX11 portability fixes Makefile: SunOS 5.6 portability fix inline declaration does not work on AIX Allow disabling "inline" Some platforms lack socklen_t type Make NO_{INET_NTOP,INET_PTON} configured independently Makefile: some platforms do not have hstrerror anywhere git-compat-util.h: some platforms with mmap() lack MAP_FAILED definition test_cmp: do not use "diff -u" on platforms that lack one fixup: do not unconditionally disable "diff -u" tests: use "test_cmp", not "diff", when verifying the result Do not use "diff" found on PATH while building and installing enums: omit trailing comma for portability Makefile: -lpthread may still be necessary when libc has only pthread stubs Rewrite dynamic structure initializations to runtime assignment Makefile: pass CPPFLAGS through to fllow customization Conflicts: Makefile wt-status.h
2010-05-31enums: omit trailing comma for portabilityGary V. Vaughan1-1/+1
Without this patch at least IBM VisualAge C 5.0 (I have 5.0.2) on AIX 5.1 fails to compile git. enum style is inconsistent already, with some enums declared on one line, some over 3 lines with the enum values all on the middle line, sometimes with 1 enum value per line... and independently of that the trailing comma is sometimes present and other times absent, often mixing with/without trailing comma styles in a single file, and sometimes in consecutive enum declarations. Clearly, omitting the comma is the more portable style, and this patch changes all enum declarations to use the portable omitted dangling comma style consistently. Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-21Merge branch 'sp/maint-dumb-http-pack-reidx'Junio C Hamano1-1/+1
* sp/maint-dumb-http-pack-reidx: http.c::new_http_pack_request: do away with the temp variable filename http-fetch: Use temporary files for pack-*.idx until verified http-fetch: Use index-pack rather than verify-pack to check packs Allow parse_pack_index on temporary files Extract verify_pack_index for reuse from verify_pack Introduce close_pack_index to permit replacement http.c: Remove unnecessary strdup of sha1_to_hex result http.c: Don't store destination name in request structures http.c: Drop useless != NULL test in finish_http_pack_request http.c: Tiny refactoring of finish_http_pack_request t5550-http-fetch: Use subshell for repository operations http.c: Remove bad free of static block
2010-04-17http.c: Don't store destination name in request structuresShawn O. Pearce1-1/+1
The destination name within the object store is easily computed on demand, reusing a static buffer held by sha1_file.c. We don't need to copy the entire path into the request structure for safe keeping, when it can be easily reformatted after the download has been completed. This reduces the size of the per-request structure, and removes yet another PATH_MAX based limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-02http: init and cleanup separately from http-walkerTay Ray Chuan1-5/+1
Previously, all our http operations were done with http-walker. With the new remote-curl helper, we find ourselves using http methods outside of http-walker - for example, fetching info/refs. Accomodate this by separating http_init() and http_cleanup() invocations from http-walker. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-02http-walker: cleanup more thoroughlyTay Ray Chuan1-0/+17
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06http*: add helper methods for fetching objects (loose)Tay Ray Chuan1-227/+40
The code handling the fetching of loose objects in http-push.c and http-walker.c have been refactored into new methods and a new struct (object_http_request) in http.c. They are not meant to be invoked elsewhere. The new methods in http.c are - new_http_object_request - process_http_object_request - finish_http_object_request - abort_http_object_request - release_http_object_request and the new struct is http_object_request. RANGER_HEADER_SIZE and no_pragma_header is no longer made available outside of http.c, since after the above changes, there are no other instances of usage outside of http.c. Remove members of the transfer_request struct in http-push.c and http-walker.c, including filename, real_sha1 and zret, as they are used no longer used. Move the methods append_remote_object_url() and get_remote_object_url() from http-push.c to http.c. Additionally, get_remote_object_url() is no longer defined only when USE_CURL_MULTI is defined, since non-USE_CURL_MULTI code in http.c uses it (namely, in new_http_object_request()). Refactor code from http-push.c::start_fetch_loose() and http-walker.c::start_object_fetch_request() that deals with the details of coming up with the filename to store the retrieved object, resuming a previously aborted request, and making a new curl request, into a new function, new_http_object_request(). Refactor code from http-walker.c::process_object_request() into the function, process_http_object_request(). Refactor code from http-push.c::finish_request() and http-walker.c::finish_object_request() into a new function, finish_http_object_request(). It returns the result of the move_temp_to_file() invocation. Add a function, release_http_object_request(), which cleans up object request data. http-push.c and http-walker.c invoke this function separately; http-push.c::release_request() and http-walker.c::release_object_request() do not invoke this function. Add a function, abort_http_object_request(), which unlink()s the object file and invokes release_http_object_request(). Update http-walker.c::abort_object_request() to use this. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06http*: add helper methods for fetching packsTay Ray Chuan1-67/+18
The code handling the fetching of packs in http-push.c and http-walker.c have been refactored into new methods and a new struct (http_pack_request) in http.c. They are not meant to be invoked elsewhere. The new methods in http.c are - new_http_pack_request - finish_http_pack_request - release_http_pack_request and the new struct is http_pack_request. Add a function, new_http_pack_request(), that deals with the details of coming up with the filename to store the retrieved packfile, resuming a previously aborted request, and making a new curl request. Update http-push.c::start_fetch_packed() and http-walker.c::fetch_pack() to use this. Add a function, finish_http_pack_request(), that deals with renaming the pack, advancing the pack list, and installing the pack. Update http-push.c::finish_request() and http-walker.c::fetch_pack to use this. Update release_request() in http-push.c and http-walker.c to invoke release_http_pack_request() to clean up pack request helper data. The local_stream member of the transfer_request struct in http-push.c has been removed, as the packfile pointer will be managed in the struct http_pack_request. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06http*: add http_get_info_packsTay Ray Chuan1-175/+9
http-push.c and http-walker.c no longer have to use fetch_index or setup_index; they simply need to use http_get_info_packs, a new http method, in their fetch_indices implementations. Move fetch_index() and rename to fetch_pack_index() in http.c; this method is not meant to be used outside of http.c. It invokes end_url_with_slash with base_url; apart from that change, the code is identical. Move setup_index() and rename to fetch_and_setup_pack_index() in http.c; this method is not meant to be used outside of http.c. Do not immediately set ret to 0 in http-walker.c::fetch_indices(); instead do it in the HTTP_MISSING_TARGET case, to make it clear that the HTTP_OK and HTTP_MISSING_TARGET cases both return 0. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06http*: move common variables and macros to http.[ch]Tay Ray Chuan1-13/+5
Move RANGE_HEADER_SIZE to http.h. Create no_pragma_header, the curl header list containing the header "Pragma:" in http.[ch]. It is allocated in http_init, and freed in http_cleanup. This replaces the no_pragma_header in http-push.c, and the no_pragma_header member in walker_data in http-walker.c. Create http_is_verbose. It is to be used by methods in http.c, and is modified at the entry points of http.c's users, namely http-push.c (when parsing options) and http-walker.c (in get_http_walker). Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06http*: copy string returned by sha1_to_hexTay Ray Chuan1-22/+23
In the fetch_index implementations in http-push.c and http-walker.c, the string returned by sha1_to_hex is assumed to stay immutable. This patch ensures that hex stays immutable by copying the string returned by sha1_to_hex (via xstrdup) and frees it subsequently. It also refactors free()'s and fclose()'s with labels. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06http-walker: verify remote packsTay Ray Chuan1-3/+30
In c17fb6e ("Verify remote packs, speed up pending request queue"), changes were made to index fetching in http-push.c, particularly the methods fetch_index and setup_index. Since http-walker.c has similar code for index fetching, these improvements should apply to http-walker.c's fetch_index and setup_index. Invocations of free() of string memory are reproduced as well. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06http-push, http-walker: style fixesTay Ray Chuan1-33/+50
- Use tabs to indent, instead of spaces. - Do not use curly-braces around a single statement body in if/while statement; - Do not start multi-line comment with description on the first line after "/*", i.e. /* * We prefer this over... */ /* comments like * this (notice the first line) */ Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06Merge branch 'rc/maint-http-local-slot-fix' into rc/http-pushJunio C Hamano1-0/+6
* rc/maint-http-local-slot-fix: http*: cleanup slot->local after fclose
2009-06-06http*: cleanup slot->local after fcloseTay Ray Chuan1-0/+6
Set slot->local to NULL after doing a fclose() on the file it points to. This prevents the passing of a FILE* pointer to a fclose()'d file to ftell() in http.c::run_active_slot(). This issue was raised by Clemens Buchacher on 30th May 2009: http://www.spinics.net/lists/git/msg104623.html Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-29replace direct calls to unlink(2) with unlink_or_warnAlex Riesen1-7/+7
This helps to notice when something's going wrong, especially on systems which lock open files. I used the following criteria when selecting the code for replacement: - it was already printing a warning for the unlink failures - it is in a function which already printing something or is called from such a function - it is in a static function, returning void and the function is only called from a builtin main function (cmd_) - it is in a function which handles emergency exit (signal handlers) - it is in a function which is obvously cleaning up the lockfiles Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-27Move chmod(foo, 0444) into move_temp_to_file()Johan Herland1-1/+0
When writing out a loose object or a pack (index), move_temp_to_file() is called to finalize the resulting file. These files (loose files and packs) should all have permission mode 0444 (modulo adjust_shared_perm()). Therefore, instead of doing chmod(foo, 0444) explicitly from each callsite (or even forgetting to chmod() at all), do the chmod() call from within move_temp_to_file(). Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-21Merge branch 'lt/maint-wrap-zlib'Junio C Hamano1-4/+4
* lt/maint-wrap-zlib: Wrap inflate and other zlib routines for better error reporting Conflicts: http-push.c http-walker.c sha1_file.c
2009-01-11Wrap inflate and other zlib routines for better error reportingLinus Torvalds1-4/+4
R. Tyler Ballance reported a mysterious transient repository corruption; after much digging, it turns out that we were not catching and reporting memory allocation errors from some calls we make to zlib. This one _just_ wraps things; it doesn't do the "retry on low memory error" part, at least not yet. It is an independent issue from the reporting. Some of the errors are expected and passed back to the caller, but we die when zlib reports it failed to allocate memory for now. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-02fix openssl headers conflicting with custom SHA1 implementationsNicolas Pitre1-5/+5
On ARM I have the following compilation errors: CC fast-import.o In file included from cache.h:8, from builtin.h:6, from fast-import.c:142: arm/sha1.h:14: error: conflicting types for 'SHA_CTX' /usr/include/openssl/sha.h:105: error: previous declaration of 'SHA_CTX' was here arm/sha1.h:16: error: conflicting types for 'SHA1_Init' /usr/include/openssl/sha.h:115: error: previous declaration of 'SHA1_Init' was here arm/sha1.h:17: error: conflicting types for 'SHA1_Update' /usr/include/openssl/sha.h:116: error: previous declaration of 'SHA1_Update' was here arm/sha1.h:18: error: conflicting types for 'SHA1_Final' /usr/include/openssl/sha.h:117: error: previous declaration of 'SHA1_Final' was here make: *** [fast-import.o] Error 1 This is because openssl header files are always included in git-compat-util.h since commit 684ec6c63c whenever NO_OPENSSL is not set, which somehow brings in <openssl/sha1.h> clashing with the custom ARM version. Compilation of git is probably broken on PPC too for the same reason. Turns out that the only file requiring openssl/ssl.h and openssl/err.h is imap-send.c. But only moving those problematic includes there doesn't solve the issue as it also includes cache.h which brings in the conflicting local SHA1 header file. As suggested by Jeff King, the best solution is to rename our references to SHA1 functions and structure to something git specific, and define those according to the implementation used. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-07-19Merge branch 'maint'Junio C Hamano1-0/+2
* maint: GIT 1.5.6.4 builtin-rm: fix index lock file path http-fetch: do not SEGV after fetching a bad pack idx file rev-list: honor --quiet option api-run-command.txt: typofix
2008-07-18http-fetch: do not SEGV after fetching a bad pack idx fileJunio C Hamano1-0/+2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-24move show_pack_info() where it belongsNicolas Pitre1-1/+1
This is called when verify_pack() has its verbose argument set, and verbose in this context makes sense only for the actual 'git verify-pack' command. Therefore let's move show_pack_info() to builtin-verify-pack.c instead and remove useless verbose argument from verify_pack(). Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-26Make walker.fetch_ref() take a struct ref.Daniel Barkalow1-2/+2
This simplifies a few things, makes a few things slightly more complicated, but, more importantly, allows that, when struct ref can represent a symref, http_fetch_ref() can return one. Incidentally makes the string that http_fetch_ref() gets include "refs/" (if appropriate), because that's how the name field of struct ref works. As far as I can tell, the usage in walker:interpret_target() wouldn't have worked previously, if it ever would have been used, which it wouldn't (since the fetch process uses the hash instead of the name of the ref there). Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-27Set proxy override with http_init()Mike Hommey1-2/+2
In transport.c, proxy setting (the one from the remote conf) was set through curl_easy_setopt() call, while http.c already does the same with the http.proxy setting. We now just use this infrastructure instead, and make http_init() now take the struct remote as argument so that it can take the http_proxy setting from there, and any other property that would be added later. At the same time, we make get_http_walker() take a struct remote argument too, and pass it to http_init(), which makes remote defined proxy be used for more than get_refs_via_curl(). We leave out http-fetch and http-push, which don't use remotes for the moment, purposefully. Signed-off-by: Mike Hommey <mh@glandium.org> Acked-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-14Move fetch_ref from http-push.c and http-walker.c to http.cMike Hommey1-79/+1
Make the necessary changes to be ok with their difference, and rename the function http_fetch_ref. Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-14Fix various memory leaks in http-push.c and http-walker.cMike Hommey1-15/+25
Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-14Use strbuf in http codeMike Hommey1-38/+21
Also, replace whitespaces with tabs in some places Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-14Avoid redundant declaration of missing_target()Mike Hommey1-13/+0
Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-14Fix random sha1 in error message in http-fetch and http-pushMike Hommey1-2/+3
When a downloaded ref doesn't contain a sha1, the error message displays a random sha1 because of uninitialized memory. This happens when cloning a repository that is already a clone of another one, in which case refs/remotes/origin/HEAD is a symref. Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-25Print the real filename that we failed to open.André Goddard Rosa1-2/+2
When we fail to open a temporary file to be renamed to something else, we reported the final filename, not the temporary file we failed to open. Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-19Modularize commit-walkerDaniel Barkalow1-0/+1035
This turns the extern functions to be provided by the backend into a struct of pointers, renames the functions to be more namespace-friendly, and updates http-fetch to this interface. It removes the unused include from http-push.c. It makes git-http-fetch a builtin (with the implementation a separate file, accessible directly). Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>