aboutsummaryrefslogtreecommitdiffstats
path: root/connect.c
AgeCommit message (Collapse)AuthorFilesLines
2025-08-21Merge branch 'jc/string-list-split'Junio C Hamano1-1/+1
string_list_split*() family of functions have been extended to simplify common use cases. * jc/string-list-split: string-list: split-then-remove-empty can be done while splitting string-list: optionally omit empty string pieces in string_list_split*() diff: simplify parsing of diff.colormovedws string-list: optionally trim string pieces split by string_list_split*() string-list: unify string_list_split* functions string-list: align string_list_split() with its _in_place() counterpart string-list: report programming error with BUG
2025-08-02string-list: align string_list_split() with its _in_place() counterpartJunio C Hamano1-1/+1
The string_list_split_in_place() function was updated by 52acddf3 (string-list: multi-delimiter `string_list_split_in_place()`, 2023-04-24) to take more than one delimiter characters, hoping that we can later use it to replace our uses of strtok(). We however did not make a matching change to the string_list_split() function, which is very similar. Before giving both functions more features in future commits, allow string_list_split() to also take more than one delimiter characters to make them closer to each other. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_get_string()` wrapperPatrick Steinhardt1-2/+2
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_string()`. All callsites are adjusted so that they use `repo_config_get_string(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config()` wrapperPatrick Steinhardt1-1/+1
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config()`. All callsites are adjusted so that they use `repo_config(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01Use legacy hash for legacy formatsbrian m. carlson1-3/+3
We have a large variety of data formats and protocols where no hash algorithm was defined and the default was assumed to always be SHA-1. Instead of explicitly stating SHA-1, let's use the constant to represent the legacy hash algorithm (which is still SHA-1) so that it's clear for documentary purposes that it's a legacy fallback option and not an intentional choice to use SHA-1. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-05Merge branch 'cc/lop-remote'Junio C Hamano1-0/+9
Large-object promisor protocol extension. * cc/lop-remote: doc: add technical design doc for large object promisors promisor-remote: check advertised name or URL Add 'promisor-remote' capability to protocol v2
2025-02-27Merge branch 'ua/os-version-capability'Junio C Hamano1-1/+1
The value of "uname -s" is by default sent over the wire as a part of the "version" capability. * ua/os-version-capability: agent: advertise OS name via agent capability t5701: add setup test to remove side-effect dependency version: extend get_uname_info() to hide system details version: refactor get_uname_info() version: refactor redact_non_printables() version: replace manual ASCII checks with isprint() for clarity
2025-02-19agent: advertise OS name via agent capabilityUsman Akinyemi1-1/+1
As some issues that can happen with a Git client can be operating system specific, it can be useful for a server to know which OS a client is using. In the same way it can be useful for a client to know which OS a server is using. Our current agent capability is in the form of "package/version" (e.g., "git/1.8.3.1"). Let's extend it to include the operating system name (os) i.e in the form "package/version-os" (e.g., "git/1.8.3.1-Linux"). Including OS details in the agent capability simplifies implementation, maintains backward compatibility, avoids introducing a new capability, encourages adoption across Git-compatible software, and enhances debugging by providing complete environment information without affecting functionality. The operating system name is retrieved using the 'sysname' field of the `uname(2)` system call or its equivalent. However, there are differences between `uname(1)` (command-line utility) and `uname(2)` (system call) outputs on Windows. These discrepancies complicate testing on Windows platforms. For example: - `uname(1)` output: MINGW64_NT-10.0-20348.3.4.10-87d57229.x86_64\ .2024-02-14.20:17.UTC.x86_64 - `uname(2)` output: Windows.10.0.20348 On Windows, uname(2) is not actually system-supplied but is instead already faked up by Git itself. We could have overcome the test issue on Windows by implementing a new `uname` subcommand in `test-tool` using uname(2), but except uname(2), which would be tested against itself, there would be nothing platform specific, so it's just simpler to disable the tests on Windows. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18Add 'promisor-remote' capability to protocol v2Christian Couder1-0/+9
When a server S knows that some objects from a repository are available from a promisor remote X, S might want to suggest to a client C cloning or fetching the repo from S that C may use X directly instead of S for these objects. Note that this could happen both in the case S itself doesn't have the objects and borrows them from X, and in the case S has the objects but knows that X is better connected to the world (e.g., it is in a $LARGEINTERNETCOMPANY datacenter with petabit/s backbone connections) than S. Implementation of the latter case, which would require S to omit in its response the objects available on X, is left for future improvement though. Then C might or might not, want to get the objects from X. If S and C can agree on C using X directly, S can then omit objects that can be obtained from X when answering C's request. To allow S and C to agree and let each other know about C using X or not, let's introduce a new "promisor-remote" capability in the protocol v2, as well as a few new configuration variables: - "promisor.advertise" on the server side, and: - "promisor.acceptFromServer" on the client side. By default, or if "promisor.advertise" is set to 'false', a server S will not advertise the "promisor-remote" capability. If S doesn't advertise the "promisor-remote" capability, then a client C replying to S shouldn't advertise the "promisor-remote" capability either. If "promisor.advertise" is set to 'true', S will advertise its promisor remotes with a string like: promisor-remote=<pr-info>[;<pr-info>]... where each <pr-info> element contains information about a single promisor remote in the form: name=<pr-name>[,url=<pr-url>] where <pr-name> is the urlencoded name of a promisor remote and <pr-url> is the urlencoded URL of the promisor remote named <pr-name>. For now, the URL is passed in addition to the name. In the future, it might be possible to pass other information like a filter-spec that the client may use when cloning from S, or a token that the client may use when retrieving objects from X. It is C's responsibility to arrange how it can reach X though, so pieces of information that are usually outside Git's concern, like proxy configuration, must not be distributed over this protocol. It might also be possible in the future for "promisor.advertise" to have other values. For example a value like "onlyName" could prevent S from advertising URLs, which could help in case C should use a different URL for X than the URL S is using. (The URL S is using might be an internal one on the server side for example.) By default or if "promisor.acceptFromServer" is set to "None", C will not accept to use the promisor remotes that might have been advertised by S. In this case, C will not advertise any "promisor-remote" capability in its reply to S. If "promisor.acceptFromServer" is set to "All" and S advertised some promisor remotes, then on the contrary, C will accept to use all the promisor remotes that S advertised and C will reply with a string like: promisor-remote=<pr-name>[;<pr-name>]... where the <pr-name> elements are the urlencoded names of all the promisor remotes S advertised. In a following commit, other values for "promisor.acceptFromServer" will be implemented, so that C will be able to decide the promisor remotes it accepts depending on the name and URL it received from S. So even if that name and URL information is not used much right now, it will be needed soon. Helped-by: Taylor Blau <me@ttaylorr.com> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17connect: address -Wsign-compare warningsMike Hommey1-12/+11
Most of the warnings were about loop variables being declared as ints with a condition using a size_t, whereby switching the variable to size_t fixes the warning. One other case was comparing the result of strlen to an int passed as an argument, which turns out could just as well be passed as a size_t, albeit trickling to other functions. Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06global: mark code units that generate warnings with `-Wsign-compare`Patrick Steinhardt1-0/+1
Mark code units that generate warnings with `-Wsign-compare`. This allows for a structured approach to get rid of all such warnings over time in a way that can be easily measured. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25connect: clear child process before freeing in diagnostic modeJeff King1-0/+1
The git_connect() function has a special CONNECT_DIAG_URL mode, where we stop short of actually connecting to the other side and just print some parsing details. For URLs that require a child process (like ssh), we free() the child_process struct but forget to clear it, leaking the strings we stuffed into its "env" list. This leak is triggered many times in t5500, which uses "fetch-pack --diag-url", but we're not yet ready to mark it as leak-free. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13global: prepare for hiding away repo-less config functionsPatrick Steinhardt1-0/+2
We're about to hide config functions that implicitly depend on `the_repository` behind the `USE_THE_REPOSITORY_VARIABLE` macro. This will uncover a bunch of dependents that transitively relied on the global variable, but didn't define the macro yet. Adapt them such that we define the macro to prepare for this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04refs: call branches branchesJunio C Hamano1-2/+2
These things in refs/heads/ hierarchy are called "branches" in human parlance. Replace REF_HEADS with REF_BRANCHES to make it clearer. No end-user visible change intended at this step. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-06Merge branch 'gc/config-context'Junio C Hamano1-2/+2
Reduce reliance on a global state in the config reading API. * gc/config-context: config: pass source to config_parser_event_fn_t config: add kvi.path, use it to evaluate includes config.c: remove config_reader from configsets config: pass kvi to die_bad_number() trace2: plumb config kvi config.c: pass ctx with CLI config config: pass ctx with config files config.c: pass ctx in configsets config: add ctx arg to config_fn_t urlmatch.h: use config_fn_t type config: inline git_color_default_config
2023-06-28config: add ctx arg to config_fn_tGlen Choo1-2/+2
Add a new "const struct config_context *ctx" arg to config_fn_t to hold additional information about the config iteration operation. config_context has a "struct key_value_info kvi" member that holds metadata about the config source being read (e.g. what kind of config source it is, the filename, etc). In this series, we're only interested in .kvi, so we could have just used "struct key_value_info" as an arg, but config_context makes it possible to add/adjust members in the future without changing the config_fn_t signature. We could also consider other ways of organizing the args (e.g. moving the config name and value into config_context or key_value_info), but in my experiments, the incremental benefit doesn't justify the added complexity (e.g. a config_fn_t will sometimes invoke another config_fn_t but with a different config value). In subsequent commits, the .kvi member will replace the global "struct config_reader" in config.c, making config iteration a global-free operation. It requires much more work for the machinery to provide meaningful values of .kvi, so for now, merely change the signature and call sites, pass NULL as a placeholder value, and don't rely on the arg in any meaningful way. Most of the changes are performed by contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every config_fn_t: - Modifies the signature to accept "const struct config_context *ctx" - Passes "ctx" to any inner config_fn_t, if needed - Adds UNUSED attributes to "ctx", if needed Most config_fn_t instances are easily identified by seeing if they are called by the various config functions. Most of the remaining ones are manually named in the .cocci patch. Manual cleanups are still needed, but the majority of it is trivial; it's either adjusting config_fn_t that the .cocci patch didn't catch, or adding forward declarations of "struct config_context ctx" to make the signatures make sense. The non-trivial changes are in cases where we are invoking a config_fn_t outside of config machinery, and we now need to decide what value of "ctx" to pass. These cases are: - trace2/tr2_cfg.c:tr2_cfg_set_fl() This is indirectly called by git_config_set() so that the trace2 machinery can notice the new config values and update its settings using the tr2 config parsing function, i.e. tr2_cfg_cb(). - builtin/checkout.c:checkout_main() This calls git_xmerge_config() as a shorthand for parsing a CLI arg. This might be worth refactoring away in the future, since git_xmerge_config() can call git_default_config(), which can do much more than just parsing. Handle them by creating a KVI_INIT macro that initializes "struct key_value_info" to a reasonable default, and use that to construct the "ctx" arg. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21repository: remove unnecessary include of path.hElijah Newren1-0/+1
This also made it clear that several .c files that depended upon path.h were missing a #include for it; add the missing includes while at it. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-25Merge branch 'jk/protocol-cap-parse-fix'Junio C Hamano1-14/+16
The code to parse capability list for v0 on-wire protocol fell into an infinite loop when a capability appears multiple times, which has been corrected. * jk/protocol-cap-parse-fix: v0 protocol: use size_t for capability length/offset t5512: test "ls-remote --heads --symref" filtering with v0 and v2 t5512: allow any protocol version for filtered symref test t5512: add v2 support for "ls-remote --symref" test v0 protocol: fix sha1/sha256 confusion for capabilities^{} t5512: stop referring to "v1" protocol v0 protocol: fix infinite loop when parsing multi-valued capabilities
2023-04-25Merge branch 'en/header-split-cache-h'Junio C Hamano1-1/+1
Header clean-up. * en/header-split-cache-h: (24 commits) protocol.h: move definition of DEFAULT_GIT_PORT from cache.h mailmap, quote: move declarations of global vars to correct unit treewide: reduce includes of cache.h in other headers treewide: remove double forward declaration of read_in_full cache.h: remove unnecessary includes treewide: remove cache.h inclusion due to pager.h changes pager.h: move declarations for pager.c functions from cache.h treewide: remove cache.h inclusion due to editor.h changes editor: move editor-related functions and declarations into common file treewide: remove cache.h inclusion due to object.h changes object.h: move some inline functions and defines from cache.h treewide: remove cache.h inclusion due to object-file.h changes object-file.h: move declarations for object-file.c functions from cache.h treewide: remove cache.h inclusion due to git-zlib changes git-zlib: move declarations for git-zlib functions from cache.h treewide: remove cache.h inclusion due to object-name.h changes object-name.h: move declarations for object-name.c functions from cache.h treewide: remove unnecessary cache.h inclusion treewide: be explicit about dependence on mem-pool.h treewide: be explicit about dependence on oid-array.h ...
2023-04-14v0 protocol: use size_t for capability length/offsetJeff King1-11/+11
When parsing server capabilities, we use "int" to store lengths and offsets. At first glance this seems like a spot where our parser may be confused by integer overflow if somebody sent us a malicious response. In practice these strings are all bounded by the 64k limit of a pkt-line, so using "int" is OK. However, it makes the code simpler to audit if they just use size_t everywhere. Note that because we take these parameters as pointers, this also forces many callers to update their declared types. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14v0 protocol: fix sha1/sha256 confusion for capabilities^{}Jeff King1-1/+2
Commit eb398797cd (connect: advertized capability is not a ref, 2016-09-09) added support for an upload-pack server responding with: 0000000000000000000000000000000000000000 capabilities^{} followed by a NUL and the actual capabilities. We correctly parse the oid using the packet_reader's hash_algo field, but then we compare it to null_oid(), which will instead use our current repo's default algorithm. If we're defaulting to sha256 locally but the other side is sha1, they won't match and we'll fail to parse the line (and thus die()). This can cause a test failure when the suite is run with GIT_TEST_DEFAULT_HASH=sha256, and we even do so regularly via the linux-sha256 CI job. But since the test requires JGit to run, it's usually just skipped, and nobody noticed the problem. The reason the original patch used JGit is that Git itself does not ever produce such a line via upload-pack; the feature was added to fix a real-world problem when interacting with JGit. That was good for verifying that the incompatibility was fixed, but it's not a good regression test: - hardly anybody runs it, because you have to have jgit installed; hence this bug going unnoticed - we're depending on jgit's behavior for the test to do anything useful. In particular, this behavior is only relevant to the v0 protocol, but these days we ask for the v2 protocol by default. So for modern jgit, this is probably testing nothing. - it's complicated and slow. We had to do some fifo trickery to handle races, and this one test makes up 40% of the runtime of the total script. Instead, let's just hard-code the response that's of interest to us. That will test exactly what we want for every run, and reveals the bug when run in sha256 mode. And of course we'll fix the actual bug by using the correct hash_algo struct. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14v0 protocol: fix infinite loop when parsing multi-valued capabilitiesJeff King1-2/+3
If Git's client-side parsing of an upload-pack response (so git-fetch or ls-remote) sees multiple instances of a single capability, it can enter an infinite loop due to a bug in advancing the "offset" parameter in the parser. This bug can't happen between a client and server of the same Git version. The client bug is in parse_feature_value() when the caller passes in an offset parameter. And that only happens when the v0 protocol is parsing "symref" and "object-format" capabilities, via next_server_feature_value(). But Git has never produced multiple object-format capabilities, and it stopped producing multiple symref values in d007dbf7d6 (Revert "upload-pack: send non-HEAD symbolic refs", 2013-11-18). However, upload-pack did produce multiple symref entries for a while, and they are valid. Plus other implementations, such as Dulwich will still do so. So we should handle them. And even if we do not expect it, it is obviously a bug for the parser to enter an infinite loop. The bug itself is pretty simple. Commit 2c6a403d96 (connect: add function to parse multiple v1 capability values, 2020-05-25) added the "offset" parameter, which is used as both an in- and out-parameter. When parsing the first "symref" capability, *offset will be 0 on input, and after parsing the capability, we set *offset to an index just past the value by taking a pointer difference "(value + end) - feature_list". But on the second call, now *offset is set to that larger index, which lets us skip past the first "symref" capability. However, we do so by incrementing feature_list. That means our pointer difference is now too small; it is counting from where we resumed parsing, not from the start of the original feature_list pointer. And because we incremented feature_list only inside our function, and not the caller, that increment is lost next time the function is called. One solution would be to account for those skipped bytes by incrementing *offset, rather than assigning to it. But wait, there's more! We also increment feature_list if we have a near-miss. Say we are looking for "symref" and find "almost-symref". In that case we'll point feature_list to the "y" in "almost-symref" and restart our search. But that again means our offset won't be correct, as it won't account for the bytes between the start of the string and that "y". So instead, let's just record the beginning of the feature_list string in a separate pointer that we never touch. That offset we take in and return is meant to be using that point as a base, and now we'll do so consistently. Since the bug can't be reproduced using the current version of git-upload-pack, we'll instead hard-code an input which triggers the problem. Before this patch it loops forever re-parsing the second symref entry. Now we check both that it finishes, and that it parses both entries correctly (a case we could not test at all before). We don't need to worry about testing v2 here; it communicates the capabilities in a completely different way, and doesn't use this code at all. There are tests earlier in t5512 that are meant to cover this (they don't, but we'll address that in a future patch). Reported-by: Jonas Haag <jonas@lophus.org> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11treewide: remove unnecessary cache.h inclusionElijah Newren1-1/+0
Several files were including cache.h solely to get other headers, such as trace.h and trace2.h. Since the last few commits have modified files to make these dependencies more explicit, the inclusion of cache.h is no longer needed in several cases. Remove it. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11treewide: be explicit about dependence on trace.h & trace2.hElijah Newren1-0/+1
Dozens of files made use of trace and trace2 functions, without explicitly including trace.h or trace2.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include trace.h or trace2.h if they are using them. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-06Merge branch 'en/header-split-cleanup'Junio C Hamano1-0/+2
Split key function and data structure definitions out of cache.h to new header files and adjust the users. * en/header-split-cleanup: csum-file.h: remove unnecessary inclusion of cache.h write-or-die.h: move declarations for write-or-die.c functions from cache.h treewide: remove cache.h inclusion due to setup.h changes setup.h: move declarations for setup.c functions from cache.h treewide: remove cache.h inclusion due to environment.h changes environment.h: move declarations for environment.c functions from cache.h treewide: remove unnecessary includes of cache.h wrapper.h: move declarations for wrapper.c functions from cache.h path.h: move function declarations for path.c functions from cache.h cache.h: remove expand_user_path() abspath.h: move absolute path functions from cache.h environment: move comment_line_char from cache.h treewide: remove unnecessary cache.h inclusion from several sources treewide: remove unnecessary inclusion of gettext.h treewide: be explicit about dependence on gettext.h treewide: remove unnecessary cache.h inclusion from a few headers
2023-03-28Merge branch 'jk/fix-proto-downgrade-to-v0'Junio C Hamano1-3/+5
Transports that do not support protocol v2 did not correctly fall back to protocol v0 under certain conditions, which has been corrected. * jk/fix-proto-downgrade-to-v0: git_connect(): fix corner cases in downgrading v2 to v0
2023-03-21environment.h: move declarations for environment.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21treewide: be explicit about dependence on gettext.hElijah Newren1-0/+1
Dozens of files made use of gettext functions, without explicitly including gettext.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include gettext.h if they are using it. However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an include of gettext.h, it was left out to avoid conflicting with an in-flight topic. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-19Merge branch 'zh/push-to-delete-onelevel-ref'Junio C Hamano1-1/+2
"git push" has been taught to allow deletion of refs with one-level names to help repairing a repository who acquired such a ref by mistake. In general, we don't encourage use of such a ref, and creation or update to such a ref is rejected as before. * zh/push-to-delete-onelevel-ref: push: allow delete single-level ref receive-pack: fix funny ref error messsage
2023-03-17git_connect(): fix corner cases in downgrading v2 to v0Jeff King1-3/+5
There's code in git_connect() that checks whether we are doing a push with protocol_v2, and if so, drops us to protocol_v0 (since we know how to do v2 only for fetches). But it misses some corner cases: 1. it checks the "prog" variable, which is actually the path to receive-pack on the remote side. By default this is just "git-receive-pack", but it could be an arbitrary string (like "/path/to/git receive-pack", etc). We'd accidentally stay in v2 mode in this case. 2. besides "receive-pack" and "upload-pack", there's one other value we'd expect: "upload-archive" for handling "git archive --remote". Like receive-pack, this doesn't understand v2, and should use the v0 protocol. In practice, neither of these causes bugs in the real world so far. We do send a "we understand v2" probe to the server, but since no server implements v2 for anything but upload-pack, it's simply ignored. But this would eventually become a problem if we do implement v2 for those endpoints, as older clients would falsely claim to understand it, leading to a server response they can't parse. We can fix (1) by passing in both the program path and the "name" of the operation. I treat the name as a string here, because that's the pattern set in transport_connect(), which is one of our callers (we were simply throwing away the "name" value there before). We can fix (2) by allowing only known-v2 protocols ("upload-pack"), rather than blocking unknown ones ("receive-pack" and "upload-archive"). That will mean whoever eventually implements v2 push will have to adjust this list, but that's reasonable. We'll do the safe, conservative thing (sticking to v0) by default, and anybody working on v2 will quickly realize this spot needs to be updated. The new tests cover the receive-pack and upload-archive cases above, and re-confirm that we allow v2 with an arbitrary "--upload-pack" path (that already worked before this patch, of course, but it would be an easy thing to break if we flipped the allow/block logic without also handling "name" separately). Here are a few miscellaneous implementation notes, since I had to do a little head-scratching to understand who calls what: - transport_connect() is called only for git-upload-archive. For non-http git remotes, that resolves to the virtual connect_git() function (which then calls git_connect(); confused yet?). So plumbing through "name" in connect_git() covers that. - for regular fetches and pushes, callers use higher-level functions like transport_fetch_refs(). For non-http git remotes, that means calling git_connect() under the hood via connect_setup(). And that uses the "for_push" flag to decide which name to use. - likewise, plumbing like fetch-pack and send-pack may call git_connect() directly; they each know which name to use. - for remote helpers (including http), we already have separate parameters for "name" and "exec" (another name for "prog"). In process_connect_service(), we feed the "name" to the helper via "connect" or "stateless-connect" directives. There's also a "servpath" option, which can be used to tell the helper about the "exec" path. But no helpers we implement support it! For http it would be useless anyway (no reasonable server implementation will allow you to send a shell command to run the server). In theory it would be useful for more obscure helpers like remote-ext, but even there it is not implemented. It's tempting to get rid of it simply to reduce confusion, but we have publicly documented it since it was added in fa8c097cc9 (Support remote helpers implementing smart transports, 2009-12-09), so it's possible some helper in the wild is using it. - So for v2, helpers (again, including http) are mainly used via stateless-connect, driven by the main program. But they do still need to decide whether to do a v2 probe. And so there's similar logic in remote-curl.c's discover_refs() that looks for "git-receive-pack". But it's not buggy in the same way. Since it doesn't support servpath, it is always dealing with a "service" string like "git-receive-pack". And since it doesn't support straight "connect", it can't be used for "upload-archive". So we could leave that spot alone. But I've updated it here to match the logic we're changing in connect_git(). That seems like the least confusing thing for somebody who has to touch both of these spots later (say, to add v2 push support). I didn't add a new test to make sure this doesn't break anything; we already have several tests (in t5551 and elsewhere) that make sure we are using v2 over http. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-01push: allow delete single-level refZheNing Hu1-1/+2
We discourage the creation/update of single-level refs because some upper-layer applications only work in specified reference namespaces, such as "refs/heads/*" or "refs/tags/*", these single-level refnames may not be recognized. However, we still hope users can delete them which have been created by mistake. Therefore, when updating branches on the server with "git receive-pack", by checking whether it is a branch deletion operation, it will determine whether to allow the update of a single-level refs. This avoids creating/updating such single-level refs, but allows them to be deleted. On the client side, "git push" also does not properly fill in the old-oid of single-level refs, which causes the server-side "git receive-pack" to think that the ref's old-oid has changed when deleting single-level refs, this causes the push to be rejected. So the solution is to fix the client to be able to delete single-level refs by properly filling old-oid. Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23cache.h: remove dependence on hex.h; make other files include it explicitlyElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-02Merge branch 'ds/bundle-uri-4'Junio C Hamano1-0/+44
Bundle URIs part 4. * ds/bundle-uri-4: clone: unbundle the advertised bundles bundle-uri: download bundles from an advertised list bundle-uri: allow relative URLs in bundle lists strbuf: introduce strbuf_strip_file_from_path() bundle-uri: serve bundle.* keys from config bundle-uri client: add helper for testing server transport: rename got_remote_heads bundle-uri client: add boolean transfer.bundleURI setting clone: request the 'bundle-uri' command when available t: create test harness for 'bundle-uri' command protocol v2: add server-side "bundle-uri" skeleton
2022-12-25clone: request the 'bundle-uri' command when availableÆvar Arnfjörð Bjarmason1-0/+44
Set up all the needed client parts of the 'bundle-uri' protocol v2 command, without actually doing anything with the bundle URIs. If the server says it supports 'bundle-uri' teach Git to issue the 'bundle-uri' command after the 'ls-refs' during 'git clone'. The returned key=value pairs are passed to the bundle list code which is tested using a different ingest mechanism in t5750-bundle-uri-parse.sh. At this point, Git does nothing with that bundle list. It will not download any of the bundles. That will come in a later change after these protocol bits are finalized. The no-op client is initially used only by 'git clone' to test the basic functionality, and eventually will bootstrap the initial download of Git objects during a fresh clone. The bundle URI client will not be integrated into other fetches until a mechanism is created to select a subset of bundles for download. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13server_supports_v2(): use a separate function for die_on_errorJeff King1-9/+12
The server_supports_v2() helper lets a caller find out if the server supports a feature, and will optionally die if it's not supported. This makes the return value confusing, as it's only meaningful when the function is not asked to die. Coverity flagged a new call like: /* check that we support "foo" */ server_supports_v2("foo", 1); complaining that we usually checked the return value, but this time we didn't. But this call is correct, and other ones that did: if (server_supports_v2("foo", 1)) do_something_with_foo(); are "wrong", in the sense that we know the conditional will always be true (but there's no bug; the code is simply misleading). Let's split the "die" behavior into its own function which returns void, and modify each caller to use the correct one. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-10Merge branch 'ab/env-array'Junio C Hamano1-4/+5
Rename .env_array member to .env in the child_process structure. * ab/env-array: run-command API users: use "env" not "env_array" in comments & names run-command API: rename "env_array" to "env"
2022-06-02run-command API: rename "env_array" to "env"Ævar Arnfjörð Bjarmason1-4/+5
Start following-up on the rename mentioned in c7c4bdeccf3 (run-command API: remove "env" member, always use "env_array", 2021-11-25) of "env_array" to "env". The "env_array" name was picked in 19a583dc39e (run-command: add env_array, an optional argv_array for env, 2014-10-19) because "env" was taken. Let's not forever keep the oddity of "*_array" for this "struct strvec", but not for its "args" sibling. This commit is almost entirely made with a coccinelle rule[1]. The only manual change here is in run-command.h to rename the struct member itself and to change "env_array" to "env" in the CHILD_PROCESS_INIT initializer. The rest of this is all a result of applying [1]: * make contrib/coccinelle/run_command.cocci.patch * patch -p1 <contrib/coccinelle/run_command.cocci.patch * git add -u 1. cat contrib/coccinelle/run_command.pending.cocci @@ struct child_process E; @@ - E.env_array + E.env @@ struct child_process *E; @@ - E->env_array + E->env I've avoided changing any comments and derived variable names here, that will all be done in the next commit. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-16connect.c: refactor sending of agent & object-formatÆvar Arnfjörð Bjarmason1-13/+20
Refactor the sending of the "agent" and "object-format" capabilities into a function. This was added in its current form in ab67235bc4 (connect: parse v2 refs with correct hash algorithm, 2020-05-25). When we connect to a v2 server we need to know about its object-format, and it needs to know about ours. Since most things in connect.c and transport.c piggy-back on the eager getting of remote refs via the handshake() those commands can make use of the just-sent-over object-format by ls-refs. But I'm about to add a command that may come after ls-refs, and may not, but we need the server to know about our user-agent and object-format. So let's split this into a function. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-06ls-remote & transport API: release "struct transport_ls_refs_options"Ævar Arnfjörð Bjarmason1-2/+2
Fix a memory leak in codepaths that use the "struct transport_ls_refs_options" API. Since the introduction of the struct in 39835409d10 (connect, transport: encapsulate arg in struct, 2021-02-05) the caller has been responsible for freeing it. That commit in turn migrated code originally added in 402c47d9391 (clone: send ref-prefixes when using protocol v2, 2018-07-20) and b4be74105fe (ls-remote: pass ref prefixes when requesting a remote's refs, 2018-03-15). Only some of those codepaths were releasing the allocated resources of the struct, now all of them will. Mark the "t/t5511-refspec.sh" test as passing when git is compiled with SANITIZE=leak. They'll now be listed as running under the "GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI target). Previously 24/47 tests would fail. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-03Merge branch 'ah/connect-parse-feature-v0-fix'Junio C Hamano1-0/+2
Protocol v0 clients can get stuck parsing a malformed feature line. * ah/connect-parse-feature-v0-fix: connect: also update offset for features without values
2021-09-27connect: also update offset for features without valuesAndrzej Hunt1-0/+2
parse_feature_value() takes an offset, and uses it to seek past the point in features_list that we've already seen. However if the feature being searched for does not specify a value, the offset is not updated. Therefore if we call parse_feature_value() in a loop on a value-less feature, we'll keep on parsing the same feature over and over again. This usually isn't an issue: there's no point in using next_server_feature_value() to search for repeated instances of the same capability unless that capability typically specifies a value - but a broken server could send a response that omits the value for a feature even when we are expecting a value. Therefore we add an offset update calculation for the no-value case, which helps ensure that loops using next_server_feature_value() will always terminate. next_server_feature_value(), and the offset calculation, were first added in 2.28 in 2c6a403d96 (connect: add function to parse multiple v1 capability values, 2020-05-25). Thanks to Peff for authoring the test. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Andrzej Hunt <andrzej@ahunt.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10connect, protocol: log negotiated protocol versionJosh Steadmon1-0/+2
It is useful for performance monitoring and debugging purposes to know the wire protocol used for remote operations. This may differ from the version set in local configuration due to differences in version and/or configuration between the server and the client. Therefore, log the negotiated wire protocol version via trace2, for both clients and servers. Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27hash: provide per-algorithm null OIDsbrian m. carlson1-1/+1
Up until recently, object IDs did not have an algorithm member, only a hash. Consequently, it was possible to share one null (all-zeros) object ID among all hash algorithms. Now that we're going to be handling objects from multiple hash algorithms, it's important to make sure that all object IDs have a correct algorithm field. Introduce a per-algorithm null OID, and add it to struct hash_algo. Introduce a wrapper function as well, and use it everywhere we used to use the null_oid constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17Merge branch 'jt/clone-unborn-head'Junio C Hamano1-3/+29
"git clone" tries to locally check out the branch pointed at by HEAD of the remote repository after it is done, but the protocol did not convey the information necessary to do so when copying an empty repository. The protocol v2 learned how to do so. * jt/clone-unborn-head: clone: respect remote unborn HEAD connect, transport: encapsulate arg in struct ls-refs: report unborn targets of symrefs
2021-02-05clone: respect remote unborn HEADJonathan Tan1-2/+26
Teach Git to use the "unborn" feature introduced in a previous patch as follows: Git will always send the "unborn" argument if it is supported by the server. During "git clone", if cloning an empty repository, Git will use the new information to determine the local branch to create. In all other cases, Git will ignore it. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05connect, transport: encapsulate arg in structJonathan Tan1-1/+3
In a future patch we plan to return the name of an unborn current branch from deep in the callchain to a caller via a new pointer parameter that points at a variable in the caller when the caller calls get_remote_refs() and transport_get_remote_refs(). In preparation for that, encapsulate the existing ref_prefixes parameter into a struct. The aforementioned unborn current branch will go into this new struct in the future patch. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25Merge branch 'jk/forbid-lf-in-git-url'Junio C Hamano1-0/+2
Newline characters in the host and path part of git:// URL are now forbidden. * jk/forbid-lf-in-git-url: fsck: reject .gitmodules git:// urls with newlines git_connect_git(): forbid newlines in host and path
2021-01-07git_connect_git(): forbid newlines in host and pathJeff King1-0/+2
When we connect to a git:// server, we send an initial request that looks something like: 002dgit-upload-pack repo.git\0host=example.com If the repo path contains a newline, then it's included literally, and we get: 002egit-upload-pack repo .git\0host=example.com This works fine if you really do have a newline in your repository name; the server side uses the pktline framing to parse the string, not newlines. However, there are many _other_ protocols in the wild that do parse on newlines, such as HTTP. So a carefully constructed git:// URL can actually turn into a valid HTTP request. For example: git://localhost:1234/%0d%0a%0d%0aGET%20/%20HTTP/1.1 %0d%0aHost:localhost%0d%0a%0d%0a becomes: 0050git-upload-pack / GET / HTTP/1.1 Host:localhost host=localhost:1234 on the wire. Again, this isn't a problem for a real Git server, but it does mean that feeding a malicious URL to Git (e.g., through a submodule) can cause it to make unexpected cross-protocol requests. Since repository names with newlines are presumably quite rare (and indeed, we already disallow them in git-over-http), let's just disallow them over this protocol. Hostnames could likewise inject a newline, but this is unlikely a problem in practice; we'd try resolving the hostname with a newline in it, which wouldn't work. Still, it doesn't hurt to err on the side of caution there, since we would not expect them to work in the first place. The ssh and local code paths are unaffected by this patch. In both cases we're trying to run upload-pack via a shell, and will quote the newline so that it makes it intact. An attacker can point an ssh url at an arbitrary port, of course, but unless there's an actual ssh server there, we'd never get as far as sending our shell command anyway. We _could_ similarly restrict newlines in those protocols out of caution, but there seems little benefit to doing so. The new test here is run alongside the git-daemon tests, which cover the same protocol, but it shouldn't actually contact the daemon at all. In theory we could make the test more robust by setting up an actual repository with a newline in it (so that our clone would succeed if our new check didn't kick in). But a repo directory with newline in it is likely not portable across all filesystems. Likewise, we could check git-daemon's log that it was not contacted at all, but we do not currently record the log (and anyway, it would make the test racy with the daemon's log write). We'll just check the client-side stderr to make sure we hit the expected code path. Reported-by: Harold Kim <h.kim@flatt.tech> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-27Merge branch 'jk/leakfix'Junio C Hamano1-2/+2
Code clean-up. * jk/leakfix: submodule--helper: fix leak of core.worktree value config: fix leak in git_config_get_expiry_in_days() config: drop git_config_get_string_const() config: fix leaks from git_config_get_string_const() checkout: fix leak of non-existent branch names submodule--helper: use strbuf_release() to free strbufs clear_pattern_list(): clear embedded hashmaps
2020-08-14config: fix leaks from git_config_get_string_const()Jeff King1-2/+2
There are two functions to get a single config string: - git_config_get_string() - git_config_get_string_const() One might naively think that the first one allocates a new string and the second one just points us to the internal configset storage. But in fact they both allocate a new copy; the second one exists only to avoid having to cast when using it with a const global which we never intend to free. The documentation for the function explains that clearly, but it seems I'm not alone in being surprised by this. Of 17 calls to the function, 13 of them leak the resulting value. We could obviously fix these by adding the appropriate free(). But it would be simpler still if we actually had a non-allocating way to get the string. There's git_config_get_value() but that doesn't quite do what we want. If the config key is present but is a boolean with no value (e.g., "[foo]bar" in the file), then we'll get NULL (whereas the string versions will print an error and die). So let's introduce a new variant, git_config_get_string_tmp(), that behaves as these callers expect. We need a new name because we have new semantics but the same function signature (so even if we converted the four remaining callers, topics in flight might be surprised). The "tmp" is because this value should only be held onto for a short time. In practice it's rare for us to clear and refresh the configset, invalidating the pointer, but hopefully the "tmp" makes callers think about the lifetime. In each of the converted cases here the value only needs to last within the local function or its immediate caller. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-30strvec: rename struct fieldsJeff King1-8/+8
The "argc" and "argv" names made sense when the struct was argv_array, but now they're just confusing. Let's rename them to "nr" (which we use for counts elsewhere) and "v" (which is rather terse, but reads well when combined with typical variable names like "args.v"). Note that we have to update all of the callers immediately. Playing tricks with the preprocessor is hard here, because we wouldn't want to rewrite unrelated tokens. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-28strvec: fix indentation in renamed callsJeff King1-3/+4
Code which split an argv_array call across multiple lines, like: argv_array_pushl(&args, "one argument", "another argument", "and more", NULL); was recently mechanically renamed to use strvec, which results in mis-matched indentation like: strvec_pushl(&args, "one argument", "another argument", "and more", NULL); Let's fix these up to align the arguments with the opening paren. I did this manually by sifting through the results of: git jump grep 'strvec_.*,$' and liberally applying my editor's auto-format. Most of the changes are of the form shown above, though I also normalized a few that had originally used a single-tab indentation (rather than our usual style of aligning with the open paren). I also rewrapped a couple of obvious cases (e.g., where previously too-long lines became short enough to fit on one), but I wasn't aggressive about it. In cases broken to three or more lines, the grouping of arguments is sometimes meaningful, and it wasn't worth my time or reviewer time to ponder each case individually. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-28strvec: convert more callers away from argv_array nameJeff King1-24/+24
We eventually want to drop the argv_array name and just use strvec consistently. There's no particular reason we have to do it all at once, or care about interactions between converted and unconverted bits. Because of our preprocessor compat layer, the names are interchangeable to the compiler (so even a definition and declaration using different names is OK). This patch converts remaining files from the first half of the alphabet, to keep the diff to a manageable size. The conversion was done purely mechanically with: git ls-files '*.c' '*.h' | xargs perl -i -pe ' s/ARGV_ARRAY/STRVEC/g; s/argv_array/strvec/g; ' and then selectively staging files with "git add '[abcdefghjkl]*'". We'll deal with any indentation/style fallouts separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-06Merge branch 'bc/sha-256-part-2'Junio C Hamano1-29/+109
SHA-256 migration work continues. * bc/sha-256-part-2: (44 commits) remote-testgit: adapt for object-format bundle: detect hash algorithm when reading refs t5300: pass --object-format to git index-pack t5704: send object-format capability with SHA-256 t5703: use object-format serve option t5702: offer an object-format capability in the test t/helper: initialize the repository for test-sha1-array remote-curl: avoid truncating refs with ls-remote t1050: pass algorithm to index-pack when outside repo builtin/index-pack: add option to specify hash algorithm remote-curl: detect algorithm for dumb HTTP by size builtin/ls-remote: initialize repository based on fetch t5500: make hash independent serve: advertise object-format capability for protocol v2 connect: parse v2 refs with correct hash algorithm connect: pass full packet reader when parsing v2 refs Documentation/technical: document object-format for protocol v2 t1302: expect repo format version 1 for SHA-256 builtin/show-index: provide options to determine hash algo t5302: modernize test formatting ...
2020-05-27serve: advertise object-format capability for protocol v2brian m. carlson1-0/+2
In order to communicate the protocol supported by the server side, add support for advertising the object-format capability. We check that the client side sends us an identical algorithm if it sends us its own object-format capability, and assume it speaks SHA-1 if not. In the test, when we're using an algorithm other than SHA-1, we need to specify the algorithm in use so we don't get a failure with an "unknown format" message. Add a test that we handle a mismatched algorithm. Remove the test_oid_init call since it's no longer necessary. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-27connect: parse v2 refs with correct hash algorithmbrian m. carlson1-5/+16
When using protocol v2, we need to know what hash algorithm is used by the remote end. See if the server has sent us an object-format capability, and if so, use it to determine the hash algorithm in use and set that value in the packet reader. Parse the refs using this algorithm. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-27connect: pass full packet reader when parsing v2 refsbrian m. carlson1-2/+3
When we're parsing refs, we need to know not only what the line we're parsing is, but also the hash algorithm we should use to parse it, which is stored in the reader object. Pass the packet reader object through to the protocol v2 ref parsing function. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-27connect: detect algorithm when fetching refsbrian m. carlson1-4/+17
If we're fetching refs, detect the hash algorithm and parse the refs using that algorithm. As mentioned in the documentation, if multiple versions of the object-format capability are provided, we use the first. No known implementation supports multiple algorithms now, but they may in the future. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-27connect: make parse_feature_value externbrian m. carlson1-2/+1
We're going to be using this function in other files, so no longer mark this function static. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-27connect: add function to detect supported v1 hash functionsbrian m. carlson1-0/+22
Add a function, server_supports_hash, to see if the remote server supports a particular hash algorithm when speaking protocol v1. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-27connect: add function to fetch value of a v2 server capabilitybrian m. carlson1-0/+15
So far in protocol v2, all of our server capabilities that have values have not had values that we've been interested in parsing. For example, we receive but ignore the agent value. However, in a future commit, we're going to want to parse out the value of a server capability. To make this easy, add a function, server_feature_v2, that can fetch the value provided as part of the server capability. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-27connect: add function to parse multiple v1 capability valuesbrian m. carlson1-9/+21
In a capability response, we can have multiple symref entries. In the future, we will also allow for multiple hash algorithms to be specified. To avoid duplication, expand the parse_feature_value function to take an optional offset where the parsing should begin next time. Add a wrapper function that allows us to query the next server feature value, and use it in the existing symref parsing code. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-27connect: have ref processing code take struct packet_readerbrian m. carlson1-11/+16
In a future patch, we'll want to access multiple members from struct packet_reader when parsing references. Therefore, have the ref parsing code take pointers to struct reader instead of having to pass multiple arguments to each function. Rename the len variable to "linelen" to make it clearer what the variable does in light of the variable change. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-24stateless-connect: send response end packetDenton Liu1-1/+15
Currently, remote-curl acts as a proxy and blindly forwards packets between an HTTP server and fetch-pack. In the case of a stateless RPC connection where the connection is terminated before the transaction is complete, remote-curl will blindly forward the packets before waiting on more input from fetch-pack. Meanwhile, fetch-pack will read the transaction and continue reading, expecting more input to continue the transaction. This results in a deadlock between the two processes. This can be seen in the following command which does not terminate: $ git -c protocol.version=2 clone https://github.com/git/git.git --shallow-since=20151012 Cloning into 'git'... whereas the v1 version does terminate as expected: $ git -c protocol.version=1 clone https://github.com/git/git.git --shallow-since=20151012 Cloning into 'git'... fatal: the remote end hung up unexpectedly Instead of blindly forwarding packets, make remote-curl insert a response end packet after proxying the responses from the remote server when using stateless_connect(). On the RPC client side, ensure that each response ends as described. A separate control packet is chosen because we need to be able to differentiate between what the remote server sends and remote-curl's control packets. By ensuring in the remote-curl code that a server cannot send response end packets, we prevent a malicious server from being able to perform a denial of service attack in which they spoof a response end packet and cause the described deadlock to happen. Reported-by: Force Charlie <charlieio@outlook.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-24pkt-line: define PACKET_READ_RESPONSE_ENDDenton Liu1-0/+2
In a future commit, we will use PACKET_READ_RESPONSE_END to separate messages proxied by remote-curl. To prepare for this, add the PACKET_READ_RESPONSE_END enum value. In switch statements that need a case added, die() or BUG() when a PACKET_READ_RESPONSE_END is unexpected. Otherwise, mirror how PACKET_READ_DELIM is implemented (especially in cases where packets are being forwarded). Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-30oid_array: rename source file from sha1-arrayJeff King1-1/+1
We renamed the actual data structure in 910650d2f8 (Rename sha1_array to oid_array, 2017-03-31), but the file is still called sha1-array. Besides being slightly confusing, it makes it more annoying to grep for leftover occurrences of "sha1" in various files, because the header is included in so many places. Let's complete the transition by renaming the source and header files (and fixing up a few comment references). I kept the "-" in the name, as that seems to be our style; cf. fc1395f4a4 (sha1_file.c: rename to use dash in file name, 2018-04-10). We also have oidmap.h and oidset.h without any punctuation, but those are "struct oidmap" and "struct oidset" in the code. We _could_ make this "oidarray" to match, but somehow it looks uglier to me because of the length of "array" (plus it would be a very invasive patch for little gain). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-06Sync with 2.23.1Johannes Schindelin1-1/+1
* maint-2.23: (44 commits) Git 2.23.1 Git 2.22.2 Git 2.21.1 mingw: sh arguments need quoting in more circumstances mingw: fix quoting of empty arguments for `sh` mingw: use MSYS2 quoting even when spawning shell scripts mingw: detect when MSYS2's sh is to be spawned more robustly t7415: drop v2.20.x-specific work-around Git 2.20.2 t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters ...
2019-12-06Sync with 2.21.1Johannes Schindelin1-1/+1
* maint-2.21: (42 commits) Git 2.21.1 mingw: sh arguments need quoting in more circumstances mingw: fix quoting of empty arguments for `sh` mingw: use MSYS2 quoting even when spawning shell scripts mingw: detect when MSYS2's sh is to be spawned more robustly t7415: drop v2.20.x-specific work-around Git 2.20.2 t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh ...
2019-12-06Sync with 2.20.2Johannes Schindelin1-1/+1
* maint-2.20: (36 commits) Git 2.20.2 t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories ...
2019-12-06Sync with 2.19.3Johannes Schindelin1-1/+1
* maint-2.19: (34 commits) Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams ...
2019-12-06Sync with 2.18.2Johannes Schindelin1-1/+1
* maint-2.18: (33 commits) Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up ...
2019-12-06Sync with 2.17.3Johannes Schindelin1-1/+1
* maint-2.17: (32 commits) Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names ...
2019-12-06Sync with 2.15.4Johannes Schindelin1-1/+1
* maint-2.15: (29 commits) Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses clone --recurse-submodules: prevent name squatting on Windows is_ntfs_dotgit(): only verify the leading segment ...
2019-12-06Sync with 2.14.6Johannes Schindelin1-1/+1
* maint-2.14: (28 commits) Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses clone --recurse-submodules: prevent name squatting on Windows is_ntfs_dotgit(): only verify the leading segment test-path-utils: offer to run a protectNTFS/protectHFS benchmark ...
2019-12-05mingw: handle `subst`-ed "DOS drives"Johannes Schindelin1-1/+1
Over a decade ago, in 25fe217b86c (Windows: Treat Windows style path names., 2008-03-05), Git was taught to handle absolute Windows paths, i.e. paths that start with a drive letter and a colon. Unbeknownst to us, while drive letters of physical drives are limited to letters of the English alphabet, there is a way to assign virtual drive letters to arbitrary directories, via the `subst` command, which is _not_ limited to English letters. It is therefore possible to have absolute Windows paths of the form `1:\what\the\hex.txt`. Even "better": pretty much arbitrary Unicode letters can also be used, e.g. `ä:\tschibät.sch`. While it can be sensibly argued that users who set up such funny drive letters really seek adverse consequences, the Windows Operating System is known to be a platform where many users are at the mercy of administrators who have their very own idea of what constitutes a reasonable setup. Therefore, let's just make sure that such funny paths are still considered absolute paths by Git, on Windows. In addition to Unicode characters, pretty much any character is a valid drive letter, as far as `subst` is concerned, even `:` and `"` or even a space character. While it is probably the opposite of smart to use them, let's safeguard `is_dos_drive_prefix()` against all of them. Note: `[::1]:repo` is a valid URL, but not a valid path on Windows. As `[` is now considered a valid drive letter, we need to be very careful to avoid misinterpreting such a string as valid local path in `url_is_local_not_ssh()`. To do that, we use the just-introduced function `is_valid_path()` (which will label the string as invalid file name because of the colon characters). This fixes CVE-2019-1351. Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-08-26mingw: support UNC in git clone file://server/share/repoTorsten Bögershausen1-0/+4
Extend the parser to accept file://server/share/repo in the way that Windows users expect it to be parsed who are used to referring to file shares by UNC paths of the form \\server\share\folder. [jes: tightened check to avoid handling file://C:/some/path as a UNC path.] This closes https://github.com/git-for-windows/git/issues/1264. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-22trace2:data: add trace2 transport child classificationJeff Hostetler1-0/+3
Add trace2 child classification for transport processes. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-02pack-protocol.txt: accept error packets in any contextMasaya Suzuki1-3/+0
In the Git pack protocol definition, an error packet may appear only in a certain context. However, servers can face a runtime error (e.g. I/O error) at an arbitrary timing. This patch changes the protocol to allow an error packet to be sent instead of any packet. Without this protocol spec change, when a server cannot process a request, there's no way to tell that to a client. Since the server cannot produce a valid response, it would be forced to cut a connection without telling why. With this protocol spec change, the server can be more gentle in this situation. An old client may see these error packets as an unexpected packet, but this is not worse than having an unexpected EOF. Following this protocol spec change, the error packet handling code is moved to pkt-line.c. Implementation wise, this implementation uses pkt-line to communicate with a subprocess. Since this is not a part of Git protocol, it's possible that a packet that is not supposed to be an error packet is mistakenly parsed as an error packet. This error packet handling is enabled only for the Git pack protocol parsing code considering this. Signed-off-by: Masaya Suzuki <masayasuzuki@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-29convert "oidcmp() == 0" to oideq()Jeff King1-1/+1
Using the more restrictive oideq() should, in the long run, give the compiler more opportunities to optimize these callsites. For now, this conversion should be a complete noop with respect to the generated code. The result is also perhaps a little more readable, as it avoids the "zero is equal" idiom. Since it's so prevalent in C, I think seasoned programmers tend not to even notice it anymore, but it can sometimes make for awkward double negations (e.g., we can drop a few !!oidcmp() instances here). This patch was generated almost entirely by the included coccinelle patch. This mechanical conversion should be completely safe, because we check explicitly for cases where oidcmp() is compared to 0, which is what oideq() is doing under the hood. Note that we don't have to catch "!oidcmp()" separately; coccinelle's standard isomorphisms make sure the two are treated equivalently. I say "almost" because I did hand-edit the coccinelle output to fix up a few style violations (it mostly keeps the original formatting, but sometimes unwraps long lines). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-23connect.c: mark more strings for translationNguyễn Thái Ngọc Duy1-35/+39
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-23Update messages in preparation for i18nNguyễn Thái Ngọc Duy1-11/+10
Many messages will be marked for translation in the following commits. This commit updates some of them to be more consistent and reduce diff noise in those commits. Changes are - keep the first letter of die(), error() and warning() in lowercase - no full stop in die(), error() or warning() if it's single sentence messages - indentation - some messages are turned to BUG(), or prefixed with "BUG:" and will not be marked for i18n - some messages are improved to give more information - some messages are broken down by sentence to be i18n friendly (on the same token, combine multiple warning() into one big string) - the trailing \n is converted to printf_ln if possible, or deleted if not redundant - errno_errno() is used instead of explicit strerror() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-01Merge branch 'nd/command-list'Junio C Hamano1-0/+1
The list of commands with their various attributes were spread across a few places in the build procedure, but it now is getting a bit more consolidated to allow more automation. * nd/command-list: completion: allow to customize the completable command list completion: add and use --list-cmds=alias completion: add and use --list-cmds=nohelpers Move declaration for alias.c to alias.h completion: reduce completable command list completion: let git provide the completable command list command-list.txt: documentation and guide line help: use command-list.txt for the source of guides help: add "-a --verbose" to list all commands with synopsis git: support --list-cmds=list-<category> completion: implement and use --list-cmds=main,others git --list-cmds: collect command list in a string_list git.c: convert --list-* to --list-cmds=* Remove common-cmds.h help: use command-list.h for common command list generate-cmds.sh: export all commands to command-list.h generate-cmds.sh: factor out synopsis extract code
2018-05-23Merge branch 'bw/server-options'Junio C Hamano1-1/+8
The transport protocol v2 is getting updated further. * bw/server-options: fetch: send server options when using protocol v2 ls-remote: send server options when using protocol v2 serve: introduce the server-option capability
2018-05-21Move declaration for alias.c to alias.hNguyễn Thái Ngọc Duy1-0/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-08Merge branch 'nd/warn-more-for-devs'Junio C Hamano1-1/+1
The build procedure "make DEVELOPER=YesPlease" learned to enable a bit more warning options depending on the compiler used to help developers more. There also is "make DEVOPTS=tokens" knob available now, for those who want to help fixing warnings we usually ignore, for example. * nd/warn-more-for-devs: Makefile: add a DEVOPTS to get all of -Wextra Makefile: add a DEVOPTS to suppress -Werror under DEVELOPER Makefile: detect compiler and enable more warnings in DEVELOPER=1 connect.c: mark die_initial_contact() NORETURN
2018-04-24ls-remote: send server options when using protocol v2Brandon Williams1-1/+8
Teach ls-remote to optionally accept server options by specifying them on the cmdline via '-o' or '--server-option'. These server options are sent to the remote end when querying for the remote end's refs using protocol version 2. If communicating using a protocol other than v2 the provided options are ignored and not sent to the remote end. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-16connect.c: mark die_initial_contact() NORETURNNguyễn Thái Ngọc Duy1-1/+1
There is a series running in parallel with this one that adds code like this switch (...) { case ...: die_initial_contact(); case ...: There is nothing wrong with this. There is no actual falling through. But since gcc is not that smart and gcc 7.x introduces -Wimplicit-fallthrough, it raises a false alarm in this case. This class of warnings may be useful elsewhere, so instead of suppressing the whole class, let's try to fix just this code. gcc is smart enough to realize that no execution can continue after a NORETURN function call and no longer raises the warning. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15connect: don't request v2 when pushingBrandon Williams1-0/+8
In order to be able to ship protocol v2 with only supporting fetch, we need clients to not issue a request to use protocol v2 when pushing (since the client currently doesn't know how to push using protocol v2). This allows a client to have protocol v2 configured in `protocol.version` and take advantage of using v2 for fetch and falling back to using v0 when pushing while v2 for push is being designed. We could run into issues if we didn't fall back to protocol v2 when pushing right now. This is because currently a server will ignore a request to use v2 when contacting the 'receive-pack' endpoint and fall back to using v0, but when push v2 is rolled out to servers, the 'receive-pack' endpoint will start responding using v2. So we don't want to get into a state where a client is requesting to push with v2 before they actually know how to push using v2. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15connect: refactor git_connect to only get the protocol version onceBrandon Williams1-12/+15
Instead of having each builtin transport asking for which protocol version the user has configured in 'protocol.version' by calling `get_protocol_version_config()` multiple times, factor this logic out so there is just a single call at the beginning of `git_connect()`. This will be helpful in the next patch where we can have centralized logic which determines if we need to request a different protocol version than what the user has configured. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15fetch-pack: support shallow requestsBrandon Williams1-0/+22
Enable shallow clones and deepen requests using protocol version 2 if the server 'fetch' command supports the 'shallow' feature. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15connect: request remote refs using v2Brandon Williams1-5/+133
Teach the client to be able to request a remote's refs using protocol v2. This is done by having a client issue a 'ls-refs' request to a v2 server. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14protocol: introduce enum protocol_version value protocol_v2Brandon Williams1-0/+3
Introduce protocol_v2, a new value for 'enum protocol_version'. Subsequent patches will fill in the implementation of protocol_v2. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14connect: discover protocol version outside of get_remote_headsBrandon Williams1-17/+10
In order to prepare for the addition of protocol_v2 push the protocol version discovery outside of 'get_remote_heads()'. This will allow for keeping the logic for processing the reference advertisement for protocol_v1 and protocol_v0 separate from the logic for protocol_v2. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14connect: convert get_remote_heads to use struct packet_readerBrandon Williams1-78/+95
In order to allow for better control flow when protocol_v2 is introduced convert 'get_remote_heads()' to use 'struct packet_reader' to read packet lines. This enables a client to be able to peek the first line of a server's response (without consuming it) in order to determine the protocol version its speaking and then passing control to the appropriate handler. This is needed because the initial response from a server speaking protocol_v0 includes the first ref, while subsequent protocol versions respond with a version line. We want to be able to read this first line without consuming the first ref sent in the protocol_v0 case so that the protocol version the server is speaking can be determined outside of 'get_remote_heads()' in a future patch. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21connect: correct style of C-style commentJonathan Nieder1-1/+2
Documentation/CodingGuidelines explains: - Multi-line comments include their delimiters on separate lines from the text. E.g. /* * A very long * multi-line comment. */ Reported-by: Brandon Williams <bmwill@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21ssh: 'simple' variant does not support --portJonathan Nieder1-3/+12
When trying to connect to an ssh:// URL with port explicitly specified and the ssh command configured with GIT_SSH does not support such a setting, it is less confusing to error out than to silently suppress the port setting and continue. This requires updating the GIT_SSH setting in t5603-clone-dirname.sh. That test is about the directory name produced when cloning various URLs. It uses an ssh wrapper that ignores all its arguments but does not declare that it supports a port argument; update it to set GIT_SSH_VARIANT=ssh to do so. (Real-life ssh wrappers that pass a port argument to OpenSSH would also support -G and would not require such an update.) Reported-by: William Yan <wyan@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21ssh: 'simple' variant does not support -4/-6Jonathan Nieder1-3/+22
If the user passes -4/--ipv4 or -6/--ipv6 to "git fetch" or "git push" and the ssh command configured with GIT_SSH does not support such a setting, error out instead of ignoring the option and continuing. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21ssh: 'auto' variant to select between 'ssh' and 'simple'Jonathan Nieder1-7/+25
Android's "repo" tool is a tool for managing a large codebase consisting of multiple smaller repositories, similar to Git's submodule feature. Starting with Git 94b8ae5a (ssh: introduce a 'simple' ssh variant, 2017-10-16), users noticed that it stopped handling the port in ssh:// URLs. The cause: when it encounters ssh:// URLs, repo pre-connects to the server and sets GIT_SSH to a helper ".repo/repo/git_ssh" that reuses that connection. Before 94b8ae5a, the helper was assumed to support OpenSSH options for lack of a better guess and got passed a -p option to set the port. After that patch, it uses the new default of a simple helper that does not accept an option to set the port. The next release of "repo" will set GIT_SSH_VARIANT to "ssh" to avoid that. But users of old versions and of other similar GIT_SSH implementations would not get the benefit of that fix. So update the default to use OpenSSH options again, with a twist. As observed in 94b8ae5a, we cannot assume that $GIT_SSH always handles OpenSSH options: common helpers such as travis-ci's dpl[*] are configured using GIT_SSH and do not accept OpenSSH options. So make the default a new variant "auto", with the following behavior: 1. First, check for a recognized basename, like today. 2. If the basename is not recognized, check whether $GIT_SSH supports OpenSSH options by running $GIT_SSH -G <options> <host> This returns status 0 and prints configuration in OpenSSH if it recognizes all <options> and returns status 255 if it encounters an unrecognized option. A wrapper script like exec ssh -- "$@" would fail with ssh: Could not resolve hostname -g: Name or service not known , correctly reflecting that it does not support OpenSSH options. The command is run with stdin, stdout, and stderr redirected to /dev/null so even a command that expects a terminal would exit immediately. 3. Based on the result from step (2), behave like "ssh" (if it succeeded) or "simple" (if it failed). This way, the default ssh variant for unrecognized commands can handle both the repo and dpl cases as intended. This autodetection has been running on Google workstations since 2017-10-23 with no reported negative effects. [*] https://github.com/travis-ci/dpl/blob/6c3fddfda1f2a85944c544446b068bac0a77c049/lib/dpl/provider.rb#L215 Reported-by: William Yan <wyan@google.com> Improved-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21connect: split ssh option computation to its own functionJonathan Nieder1-28/+37
This puts the determination of options to pass to each ssh variant (see ssh.variant in git-config(1)) in one place. A follow-up patch will use this in an initial dry run to detect which variant to use when the ssh command is ambiguous. No functional change intended yet. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21connect: split ssh command line options into separate functionJonathan Nieder1-53/+60
The git_connect function is growing long. Split the portion that discovers an ssh command and options it accepts before the service name and path to a separate function to make it easier to read. No functional change intended. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21connect: split git:// setup into a separate functionJonathan Nieder1-44/+59
The git_connect function is growing long. Split the PROTO_GIT-specific portion to a separate function to make it easier to read. No functional change intended. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21connect: move no_fork fallback to git_tcp_connectJonathan Nieder1-15/+21
git_connect has the structure struct child_process *conn = &no_fork; ... switch (protocol) { case PROTO_GIT: if (git_use_proxy(hostandport)) conn = git_proxy_connect(fd, hostandport); else git_tcp_connect(fd, hostandport, flags); ... break; case PROTO_SSH: conn = xmalloc(sizeof(*conn)); child_process_init(conn); argv_array_push(&conn->args, ssh); ... break; ... return conn; In all cases except the git_tcp_connect case, conn is explicitly assigned a value. Make the code clearer by explicitly assigning 'conn = &no_fork' in the tcp case and eliminating the default so the compiler can ensure conn is always correctly assigned. Noticed-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-17ssh: introduce a 'simple' ssh variantBrandon Williams1-47/+61
When using the 'ssh' transport, the '-o' option is used to specify an environment variable which should be set on the remote end. This allows git to send additional information when contacting the server, requesting the use of a different protocol version via the 'GIT_PROTOCOL' environment variable like so: "-o SendEnv=GIT_PROTOCOL". Unfortunately not all ssh variants support the sending of environment variables to the remote end. To account for this, only use the '-o' option for ssh variants which are OpenSSH compliant. This is done by checking that the basename of the ssh command is 'ssh' or the ssh variant is overridden to be 'ssh' (via the ssh.variant config). Other options like '-p' and '-P', which are used to specify a specific port to use, or '-4' and '-6', which are used to indicate that IPV4 or IPV6 addresses should be used, may also not be supported by all ssh variants. Currently if an ssh command's basename wasn't 'plink' or 'tortoiseplink' git assumes that the command is an OpenSSH variant. Since user configured ssh commands may not be OpenSSH compliant, tighten this constraint and assume a variant of 'simple' if the basename of the command doesn't match the variants known to git. The new ssh variant 'simple' will only have the host and command to execute ([username@]host command) passed as parameters to the ssh command. Update the Documentation to better reflect the command-line options sent to ssh commands based on their variant. Reported-by: Jeffrey Yasskin <jyasskin@google.com> Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-17connect: tell server that the client understands v1Brandon Williams1-5/+32
Teach the connection logic to tell a serve that it understands protocol v1. This is done in 2 different ways for the builtin transports, both of which ultimately set 'GIT_PROTOCOL' to 'version=1' on the server. 1. git:// A normal request to git-daemon is structured as "command path/to/repo\0host=..\0" and due to a bug introduced in 49ba83fb6 (Add virtualization support to git-daemon, 2006-09-19) we aren't able to place any extra arguments (separated by NULs) besides the host otherwise the parsing of those arguments would enter an infinite loop. This bug was fixed in 73bb33a94 (daemon: Strictly parse the "extra arg" part of the command, 2009-06-04) but a check was put in place to disallow extra arguments so that new clients wouldn't trigger this bug in older servers. In order to get around this limitation git-daemon was taught to recognize additional request arguments hidden behind a second NUL byte. Requests can then be structured like: "command path/to/repo\0host=..\0\0version=1\0key=value\0". git-daemon can then parse out the extra arguments and set 'GIT_PROTOCOL' accordingly. By placing these extra arguments behind a second NUL byte we can skirt around both the infinite loop bug in 49ba83fb6 (Add virtualization support to git-daemon, 2006-09-19) as well as the explicit disallowing of extra arguments introduced in 73bb33a94 (daemon: Strictly parse the "extra arg" part of the command, 2009-06-04) because both of these versions of git-daemon check for a single NUL byte after the host argument before terminating the argument parsing. 2. ssh://, file:// Set 'GIT_PROTOCOL' environment variable with the desired protocol version. With the file:// transport, 'GIT_PROTOCOL' can be set explicitly in the locally running git-upload-pack or git-receive-pack processes. With the ssh:// transport and OpenSSH compliant ssh programs, 'GIT_PROTOCOL' can be sent across ssh by using '-o SendEnv=GIT_PROTOCOL' and having the server whitelist this environment variable. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-17connect: teach client to recognize v1 server responseBrandon Williams1-4/+26
Teach a client to recognize that a server understands protocol v1 by looking at the first pkt-line the server sends in response. This is done by looking for the response "version 1" send by upload-pack or receive-pack. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-27connect: in ref advertisement, shallows are lastJonathan Tan1-66/+123
Currently, get_remote_heads() parses the ref advertisement in one loop, allowing refs and shallow lines to intersperse, despite this not being allowed by the specification. Refactor get_remote_heads() to use two loops instead, enforcing that refs come first, and then shallows. This also makes it easier to teach get_remote_heads() to interpret other lines in the ref advertisement, which will be done in a subsequent patch. As part of this change, this patch interprets capabilities only on the first line in the ref advertisement, printing a warning message when encountering capabilities on other lines. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-07connect: release strbuf on error return in git_connect()Rene Scharfe1-1/+3
Reduce the scope of the variable cmd and release it before returning early. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-04Merge tag 'v2.13.5' into maintJunio C Hamano1-0/+11
2017-08-01Merge tag 'v2.12.4' into maintJunio C Hamano1-0/+11
2017-07-30Merge tag 'v2.10.4' into maint-2.11Junio C Hamano1-0/+11
Git 2.10.4
2017-07-30Merge tag 'v2.9.5' into maint-2.10Junio C Hamano1-0/+11
Git 2.9.5
2017-07-30Merge tag 'v2.7.6' into maint-2.8Junio C Hamano1-0/+11
Git 2.7.6
2017-07-28connect: reject paths that look like command line optionsJeff King1-0/+3
If we get a repo path like "-repo.git", we may try to invoke "git-upload-pack -repo.git". This is going to fail, since upload-pack will interpret it as a set of bogus options. But let's reject this before we even run the sub-program, since we would not want to allow any mischief with repo names that actually are real command-line options. You can still ask for such a path via git-daemon, but there's no security problem there, because git-daemon enters the repo itself and then passes "." on the command line. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-28connect: reject dashed arguments for proxy commandsJeff King1-0/+5
If you have a GIT_PROXY_COMMAND configured, we will run it with the host/port on the command-line. If a URL contains a mischievous host like "--foo", we don't know how the proxy command may handle it. It's likely to break, but it may also do something dangerous and unwanted (technically it could even do something useful, but that seems unlikely). We should err on the side of caution and reject this before we even run the command. The hostname check matches the one we do in a similar circumstance for ssh. The port check is not present for ssh, but there it's not necessary because the syntax is "-p <port>", and there's no ambiguity on the parsing side. It's not clear whether you can actually get a negative port to the proxy here or not. Doing: git fetch git://remote:-1234/repo.git keeps the "-1234" as part of the hostname, with the default port of 9418. But it's a good idea to keep this check close to the point of running the command to make it clear that there's no way to circumvent it (and at worst it serves as a belt-and-suspenders check). Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-28connect: factor out "looks like command line option" checkJeff King1-1/+1
We reject hostnames that start with a dash because they may be confused for command-line options. Let's factor out that notion into a helper function, as we'll use it in more places. And while it's simple now, it's not clear if some systems might need more complex logic to handle all cases. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-28connect: reject ssh hostname that begins with a dashJunio C Hamano1-0/+3
When commands like "git fetch" talk with ssh://$rest_of_URL/, the code splits $rest_of_URL into components like host, port, etc., and then spawns the underlying "ssh" program by formulating argv[] array that has: - the path to ssh command taken from GIT_SSH_COMMAND, etc. - dashed options like '-batch' (for Tortoise), '-p <port>' as needed. - ssh_host, which is supposed to be the hostname parsed out of $rest_of_URL. - then the command to be run on the other side, e.g. git upload-pack. If the ssh_host ends up getting '-<anything>', the argv[] that is used to spawn the command becomes something like: { "ssh", "-p", "22", "-<anything>", "command", "to", "run", NULL } which obviously is bogus, but depending on the actual value of "<anything>", will make "ssh" parse and use it as an option. Prevent this by forbidding ssh_host that begins with a "-". Noticed-by: Joern Schneeweisz of Recurity Labs Reported-by: Brian at GitLab Signed-off-by: Junio C Hamano <gitster@pobox.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-24Merge branch 'bw/config-h'Junio C Hamano1-0/+1
Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h
2017-06-15config: don't include config.h by defaultBrandon Williams1-0/+1
Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13Merge branch 'jk/connect-symref-info-leak-fix' into maintJunio C Hamano1-1/+1
Leakfix. * jk/connect-symref-info-leak-fix: connect.c: fix leak in parse_one_symref_info()
2017-06-05Merge branch 'jk/connect-symref-info-leak-fix'Junio C Hamano1-1/+1
Leakfix. * jk/connect-symref-info-leak-fix: connect.c: fix leak in parse_one_symref_info()
2017-05-26connect.c: fix leak in parse_one_symref_info()Jeff King1-1/+1
If we successfully parse a symref value like "HEAD:refs/heads/master", we add the result to a string list. But because the string list is marked STRING_LIST_INIT_DUP, the string list code will make a copy of the string and add the copy. This patch fixes it by adding the entry with string_list_append_nodup(), which lets the string list take ownership of our newly allocated string. There are two alternatives that seem like they would work, but aren't the right solution. The first is to initialize the list with the "NODUP" initializer. That would avoid the copy, but then the string list would not realize that it owns the strings. When we eventually call string_list_clear(), it would not free the strings, causing a leak. The second option would be to use the normal string_list_append(), but free the local copy in our function. We can't do this because the local copy actually contains _two_ strings; the symref name and its target. We point to the target pointer via the "util" field, and its memory must last as long as the string list does. You may also wonder whether it's safe to ever free the local copy, since the target points into it. The answer is yes, because we duplicate it in annotaate_refs_with_symref_info before clearing the string list. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-26Merge branch 'sf/putty-w-args'Junio C Hamano1-1/+3
Plug a memleak. * sf/putty-w-args: connect.c: fix leak in handle_ssh_variant
2017-04-20connect.c: fix leak in handle_ssh_variantJeff King1-1/+3
When we see an error from split_cmdline(), we exit the function without freeing the copy of the command string we made. This was sort-of introduced by 22e5ae5c8 (connect.c: handle errors from split_cmdline, 2017-04-10). The leak existed before that, but before that commit fixed the bug, we could never trigger this else clause in the first place. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-19Merge branch 'sf/putty-w-args'Junio C Hamano1-1/+1
* sf/putty-w-args: connect.c: handle errors from split_cmdline
2017-04-16connect.c: handle errors from split_cmdlineJeff King1-1/+1
Commit e9d9a8a4d (connect: handle putty/plink also in GIT_SSH_COMMAND, 2017-01-02) added a call to split_cmdline(), but checks only for a non-zero return to see if we got any output. Since the function returns negative values (and a NULL argv) on error, we end up dereferencing NULL and segfaulting. Arguably we could report on the parsing error here, but it's probably not worth it. This is a best-effort attempt to see if we are using plink. So we can simply return here with "no, it wasn't plink" and let the shell actually complain about the bogus quoting. Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-31Rename sha1_array to oid_arraybrian m. carlson1-4/+4
Since this structure handles an array of object IDs, rename it to struct oid_array. Also rename the accessor functions and the initialization constant. This commit was produced mechanically by providing non-Documentation files to the following Perl one-liners: perl -pi -E 's/struct sha1_array/struct oid_array/g' perl -pi -E 's/\bsha1_array_/oid_array_/g' perl -pi -E 's/SHA1_ARRAY_INIT/OID_ARRAY_INIT/g' Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-31Make sha1_array_append take a struct object_id *brian m. carlson1-2/+2
Convert the callers to pass struct object_id by changing the function declaration and definition and applying the following semantic patch: @@ expression E1, E2; @@ - sha1_array_append(E1, E2.hash) + sha1_array_append(E1, &E2) @@ expression E1, E2; @@ - sha1_array_append(E1, E2->hash) + sha1_array_append(E1, E2) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-10connect.c: stop conflating ssh command names and overridesJunio C Hamano1-13/+32
dd33e07766 ("connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config", 2017-02-01) attempted to add support for configuration and environment variable to override the different handling of port_option and needs_batch settings suitable for variants of the ssh implementation that was autodetected by looking at the ssh command name. Because it piggybacked on the code that turns command name to specific override (e.g. "plink.exe" and "plink" means port_option needs to be set to 'P' instead of the default 'p'), yet it defined a separate namespace for these overrides (e.g. "putty" can be usable to signal that port_option needs to be 'P'), however, it made the auto-detection based on the command name less robust (e.g. the code now accepts "putty" as a SSH command name and applies the same override). Separate the code that interprets the override that was read from the configuration & environment from the original code that handles the command names, as they are in separate namespaces, to fix this confusion. This incidentally also makes it easier for future enhancement of the override syntax (e.g. "port_option=p,needs_batch=1" may want to be accepted as a more explicit syntax) without affecting the code for auto-detection based on the command name. While at it, update the return type of the handle_ssh_variant() helper function to void; the caller does not use it, and the function does not return any meaningful value. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-01connect: Add the envvar GIT_SSH_VARIANT and ssh.variant configSegev Finer1-3/+8
This environment variable and configuration value allow to override the autodetection of plink/tortoiseplink in case that Git gets it wrong. [jes: wrapped overly-long lines, factored out and changed get_ssh_variant() to handle_ssh_variant() to accomodate the change from the putty/tortoiseplink variables to port_option/needs_batch, adjusted the documentation, free()d value obtained from the config.] Signed-off-by: Segev Finer <segev208@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-01git_connect(): factor out SSH variant handlingJohannes Schindelin1-26/+46
We handle plink and tortoiseplink as OpenSSH replacements, by passing the correct command-line options when detecting that they are used. To let users override that auto-detection (in case Git gets it wrong), we need to introduce new code to that end. In preparation for this code, let's factor out the SSH variant handling into its own function, handle_ssh_variant(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-26connect: rename tortoiseplink and putty variablesJunio C Hamano1-10/+13
One of these two may have originally been named after "what exact SSH implementation do we have?" so that we can tweak the command line options for that exact implementation. But "putty=1" no longer means "We are using the plink SSH implementation that comes with PuTTY" these days. It is set when we guess that either PuTTY plink or Tortoiseplink is in use. Rename them after what effect is desired. The current 'putty' option is about using "-P <port>" when OpenSSH would use "-p <port>", so rename it to 'port_option' whose value is either 'p' or 'P". The other one is about passing an extra command line option "-batch", so rename it to 'needs_batch'. [jes: wrapped overly-long line] Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-25connect: handle putty/plink also in GIT_SSH_COMMANDSegev Finer1-7/+16
Git for Windows has special support for the popular SSH client PuTTY: when using PuTTY's non-interactive version ("plink.exe"), we use the -P option to specify the port rather than OpenSSH's -p option. TortoiseGit ships with its own, forked version of plink.exe, that adds support for the -batch option, and for good measure we special-case that, too. However, this special-casing of PuTTY only covers the case where the user overrides the SSH command via the environment variable GIT_SSH (which allows specifying the name of the executable), not GIT_SSH_COMMAND (which allows specifying a full command, including additional command-line options). When users want to pass any additional arguments to (Tortoise-)Plink, such as setting a private key, they are required to either use a shell script named plink or tortoiseplink or duplicate the logic that is already in Git for passing the correct style of command line arguments, which can be difficult, error prone and annoying to get right. This patch simply reuses the existing logic and expands it to cover GIT_SSH_COMMAND, too. Note: it may look a little heavy-handed to duplicate the entire command-line and then split it, only to extract the name of the executable. However, this is not a performance-critical code path, and the code is much more readable this way. Signed-off-by: Segev Finer <segev208@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-31Merge branch 'ls/filter-process'Junio C Hamano1-1/+1
The smudge/clean filter API expect an external process is spawned to filter the contents for each path that has a filter defined. A new type of "process" filter API has been added to allow the first request to run the filter for a path to spawn a single process, and all filtering need is served by this single process for multiple paths, reducing the process creation overhead. * ls/filter-process: contrib/long-running-filter: add long running filter example convert: add filter.<driver>.process option convert: prepare filter.<driver>.process option convert: make apply_filter() adhere to standard Git error handling pkt-line: add functions to read/write flush terminated packet streams pkt-line: add packet_write_gently() pkt-line: add packet_flush_gently() pkt-line: add packet_write_fmt_gently() pkt-line: extract set_packet_header() pkt-line: rename packet_write() to packet_write_fmt() run-command: add clean_on_exit_handler run-command: move check_pipe() from write_or_die to run_command convert: modernize tests convert: quote filter names in error messages
2016-10-17pkt-line: rename packet_write() to packet_write_fmt()Lars Schneider1-1/+1
packet_write() should be called packet_write_fmt() because it is a printf-like function that takes a format string as first parameter. packet_write_fmt() should be used for text strings only. Arbitrary binary data should use a new packet_write() function that is introduced in a subsequent patch. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26Merge branch 'va/i18n-more'Junio C Hamano1-4/+4
Even more i18n. * va/i18n-more: i18n: stash: mark messages for translation i18n: notes-merge: mark die messages for translation i18n: ident: mark hint for translation i18n: i18n: diff: mark die messages for translation i18n: connect: mark die messages for translation i18n: commit: mark message for translation
2016-09-21Merge branch 'jt/accept-capability-advertisement-when-fetching-from-void'Junio C Hamano1-6/+26
JGit can show a fake ref "capabilities^{}" to "git fetch" when it does not advertise any refs, but "git fetch" was not prepared to see such an advertisement. When the other side disconnects without giving any ref advertisement, we used to say "there may not be a repository at that URL", but we may have seen other advertisement like "shallow" and ".have" in which case we definitely know that a repository is there. The code to detect this case has also been updated. * jt/accept-capability-advertisement-when-fetching-from-void: connect: advertized capability is not a ref connect: tighten check for unexpected early hang up tests: move test_lazy_prereq JGIT to test-lib.sh
2016-09-19i18n: connect: mark die messages for translationVasco Almeida1-4/+4
Mark messages passed to die() in die_initial_contact(). Update test to reflect changes. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-09connect: advertized capability is not a refJonathan Tan1-0/+14
When cloning an empty repository served by standard git, "git clone" produces the following reassuring message: $ git clone git://localhost/tmp/empty Cloning into 'empty'... warning: You appear to have cloned an empty repository. Checking connectivity... done. Meanwhile when cloning an empty repository served by JGit, the output is more haphazard: $ git clone git://localhost/tmp/empty Cloning into 'empty'... Checking connectivity... done. warning: remote HEAD refers to nonexistent ref, unable to checkout. This is a common command to run immediately after creating a remote repository as preparation for adding content to populate it and pushing. The warning is confusing and needlessly worrying. The cause is that, since v3.1.0.201309270735-rc1~22 (Advertise capabilities with no refs in upload service., 2013-08-08), JGit's ref advertisement includes a ref named capabilities^{} to advertise its capabilities on, while git's ref advertisement is empty in this case. This allows the client to learn about the server's capabilities and is needed, for example, for fetch-by-sha1 to work when no refs are advertised. This also affects "ls-remote". For example, against an empty repository served by JGit: $ git ls-remote git://localhost/tmp/empty 0000000000000000000000000000000000000000 capabilities^{} Git advertises the same capabilities^{} ref in its ref advertisement for push but since it never did so for fetch, the client didn't need to handle this case. Handle it. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Helped-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-09connect: tighten check for unexpected early hang upJonathan Nieder1-6/+12
A server hanging up immediately to mark access being denied does not send any .have refs, shallow lines, or anything else before hanging up. If the server has sent anything, then the hangup is unexpected. That is, if the server hangs up after a shallow line but before sending any refs, then git should tell me so: fatal: The remote end hung up upon initial contact instead of suggesting an access control problem: fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. Noticed while examining this code. This case isn't likely to come up in practice but tightening the check makes the code easier to read and manipulate. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06connect: read $GIT_SSH_COMMAND from config fileNguyễn Thái Ngọc Duy1-1/+14
Similar to $GIT_ASKPASS or $GIT_PROXY_COMMAND, we also read from config file first then fall back to $GIT_SSH_COMMAND. This is useful for selecting different private keys targetting the same host (e.g. github) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-16Merge branch 'cn/deprecate-ssh-git-url'Junio C Hamano1-2/+2
The two alternative ways to spell "ssh://" transport have been deprecated for a long time. The last mention of them has finally removed from the documentation. * cn/deprecate-ssh-git-url: Disown ssh+git and git+ssh
2016-03-09Disown ssh+git and git+sshCarlos Martín Nieto1-2/+2
Some people argue that these were silly from the beginning (see http://thread.gmane.org/gmane.comp.version-control.git/285590/focus=285601 for example), but we have to support them for compatibility. That doesn't mean we have to show them in the documentation. These were already left out of the main list, but a reference in the main manpage was left, so remove that. Also add a note to discourage their use if anybody goes looking for them in the source code. Signed-off-by: Carlos Martín Nieto <cmn@dwim.me> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-12connect & http: support -4 and -6 switches for remote operationsEric Wong1-0/+8
Sometimes it is necessary to force IPv4-only or IPv6-only operation on networks where name lookups may return a non-routable address and stall remote operations. The ssh(1) command has an equivalent switches which we may pass when we run them. There may be old ssh(1) implementations out there which do not support these switches; they should report the appropriate error in that case. rsync support is untouched for now since it is deprecated and scheduled to be removed. Signed-off-by: Eric Wong <normalperson@yhbt.net> Reviewed-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-20get_remote_heads: convert to struct object_idbrian m. carlson1-10/+12
Replace an unsigned char array with struct object_id and express several hard-coded constants in terms of GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
2015-11-20Convert struct ref to use object_id.brian m. carlson1-1/+1
Use struct object_id in three fields in struct ref and convert all the necessary places that use it. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
2015-10-20Merge branch 'jk/war-on-sprintf'Junio C Hamano1-1/+1
Many allocations that is manually counted (correctly) that are followed by strcpy/sprintf have been replaced with a less error prone constructs such as xstrfmt. Macintosh-specific breakage was noticed and corrected in this reroll. * jk/war-on-sprintf: (70 commits) name-rev: use strip_suffix to avoid magic numbers use strbuf_complete to conditionally append slash fsck: use for_each_loose_file_in_objdir Makefile: drop D_INO_IN_DIRENT build knob fsck: drop inode-sorting code convert strncpy to memcpy notes: document length of fanout path with a constant color: add color_set helper for copying raw colors prefer memcpy to strcpy help: clean up kfmclient munging receive-pack: simplify keep_arg computation avoid sprintf and strcpy with flex arrays use alloc_ref rather than hand-allocating "struct ref" color: add overflow checks for parsing colors drop strcpy in favor of raw sha1_to_hex use sha1_to_hex_r() instead of strcpy daemon: use cld->env_array when re-spawning stat_tracking_info: convert to argv_array http-push: use an argv_array for setup_revisions fetch-pack: use argv_array for index-pack / unpack-objects ...
2015-10-14Merge branch 'tk/typofix-connect-unknown-proto-error'Junio C Hamano1-1/+1
* tk/typofix-connect-unknown-proto-error: connect: fix typo in result string of prot_name()
2015-10-05Sync with 2.6.1Junio C Hamano1-0/+6
2015-09-28Sync with 2.3.10Junio C Hamano1-0/+5
2015-09-25convert trivial sprintf / strcpy calls to xsnprintfJeff King1-1/+1
We sometimes sprintf into fixed-size buffers when we know that the buffer is large enough to fit the input (either because it's a constant, or because it's numeric input that is bounded in size). Likewise with strcpy of constant strings. However, these sites make it hard to audit sprintf and strcpy calls for buffer overflows, as a reader has to cross-reference the size of the array with the input. Let's use xsnprintf instead, which communicates to a reader that we don't expect this to overflow (and catches the mistake in case we do). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25connect: fix typo in result string of prot_name()Tobias Klauser1-1/+1
Replace 'unkown' with 'unknown'. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-23transport: add a protocol-whitelist environment variableJeff King1-0/+5
If we are cloning an untrusted remote repository into a sandbox, we may also want to fetch remote submodules in order to get the complete view as intended by the other side. However, that opens us up to attacks where a malicious user gets us to clone something they would not otherwise have access to (this is not necessarily a problem by itself, but we may then act on the cloned contents in a way that exposes them to the attacker). Ideally such a setup would sandbox git entirely away from high-value items, but this is not always practical or easy to set up (e.g., OS network controls may block multiple protocols, and we would want to enable some but not others). We can help this case by providing a way to restrict particular protocols. We use a whitelist in the environment. This is more annoying to set up than a blacklist, but defaults to safety if the set of protocols git supports grows). If no whitelist is specified, we continue to default to allowing all protocols (this is an "unsafe" default, but since the minority of users will want this sandboxing effect, it is the only sensible one). A note on the tests: ideally these would all be in a single test file, but the git-daemon and httpd test infrastructure is an all-or-nothing proposition rather than a test-by-test prerequisite. By putting them all together, we would be unable to test the file-local code on machines without apache. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-08git_connect: clarify conn->use_shell flagJeff King1-9/+13
When executing user-specified programs, we generally always want to use a shell, for flexibility and consistency. One big exception is executing $GIT_SSH, which for historical reasons must not use a shell. Once upon a time the logic in git_connect looked like: if (protocol == PROTO_SSH) { ... setup ssh ... } else { ... setup local connection ... conn->use_shell = 1; } But over time the PROTO_SSH block has grown, and the "local" block has shrunk so that it contains only conn->use_shell; it's easy to miss at the end of the large block. Moreover, PROTO_SSH now also sometimes sets use_shell, when the new GIT_SSH_COMMAND is used. Let's just set conn->use_shell when we're setting up the "conn" struct, and unset it (with a comment) in the historical GIT_SSH case. This will make the flow easier to follow. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-04git_connect: clear GIT_* environment for sshJeff King1-2/+2
When we "switch" to another local repository to run the server side of a fetch or push, we must clear the variables in local_repo_env so that our local $GIT_DIR, etc, do not pollute the upload-pack or receive-pack that is executing in the "remote" repository. We have never done so for ssh connections. For the most part, nobody has noticed because ssh will not pass unknown environment variables by default. However, it is not out of the question for a user to configure ssh to pass along GIT_* variables using SendEnv/AcceptEnv. We can demonstrate the problem by using "git -c" on a local command and seeing its impact on a remote repository. This config ends up in $GIT_CONFIG_PARAMETERS. In the local case, the config has no impact, but in the ssh transport, it does (our test script has a fake ssh that passes through all environment variables; this isn't normal, but does simulate one possible setup). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-05Merge branch 'bc/connect-plink' into maintJunio C Hamano1-21/+33
The connection initiation code for "ssh" transport tried to absorb differences between the stock "ssh" and Putty-supplied "plink" and its derivatives, but the logic to tell that we are using "plink" variants were too loose and falsely triggered when "plink" appeared anywhere in the path (e.g. "/home/me/bin/uplink/ssh"). * bc/connect-plink: connect: improve check for plink to reduce false positives t5601: fix quotation error leading to skipped tests connect: simplify SSH connection code path
2015-05-19Merge branch 'bc/connect-plink'Junio C Hamano1-21/+33
The connection initiation code for "ssh" transport tried to absorb differences between the stock "ssh" and Putty-supplied "plink" and its derivatives, but the logic to tell that we are using "plink" variants were too loose and falsely triggered when "plink" appeared anywhere in the path (e.g. "/home/me/bin/uplink/ssh"). * bc/connect-plink: connect: improve check for plink to reduce false positives t5601: fix quotation error leading to skipped tests connect: simplify SSH connection code path
2015-04-28connect: improve check for plink to reduce false positivesbrian m. carlson1-3/+15
The git_connect function has code to handle plink and tortoiseplink specially, as they require different command line arguments from OpenSSH (-P instead of -p for ports; tortoiseplink additionally requires -batch). However, the match was done by checking for "plink" anywhere in the string, which led to a GIT_SSH value containing "uplink" being treated as an invocation of putty's plink. Improve the check by looking for "plink" or "tortoiseplink" (or those names suffixed with ".exe") only in the final component of the path. This has the downside that a program such as "plink-0.63" would no longer be recognized, but the increased robustness is likely worth it. Add tests to cover these cases to avoid regressions. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-04-28connect: simplify SSH connection code pathbrian m. carlson1-20/+20
The code path used in git_connect pushed the majority of the SSH connection code into an else block, even though the if block returns. Simplify the code by eliminating the else block, as it is unneeded. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-04-27Merge branch 'tb/connect-ipv6-parse-fix' into maintJunio C Hamano1-0/+2
An earlier update to the parser that disects a URL broke an address, followed by a colon, followed by an empty string (instead of the port number), e.g. ssh://example.com:/path/to/repo. * tb/connect-ipv6-parse-fix: connect.c: ignore extra colon after hostname
2015-04-20Merge branch 'tb/connect-ipv6-parse-fix'Junio C Hamano1-0/+2
An earlier update to the parser that disects an address broke an address, followed by a colon, followed by an empty string (instead of the port number). * tb/connect-ipv6-parse-fix: connect.c: ignore extra colon after hostname
2015-04-08connect.c: ignore extra colon after hostnameTorsten Bögershausen1-0/+2
Ignore an extra ':' at the end of the hostname in URL's like "ssh://example.com:/path/to/repo" The colon is meant to separate a port number from the hostname. If the port is empty, the colon should be ignored, see RFC 3986. It had been working for URLs with ssh:// scheme, but was unintentionally broken in 86ceb3, "allow ssh://user@[2001:db8::1]/repo.git" Reported-by: Reid Woodbury Jr. <reidw@rawsound.com> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-23Merge branch 'tb/connect-ipv6-parse-fix' into maintJunio C Hamano1-44/+70
We did not parse username followed by literal IPv6 address in SSH transport URLs, e.g. ssh://user@[2001:db8::1]:22/repo.git correctly. * tb/connect-ipv6-parse-fix: t5500: show user name and host in diag-url t5601: add more test cases for IPV6 connect.c: allow ssh://user@[2001:db8::1]/repo.git
2015-03-13Merge branch 'jk/daemon-interpolate' into maintJunio C Hamano1-1/+11
The "interpolated-path" option of "git daemon" inserted any string client declared on the "host=" capability request without checking. Sanitize and limit %H and %CH to a saner and a valid DNS name. * jk/daemon-interpolate: daemon: sanitize incoming virtual hostname t5570: test git-daemon's --interpolated-path option git_connect: let user override virtual-host we send to daemon
2015-03-10connect.c: do not leak "conn" after showing diagnosisStefan Beller1-0/+1
When git_connect() is called to see how the URL is parsed for debugging purposes with CONNECT_DIAG_URL set, the variable conn is leaked. At this point in the codeflow, it only has its memory and no other resource is associated with it, so it is sufficient to clean it up by just freeing it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-06Merge branch 'rs/simple-cleanups' into maintJunio C Hamano1-2/+1
Code cleanups. * rs/simple-cleanups: sha1_name: use strlcpy() to copy strings pretty: use starts_with() to check for a prefix for-each-ref: use skip_prefix() to avoid duplicate string comparison connect: use strcmp() for string comparison
2015-03-05Merge branch 'tb/connect-ipv6-parse-fix'Junio C Hamano1-44/+70
We did not parse username followed by literal IPv6 address in SSH transport URLs, e.g. ssh://user@[2001:db8::1]:22/repo.git correctly. * tb/connect-ipv6-parse-fix: t5500: show user name and host in diag-url t5601: add more test cases for IPV6 connect.c: allow ssh://user@[2001:db8::1]/repo.git
2015-03-05Merge branch 'rs/simple-cleanups'Junio C Hamano1-2/+1
Code cleanups. * rs/simple-cleanups: sha1_name: use strlcpy() to copy strings pretty: use starts_with() to check for a prefix for-each-ref: use skip_prefix() to avoid duplicate string comparison connect: use strcmp() for string comparison
2015-03-03Merge branch 'jk/daemon-interpolate'Junio C Hamano1-1/+11
The "interpolated-path" option of "git daemon" inserted any string client declared on the "host=" capability request without checking. Sanitize and limit %H and %CH to a saner and a valid DNS name. * jk/daemon-interpolate: daemon: sanitize incoming virtual hostname t5570: test git-daemon's --interpolated-path option git_connect: let user override virtual-host we send to daemon
2015-02-22t5500: show user name and host in diag-urlTorsten Bögershausen1-12/+23
The URL for ssh may have include a username before the hostname, like ssh://user@host/repo. When literal IPV6 addresses are used together with a username, the substring "user@[::1]" must be converted into "user@::1". Make that conversion visible for the user, and write userandhost in the diagnostics Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-22connect.c: allow ssh://user@[2001:db8::1]/repo.gitTorsten Bögershausen1-25/+38
The ssh:// syntax was added in 2386d658 (Add first cut at "git protocol" connect logic., 2005-07-13), it accepted ssh://user@2001:db8::1/repo.git, which is now legacy. Over the years the parser was improved to support [] and port numbers, but the combination of ssh://user@[2001:db8::1]:222/repo.git did never work. The only only way to use a user name, a literall IPV6 address and a port number was ssh://[user@2001:db8::1]:222/repo.git (Thanks to Christian Taube <lists@hcf.yourweb.de> for reporting this long standing issue) New users would use ssh://user@[2001:db8::1]:222/repo.git, so change the parser to handle it correctly. Support the old legacy URLs as well, to be backwards compatible, and avoid regressions for users which upgrade an existing installation to a later Git version. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-22connect: use strcmp() for string comparisonRené Scharfe1-2/+1
Get rid of magic string length constants and simply compare the strings using strcmp(). This makes the intent of the code a bit clearer. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17git_connect: let user override virtual-host we send to daemonJeff King1-1/+11
When we connect to a git-daemon at a given host and port, we actually send the string "localhost:9418" to the other side, which allows it to do virtual-hosting lookups. For testing and debugging, we'd like to be able to send arbitrary strings, rather than the hostname we actually connected to. Using "insteadOf" config does not work for this purpose, as the hostname determination happens at a very low level, right before we feed the hostname to our lookup routines. You could use /etc/hosts or similar to get around this, but we cannot do that portably from our test suite. Instead, this patch provides an environment variable that can be used to send an arbitrary string. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-22Merge branch 'mh/simplify-repack-without-refs'Junio C Hamano1-1/+1
"git remote update --prune" to drop many refs has been optimized. * mh/simplify-repack-without-refs: sort_string_list(): rename to string_list_sort() prune_remote(): iterate using for_each_string_list_item() prune_remote(): rename local variable repack_without_refs(): make the refnames argument a string_list prune_remote(): sort delete_refs_list references en masse prune_remote(): initialize both delete_refs lists in a single loop prune_remote(): exit early if there are no stale references
2014-11-25sort_string_list(): rename to string_list_sort()Michael Haggerty1-1/+1
The new name is more consistent with the names of other string_list-related functions. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-10git_connect: set ssh shell command in GIT_SSH_COMMANDThomas Quinot1-3/+12
It may be impractical to install a wrapper script for GIT_SSH when additional parameters need to be passed. Provide an alternative way of specifying a shell command to be run, including command line arguments, by means of the GIT_SSH_COMMAND environment variable, which behaves like GIT_SSH but is passed to the shell. The special circuitry to modify parameters in the case of using PuTTY's plink/tortoiseplink is activated only when using GIT_SSH; in the case of using GIT_SSH_COMMAND, it is deliberately left up to the user to make any required parameters adaptation before calling the underlying ssh implementation. Signed-off-by: Thomas Quinot <thomas@quinot.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-19Merge branch 'rs/more-uses-of-skip-prefix'Junio C Hamano1-10/+6
Code clean-up. * rs/more-uses-of-skip-prefix: pack-write: simplify index_pack_lockfile using skip_prefix() and xstrfmt() connect: simplify check_ref() using skip_prefix() and starts_with()
2014-09-02connect: simplify check_ref() using skip_prefix() and starts_with()René Scharfe1-10/+6
Both callers of check_ref() pass in NUL-terminated strings for name. Remove the len parameter and then use skip_prefix() and starts_with() instead of memcmp() to check if it starts with certain strings. This gets rid of several magic string length constants and a strlen() call. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-20run-command: introduce child_process_init()René Scharfe1-2/+4
Add a helper function for initializing those struct child_process variables for which the macro CHILD_PROCESS_INIT can't be used. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-20run-command: introduce CHILD_PROCESS_INITRené Scharfe1-1/+1
Most struct child_process variables are cleared using memset first after declaration. Provide a macro, CHILD_PROCESS_INIT, that can be used to initialize them statically instead. That's shorter, doesn't require a function call and is slightly more readable (especially given that we already have STRBUF_INIT, ARGV_ARRAY_INIT etc.). Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-21Merge branch 'maint'Junio C Hamano1-3/+1
* maint: use xmemdupz() to allocate copies of strings given by start and length use xcalloc() to allocate zero-initialized memory
2014-07-21use xmemdupz() to allocate copies of strings given by start and lengthRené Scharfe1-3/+1
Use xmemdupz() to allocate the memory, copy the data and make sure to NUL-terminate the result, all in one step. The resulting code is shorter, doesn't contain the constants 1 and '\0', and avoids duplicating function parameters. For blame, the last copied byte (o->file.ptr[o->file.size]) is always set to NUL by fake_working_tree_commit() or read_sha1_file(), so no information is lost by the conversion to using xmemdupz(). Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20use skip_prefix to avoid magic numbersJeff King1-5/+6
It's a common idiom to match a prefix and then skip past it with a magic number, like: if (starts_with(foo, "bar")) foo += 3; This is easy to get wrong, since you have to count the prefix string yourself, and there's no compiler check if the string changes. We can use skip_prefix to avoid the magic numbers here. Note that some of these conversions could be much shorter. For example: if (starts_with(arg, "--foo=")) { bar = arg + 6; continue; } could become: if (skip_prefix(arg, "--foo=", &bar)) continue; However, I have left it as: if (skip_prefix(arg, "--foo=", &v)) { bar = v; continue; } to visually match nearby cases which need to actually process the string. Like: if (skip_prefix(arg, "--foo=", &v)) { bar = atoi(v); continue; } Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-15git_connect: use argv_arrayJeff King1-18/+10
This avoids magic numbers when we allocate fixed-size argv arrays, and makes it more obvious that we are not overflowing. It is also the first step to fixing a memory leak. When git_connect returns a child_process struct, the argv array in the struct is dynamically allocated, but the individual strings are not (they are either owned elsewhere, or are freed). Later, in finish_connect, we free the array but leave the strings alone. This works for the child_process created by git_connect, but if we use transport_take_over, we may also end up with a child_process created by transport-helper's get_helper. In that case, the strings are freshly allocated, and we would want to free them. However, we have no idea in finish_connect which type we have. By consistently using run-command's internal argv-array, we do not have to worry about this issue at all; finish_command takes care of it for us, and we can drop our manual free entirely. Note that this actually makes the get_helper leak slightly worse; now we are leaking both the strings and the array. But when we adjust it in a future patch, that leak will go away entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-25Merge branch 'nd/indent-fix-connect-c'Junio C Hamano1-1/+1
* nd/indent-fix-connect-c: connect.c: SP after "}", not TAB
2014-03-13connect.c: SP after "}", not TABNguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-17Merge branch 'nd/shallow-clone'Junio C Hamano1-9/+13
Fetching from a shallow-cloned repository used to be forbidden, primarily because the codepaths involved were not carefully vetted and we did not bother supporting such usage. This attempts to allow object transfer out of a shallow-cloned repository in a controlled way (i.e. the receiver become a shallow repository with truncated history). * nd/shallow-clone: (31 commits) t5537: fix incorrect expectation in test case 10 shallow: remove unused code send-pack.c: mark a file-local function static git-clone.txt: remove shallow clone limitations prune: clean .git/shallow after pruning objects clone: use git protocol for cloning shallow repo locally send-pack: support pushing from a shallow clone via http receive-pack: support pushing to a shallow clone via http smart-http: support shallow fetch/clone remote-curl: pass ref SHA-1 to fetch-pack as well send-pack: support pushing to a shallow clone receive-pack: allow pushes that update .git/shallow connected.c: add new variant that runs with --shallow-file add GIT_SHALLOW_FILE to propagate --shallow-file to subprocesses receive/send-pack: support pushing from a shallow clone receive-pack: reorder some code in unpack() fetch: add --update-shallow to accept refs that update .git/shallow upload-pack: make sure deepening preserves shallow roots fetch: support fetching from a shallow repository clone: support remote shallow repository ...
2013-12-17Merge branch 'tb/clone-ssh-with-colon-for-port'Junio C Hamano1-113/+136
Be more careful when parsing remote repository URL given in the scp-style host:path notation. * tb/clone-ssh-with-colon-for-port: git_connect(): use common return point connect.c: refactor url parsing git_connect(): refactor the port handling for ssh git fetch: support host:/~repo t5500: add test cases for diag-url git fetch-pack: add --diag-url git_connect: factor out discovery of the protocol and its parts git_connect: remove artificial limit of a remote command t5601: add tests for ssh t5601: remove clear_ssh, refactor setup_ssh_wrapper
2013-12-10connect.c: teach get_remote_heads to parse "shallow" linesNguyễn Thái Ngọc Duy1-1/+11
No callers pass a non-empty pointer as shallow_points at this stage. As a result, all clients still refuse to talk to shallow repository on the other end. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10remote.h: replace struct extra_have_objects with struct sha1_arrayNguyễn Thái Ngọc Duy1-9/+3
The latter can do everything the former can and is used in many more places. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09git_connect(): use common return pointTorsten Bögershausen1-58/+50
Use only one return point from git_connect(), doing the free(); return conn; only at one place in the code. There may be a little confusion what the variable "host" is for. At some places it is only the host part, at other places it may include the port number, so change host into hostandport here. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09connect.c: refactor url parsingTorsten Bögershausen1-27/+30
Make the function is_local() in transport.c public, rename it into url_is_local_not_ssh() and use it in both transport.c and connect.c Use a protocol "local" for URLs for the local file system. One note about using file:// under Windows: The (absolute) path on Unix like system typically starts with "/". When the host is empty, it can be omitted, so that a shell scriptlet url=file://$pwd will give a URL like "file:///home/user/repo". Windows does not have the same concept of a root directory located in "/". When parsing the URL allow "file://C:/user/repo" (even if RFC1738 indicates that "file:///C:/user/repo" should be used). Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09git_connect(): refactor the port handling for sshTorsten Bögershausen1-34/+13
Use get_host_and_port() even for ssh. Remove the variable port git_connect(), and simplify parse_connect_url() Use only one return point in git_connect(), doing the free() and return conn. t5601 had 2 corner test cases which now pass. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09git fetch: support host:/~repoTorsten Bögershausen1-7/+7
The documentation (in urls.txt) says that "ssh://host:/~repo", "host:/~repo" or "host:~repo" specify the repository "repo" in the home directory at "host". This has not been working for "host:/~repo". Before commit 356bec "Support [address] in URLs", the comparison "url != hostname" could be used to determine if the URL had a scheme or not: "ssh://host/host" != "host". However, after 356bec "[::1]" was converted into "::1", yielding url != hostname as well. To fix this regression, don't use "if (url != hostname)", but look at the separator instead. Rename the variable "c" into "separator" to make it easier to read. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09git fetch-pack: add --diag-urlTorsten Bögershausen1-0/+28
The main purpose is to trace the URL parser called by git_connect() in connect.c The main features of the parser can be listed as this: - parse out host and path for URLs with a scheme (git:// file:// ssh://) - parse host names embedded by [] correctly - extract the port number, if present - separate URLs like "file" (which are local) from URLs like "host:repo" which should use ssh Add the new parameter "--diag-url" to "git fetch-pack", which prints the value for protocol, host and path to stderr and exits. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09git_connect: factor out discovery of the protocol and its partsJohannes Sixt1-27/+53
git_connect has grown large due to the many different protocols syntaxes that are supported. Move the part of the function that parses the URL to connect to into a separate function for readability. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09git_connect: remove artificial limit of a remote commandJohannes Sixt1-6/+1
Since day one, function git_connect() had a limit on the command line of the command that is invoked to make a connection. 7a33bcbe converted the code that constructs the command to strbuf. This would have been the right time to remove the limit, but it did not happen. Remove it now. git_connect() uses start_command() to invoke the command; consequently, the limits of the system still apply, but are diagnosed only at execve() time. But these limits are more lenient than the 1K that git_connect() imposed. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05replace {pre,suf}fixcmp() with {starts,ends}_with()Christian Couder1-1/+1
Leaving only the function definitions and declarations so that any new topic in flight can still make use of the old functions, replace existing uses of the prefixcmp() and suffixcmp() with new API functions. The change can be recreated by mechanically applying this: $ git grep -l -e prefixcmp -e suffixcmp -- \*.c | grep -v strbuf\\.c | xargs perl -pi -e ' s|!prefixcmp\(|starts_with\(|g; s|prefixcmp\(|!starts_with\(|g; s|!suffixcmp\(|ends_with\(|g; s|suffixcmp\(|!ends_with\(|g; ' on the result of preparatory changes in this series. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-30Merge branch 'jc/upload-pack-send-symref'Junio C Hamano1-1/+62
One long-standing flaw in the pack transfer protocol used by "git clone" was that there was no way to tell the other end which branch "HEAD" points at, and the receiving end needed to guess. A new capability has been defined in the pack protocol to convey this information so that cloning from a repository with more than one branches pointing at the same commit where the HEAD is at now reliably sets the initial branch in the resulting repository. * jc/upload-pack-send-symref: t5570: Update for clone-progress-to-stderr branch t5570: Update for symref capability clone: test the new HEAD detection logic connect: annotate refs with their symref information in get_remote_head() connect.c: make parse_feature_value() static upload-pack: send non-HEAD symbolic refs upload-pack: send symbolic ref information as capability upload-pack.c: do not pass confusing cb_data to mark_our_ref() t5505: fix "set-head --auto with ambiguous HEAD" test
2013-10-14Merge branch 'nd/clone-local-with-colon'Jonathan Nieder1-1/+1
* nd/clone-local-with-colon: clone: tighten "local paths with colons" check a bit
2013-09-27clone: tighten "local paths with colons" check a bitNguyễn Thái Ngọc Duy1-1/+1
commit 6000334 (clone: allow cloning local paths with colons in them - 2013-05-04) made it possible to specify a path that has colons in it without file://, e.g. ../foo:bar/somewhere. But the check was a bit sloppy. Consider the url '[foo]:bar'. The '[]' unwrapping code will turn the string to 'foo\0:bar'. In effect this new string is the same as 'foo/:bar' in the check "path < strchrnul(host, '/')", which mistakes it for a local path (with '/' before the first ':') when it's actually not. So disable the check for '/' before ':' when the URL has been mangled by '[]' unwrapping. [jn: with tests from Jeff King] Noticed-by: Morten Stenshorne <mstensho@opera.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>