aboutsummaryrefslogtreecommitdiffstats
path: root/builtin/grep.c
AgeCommit message (Collapse)AuthorFilesLines
2024-09-23Merge branch 'jc/pass-repo-to-builtins'Junio C Hamano1-2/+5
The convention to calling into built-in command implementation has been updated to pass the repository, if known, together with the prefix value. * jc/pass-repo-to-builtins: add: pass in repo variable instead of global the_repository builtin: remove USE_THE_REPOSITORY for those without the_repository builtin: remove USE_THE_REPOSITORY_VARIABLE from builtin.h builtin: add a repository parameter for builtin functions
2024-09-13builtin: remove USE_THE_REPOSITORY_VARIABLE from builtin.hJohn Cai1-1/+1
Instead of including USE_THE_REPOSITORY_VARIABLE by default on every builtin, remove it from builtin.h and add it to all the builtins that include builtin.h (by definition, that means all builtins/*.c). Also, remove the include statement for repository.h since it gets brought in through builtin.h. The next step will be to migrate each builtin from having to use the_repository. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-13builtin: add a repository parameter for builtin functionsJohn Cai1-1/+4
In order to reduce the usage of the global the_repository, add a parameter to builtin functions that will get passed a repository variable. This commit uses UNUSED on most of the builtin functions, as subsequent commits will modify the actual builtins to pass the repository parameter down. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05builtin/grep: fix leaking object contextPatrick Steinhardt1-0/+1
Even when `get_oid_with_context()` fails it may have allocated some data in the object context. But we do not release it in git-grep(1) when the call fails, leading to a memory leak. Plug it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11object-name: free leaking object contextsPatrick Steinhardt1-2/+2
While it is documented in `struct object_context::path` that this variable needs to be released by the caller, this fact is rather easy to miss given that we do not ever provide a function to release the object context. And of course, while some callers dutifully release the path, many others don't. Introduce a new `object_context_release()` function that releases the path. Convert callsites that used to free the path to use that new function and add missing calls to callsites that were leaking memory. Refactor those callsites as required to have a single return path, only. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-28Merge branch 'eb/hash-transition'Junio C Hamano1-4/+4
Work to support a repository that work with both SHA-1 and SHA-256 hash algorithms has started. * eb/hash-transition: (30 commits) t1016-compatObjectFormat: add tests to verify the conversion between objects t1006: test oid compatibility with cat-file t1006: rename sha1 to oid test-lib: compute the compatibility hash so tests may use it builtin/ls-tree: let the oid determine the output algorithm object-file: handle compat objects in check_object_signature tree-walk: init_tree_desc take an oid to get the hash algorithm builtin/cat-file: let the oid determine the output algorithm rev-parse: add an --output-object-format parameter repository: implement extensions.compatObjectFormat object-file: update object_info_extended to reencode objects object-file-convert: convert commits that embed signed tags object-file-convert: convert commit objects when writing object-file-convert: don't leak when converting tag objects object-file-convert: convert tag objects when writing object-file-convert: add a function to convert trees between algorithms object: factor out parse_mode out of fast-import and tree-walk into in object.h cache: add a function to read an OID of a specific algorithm tag: sign both hashes commit: export add_header_signature to support handling signatures on tags ...
2024-02-14Merge branch 'js/check-null-from-read-object-file'Junio C Hamano1-0/+2
The code paths that call repo_read_object_file() have been tightened to react to errors. * js/check-null-from-read-object-file: Always check the return value of `repo_read_object_file()`
2024-02-06Always check the return value of `repo_read_object_file()`Johannes Schindelin1-0/+2
There are a couple of places in Git's source code where the return value is not checked. As a consequence, they are susceptible to segmentation faults. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26treewide: remove unnecessary includes in source filesElijah Newren1-1/+0
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26treewide: remove unnecessary includes in source filesElijah Newren1-3/+0
Each of these were checked with gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE} to ensure that removing the direct inclusion of the header actually resulted in that header no longer being included at all (i.e. that no other header pulled it in transitively). ...except for a few cases where we verified that although the header was brought in transitively, nothing from it was directly used in that source file. These cases were: * builtin/credential-cache.c * builtin/pull.c * builtin/send-pack.c Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-07Merge branch 'jc/grep-f-relative-to-cwd'Junio C Hamano1-2/+11
"cd sub && git grep -f patterns" tried to read "patterns" file at the top level of the working tree; it has been corrected to read "sub/patterns" instead. * jc/grep-f-relative-to-cwd: grep: -f <path> is relative to $cwd
2023-10-12grep: -f <path> is relative to $cwdJunio C Hamano1-2/+11
Just like OPT_FILENAME() does, "git grep -f <path>" should treat the <path> relative to the original $cwd by paying attention to the prefix the command is given. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-02tree-walk: init_tree_desc take an oid to get the hash algorithmEric W. Biederman1-4/+4
To make it possible for git ls-tree to display the tree encoded in the hash algorithm of the oid specified to git ls-tree, update init_tree_desc to take as a parameter the oid of the tree object. Update all callers of init_tree_desc and init_tree_desc_gently to pass the oid of the tree object. Use the oid of the tree object to discover the hash algorithm of the oid and store that hash algorithm in struct tree_desc. Use the hash algorithm in decode_tree_entry and update_tree_entry_internal to handle reading a tree object encoded in a hash algorithm that differs from the repositories hash algorithm. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18Merge branch 'rs/grep-no-no-or'Junio C Hamano1-1/+1
"git grep -e A --no-or -e B" is accepted, even though the negation of "or" did not mean anything, which has been tightened. * rs/grep-no-no-or: grep: reject --no-or
2023-09-07grep: reject --no-orRené Scharfe1-1/+1
Since 3e230fa1b2 (grep: use parseopt, 2009-05-07) git grep has been accepting the option --no-or. It does the same as --or: nothing. That's confusing and unintended. Forbid negating --or. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-05grep: use OPT_INTEGER_F for --max-depthRené Scharfe1-3/+2
a91f453f64 (grep: Add --max-depth option., 2009-07-22) added the option --max-depth, defining it using a positional struct option initializer of type OPTION_INTEGER. It also sets defval to 1 for some reason, but that value would only be used if the flag PARSE_OPT_OPTARG was given. Use the macro OPT_INTEGER_F instead to standardize the definition and specify only the necessary values. This also normalizes argh to N_("n") as a side-effect, which is OK. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-02Merge branch 'jc/tree-walk-drop-base-offset'Junio C Hamano1-1/+1
Code simplification. * jc/tree-walk-drop-base-offset: tree-walk: drop unused base_offset from do_match() tree-walk: lose base_offset that is never used in tree_entry_interesting
2023-07-17Merge branch 'cw/compat-util-header-cleanup'Junio C Hamano1-1/+0
Further shuffling of declarations across header files to streamline file dependencies. * cw/compat-util-header-cleanup: git-compat-util: move alloc macros to git-compat-util.h treewide: remove unnecessary includes for wrapper.h kwset: move translation table from ctype sane-ctype.h: create header for sane-ctype macros git-compat-util: move wrapper.c funcs to its header git-compat-util: move strbuf.c funcs to its header
2023-07-07tree-walk: lose base_offset that is never used in tree_entry_interestingJunio C Hamano1-1/+1
The tree_entry_interesting() function takes base_offset, allowing its callers to potentially pass a non-zero number to skip the early part of the path string. The feature is never exercised and we do not even know what bugs are lurking there, as all callers pass 0 to the parameter. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-06Merge branch 'gc/config-context'Junio C Hamano1-4/+8
Reduce reliance on a global state in the config reading API. * gc/config-context: config: pass source to config_parser_event_fn_t config: add kvi.path, use it to evaluate includes config.c: remove config_reader from configsets config: pass kvi to die_bad_number() trace2: plumb config kvi config.c: pass ctx with CLI config config: pass ctx with config files config.c: pass ctx in configsets config: add ctx arg to config_fn_t urlmatch.h: use config_fn_t type config: inline git_color_default_config
2023-07-05git-compat-util: move alloc macros to git-compat-util.hCalvin Wan1-1/+0
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for dynamic array allocation. Moving these macros to git-compat-util.h with the other alloc macros focuses alloc.[ch] to allocation for Git objects and additionally allows us to remove inclusions to alloc.h from files that solely used the above macros. Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28config: pass kvi to die_bad_number()Glen Choo1-1/+1
Plumb "struct key_value_info" through all code paths that end in die_bad_number(), which lets us remove the helper functions that read analogous values from "struct config_reader". As a result, nothing reads config_reader.config_kvi any more, so remove that too. In config.c, this requires changing the signature of git_configset_get_value() to 'return' "kvi" in an out parameter so that git_configset_get_<type>() can pass it to git_config_<type>(). Only numeric types will use "kvi", so for non-numeric types (e.g. git_configset_get_string()), pass NULL to indicate that the out parameter isn't needed. Outside of config.c, config callbacks now need to pass "ctx->kvi" to any of the git_config_<type>() functions that parse a config string into a number type. Included is a .cocci patch to make that refactor. The only exceptional case is builtin/config.c, where git_config_<type>() is called outside of a config callback (namely, on user-provided input), so config source information has never been available. In this case, die_bad_number() defaults to a generic, but perfectly descriptive message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure not to change the message. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28config: add ctx arg to config_fn_tGlen Choo1-3/+4
Add a new "const struct config_context *ctx" arg to config_fn_t to hold additional information about the config iteration operation. config_context has a "struct key_value_info kvi" member that holds metadata about the config source being read (e.g. what kind of config source it is, the filename, etc). In this series, we're only interested in .kvi, so we could have just used "struct key_value_info" as an arg, but config_context makes it possible to add/adjust members in the future without changing the config_fn_t signature. We could also consider other ways of organizing the args (e.g. moving the config name and value into config_context or key_value_info), but in my experiments, the incremental benefit doesn't justify the added complexity (e.g. a config_fn_t will sometimes invoke another config_fn_t but with a different config value). In subsequent commits, the .kvi member will replace the global "struct config_reader" in config.c, making config iteration a global-free operation. It requires much more work for the machinery to provide meaningful values of .kvi, so for now, merely change the signature and call sites, pass NULL as a placeholder value, and don't rely on the arg in any meaningful way. Most of the changes are performed by contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every config_fn_t: - Modifies the signature to accept "const struct config_context *ctx" - Passes "ctx" to any inner config_fn_t, if needed - Adds UNUSED attributes to "ctx", if needed Most config_fn_t instances are easily identified by seeing if they are called by the various config functions. Most of the remaining ones are manually named in the .cocci patch. Manual cleanups are still needed, but the majority of it is trivial; it's either adjusting config_fn_t that the .cocci patch didn't catch, or adding forward declarations of "struct config_context ctx" to make the signatures make sense. The non-trivial changes are in cases where we are invoking a config_fn_t outside of config machinery, and we now need to decide what value of "ctx" to pass. These cases are: - trace2/tr2_cfg.c:tr2_cfg_set_fl() This is indirectly called by git_config_set() so that the trace2 machinery can notice the new config values and update its settings using the tr2 config parsing function, i.e. tr2_cfg_cb(). - builtin/checkout.c:checkout_main() This calls git_xmerge_config() as a shorthand for parsing a CLI arg. This might be worth refactoring away in the future, since git_xmerge_config() can call git_default_config(), which can do much more than just parsing. Handle them by creating a KVI_INIT macro that initializes "struct key_value_info" to a reasonable default, and use that to construct the "ctx" arg. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28config: inline git_color_default_configGlen Choo1-1/+4
git_color_default_config() is a shorthand for calling two other config callbacks. There are no other non-static functions that do this and it will complicate our refactoring of config_fn_t so inline it instead. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21object-store-ll.h: split this header out of object-store.hElijah Newren1-1/+1
The vast majority of files including object-store.h did not need dir.h nor khash.h. Split the header into two files, and let most just depend upon object-store-ll.h, while letting the two callers that need it depend on the full object-store.h. After this patch: $ git grep -h include..object-store | sort | uniq -c 2 #include "object-store.h" 129 #include "object-store-ll.h" Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21repository: remove unnecessary include of path.hElijah Newren1-0/+1
This also made it clear that several .c files that depended upon path.h were missing a #include for it; add the missing includes while at it. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21cache.h: remove this no-longer-used headerElijah Newren1-2/+1
Since this header showed up in some places besides just #include statements, update/clean-up/remove those other places as well. Note that compat/fsmonitor/fsm-path-utils-darwin.c previously got away with violating the rule that all files must start with an include of git-compat-util.h (or a short-list of alternate headers that happen to include it first). This change exposed the violation and caused it to stop building correctly; fix it by having it include git-compat-util.h first, as per policy. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21read-cache*.h: move declarations for read-cache.c functions from cache.hElijah Newren1-0/+1
For the functions defined in read-cache.c, move their declarations from cache.h to a new header, read-cache-ll.h. Also move some related inline functions from cache.h to read-cache.h. The purpose of the read-cache-ll.h/read-cache.h split is that about 70% of the sites don't need the inline functions and the extra headers they include. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11pager.h: move declarations for pager.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11object-file.h: move declarations for object-file.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11object-name.h: move declarations for object-name.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-04Merge branch 'ab/remove-implicit-use-of-the-repository' into ↵Junio C Hamano1-2/+4
en/header-split-cache-h * ab/remove-implicit-use-of-the-repository: libs: use "struct repository *" argument, not "the_repository" post-cocci: adjust comments for recent repo_* migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"
2023-03-28cocci: apply the "object-store.h" part of "the_repository.pending"Ævar Arnfjörð Bjarmason1-2/+4
Apply the part of "the_repository.pending.cocci" pertaining to "object-store.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21write-or-die.h: move declarations for write-or-die.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21setup.h: move declarations for setup.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21treewide: be explicit about dependence on gettext.hElijah Newren1-0/+1
Dozens of files made use of gettext functions, without explicitly including gettext.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include gettext.h if they are using it. However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an include of gettext.h, it was left out to avoid conflicting with an in-flight topic. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23cache.h: remove dependence on hex.h; make other files include it explicitlyElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23alloc.h: move ALLOC_GROW() functions from cache.hElijah Newren1-0/+1
This allows us to replace includes of cache.h with includes of the much smaller alloc.h in many places. It does mean that we also need to add includes of alloc.h in a number of C files. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21builtin/{grep,log}.: don't define "USE_THE_INDEX_COMPATIBILITY_MACROS"Ævar Arnfjörð Bjarmason1-1/+0
Adding "USE_THE_INDEX_COMPATIBILITY_MACROS" to these two appears to have been unnecessary from the start, as going back and compiling f8adbec9fea (cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch, 2019-01-24) without that addition works. Let's not have these ask for the compatibility macros from cache.h that they don't need. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-23builtin/grep.c: integrate with sparse indexShaoxuan Yuan1-3/+45
Turn on sparse index and remove ensure_full_index(). Before this patch, `git-grep` utilizes the ensure_full_index() method to expand the index and search all the entries. Because this method requires walking all the trees and constructing the index, it is the slow part within the whole command. To achieve better performance, this patch uses grep_tree() to search the sparse directory entries and get rid of the ensure_full_index() method. Why grep_tree() is a better choice over ensure_full_index()? 1) grep_tree() is as correct as ensure_full_index(). grep_tree() looks into every sparse-directory entry (represented by a tree) recursively when looping over the index, and the result of doing so matches the result of expanding the index. 2) grep_tree() utilizes pathspecs to limit the scope of searching. ensure_full_index() always expands the index, which means it will always walk all the trees and blobs in the repo without caring if the user only wants a subset of the content, i.e. using a pathspec. On the other hand, grep_tree() will only search the contents that match the pathspec, and thus possibly walking fewer trees. 3) grep_tree() does not construct and copy back a new index, while ensure_full_index() does. This also saves some time. ---------------- Performance test - Summary: p2000 tests demonstrate a ~71% execution time reduction for `git grep --cached bogus -- "f2/f1/f1/*"` using tree-walking logic. However, notice that this result varies depending on the pathspec given. See below "Command used for testing" for more details. Test HEAD~ HEAD ------------------------------------------------------- 2000.78: git grep ... (full-v3) 0.35 0.39 (≈) 2000.79: git grep ... (full-v4) 0.36 0.30 (≈) 2000.80: git grep ... (sparse-v3) 0.88 0.23 (-73.8%) 2000.81: git grep ... (sparse-v4) 0.83 0.26 (-68.6%) - Command used for testing: git grep --cached bogus -- "f2/f1/f1/*" The reason for specifying a pathspec is that, if we don't specify a pathspec, then grep_tree() will walk all the trees and blobs to find the pattern, and the time consumed doing so is not too different from using the original ensure_full_index() method, which also spends most of the time walking trees. However, when a pathspec is specified, this latest logic will only walk the area of trees enclosed by the pathspec, and the time consumed is reasonably a lot less. Generally speaking, because the performance gain is acheived by walking less trees, which are specified by the pathspec, the HEAD time v.s. HEAD~ time in sparse-v[3|4], should be proportional to "pathspec enclosed area" v.s. "all area", respectively. Namely, the wider the <pathspec> is encompassing, the less the performance difference between HEAD~ and HEAD, and vice versa. That is, if we don't specify a pathspec, the performance difference [1] is indistinguishable: both methods walk all the trees and take generally same amount of time (even with the index construction time included for ensure_full_index()). [1] Performance test result without pathspec (hence walking all trees): Command used: git grep --cached bogus Test HEAD~ HEAD --------------------------------------------------- 2000.78: git grep ... (full-v3) 6.17 5.19 (≈) 2000.79: git grep ... (full-v4) 6.19 5.46 (≈) 2000.80: git grep ... (sparse-v3) 6.57 6.44 (≈) 2000.81: git grep ... (sparse-v4) 6.65 6.28 (≈) -------------------------- NEEDSWORK about submodules There are a few NEEDSWORKs that belong to improvements beyond this topic. See the NEEDSWORK in builtin/grep.c::grep_submodule() for more context. The other two NEEDSWORKs in t1092 are also relative. Suggested-by: Derrick Stolee <derrickstolee@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Helped-by: Victoria Dye <vdye@github.com> Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22grep: add --max-count command line optionCarlos López1-0/+9
This patch adds a command line option analogous to that of GNU grep(1)'s -m / --max-count, which users might already be used to. This makes it possible to limit the amount of matches shown in the output while keeping the functionality of other options such as -C (show code context) or -p (show containing function), which would be difficult to do with a shell pipeline (e.g. head(1)). Signed-off-by: Carlos López 00xc@protonmail.com Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-16Merge branch 'ab/object-file-api-updates'Junio C Hamano1-2/+2
Object-file API shuffling. * ab/object-file-api-updates: object-file API: pass an enum to read_object_with_reference() object-file.c: add a literal version of write_object_file_prepare() object-file API: have hash_object_file() take "enum object_type" object API: rename hash_object_file_literally() to write_*() object-file API: split up and simplify check_object_signature() object API users + docs: check <0, not !0 with check_object_signature() object API docs: move check_object_signature() docs to cache.h object API: correct "buf" v.s. "map" mismatch in *.c and *.h object-file API: have write_object_file() take "enum object_type" object-file API: add a format_object_header() function object-file API: return "void", not "int" from hash_object_file() object-file.c: split up declaration of unrelated variables
2022-02-25object-file API: pass an enum to read_object_with_reference()Ævar Arnfjörð Bjarmason1-2/+2
Change the read_object_with_reference() function to take an "enum object_type". It was not prepared to handle an arbitrary "const char *type", as it was itself calling type_from_string(). Let's change the only caller that passes in user data to use type_from_string(), and convert the rest to use e.g. "OBJ_TREE" instead of "tree_type". The "cat-file" caller is not on the codepath that handles"--allow-unknown", so the type_from_string() there is safe. Its use of type_from_string() doesn't functionally differ from that of the pre-image. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-25Merge branch 'ab/grep-patterntype'Junio C Hamano1-13/+14
Some code clean-up in the "git grep" machinery. * ab/grep-patterntype: grep: simplify config parsing and option parsing grep.c: do "if (bool && memchr())" not "if (memchr() && bool)" grep.h: make "grep_opt.pattern_type_option" use its enum grep API: call grep_config() after grep_init() grep.c: don't pass along NULL callback value built-ins: trust the "prefix" from run_builtin() grep tests: add missing "grep.patternType" config tests grep tests: create a helper function for "BRE" or "ERE" log tests: check if grep_config() is called by "log"-like cmds grep.h: remove unused "regex_t regexp" from grep_opt
2022-02-15grep: simplify config parsing and option parsingÆvar Arnfjörð Bjarmason1-6/+4
Simplify the parsing of "grep.patternType" and "grep.extendedRegexp". This changes no behavior, but gets rid of complex parsing logic that isn't needed anymore. When "grep.patternType" was introduced in 84befcd0a4a (grep: add a grep.patternType configuration setting, 2012-08-03) we promised that: 1. You can set "grep.patternType", and "[setting it to] 'default' will return to the default matching behavior". In that context "the default" meant whatever the configuration system specified before that change, i.e. via grep.extendedRegexp. 2. We'd support the existing "grep.extendedRegexp" option, but ignore it when the new "grep.patternType" option is set. We said we'd only ignore the older "grep.extendedRegexp" option "when the `grep.patternType` option is set to a value other than 'default'". In a preceding commit we changed grep_config() to be called after grep_init(), which means that much of the complexity here can go away. As before both "grep.patternType" and "grep.extendedRegexp" are last-one-wins variable, with "grep.extendedRegexp" yielding to "grep.patternType", except when "grep.patternType=default". Note that as the previously added tests indicate this cannot be done on-the-fly as we see the config variables, without introducing more state keeping. I.e. if we see: -c grep.extendedRegexp=false -c grep.patternType=default -c extendedRegexp=true We need to select ERE, since grep.patternType=default unselects that variable, which normally has higher precedence, but we also need to select BRE in cases of: -c grep.extendedRegexp=true \ -c grep.extendedRegexp=false Which would not be the case for this, which select ERE: -c grep.patternType=extended \ -c grep.extendedRegexp=false Therefore we cannot do this on-the-fly in grep_config without also introducing tracking variables for not only the pattern type, but what the source of that pattern type was. So we need to decide on the pattern after our config was fully parsed. Let's do that by deferring the decision on the pattern type until it's time to compile it in compile_regexp(). By that time we've not only parsed the config, but also handled the command-line options. Those will set "opt.pattern_type_option" (*not* "opt.extended_regexp_option"!). At that point all we need to do is see if "grep.patternType" was UNSPECIFIED in the end (including an explicit "=default"), if so we'll use the "grep.extendedRegexp" configuration, if any. See my 07a3d411739 (grep: remove regflags from the public grep_opt API, 2017-06-29) for addition of the two comments being removed here, i.e. the complexity noted in that commit is now going away. 1. https://lore.kernel.org/git/patch-v8-09.10-c211bb0c69d-20220118T155211Z-avarab@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-15grep API: call grep_config() after grep_init()Ævar Arnfjörð Bjarmason1-2/+2
The grep_init() function used the odd pattern of initializing the passed-in "struct grep_opt" with a statically defined "grep_defaults" struct, which would be modified in-place when we invoked grep_config(). So we effectively (b) initialized config, (a) then defaults, (c) followed by user options. Usually those are ordered as "a", "b" and "c" instead. As the comments being removed here show the previous behavior needed to be carefully explained as we'd potentially share the populated configuration among different instances of grep_init(). In practice we didn't do that, but now that it can't be a concern anymore let's remove those comments. This does not change the behavior of any of the configuration variables or options. That would have been the case if we didn't move around the grep_config() call in "builtin/log.c". But now that we call "grep_config" after "git_log_config" and "git_format_config" we'll need to pass in the already initialized "struct grep_opt *". See 6ba9bb76e02 (grep: copy struct in one fell swoop, 2020-11-29) and 7687a0541e0 (grep: move the configuration parsing logic to grep.[ch], 2012-10-09) for the commits that added the comments. The memcpy() pattern here will be optimized away and follows the convention of other *_init() functions. See 5726a6b4012 (*.c *_init(): define in terms of corresponding *_INIT macro, 2021-07-01). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-15grep.c: don't pass along NULL callback valueÆvar Arnfjörð Bjarmason1-2/+2
Change grep_cmd_config() to stop passing around the always-NULL "cb" value. When this code was added in 7e8f59d577e (grep: color patterns in output, 2009-03-07) it was non-NULL, but when that changed in 15fabd1bbd4 (builtin/grep.c: make configuration callback more reusable, 2012-10-09) this code was left behind. In a subsequent change I'll start using the "cb" value, this will make it clear which functions we call need it, and which don't. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-15built-ins: trust the "prefix" from run_builtin()Ævar Arnfjörð Bjarmason1-5/+8
Change code in "builtin/grep.c" and "builtin/ls-tree.c" to trust the "prefix" passed from "run_builtin()". The "prefix" we get from setup.c is either going to be NULL or a string of length >0, never "". So we can drop the "prefix && *prefix" checks added for "builtin/grep.c" in 0d042fecf2f (git-grep: show pathnames relative to the current directory, 2006-08-11), and for "builtin/ls-tree.c" in a69dd585fca (ls-tree: chomp leading directories when run from a subdirectory, 2005-12-23). As seen in code in revision.c that was added in cd676a51367 (diff --relative: output paths as relative to the current subdirectory, 2008-02-12) we already have existing code that does away with this assertion. This makes it easier to reason about a subsequent change to the "prefix_length" code in grep.c in a subsequent commit, and since we're going to the trouble of doing that let's leave behind an assert() to promise this to any future callers. For "builtin/grep.c" it would be painful to pass the "prefix" down the callchain of: cmd_grep -> grep_tree -> grep_submodule -> grep_cache -> grep_oid -> grep_source_name So for the code that needs it in grep_source_name() let's add a "grep_prefix" variable similar to the existing "ls_tree_prefix". While at it let's move the code in cmd_ls_tree() around so that we assign to the "ls_tree_prefix" right after declaring the variables, and stop assigning to "prefix". We only subsequently used that variable later in the function after clobbering it. Let's just use our own "grep_prefix" instead. Let's also add an assert() in git.c, so that we'll make this promise about the "prefix" to any current and future callers, as well as to any readers of the code. Code history: * The strlen() in "grep.c" hasn't been used since 493b7a08d80 (grep: accept relative paths outside current working directory, 2009-09-05). When that code was added in 0d042fecf2f (git-grep: show pathnames relative to the current directory, 2006-08-11) we used the length. But since 493b7a08d80 we haven't used it for anything except a boolean check that we could have done on the "prefix" member itself. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-04i18n: factorize more 'incompatible options' messagesJean-Noël Avila1-5/+3
Find more incompatible options to factorize. When more than two options are mutually exclusive, print the ones which are actually on the command line. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-23grep: fix a "path_list" memory leakÆvar Arnfjörð Bjarmason1-4/+5
Free the "path_list" used in builtin/grep.c, it was declared as STRING_LIST_INIT_NODUP, let's change it to a STRING_LIST_INIT_DUP since an early user in cmd_grep() appends a string passed via parse-options.c to it, which needs to be duplicated. Let's then convert the remaining callers to use string_list_append_nodup() instead, allowing us to free the list. This makes all the tests in t7811-grep-open.sh pass, 6/10 would fail before this change. The only remaining failure would have been due to a stray "git checkout" (which still leaks memory). In this case we can use a "git reset --hard" instead, so let's do that, and move the test_when_finished() above the code that would modify the relevant file. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-23grep: use object_array_clear() in cmd_grep()Ævar Arnfjörð Bjarmason1-0/+1
Free the "struct object_array" before exiting. This makes grep tests (e.g. "t7815-grep-binary.sh") a bit happer under SANITIZE=leak. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-23grep: prefer "struct grep_opt" over its "void *" equivalentÆvar Arnfjörð Bjarmason1-2/+2
Stylistically fix up code added in bfac23d9534 (grep: Fix two memory leaks, 2010-01-30). We usually don't use the "arg" at all once we've casted it to the struct we want, let's not do that here when we're freeing it. Perhaps it was thought that a cast to "void *" would otherwise be needed? Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09repository: support unabsorbed in repo_submodule_initJonathan Tan1-4/+1
In preparation for a subsequent commit that migrates code using add_submodule_odb() to repo_submodule_init(), teach repo_submodule_init() to support submodules with unabsorbed gitdirs. (See the documentation for "git submodule absorbgitdirs" for more information about absorbed and unabsorbed gitdirs.) Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08grep: add repository to OID grep sourcesJonathan Tan1-9/+6
Record the repository whenever an OID grep source is created, and teach the worker threads to explicitly provide the repository when accessing objects. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08grep: allocate subrepos on heapJonathan Tan1-9/+30
Currently, struct repository objects corresponding to submodules are allocated on the stack in grep_submodule(). This currently works because they will not be used once grep_submodule() exits, but a subsequent patch will require these structs to be accessible for longer (perhaps even in another thread). Allocate them on the heap and clear them only at the very end. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08grep: read submodule entry with explicit repoJonathan Tan1-5/+5
Replace an existing parse_object_or_die() call (which implicitly works on the_repository) with a function call that allows a repository to be passed in. There is no such direct equivalent to parse_object_or_die(), but we only need the type of the object, so replace with oid_object_info(). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Emily Shaffer <emilyshaffer@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08grep: typesafe versions of grep_source_initJonathan Tan1-2/+2
grep_source_init() can create "struct grep_source" objects and, depending on the value of the type passed, some void-pointer parameters have different meanings. Because one of these types (GREP_SOURCE_OID) will require an additional parameter in a subsequent patch, take the opportunity to increase clarity and type safety by replacing this function with individual functions for each type. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08grep: use submodule-ODB-as-alternate lazy-additionJonathan Tan1-1/+1
In the parent commit, Git was taught to add submodule ODBs as alternates lazily, but grep does not use this because it computes the path to add directly, not going through add_submodule_odb(). Add an equivalent to add_submodule_odb() that takes the exact ODB path and teach grep to use it. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Emily Shaffer <emilyshaffer@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-01dir.[ch]: replace dir_init() with DIR_INITÆvar Arnfjörð Bjarmason1-2/+1
Remove the dir_init() function and replace it with a DIR_INIT macro. In many cases in the codebase we need to initialize things with a function for good reasons, e.g. needing to call another function on initialization. The "dir_init()" function was not one such case, and could trivially be replaced with a more idiomatic macro initialization pattern. The only place where we made use of its use of memset() was in dir_clear() itself, which resets the contents of an an existing struct pointer. Let's use the new "memcpy() a 'blank' struct on the stack" idiom to do that reset. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10Merge branch 'bc/hash-transition-interop-part-1'Junio C Hamano1-1/+1
SHA-256 transition. * bc/hash-transition-interop-part-1: hex: print objects using the hash algorithm member hex: default to the_hash_algo on zero algorithm value builtin/pack-objects: avoid using struct object_id for pack hash commit-graph: don't store file hashes as struct object_id builtin/show-index: set the algorithm for object IDs hash: provide per-algorithm null OIDs hash: set, copy, and use algo field in struct object_id builtin/pack-redundant: avoid casting buffers to struct object_id Use the final_oid_fn to finalize hashing of object IDs hash: add a function to finalize object IDs http-push: set algorithm when reading object ID Always use oidread to read into struct object_id hash: add an algo member to struct object_id
2021-04-30Merge branch 'ds/sparse-index-protections'Junio C Hamano1-0/+2
Builds on top of the sparse-index infrastructure to mark operations that are not ready to mark with the sparse index, causing them to fall back on fully-populated index that they always have worked with. * ds/sparse-index-protections: (47 commits) name-hash: use expand_to_path() sparse-index: expand_to_path() name-hash: don't add directories to name_hash revision: ensure full index resolve-undo: ensure full index read-cache: ensure full index pathspec: ensure full index merge-recursive: ensure full index entry: ensure full index dir: ensure full index update-index: ensure full index stash: ensure full index rm: ensure full index merge-index: ensure full index ls-files: ensure full index grep: ensure full index fsck: ensure full index difftool: ensure full index commit: ensure full index checkout: ensure full index ...
2021-04-27hash: provide per-algorithm null OIDsbrian m. carlson1-1/+1
Up until recently, object IDs did not have an algorithm member, only a hash. Consequently, it was possible to share one null (all-zeros) object ID among all hash algorithms. Now that we're going to be handling objects from multiple hash algorithms, it's important to make sure that all object IDs have a correct algorithm field. Introduce a per-algorithm null OID, and add it to struct hash_algo. Introduce a wrapper function as well, and use it everywhere we used to use the null_oid constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14grep: ensure full indexDerrick Stolee1-0/+2
Before iterating over all cache entries, ensure that a sparse index is expanded to a full one so we do not miss blobs to scan. Later, this can integrate more carefully with sparse indexes with proper testing. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22Merge branch 'ab/grep-pcre2-allocfix'Junio C Hamano1-1/+0
Updates to memory allocation code around the use of pcre2 library. * ab/grep-pcre2-allocfix: grep/pcre2: move definitions of pcre2_{malloc,free} grep/pcre2: move back to thread-only PCREv2 structures grep/pcre2: actually make pcre2 use custom allocator grep/pcre2: use pcre2_maketables_free() function grep/pcre2: use compile-time PCREv2 version test grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode grep/pcre2: prepare to add debugging to pcre2_malloc() grep/pcre2: correct reference to grep_init() in comment grep/pcre2: drop needless assignment to NULL grep/pcre2: drop needless assignment + assert() on opt->pcre2
2021-03-13use CALLOC_ARRAYRené Scharfe1-1/+1
Add and apply a semantic patch for converting code that open-codes CALLOC_ARRAY to use it instead. It shortens the code and infers the element size automatically. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-25Merge branch 'mt/grep-sparse-checkout'Junio C Hamano1-2/+5
"git grep" has been tweaked to be limited to the sparse checkout paths. * mt/grep-sparse-checkout: grep: honor sparse-checkout on working tree searches
2021-02-17Merge branch 'mt/grep-cached-untracked'Junio C Hamano1-0/+3
"git grep --untracked" is meant to be "let's ALSO find in these files on the filesystem" when looking for matches in the working tree files, and does not make any sense if the primary search is done against the index, or the tree objects. The "--cached" and "--untracked" options have been marked as mutually incompatible. * mt/grep-cached-untracked: grep: error out if --untracked is used with --cached
2021-02-17grep/pcre2: move back to thread-only PCREv2 structuresÆvar Arnfjörð Bjarmason1-1/+0
Change the setup of the "pcre2_general_context" to happen per-thread in compile_pcre2_pattern() instead of in grep_init(). This change brings it in line with how the rest of the pcre2_* members in the grep_pat structure are set up. As noted in the preceding commit the approach 513f2b0bbd4 (grep: make PCRE2 aware of custom allocator, 2019-10-16) took to allocate the pcre2_general_context seems to have been initially based on a misunderstanding of how PCREv2 memory allocation works. The approach of creating a global context in grep_init() is just added complexity for almost zero gain. On my system it's 24 bytes saved per-thread. For comparison PCREv2 will then go on to allocate at least a kilobyte for its own thread-local state. As noted in 6d423dd542f (grep: don't redundantly compile throwaway patterns under threading, 2017-05-25) the grep code is intentionally not trying to micro-optimize allocations by e.g. sharing some PCREv2 structures globally, while making others thread-local. So let's remove this special case and make all of them thread-local again for simplicity. With this change we could move the pcre2_{malloc,free} functions around to live closer to their current use. I'm not doing that here to keep this change small, that cleanup will be done in a follow-up commit. See also the discussion in 94da9193a6 (grep: add support for PCRE v2, 2017-06-01) about thread safety, and Johannes's comments[1] to the effect that we should be doing what this patch is doing. 1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.1908052120302.46@tvgsbejvaqbjf.bet/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-09grep: honor sparse-checkout on working tree searchesMatheus Tavares1-2/+5
On a sparse checked out repository, `git grep` (without --cached) ends up searching the cache when an entry matches the search pathspec and has the SKIP_WORKTREE bit set. This is confusing both because the sparse paths are not expected to be in a working tree search (as they are not checked out), and because the output mixes working tree and cache results without distinguishing them. (Note that grep also resorts to the cache on working tree searches that include --assume-unchanged paths. But the whole point in that case is to assume that the contents of the index entry and the file are the same. This does not apply to the case of sparse paths, where the file isn't even expected to be present.) Fix that by teaching grep to honor the sparse-checkout rules for working tree searches. If the user wants to grep paths outside the current sparse-checkout definition, they may either update the sparsity rules to materialize the files, or use --cached to search all blobs registered in the index. Note: it might also be interesting to add a configuration option that allow users to search paths that are present despite having the SKIP_WORKTREE bit set, and/or to restrict searches in the index and past revisions too. These ideas are left as future improvements to avoid conflicting with other sparse-checkout topics currently in flight. Suggested-by: Elijah Newren <newren@gmail.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-09grep: error out if --untracked is used with --cachedMatheus Tavares1-0/+3
The options --untracked and --cached are not compatible, but if they are used together, grep just silently ignores --cached and searches the working tree. Error out, instead, to avoid any potential confusion. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-26grep/log: remove hidden --debug and --grep-debug optionsÆvar Arnfjörð Bjarmason1-5/+0
Remove the hidden "grep --debug" and "log --grep-debug" options added in 17bf35a3c7b (grep: teach --debug option to dump the parse tree, 2012-09-13). At the time these options seem to have been intended to go along with a documentation discussion and to help the author of relevant tests to perform ad-hoc debugging on them[1]. Reasons to want this gone: 1. They were never documented, and the only (rather trivial) use of them in our own codebase for testing is something I removed back in e01b4dab01e (grep: change non-ASCII -i test to stop using --debug, 2017-05-20). 2. Googling around doesn't show any in-the-wild uses I could dig up, and on the Git ML the only mentions after the original discussion seem to have been when they came up in unrelated diff contexts, or that test commit of mine. 3. An exception to that is c581e4a7499 (grep: under --debug, show whether PCRE JIT is enabled, 2019-08-18) where we added the ability to dump out when PCREv2 has the JIT in effect. The combination of that and my earlier b65abcafc7a (grep: use PCRE v2 for optimized fixed-string search, 2019-07-01) means Git prints this out in its most common in-the-wild configuration: $ git log --grep-debug --grep=foo --grep=bar --grep=baz --all-match pcre2_jit_on=1 pcre2_jit_on=1 pcre2_jit_on=1 [all-match] (or pattern_body<body>foo (or pattern_body<body>bar pattern_body<body>baz ) ) $ git grep --debug \( -e foo --and -e bar \) --or -e baz pcre2_jit_on=1 pcre2_jit_on=1 pcre2_jit_on=1 (or (and patternfoo patternbar ) patternbaz ) I.e. for each pattern we're considering for the and/or/--all-match etc. debugging we'll now diligently spew out another identical line saying whether the PCREv2 JIT is on or not. I think that nobody's complained about that rather glaringly obviously bad output says something about how much this is used, i.e. it's not. The need for this debugging aid for the composed grep/log patterns seems to have passed, and the desire to dump the JIT config seems to have been another one-off around the time we had JIT-related issues on the PCREv2 codepath. That the original author of this debugging facility seemingly hasn't noticed the bad output since then[2] is probably some indicator. 1. https://lore.kernel.org/git/cover.1347615361.git.git@drmicha.warpmail.net/ 2. https://lore.kernel.org/git/xmqqk1b8x0ac.fsf@gitster-ct.c.googlers.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-21grep: use designated initializers for `grep_defaults`Martin Ågren1-1/+0
In 15fabd1bbd ("builtin/grep.c: make configuration callback more reusable", 2012-10-09), we learned to fill a `static struct grep_opt grep_defaults` which we can use as a blueprint for other such structs. At the time, we didn't consider designated initializers to be widely useable, but these days, we do. (See, e.g., cbc0f81d96 ("strbuf: use designated initializers in STRBUF_INIT", 2017-07-10).) Use designated initializers to let the compiler set up the struct and so that we don't need to remember to call `init_grep_defaults()`. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-21grep: don't set up a "default" repo for grepMartin Ågren1-1/+1
`init_grep_defaults()` fills a `static struct grep_opt grep_defaults`. This struct is then used by `grep_init()` as a blueprint for other such structs. Notably, `grep_init()` takes a `struct repo *` and assigns it into the target struct. As a result, it is unnecessary for us to take a `struct repo *` in `init_grep_defaults()` as well. We assign it into the default struct and never look at it again. And in light of how we return early if we have already set up the default struct, it's not just unnecessary, but is also a bit confusing: If we are called twice and with different repos, is it a bug or a feature that we ignore the second repo? Drop the repo parameter for `init_grep_defaults()`. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-12grep: handle deref_tag() returning NULLRené Scharfe1-0/+11
deref_tag() can return NULL. Exit gracefully in that case instead of blindly dereferencing the return value. .name shouldn't ever be NULL, but grep_object() handles that case explicitly, so let's be defensive here as well and show the broken object's ID if it happens to lack a name after all. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-10quote_path: give flags parameter to quote_path()Junio C Hamano1-1/+1
The quote_path() function computes a path (relative to its base directory) and c-quotes the result if necessary. Teach it to take a flags parameter to allow its behaviour to be enriched later. No behaviour change intended. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-10quote_path: rename quote_path_relative() to quote_path()Junio C Hamano1-1/+1
There is no quote_path_absolute() or anything that causes confusion, and one of the two large consumers already rename the long name locally with a preprocessor macro. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-18dir: fix problematic API to avoid memory leaksElijah Newren1-1/+2
The dir structure seemed to have a number of leaks and problems around it. First I noticed that parent_hashmap and recursive_hashmap were being leaked (though Peff noticed and submitted fixes before me). Then I noticed in the previous commit that clear_directory() was only taking responsibility for a subset of fields within dir_struct, despite the fact that entries[] and ignored[] we allocated internally to dir.c. That, of course, resulted in many callers either leaking or haphazardly trying to free these arrays and their contents. Digging further, I found that despite the pretty clear documentation near the top of dir.h that folks were supposed to call clear_directory() when the user no longer needed the dir_struct, there were four callers that didn't bother doing that at all. However, two of them clearly thought about leaks since they had an UNLEAK(dir) directive, which to me suggests that the method to free the data was too unclear. I suspect the non-obviousness of the API and its holes led folks to avoid it, which then snowballed into further problems with the entries[], ignored[], parent_hashmap, and recursive_hashmap problems. Rename clear_directory() to dir_clear() to be more in line with other data structures in git, and introduce a dir_init() to handle the suggested memsetting of dir_struct to all zeroes. I hope that a name like "dir_clear()" is more clear, and that the presence of dir_init() will provide a hint to those looking at the code that they need to look for either a dir_clear() or a dir_free() and lead them to find dir_clear(). Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-10Merge branch 'jk/strvec'Junio C Hamano1-1/+1
The argv_array API is useful for not just managing argv but any "vector" (NULL-terminated array) of strings, and has seen adoption to a certain degree. It has been renamed to "strvec" to reduce the barrier to adoption. * jk/strvec: strvec: rename struct fields strvec: drop argv_array compatibility layer strvec: update documention to avoid argv_array strvec: fix indentation in renamed calls strvec: convert remaining callers away from argv_array name strvec: convert more callers away from argv_array name strvec: convert builtin/ callers away from argv_array name quote: rename sq_dequote_to_argv_array to mention strvec strvec: rename files from argv-array to strvec argv-array: rename to strvec argv-array: use size_t for count and alloc
2020-07-28grep: avoid using oid_to_hex() with parse_object_or_die()René Scharfe1-1/+1
parse_object_or_die() is passed an object ID and a name to show if the object cannot be parsed. If the name is NULL then it shows the hexadecimal object ID. Use that feature instead of preparing and passing the hexadecimal representation to the function proactively. That's shorter and a bit more efficient. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-28strvec: convert builtin/ callers away from argv_array nameJeff King1-1/+1
We eventually want to drop the argv_array name and just use strvec consistently. There's no particular reason we have to do it all at once, or care about interactions between converted and unconverted bits. Because of our preprocessor compat layer, the names are interchangeable to the compiler (so even a definition and declaration using different names is OK). This patch converts all of the files in builtin/ to keep the diff to a manageable size. The conversion was done purely mechanically with: git ls-files '*.c' '*.h' | xargs perl -i -pe ' s/ARGV_ARRAY/STRVEC/g; s/argv_array/strvec/g; ' and then selectively staging files with "git add builtin/". We'll deal with any indentation/style fallouts separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-05Merge branch 'dl/opt-callback-cleanup'Junio C Hamano1-13/+13
Code cleanup. * dl/opt-callback-cleanup: Use OPT_CALLBACK and OPT_CALLBACK_F
2020-04-29Merge branch 'en/fill-directory-exponential'Junio C Hamano1-2/+0
The directory traversal code had redundant recursive calls which made its performance characteristics exponential with respect to the depth of the tree, which was corrected. * en/fill-directory-exponential: completion: fix 'git add' on paths under an untracked directory Fix error-prone fill_directory() API; make it only return matches dir: replace double pathspec matching with single in treat_directory() dir: include DIR_KEEP_UNTRACKED_CONTENTS handling in treat_directory() dir: replace exponential algorithm with a linear one dir: refactor treat_directory to clarify control flow dir: fix confusion based on variable tense dir: fix broken comment dir: consolidate treat_path() and treat_one_path() dir: fix simple typo in comment t3000: add more testcases testing a variety of ls-files issues t7063: more thorough status checking
2020-04-28Use OPT_CALLBACK and OPT_CALLBACK_FDenton Liu1-13/+13
In the codebase, there are many options which use OPTION_CALLBACK in a plain ol' struct definition. However, we have the OPT_CALLBACK and OPT_CALLBACK_F macros which are meant to abstract these plain struct definitions away. These macros are useful as they semantically signal to developers that these are just normal callback option with nothing fancy happening. Replace plain struct definitions of OPTION_CALLBACK with OPT_CALLBACK or OPT_CALLBACK_F where applicable. The heavy lifting was done using the following (disgusting) shell script: #!/bin/sh do_replacement () { tr '\n' '\r' | sed -e 's/{\s*OPTION_CALLBACK,\s*\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\s*0,\(\s*[^[:space:]}]*\)\s*}/OPT_CALLBACK(\1,\2,\3,\4,\5,\6)/g' | sed -e 's/{\s*OPTION_CALLBACK,\s*\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\(\s*[^[:space:]}]*\)\s*}/OPT_CALLBACK_F(\1,\2,\3,\4,\5,\6,\7)/g' | tr '\r' '\n' } for f in $(git ls-files \*.c) do do_replacement <"$f" >"$f.tmp" mv "$f.tmp" "$f" done The result was manually inspected and then reformatted to match the style of the surrounding code. Finally, using `git grep OPTION_CALLBACK \*.c`, leftover results which were not handled by the script were manually transformed. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-20grep: follow conventions for printing paths w/ unusual charsMatheus Tavares1-12/+34
grep does not follow the conventions used by other Git commands when printing paths that contain unusual characters (as double-quotes or newlines). Commands such as ls-files, commit, status and diff will: - Quote and escape unusual pathnames, by default. - Print names verbatim and unquoted when "-z" is used. But grep *never* quotes/escapes absolute paths with unusual chars and *always* quotes/escapes relative ones, even with "-z". Besides being inconsistent in its own output, the deviation from other Git commands can be confusing. So let's make it follow the two rules above and add some tests for this new behavior. Note that, making grep quote/escape all unusual paths by default, also make it fully compliant with the core.quotePath configuration, which is currently ignored for absolute paths. Reported-by: Greg Hurrell <greg@hurrell.net> Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-01Fix error-prone fill_directory() API; make it only return matchesElijah Newren1-2/+0
Traditionally, the expected calling convention for the dir.c API was: fill_directory(&dir, ..., pathspec) foreach entry in dir->entries: if (dir_path_match(entry, pathspec)) process_or_display(entry) This may have made sense once upon a time, because the fill_directory() call could use cheap checks to avoid doing full pathspec matching, and an external caller may have wanted to do other post-processing of the results anyway. However: * this structure makes it easy for users of the API to get it wrong * this structure actually makes it harder to understand fill_directory() and the functions it uses internally. It has tripped me up several times while trying to fix bugs and restructure things. * relying on post-filtering was already found to produce wrong results; pathspec matching had to be added internally for multiple cases in order to get the right results (see commits 404ebceda01c (dir: also check directories for matching pathspecs, 2019-09-17) and 89a1f4aaf765 (dir: if our pathspec might match files under a dir, recurse into it, 2019-09-17)) * it's bad for performance: fill_directory() already has to do lots of checks and knows the subset of cases where it still needs to do more checks. Forcing external callers to do full pathspec matching means they must re-check _every_ path. So, add the pathspec matching within the fill_directory() internals, and remove it from external callers. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-14Merge branch 'mt/threaded-grep-in-object-store'Junio C Hamano1-47/+46
Traditionally, we avoided threaded grep while searching in objects (as opposed to files in the working tree) as accesses to the object layer is not thread-safe. This limitation is getting lifted. * mt/threaded-grep-in-object-store: grep: use no. of cores as the default no. of threads grep: move driver pre-load out of critical section grep: re-enable threads in non-worktree case grep: protect packed_git [re-]initialization grep: allow submodule functions to run in parallel submodule-config: add skip_if_read option to repo_read_gitmodules() grep: replace grep_read_mutex by internal obj read lock object-store: allow threaded access to object reading replace-object: make replace operations thread-safe grep: fix racy calls in grep_objects() grep: fix race conditions at grep_submodule() grep: fix race conditions on userdiff calls
2020-01-30grep: ignore --recurse-submodules if --no-index is givenPhilippe Blain1-2/+5
Since grep learned to recurse into submodules in 0281e487fd (grep: optionally recurse into submodules, 2016-12-16), using --recurse-submodules along with --no-index makes Git die(). This is unfortunate because if submodule.recurse is set in a user's ~/.gitconfig, invoking `git grep --no-index` either inside or outside a Git repository results in fatal: option not supported with --recurse-submodules Let's allow using these options together, so that setting submodule.recurse globally does not prevent using `git grep --no-index`. Using `--recurse-submodules` should not have any effect if `--no-index` is used inside a repository, as Git will recurse into the checked out submodule directories just like into regular directories. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-17grep: use no. of cores as the default no. of threadsMatheus Tavares1-2/+1
When --threads is not specified, git-grep will use 8 threads by default. This fixed number may be too many for machines with fewer cores and too little for machines with more cores. So, instead, use the number of logical cores available in the machine, which seems to result in the best overall performance: The following measurements correspond to the mean elapsed times for 30 git-grep executions in chromium's repository[1] with a 95% confidence interval (each set of 30 were performed after 2 warmup runs). Regex 1 is 'abcd[02]' and Regex 2 is '(static|extern) (int|double) \*'. | Working tree | Object Store ------|-------------------------------|-------------------------------- #ths | Regex 1 | Regex 2 | Regex 1 | Regex 2 ------|---------------|---------------|----------------|--------------- 32 | 2.92s ± 0.01 | 3.72s ± 0.21 | 5.36s ± 0.01 | 6.07s ± 0.01 16 | 2.84s ± 0.01 | 3.57s ± 0.21 | 5.05s ± 0.01 | 5.71s ± 0.01 > 8 | 2.53s ± 0.00 | 3.24s ± 0.21 | 4.86s ± 0.01 | 5.48s ± 0.01 4 | 2.43s ± 0.02 | 3.22s ± 0.20 | 5.22s ± 0.02 | 6.03s ± 0.02 2 | 3.06s ± 0.20 | 4.52s ± 0.01 | 7.52s ± 0.01 | 9.06s ± 0.01 1 | 6.16s ± 0.01 | 9.25s ± 0.02 | 14.10s ± 0.01 | 17.22s ± 0.01 The above tests were performed in a desktop running Debian 10.0 with Intel(R) Xeon(R) CPU E3-1230 V2 (4 cores w/ hyper-threading), 32GB of RAM and a 7200 rpm, SATA 3.1 HDD. Bellow, the tests were repeated for a machine with SSD: a Manjaro laptop with Intel(R) i7-7700HQ (4 cores w/ hyper-threading) and 16GB of RAM: | Working tree | Object Store ------|--------------------------------|-------------------------------- #ths | Regex 1 | Regex 2 | Regex 1 | Regex 2 ------|---------------|----------------|----------------|--------------- 32 | 3.29s ± 0.21 | 4.30s ± 0.01 | 6.30s ± 0.01 | 7.30s ± 0.02 16 | 3.19s ± 0.20 | 4.14s ± 0.02 | 5.91s ± 0.01 | 6.83s ± 0.01 > 8 | 2.90s ± 0.04 | 3.82s ± 0.20 | 5.70s ± 0.02 | 6.53s ± 0.01 4 | 2.84s ± 0.02 | 3.77s ± 0.20 | 6.19s ± 0.02 | 7.18s ± 0.02 2 | 3.73s ± 0.21 | 5.57s ± 0.02 | 9.28s ± 0.01 | 11.22s ± 0.01 1 | 7.48s ± 0.02 | 11.36s ± 0.03 | 17.75s ± 0.01 | 21.87s ± 0.08 [1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”, 04-06-2019), after a 'git gc' execution. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-17grep: move driver pre-load out of critical sectionMatheus Tavares1-4/+4
In builtin/grep.c:add_work() we pre-load the userdiff drivers before adding the grep_source in the todo list. This operation is currently being performed after acquiring the grep_mutex, but as it's already thread-safe, we don't need to protect it here. So let's move it out of the critical section which should avoid thread contention and improve performance. Running[1] `git grep --threads=8 abcd[02] HEAD` on chromium's repository[2], I got the following mean times for 30 executions after 2 warmups: Original | 6.2886s -------------------------|----------- Out of critical section | 5.7852s [1]: Tests performed on an i7-7700HQ with 16GB of RAM and SSD, running Manjaro Linux. [2]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”, 04-06-2019), after a 'git gc' execution. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-17grep: re-enable threads in non-worktree caseMatheus Tavares1-1/+1
They were disabled at 53b8d93 ("grep: disable threading in non-worktree case", 12-12-2011), due to observable performance drops (to the point that using a single thread would be faster than multiple threads). But now that zlib inflation can be performed in parallel we can regain the speedup, so let's re-enable threads in non-worktree grep. Grepping 'abcd[02]' ("Regex 1") and '(static|extern) (int|double) \*' ("Regex 2") at chromium's repository[1] I got: Threads | Regex 1 | Regex 2 ---------|------------|----------- 1 | 17.2920s | 20.9624s 2 | 9.6512s | 11.3184s 4 | 6.7723s | 7.6268s 8** | 6.2886s | 6.9843s These are all means of 30 executions after 2 warmup runs. All tests were executed on an i7-7700HQ (quad-core w/ hyper-threading), 16GB of RAM and SSD, running Manjaro Linux. But to make sure the optimization also performs well on HDD, the tests were repeated on another machine with an i5-4210U (dual-core w/ hyper-threading), 8GB of RAM and HDD (SATA III, 5400 rpm), also running Manjaro Linux: Threads | Regex 1 | Regex 2 ---------|------------|----------- 1 | 18.4035s | 22.5368s 2 | 12.5063s | 14.6409s 4** | 10.9136s | 12.7106s ** Note that in these cases we relied on hyper-threading, and that's probably why we don't see a big difference in time. Unfortunately, multithreaded git-grep might be slow in the non-worktree case when --textconv is used and there're too many text conversions. Probably the reason for this is that the object read lock is used to protect fill_textconv() and therefore there is a mutual exclusion between textconv execution and object reading. Because both are time-consuming operations, not being able to perform them in parallel can cause performance drops. To inform the users about this (and other threading details), let's also add a "NOTES ON THREADS" section to Documentation/git-grep.txt. [1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”, 04-06-2019), after a 'git gc' execution. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-17grep: protect packed_git [re-]initializationMatheus Tavares1-2/+6
Some fields in struct raw_object_store are lazy initialized by the thread-unsafe packfile.c:prepare_packed_git(). Although this function is present in the call stack of git-grep threads, all paths to it are currently protected by obj_read_lock() (and the main thread usually indirectly calls it before firing the worker threads, anyway). However, it's possible that future modifications add new unprotected paths to it, introducing a race condition. Because errors derived from it wouldn't happen often, it could be hard to detect. So to prevent future headaches, let's force eager initialization of packed_git when setting git-grep up. There'll be a small overhead in the cases where we didn't really need to prepare packed_git during execution but this shouldn't be very noticeable. Also, packed_git may be re-initialized by packfile.c:reprepare_packed_git(). Again, all paths to it in git-grep are already protected by obj_read_lock() but it may suffer from the same problem in the future. So let's also internally protect it with obj_read_lock() (which is a recursive mutex). Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-17grep: allow submodule functions to run in parallelMatheus Tavares1-16/+22
Now that object reading operations are internally protected, the submodule initialization functions at builtin/grep.c:grep_submodule() are very close to being thread-safe. Let's take a look at each call and remove from the critical section what we can, for better performance: - submodule_from_path() and is_submodule_active() cannot be called in parallel yet only because they call repo_read_gitmodules() which contains, in its call stack, operations that would otherwise be in race condition with object reading (for example parse_object() and is_promisor_remote()). However, they only call repo_read_gitmodules() if it wasn't read before. So let's pre-read it before firing the threads and allow these two functions to safely be called in parallel. - repo_submodule_init() is already thread-safe, so remove it from the critical section without other necessary changes. - The repo_read_gitmodules(&subrepo) call at grep_submodule() is safe as no other thread is performing object reading operations in the subrepo yet. However, threads might be working in the superproject, and this function calls add_to_alternates_memory() internally, which is racy with object readings in the superproject. So it must be kept protected for now. Let's add a "NEEDSWORK" to it, informing why it cannot be removed from the critical section yet. - Finally, add_to_alternates_memory() must be kept protected for the same reason as the item above. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-17submodule-config: add skip_if_read option to repo_read_gitmodules()Matheus Tavares1-1/+1
Currently, submodule-config.c doesn't have an externally accessible function to read gitmodules only if it wasn't already read. But this exact behavior is internally implemented by gitmodules_read_check(), to perform a lazy load. Let's merge this function with repo_read_gitmodules() adding a 'skip_if_read' which allows both internal and external callers to access this functionality. This simplifies a little the code. The added option will also be used in the following patch. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-17grep: replace grep_read_mutex by internal obj read lockMatheus Tavares1-30/+16
git-grep uses 'grep_read_mutex' to protect its calls to object reading operations. But these have their own internal lock now, which ensures a better performance (allowing parallel access to more regions). So, let's remove the former and, instead, activate the latter with enable_obj_read_lock(). Sections that are currently protected by 'grep_read_mutex' but are not internally protected by the object reading lock should be surrounded by obj_read_lock() and obj_read_unlock(). These guarantee mutual exclusion with object reading operations, keeping the current behavior and avoiding race conditions. Namely, these places are: In grep.c: - fill_textconv() at fill_textconv_grep(). - userdiff_get_textconv() at grep_source_1(). In builtin/grep.c: - parse_object_or_die() and the submodule functions at grep_submodule(). - deref_tag() and gitmodules_config_oid() at grep_objects(). If these functions become thread-safe, in the future, we might remove the locking and probably get some speedup. Note that some of the submodule functions will already be thread-safe (or close to being thread-safe) with the internal object reading lock. However, as some of them will require additional modifications to be removed from the critical section, this will be done in its own patch. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-17grep: fix racy calls in grep_objects()Matheus Tavares1-0/+5
deref_tag() calls is_promisor_object() and parse_object(), both of which perform lazy initializations and other thread-unsafe operations. If it was only called by grep_objects() this wouldn't be a problem as the latter is only executed by the main thread. However, deref_tag() is also present in read_object_file()'s call stack. So calling deref_tag() in grep_objects() without acquiring the grep_read_mutex may incur in a race condition with object reading operations (such as the ones internally performed by fill_textconv(), called at fill_textconv_grep()). The same problem happens with the call to gitmodules_config_oid() which also has parse_object() in its call stack. Fix that protecting both calls with the said grep_read_mutex. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-17grep: fix race conditions at grep_submodule()Matheus Tavares1-4/+3
There're currently two function calls in builtin/grep.c:grep_submodule() which might result in race conditions: - submodule_from_path(): it has config_with_options() in its call stack which, in turn, may have read_object_file() in its own. Therefore, calling the first function without acquiring grep_read_mutex may end up causing a race condition with other object read operations performed by worker threads (for example, at the fill_textconv() call in grep.c:fill_textconv_grep()). - parse_object_or_die(): it falls into the same problem, having repo_has_object_file(the_repository, ...) in its call stack. Besides that, parse_object(), which is also called by parse_object_or_die(), is thread-unsafe and also called by object reading functions. It's unlikely to really fall into a data race with these operations as the volume of calls to them is usually very low. But we better protect ourselves against this possibility, anyway. So, to solve these issues, move both of these function calls into the critical section of grep_read_mutex. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-23Merge branch 'cb/pcre2-chartables-leakfix'Junio C Hamano1-0/+1
Leakfix. * cb/pcre2-chartables-leakfix: grep: avoid leak of chartables in PCRE2 grep: make PCRE2 aware of custom allocator grep: make PCRE1 aware of custom allocator
2019-10-18grep: make PCRE2 aware of custom allocatorCarlo Marcelo Arenas Belón1-0/+1
94da9193a6 (grep: add support for PCRE v2, 2017-06-01) didn't include a way to override the system allocator, and so it is incompatible with custom allocators (e.g. nedmalloc). This problem became obvious when we tried to plug a memory leak by `free()`ing a data structure allocated by PCRE2, triggering a segfault in Windows (where we use nedmalloc by default). PCRE2 requires the use of a general context to override the allocator and therefore, there is a lot more code needed than in PCRE1, including a couple of wrapper functions. Extend the grep API with a "destructor" that could be called to cleanup any objects that were created and used globally. Update `builtin/grep.c` to use that new API, but any other future users should make sure to have matching `grep_init()`/`grep_destroy()` calls if they are using the pattern matching functionality. Move some of the logic that was before done per thread (in the workers) into an earlier phase to avoid degrading performance, but as the use of PCRE2 with custom allocators is better understood it is expected more of its functions will be instructed to use the custom allocator as well as was done in the original code[1] this work was based on. [1] https://public-inbox.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/ Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-26grep: use return value of strbuf_detach()René Scharfe1-2/+2
Append the strbuf buffer only after detaching it. There is no practical difference here, as the strbuf is not empty and no strbuf_ function is called between storing the pointer to the still attached buffer and calling strbuf_detach(), so that pointer is valid, but make sure to follow the standard sequence anyway for consistency. Signed-off-by: René Scharfe <l.s.r@web.de> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-22Merge branch 'mt/grep-submodules-working-tree'Junio C Hamano1-4/+6
"git grep --recurse-submodules" that looks at the working tree files looked at the contents in the index in submodules, instead of files in the working tree. * mt/grep-submodules-working-tree: grep: fix worktree case in submodules
2019-07-30grep: fix worktree case in submodulesMatheus Tavares1-4/+6
Running git-grep with --recurse-submodules results in a cached grep for the submodules even when --cached is not used. This makes all modifications in submodules' tracked files be always ignored when grepping. Solve that making git-grep respect the cached option when invoking grep_cache() inside grep_submodule(). Also, add tests to ensure that the desired behavior is performed. Reported-by: Daniel Zaoui <jackdanielz@eyomi.org> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-27sha1-file.c: remove the_repo from read_object_with_reference()Nguyễn Thái Ngọc Duy1-2/+4
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-06Merge branch 'nd/the-index-final'Junio C Hamano1-19/+27
The assumption to work on the single "in-core index" instance has been reduced from the library-ish part of the codebase. * nd/the-index-final: cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch read-cache.c: remove the_* from index_has_changes() merge-recursive.c: remove implicit dependency on the_repository merge-recursive.c: remove implicit dependency on the_index sha1-name.c: remove implicit dependency on the_index read-cache.c: replace update_index_if_able with repo_& read-cache.c: kill read_index() checkout: avoid the_index when possible repository.c: replace hold_locked_index() with repo_hold_locked_index() notes-utils.c: remove the_repository references grep: use grep_opt->repo instead of explict repo argument
2019-01-29Merge branch 'bc/tree-walk-oid'Junio C Hamano1-4/+4
The code to walk tree objects has been taught that we may be working with object names that are not computed with SHA-1. * bc/tree-walk-oid: cache: make oidcpy always copy GIT_MAX_RAWSZ bytes tree-walk: store object_id in a separate member match-trees: use hashcpy to splice trees match-trees: compute buffer offset correctly when splicing tree-walk: copy object ID before use
2019-01-29Merge branch 'sb/submodule-recursive-fetch-gets-the-tip'Junio C Hamano1-7/+10
"git fetch --recurse-submodules" may not fetch the necessary commit that is bound to the superproject, which is getting corrected. * sb/submodule-recursive-fetch-gets-the-tip: fetch: ensure submodule objects fetched submodule.c: fetch in submodules git directory instead of in worktree submodule: migrate get_next_submodule to use repository structs repository: repo_submodule_init to take a submodule struct submodule: store OIDs in changed_submodule_names submodule.c: tighten scope of changed_submodule_names struct submodule.c: sort changed_submodule_names before searching it submodule.c: fix indentation sha1-array: provide oid_array_filter
2019-01-24cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switchNguyễn Thái Ngọc Duy1-0/+1
By default, index compat macros are off from now on, because they could hide the_index dependency. Only those in builtin can use it. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-15tree-walk: store object_id in a separate memberbrian m. carlson1-4/+4
When parsing a tree, we read the object ID directly out of the tree buffer. This is normally fine, but such an object ID cannot be used with oidcpy, which copies GIT_MAX_RAWSZ bytes, because if we are using SHA-1, there may not be that many bytes to copy. Instead, store the object ID in a separate struct member. Since we can no longer efficiently compute the path length, store that information as well in struct name_entry. Ensure we only copy the object ID into the new buffer if the path length is nonzero, as some callers will pass us an empty path with no object ID following it, and we will not want to read past the end of the buffer. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-14Merge branch 'nd/attr-pathspec-in-tree-walk'Junio C Hamano1-1/+2
The traversal over tree objects has learned to honor ":(attr:label)" pathspec match, which has been implemented only for enumerating paths on the filesystem. * nd/attr-pathspec-in-tree-walk: tree-walk: support :(attr) matching dir.c: move, rename and export match_attrs() pathspec.h: clean up "extern" in function declarations tree-walk.c: make tree_entry_interesting() take an index tree.c: make read_tree*() take 'struct repository *'
2019-01-14sha1-name.c: remove implicit dependency on the_indexNguyễn Thái Ngọc Duy1-1/+2
This kills the_index dependency in get_oid_with_context() but for get_oid() and friends, they still assume the_repository (which also means the_index). Unfortunately the widespread use of get_oid() will make it hard to make the conversion now. We probably will add repo_get_oid() at some point and limit the use of get_oid() in builtin/ instead of forcing all get_oid() call sites to carry struct repository. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-14grep: use grep_opt->repo instead of explict repo argumentNguyễn Thái Ngọc Duy1-17/+24
This command is probably the first one that operates on a repository other than the_repository, in f9ee2fcdfa (grep: recurse in-process using 'struct repository' - 2017-08-02). An explicit 'struct repository *' was added in that commit to pass around the repository that we're supposed to grep from. Since 38bbc2ea39 (grep.c: remove implicit dependency on the_index - 2018-09-21). 'struct grep_opt *' carries in itself a repository parameter for grepping. We should now be able to reuse grep_opt to hold the submodule repo instead of a separate argument, which is just room for mistakes. While at there, use the right reference instead of the_repository and the_index in this code. I was a bit careless in my attempt to kick the_repository / the_index out of library code. It's normally safe to just stick the_repository / the_index in bultin/ code, but it's not the case for grep. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-04Merge branch 'jk/loose-object-cache'Junio C Hamano1-1/+1
Code clean-up with optimization for the codepath that checks (non-)existence of loose objects. * jk/loose-object-cache: odb_load_loose_cache: fix strbuf leak fetch-pack: drop custom loose object cache sha1-file: use loose object cache for quick existence check object-store: provide helpers for loose_objects_cache sha1-file: use an object_directory for the main object dir handle alternates paths the same as the main object dir sha1_file_name(): overwrite buffer instead of appending rename "alternate_object_database" to "object_directory" submodule--helper: prefer strip_suffix() to ends_with() fsck: do not reuse child_process structs
2018-12-05repository: repo_submodule_init to take a submodule structStefan Beller1-7/+10
When constructing a struct repository for a submodule for some revision of the superproject where the submodule is not contained in the index, it may not be present in the working tree currently either. In that situation giving a 'path' argument is not useful. Upgrade the repo_submodule_init function to take a struct submodule instead. The submodule struct can be obtained via submodule_from_{path, name} or an artificial submodule struct can be passed in. While we are at it, rename the repository struct in the repo_submodule_init function, which is to be initialized, to a name that is not confused with the struct submodule as easily. Perform such renames in similar functions as well. Also move its documentation into the header file. Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-19tree-walk.c: make tree_entry_interesting() take an indexNguyễn Thái Ngọc Duy1-1/+2
In order to support :(attr) when matching pathspec on a tree, tree_entry_interesting() needs to take an index (because git_check_attr() needs it). This is the preparation step for it. This also makes it clearer what index we fall back to when looking up attributes during an unpack-trees operation: the source index. This also fixes revs->pruning.repo initialization that should have been done in 2abf350385 (revision.c: remove implicit dependency on the_index - 2018-09-21). Without it, skip_uninteresting() will dereference a NULL pointer through this call chain get_revision(revs) get_revision_internal get_revision_1 try_to_simplify_commit rev_compare_tree diff_tree_oid(..., &revs->pruning) ll_diff_tree_oid diff_tree_paths ll_diff_tree skip_uninteresting Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-18Merge branch 'jk/unused-parameter-fixes'Junio C Hamano1-1/+13
Various functions have been audited for "-Wunused-parameter" warnings and bugs in them got fixed. * jk/unused-parameter-fixes: midx: double-check large object write loop assert NOARG/NONEG behavior of parse-options callbacks parse-options: drop OPT_DATE() apply: return -1 from option callback instead of calling exit(1) cat-file: report an error on multiple --batch options tag: mark "--message" option with NONEG show-branch: mark --reflog option as NONEG format-patch: mark "--no-numbered" option with NONEG status: mark --find-renames option with NONEG cat-file: mark batch options with NONEG pack-objects: mark index-version option as NONEG ls-files: mark exclude options as NONEG am: handle --no-patch-format option apply: mark include/exclude options as NONEG
2018-11-18Merge branch 'nd/pthreads'Junio C Hamano1-49/+30
The codebase has been cleaned up to reduce "#ifndef NO_PTHREADS". * nd/pthreads: Clean up pthread_create() error handling read-cache.c: initialize copy_len to shut up gcc 8 read-cache.c: reduce branching based on HAVE_THREADS read-cache.c: remove #ifdef NO_PTHREADS pack-objects: remove #ifdef NO_PTHREADS preload-index.c: remove #ifdef NO_PTHREADS grep: clean up num_threads handling grep: remove #ifdef NO_PTHREADS attr.c: remove #ifdef NO_PTHREADS name-hash.c: remove #ifdef NO_PTHREADS index-pack: remove #ifdef NO_PTHREADS send-pack.c: move async's #ifdef NO_PTHREADS back to run-command.c run-command.h: include thread-utils.h instead of pthread.h thread-utils: macros to unconditionally compile pthreads API
2018-11-13Merge branch 'ao/submodule-wo-gitmodules-checked-out'Junio C Hamano1-3/+14
The submodule support has been updated to read from the blob at HEAD:.gitmodules when the .gitmodules file is missing from the working tree. * ao/submodule-wo-gitmodules-checked-out: t/helper: add test-submodule-nested-repo-config submodule: support reading .gitmodules when it's not in the working tree submodule: add a helper to check if it is safe to write to .gitmodules t7506: clean up .gitmodules properly before setting up new scenario submodule: use the 'submodule--helper config' command submodule--helper: add a new 'config' subcommand t7411: be nicer to future tests and really clean things up t7411: merge tests 5 and 6 submodule: factor out a config_set_in_gitmodules_file_gently function submodule: add a print_config_from_gitmodules() helper
2018-11-13sha1-file: use an object_directory for the main object dirJeff King1-1/+1
Our handling of alternate object directories is needlessly different from the main object directory. As a result, many places in the code basically look like this: do_something(r->objects->objdir); for (odb = r->objects->alt_odb_list; odb; odb = odb->next) do_something(odb->path); That gets annoying when do_something() is non-trivial, and we've resorted to gross hacks like creating fake alternates (see find_short_object_filename()). Instead, let's give each raw_object_store a unified list of object_directory structs. The first will be the main store, and everything after is an alternate. Very few callers even care about the distinction, and can just loop over the whole list (and those who care can just treat the first element differently). A few observations: - we don't need r->objects->objectdir anymore, and can just mechanically convert that to r->objects->odb->path - object_directory's path field needs to become a real pointer rather than a FLEX_ARRAY, in order to fill it with expand_base_dir() - we'll call prepare_alt_odb() earlier in many functions (i.e., outside of the loop). This may result in us calling it even when our function would be satisfied looking only at the main odb. But this doesn't matter in practice. It's not a very expensive operation in the first place, and in the majority of cases it will be a noop. We call it already (and cache its results) in prepare_packed_git(), and we'll generally check packs before loose objects. So essentially every program is going to call it immediately once per program. Arguably we should just prepare_alt_odb() immediately upon setting up the repository's object directory, which would save us sprinkling calls throughout the code base (and forgetting to do so has been a source of subtle bugs in the past). But I've stopped short of that here, since there are already a lot of other moving parts in this patch. - Most call sites just get shorter. The check_and_freshen() functions are an exception, because they have entry points to handle local and nonlocal directories separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-06assert NOARG/NONEG behavior of parse-options callbacksJeff King1-1/+13
When we define a parse-options callback, the flags we put in the option struct must match what the callback expects. For example, a callback which does not handle the "unset" parameter should only be used with PARSE_OPT_NONEG. But since the callback and the option struct are not defined next to each other, it's easy to get this wrong (as earlier patches in this series show). Fortunately, the compiler can help us here: compiling with -Wunused-parameters can show us which callbacks ignore their "unset" parameters (and likewise, ones that ignore "arg" expect to be triggered with PARSE_OPT_NOARG). But after we've inspected a callback and determined that all of its callers use the right flags, what do we do next? We'd like to silence the compiler warning, but do so in a way that will catch any wrong calls in the future. We can do that by actually checking those variables and asserting that they match our expectations. Because this is such a common pattern, we'll introduce some helper macros. The resulting messages aren't as descriptive as we could make them, but the file/line information from BUG() is enough to identify the problem (and anyway, the point is that these should never be seen). Each of the annotated callbacks in this patch triggers -Wunused-parameters, and was manually inspected to make sure all callers use the correct options (so none of these BUGs should be triggerable). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-05grep: clean up num_threads handlingNguyễn Thái Ngọc Duy1-31/+27
When NO_PTHREADS is still used in this file, we have two separate code paths for thread and no thread support. The latter will always have num_threads remain zero while the former uses num_threads zero as "default number of threads". With recent changes blur the line between thread and no-thread support, this num_threads handling becomes a bit strange so let's redefine it like this: - num_threads == 0 means default number of threads and should become positive after all configuration and option parsing is done if multithread is supported. - num_threads <= 1 runs no threads. It does not matter if the platform supports threading or not. - num_threads > 1 will run multiple threads and is invalid if HAVE_THREADS is false. pthread API is only used in this case. PS. a new warning is also added when num_threads is forced back to one because a thread-incompatible option is specified. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-05grep: remove #ifdef NO_PTHREADSNguyễn Thái Ngọc Duy1-37/+22
This is a faithful conversion without attempting to improve anything. That comes later. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-31submodule: support reading .gitmodules when it's not in the working treeAntonio Ospite1-3/+14
When the .gitmodules file is not available in the working tree, try using the content from the index and from the current branch. This covers the case when the file is part of the repository but for some reason it is not checked out, for example because of a sparse checkout. This makes it possible to use at least the 'git submodule' commands which *read* the gitmodules configuration file without fully populating the working tree. Writing to .gitmodules will still require that the file is checked out, so check for that before calling config_set_in_gitmodules_file_gently. Add a similar check also in git-submodule.sh::cmd_add() to anticipate the eventual failure of the "git submodule add" command when .gitmodules is not safely writeable; this prevents the command from leaving the repository in a spurious state (e.g. the submodule repository was cloned but .gitmodules was not updated because config_set_in_gitmodules_file_gently failed). Moreover, since config_from_gitmodules() now accesses the global object store, it is necessary to protect all code paths which call the function against concurrent access to the global object store. Currently this only happens in builtin/grep.c::grep_submodules(), so call grep_read_lock() before invoking code involving config_from_gitmodules(). Finally, add t7418-submodule-sparse-gitmodules.sh to verify that reading from .gitmodules succeeds and that writing to it fails when the file is not checked out. NOTE: there is one rare case where this new feature does not work properly yet: nested submodules without .gitmodules in their working tree. This has been documented with a warning and a test_expect_failure item in t7814, and in this case the current behavior is not altered: no config is read. Signed-off-by: Antonio Ospite <ao2@ao2.it> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-19Merge branch 'rs/grep-no-recursive'Junio C Hamano1-0/+2
Unlike "grep", "git grep" by default recurses to the whole tree. The command learned "git grep --recursive" option, so that "git grep --no-recursive" can serve as a synonym to setting the max-depth to 0. * rs/grep-no-recursive: grep: add -r/--[no-]recursive
2018-10-03grep: add -r/--[no-]recursiveRené Scharfe1-0/+2
Recognize -r and --recursive as synonyms for --max-depth=-1 for compatibility with GNU grep; it's still the default for git grep. This also adds --no-recursive as synonym for --max-depth=0 for free, which is welcome for completeness and consistency. Fix the description for --max-depth, while we're at it -- negative values other than -1 actually disable recursion, i.e. they are equivalent to --max-depth=0. Requested-by: Christoph Berg <myon@debian.org> Suggested-by: Junio C Hamano <gitster@pobox.com> Initial-patch-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-21userdiff.c: remove implicit dependency on the_indexNguyễn Thái Ngọc Duy1-1/+2
[jc: squashed in missing forward decl in userdiff.h found by Ramsay] Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-21grep.c: remove implicit dependency on the_indexNguyễn Thái Ngọc Duy1-2/+2
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20Merge branch 'nd/no-the-index'Junio C Hamano1-3/+3
The more library-ish parts of the codebase learned to work on the in-core index-state instance that is passed in by their callers, instead of always working on the singleton "the_index" instance. * nd/no-the-index: (24 commits) blame.c: remove implicit dependency on the_index apply.c: remove implicit dependency on the_index apply.c: make init_apply_state() take a struct repository apply.c: pass struct apply_state to more functions resolve-undo.c: use the right index instead of the_index archive-*.c: use the right repository archive.c: avoid access to the_index grep: use the right index instead of the_index attr: remove index from git_attr_set_direction() entry.c: use the right index instead of the_index submodule.c: use the right index instead of the_index pathspec.c: use the right index instead of the_index unpack-trees: avoid the_index in verify_absent() unpack-trees: convert clear_ce_flags* to avoid the_index unpack-trees: don't shadow global var the_index unpack-trees: add a note about path invalidation unpack-trees: remove 'extern' on function declaration ls-files: correct index argument to get_convert_attr_ascii() preload-index.c: use the right index instead of the_index dir.c: remove an implicit dependency on the_index in pathspec code ...
2018-08-15Merge branch 'nd/i18n'Junio C Hamano1-6/+6
Many more strings are prepared for l10n. * nd/i18n: (23 commits) transport-helper.c: mark more strings for translation transport.c: mark more strings for translation sha1-file.c: mark more strings for translation sequencer.c: mark more strings for translation replace-object.c: mark more strings for translation refspec.c: mark more strings for translation refs.c: mark more strings for translation pkt-line.c: mark more strings for translation object.c: mark more strings for translation exec-cmd.c: mark more strings for translation environment.c: mark more strings for translation dir.c: mark more strings for translation convert.c: mark more strings for translation connect.c: mark more strings for translation config.c: mark more strings for translation commit-graph.c: mark more strings for translation builtin/replace.c: mark more strings for translation builtin/pack-objects.c: mark more strings for translation builtin/grep.c: mark strings for translation builtin/config.c: mark more strings for translation ...
2018-08-13grep: use the right index instead of the_indexNguyễn Thái Ngọc Duy1-2/+2
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13dir.c: remove an implicit dependency on the_index in pathspec codeNguyễn Thái Ngọc Duy1-3/+3
Make the match_patchspec API and friends take an index_state instead of assuming the_index in dir.c. All external call sites are converted blindly to keep the patch simple and retain current behavior. Individual call sites may receive further updates to use the right index instead of the_index. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-02Merge branch 'tb/grep-only-matching'Junio C Hamano1-0/+6
"git grep" learned the "--only-matching" option. * tb/grep-only-matching: grep.c: teach 'git grep --only-matching' grep.c: extract show_line_header()
2018-08-02Merge branch 'sb/object-store-lookup'Junio C Hamano1-1/+2
lookup_commit_reference() and friends have been updated to find in-core object for a specific in-core repository instance. * sb/object-store-lookup: (32 commits) commit.c: allow lookup_commit_reference to handle arbitrary repositories commit.c: allow lookup_commit_reference_gently to handle arbitrary repositories tag.c: allow deref_tag to handle arbitrary repositories object.c: allow parse_object to handle arbitrary repositories object.c: allow parse_object_buffer to handle arbitrary repositories commit.c: allow get_cached_commit_buffer to handle arbitrary repositories commit.c: allow set_commit_buffer to handle arbitrary repositories commit.c: migrate the commit buffer to the parsed object store commit-slabs: remove realloc counter outside of slab struct commit.c: allow parse_commit_buffer to handle arbitrary repositories tag: allow parse_tag_buffer to handle arbitrary repositories tag: allow lookup_tag to handle arbitrary repositories commit: allow lookup_commit to handle arbitrary repositories tree: allow lookup_tree to handle arbitrary repositories blob: allow lookup_blob to handle arbitrary repositories object: allow lookup_object to handle arbitrary repositories object: allow object_as_type to handle arbitrary repositories tag: add repository argument to deref_tag tag: add repository argument to parse_tag_buffer tag: add repository argument to lookup_tag ...
2018-07-23builtin/grep.c: mark strings for translationNguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-23Update messages in preparation for i18nNguyễn Thái Ngọc Duy1-5/+5
Many messages will be marked for translation in the following commits. This commit updates some of them to be more consistent and reduce diff noise in those commits. Changes are - keep the first letter of die(), error() and warning() in lowercase - no full stop in die(), error() or warning() if it's single sentence messages - indentation - some messages are turned to BUG(), or prefixed with "BUG:" and will not be marked for i18n - some messages are improved to give more information - some messages are broken down by sentence to be i18n friendly (on the same token, combine multiple warning() into one big string) - the trailing \n is converted to printf_ln if possible, or deleted if not redundant - errno_errno() is used instead of explicit strerror() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-09grep.c: teach 'git grep --only-matching'Taylor Blau1-0/+6
Teach 'git grep --only-matching', a new option to only print the matching part(s) of a line. For instance, a line containing the following (taken from README.md:27): (`man gitcvs-migration` or `git help cvs-migration` if git is Is printed as follows: $ git grep --line-number --column --only-matching -e git -- \ README.md | grep ":27" README.md:27:7:git README.md:27:16:git README.md:27:38:git The patch works mostly as one would expect, with the exception of a few considerations that are worth mentioning here. Like GNU grep, this patch ignores --only-matching when --invert (-v) is given. There is a sensible answer here, but parity with the behavior of other tools is preferred. Because a line might contain more than one match, there are special considerations pertaining to when to print line headers, newlines, and how to increment the match column offset. The line header and newlines are handled as a special case within the main loop to avoid polluting the surrounding code with conditionals that have large blocks. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-29tag: add repository argument to deref_tagStefan Beller1-1/+2
Add a repository argument to allow the callers of deref_tag to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-22builtin/grep.c: add '--column' option to 'git-grep(1)'Taylor Blau1-0/+1
Teach 'git-grep(1)' a new option, '--column', to show the column number of the first match on a non-context line. This makes it possible to teach 'contrib/git-jump/git-jump' how to seek to the first matching position of a grep match in your editor, and allows similar additional scripting capabilities. For example: $ git grep -n --column foo | head -n3 .clang-format:51:14:# myFunction(foo, bar, baz); .clang-format:64:7:# int foo(); .clang-format:75:8:# void foo() Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-01Merge branch 'nd/use-opt-int-set-f'Junio C Hamano1-3/+3
Code simplification. * nd/use-opt-int-set-f: Use OPT_SET_INT_F() for cmdline option specification
2018-05-30Merge branch 'sb/grep-die-on-unreadable-index'Junio C Hamano1-1/+2
Error behaviour of "git grep" when it cannot read the index was inconsistent with other commands that uses the index, which has been corrected to error out early. * sb/grep-die-on-unreadable-index: grep: handle corrupt index files early
2018-05-24Use OPT_SET_INT_F() for cmdline option specificationNguyễn Thái Ngọc Duy1-3/+3
The only thing these commands need is extra parseopt flag which can be passed in by OPT_SET_INT_F() and it is a bit more compact than full struct initialization. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-16grep: handle corrupt index files earlyStefan Beller1-1/+2
Any other caller of 'repo_read_index' dies upon a negative return of it, so grep should, too. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-08Merge branch 'sb/submodule-move-nested'Junio C Hamano1-8/+6
Moving a submodule that itself has submodule in it with "git mv" forgot to make necessary adjustment to the nested sub-submodules; now the codepath learned to recurse into the submodules. * sb/submodule-move-nested: submodule: fixup nested submodules after moving the submodule submodule-config: remove submodule_from_cache submodule-config: add repository argument to submodule_from_{name, path} submodule-config: allow submodule_free to handle arbitrary repositories grep: remove "repo" arg from non-supporting funcs submodule.h: drop declaration of connect_work_tree_and_git_dir
2018-04-11Merge branch 'sb/object-store'Junio C Hamano1-1/+2
Refactoring the internal global data structure to make it possible to open multiple repositories, work with and then close them. Rerolled by Duy on top of a separate preliminary clean-up topic. The resulting structure of the topics looked very sensible. * sb/object-store: (27 commits) sha1_file: allow sha1_loose_object_info to handle arbitrary repositories sha1_file: allow map_sha1_file to handle arbitrary repositories sha1_file: allow map_sha1_file_1 to handle arbitrary repositories sha1_file: allow open_sha1_file to handle arbitrary repositories sha1_file: allow stat_sha1_file to handle arbitrary repositories sha1_file: allow sha1_file_name to handle arbitrary repositories sha1_file: add repository argument to sha1_loose_object_info sha1_file: add repository argument to map_sha1_file sha1_file: add repository argument to map_sha1_file_1 sha1_file: add repository argument to open_sha1_file sha1_file: add repository argument to stat_sha1_file sha1_file: add repository argument to sha1_file_name sha1_file: allow prepare_alt_odb to handle arbitrary repositories sha1_file: allow link_alt_odb_entries to handle arbitrary repositories sha1_file: add repository argument to prepare_alt_odb sha1_file: add repository argument to link_alt_odb_entries sha1_file: add repository argument to read_info_alternates sha1_file: add repository argument to link_alt_odb_entry sha1_file: add raw_object_store argument to alt_odb_usable pack: move approximate object count to object store ...
2018-04-10Merge branch 'bc/object-id'Junio C Hamano1-3/+3
Conversion from uchar[20] to struct object_id continues. * bc/object-id: (36 commits) convert: convert to struct object_id sha1_file: introduce a constant for max header length Convert lookup_replace_object to struct object_id sha1_file: convert read_sha1_file to struct object_id sha1_file: convert read_object_with_reference to object_id tree-walk: convert tree entry functions to object_id streaming: convert istream internals to struct object_id tree-walk: convert get_tree_entry_follow_symlinks internals to object_id builtin/notes: convert static functions to object_id builtin/fmt-merge-msg: convert remaining code to object_id sha1_file: convert sha1_object_info* to object_id Convert remaining callers of sha1_object_info_extended to object_id packfile: convert unpack_entry to struct object_id sha1_file: convert retry_bad_packed_offset to struct object_id sha1_file: convert assert_sha1_type to object_id builtin/mktree: convert to struct object_id streaming: convert open_istream to use struct object_id sha1_file: convert check_sha1_signature to struct object_id sha1_file: convert read_loose_object to use struct object_id builtin/index-pack: convert struct ref_delta_entry to object_id ...
2018-03-29submodule-config: allow submodule_free to handle arbitrary repositoriesStefan Beller1-1/+1
At some point we may want to rename the function so that it describes what it actually does as 'submodule_free' doesn't quite describe that this clears a repository's submodule cache. But that's beyond the scope of this series. While at it remove the extern key word from its declaration. Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-29grep: remove "repo" arg from non-supporting funcsJonathan Tan1-7/+5
As part of commit f9ee2fcdfa ("grep: recurse in-process using 'struct repository'", 2017-08-02), many functions in builtin/grep.c were converted to also take "struct repository *" arguments. Among them were grep_object() and grep_objects(). However, at least grep_objects() was converted incompletely - it calls gitmodules_config_oid(), which references the_repository. But it turns out that the conversion was extraneous anyway - there has been no user-visible effect - because grep_objects() is never invoked except with the_repository. This is because grepping through objects cannot be done recursively into submodules. Revert the changes to grep_objects() and grep_object() (which conversion is also extraneous) to show that both these functions do not support repositories other than the_repository. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-23repository: introduce raw object store fieldStefan Beller1-1/+2
The raw object store field will contain any objects needed for access to objects in a given repository. This patch introduces the raw object store and populates it with the `objectdir`, which used to be part of the repository struct. As the struct gains members, we'll also populate the function to clear the memory for these members. In a later step, we'll introduce a struct object_parser, that will complement the object parsing in a repository struct: The raw object parser is the layer that will provide access to raw object content, while the higher level object parser code will parse raw objects and keeps track of parenthood and other object relationships using 'struct object'. For now only add the lower level to the repository struct. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14Merge branch 'nd/parseopt-completion'Junio C Hamano1-5/+8
Teach parse-options API an option to help the completion script, and make use of the mechanism in command line completion. * nd/parseopt-completion: (45 commits) completion: more subcommands in _git_notes() completion: complete --{reuse,reedit}-message= for all notes subcmds completion: simplify _git_notes completion: don't set PARSE_OPT_NOCOMPLETE on --rerere-autoupdate completion: use __gitcomp_builtin in _git_worktree completion: use __gitcomp_builtin in _git_tag completion: use __gitcomp_builtin in _git_status completion: use __gitcomp_builtin in _git_show_branch completion: use __gitcomp_builtin in _git_rm completion: use __gitcomp_builtin in _git_revert completion: use __gitcomp_builtin in _git_reset completion: use __gitcomp_builtin in _git_replace remote: force completing --mirror= instead of --mirror completion: use __gitcomp_builtin in _git_remote completion: use __gitcomp_builtin in _git_push completion: use __gitcomp_builtin in _git_pull completion: use __gitcomp_builtin in _git_notes completion: use __gitcomp_builtin in _git_name_rev completion: use __gitcomp_builtin in _git_mv completion: use __gitcomp_builtin in _git_merge_base ...
2018-03-14sha1_file: convert read_sha1_file to struct object_idbrian m. carlson1-1/+1
Convert read_sha1_file to take a pointer to struct object_id and rename it read_object_file. Do the same for read_sha1_file_extended. Convert one use in grep.c to use the new function without any other code change, since the pointer being passed is a void pointer that is already initialized with a pointer to struct object_id. Update the declaration and definitions of the modified functions, and apply the following semantic patch to convert the remaining callers: @@ expression E1, E2, E3; @@ - read_sha1_file(E1.hash, E2, E3) + read_object_file(&E1, E2, E3) @@ expression E1, E2, E3; @@ - read_sha1_file(E1->hash, E2, E3) + read_object_file(E1, E2, E3) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1.hash, E2, E3, E4) + read_object_file_extended(&E1, E2, E3, E4) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1->hash, E2, E3, E4) + read_object_file_extended(E1, E2, E3, E4) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14sha1_file: convert read_object_with_reference to object_idbrian m. carlson1-2/+2
Convert read_object_with_reference to take pointers to struct object_id. Update the internals of the function accordingly. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-08Merge branch 'rv/grep-cleanup'Junio C Hamano1-13/+20
Threaded "git grep" has been optimized to avoid allocation in code section that is covered under a mutex. * rv/grep-cleanup: grep: simplify grep_oid and grep_file grep: move grep_source_init outside critical section
2018-02-23grep: simplify grep_oid and grep_fileRasmus Villemoes1-4/+2
In the NO_PTHREADS or !num_threads case, this doesn't change anything. In the threaded case, note that grep_source_init duplicates its third argument, so there is no need to keep [path]buf.buf alive across the call of add_work(). Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-23grep: move grep_source_init outside critical sectionRasmus Villemoes1-9/+18
grep_source_init typically does three strdup()s, and in the threaded case, the call from add_work() happens while holding grep_mutex. We can thus reduce the time we hold grep_mutex by moving the grep_source_init() call out of add_work(), and simply have add_work() copy the initialized structure to the available slot in the todo array. This also simplifies the prototype of add_work(), since it no longer needs to duplicate all the parameters of grep_source_init(). In the callers of add_work(), we get to reduce the amount of code duplicated in the threaded and non-threaded cases slightly (avoiding repeating the long "GREP_SOURCE_OID, pathbuf.buf, path, oid" argument list); a subsequent cleanup patch will make that even more so. Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-14object: rename function 'typename' to 'type_name'Brandon Williams1-1/+1
Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-09completion: use __gitcomp_builtin in _git_grepNguyễn Thái Ngọc Duy1-5/+8
The new completable options are: --after-context= --before-context= --color --context --exclude-standard --quiet --recurse-submodules --textconv Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-19Merge branch 'bw/pathspec-match-submodule-boundary'Junio C Hamano1-0/+1
An v2.12-era regression in pathspec match logic, which made it look into submodule tree even when it is not desired, has been fixed. * bw/pathspec-match-submodule-boundary: pathspec: only match across submodule boundaries when requested
2017-12-05pathspec: only match across submodule boundaries when requestedBrandon Williams1-0/+1
Commit 74ed43711fd (grep: enable recurse-submodules to work on <tree> objects, 2016-12-16) taught 'tree_entry_interesting()' to be able to match across submodule boundaries in the presence of wildcards. This is done by performing literal matching up to the first wildcard and then punting to the submodule itself to perform more accurate pattern matching. Instead of introducing a new flag to request this behavior, commit 74ed43711fd overloaded the already existing 'recursive' flag in 'struct pathspec' to request this behavior. This leads to a bug where whenever any other caller has the 'recursive' flag set as well as a pathspec with wildcards that all submodules will be indicated as matches. One simple example of this is: git init repo cd repo git init submodule git -C submodule commit -m initial --allow-empty touch "[bracket]" git add "[bracket]" git commit -m bracket git add submodule git commit -m submodule git rev-list HEAD -- "[bracket]" Fix this by introducing the new flag 'recurse_submodules' in 'struct pathspec' and using this flag to determine if matches should be allowed to cross submodule boundaries. This fixes https://github.com/git-for-windows/git/issues/1371. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-15Merge branch 'bw/grep-recurse-submodules' into maintJunio C Hamano1-0/+2
A broken access to object databases in recent update to "git grep --recurse-submodules" has been fixed. * bw/grep-recurse-submodules: grep: take the read-lock when adding a submodule
2017-11-06Merge branch 'bw/grep-recurse-submodules'Junio C Hamano1-0/+2
A broken access to object databases in recent update to "git grep --recurse-submodules" has been fixed. * bw/grep-recurse-submodules: grep: take the read-lock when adding a submodule
2017-11-02grep: take the read-lock when adding a submoduleMartin Ågren1-0/+2
With --recurse-submodules, we add each submodule that we encounter to the list of alternate object databases. With threading, our changes to the list are not protected against races. Indeed, ThreadSanitizer reports a race when we call `add_to_alternates_memory()` around the same time that another thread is reading in the list through `read_sha1_file()`. Take the grep read-lock while adding the submodule. The lock is used to serialize uses of non-thread-safe parts of Git's API, including `read_sha1_file()`. Helped-by: Brandon Williams <bmwill@google.com> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Acked-by: Brandon Williams <bmwill@google.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-18Merge branch 'jk/ref-filter-colors-fix'Junio C Hamano1-1/+1
This is the "theoretically more correct" approach of simply stepping back to the state before plumbing commands started paying attention to "color.ui" configuration variable. Let's run with this one. * jk/ref-filter-colors-fix: tag: respect color.ui config Revert "color: check color.ui in git_default_config()" Revert "t6006: drop "always" color config tests" Revert "color: make "always" the same as "auto" in config"
2017-10-17Revert "color: check color.ui in git_default_config()"Jeff King1-1/+1
This reverts commit 136c8c8b8fa39f1315713248473dececf20f8fe7. That commit was trying to address a bug caused by 4c7f1819b3 (make color.ui default to 'auto', 2013-06-10), in which plumbing like diff-tree defaulted to "auto" color, but did not respect a "color.ui" directive to disable it. But it also meant that we started respecting "color.ui" set to "always". This was a known problem, but 4c7f1819b3 argued that nobody ought to be doing that. However, that turned out to be wrong, and we got a number of bug reports related to "add -p" regressing in v2.14.2. Let's revert 136c8c8b8, fixing the regression to "add -p". This leaves the problem from 4c7f1819b3 unfixed, but: 1. It's a pretty obscure problem in the first place. I only noticed it while working on the color code, and we haven't got a single bug report or complaint about it. 2. We can make a more moderate fix on top by respecting "never" but not "always" for plumbing commands. This is just the minimal fix to go back to the working state we had before v2.14.2. Note that this isn't a pure revert. We now have a test in t3701 which shows off the "add -p" regression. This can be flipped to success. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-26Merge branch 'bw/submodule-config-cleanup'Junio C Hamano1-4/+0
Code clean-up to avoid mixing values read from the .gitmodules file and values read from the .git/config file. * bw/submodule-config-cleanup: submodule: remove gitmodules_config unpack-trees: improve loading of .gitmodules submodule-config: lazy-load a repository's .gitmodules file submodule-config: move submodule-config functions to submodule-config.c submodule-config: remove support for overlaying repository config diff: stop allowing diff to have submodules configured in .git/config submodule: remove submodule_config callback routine unpack-trees: don't respect submodule.update submodule: don't rely on overlayed config when setting diffopts fetch: don't overlay config with submodule-config submodule--helper: don't overlay config in update-clone submodule--helper: don't overlay config in remote_submodule_branch add, reset: ensure submodules can be added or reset submodule: don't use submodule_from_name t7411: check configuration parsing errors
2017-08-23Merge branch 'jk/ref-filter-colors' into maintJunio C Hamano1-1/+1
"%C(color name)" in the pretty print format always produced ANSI color escape codes, which was an early design mistake. They now honor the configuration (e.g. "color.ui = never") and also tty-ness of the output medium. * jk/ref-filter-colors: ref-filter: consult want_color() before emitting colors pretty: respect color settings for %C placeholders rev-list: pass diffopt->use_colors through to pretty-print for-each-ref: load config earlier color: check color.ui in git_default_config() ref-filter: pass ref_format struct to atom parsers ref-filter: factor out the parsing of sorting atoms ref-filter: make parse_ref_filter_atom a private function ref-filter: provide a function for parsing sort options ref-filter: move need_color_reset_at_eol into ref_format ref-filter: abstract ref format into its own struct ref-filter: simplify automatic color reset t: use test_decode_color rather than literal ANSI codes docs/for-each-ref: update pointer to color syntax check return value of verify_ref_format()
2017-08-22Merge branch 'bw/grep-recurse-submodules'Junio C Hamano1-310/+86
"git grep --recurse-submodules" has been reworked to give a more consistent output across submodule boundary (and do its thing without having to fork a separate process). * bw/grep-recurse-submodules: grep: recurse in-process using 'struct repository' submodule: merge repo_read_gitmodules and gitmodules_config submodule: check for unmerged .gitmodules outside of config parsing submodule: check for unstaged .gitmodules outside of config parsing submodule: remove fetch.recursesubmodules from submodule-config parsing submodule: remove submodule.fetchjobs from submodule-config parsing config: add config_from_gitmodules cache.h: add GITMODULES_FILE macro repository: have the_repository use the_index repo_read_index: don't discard the index
2017-08-11Merge branch 'jk/ref-filter-colors'Junio C Hamano1-1/+1
"%C(color name)" in the pretty print format always produced ANSI color escape codes, which was an early design mistake. They now honor the configuration (e.g. "color.ui = never") and also tty-ness of the output medium. * jk/ref-filter-colors: ref-filter: consult want_color() before emitting colors pretty: respect color settings for %C placeholders rev-list: pass diffopt->use_colors through to pretty-print for-each-ref: load config earlier color: check color.ui in git_default_config() ref-filter: pass ref_format struct to atom parsers ref-filter: factor out the parsing of sorting atoms ref-filter: make parse_ref_filter_atom a private function ref-filter: provide a function for parsing sort options ref-filter: move need_color_reset_at_eol into ref_format ref-filter: abstract ref format into its own struct ref-filter: simplify automatic color reset t: use test_decode_color rather than literal ANSI codes docs/for-each-ref: update pointer to color syntax check return value of verify_ref_format()
2017-08-11Merge branch 'bc/object-id'Junio C Hamano1-4/+4
Conversion from uchar[20] to struct object_id continues. * bc/object-id: sha1_name: convert uses of 40 to GIT_SHA1_HEXSZ sha1_name: convert GET_SHA1* flags to GET_OID* sha1_name: convert get_sha1* to get_oid* Convert remaining callers of get_sha1 to get_oid. builtin/unpack-file: convert to struct object_id bisect: convert bisect_checkout to struct object_id builtin/update_ref: convert to struct object_id sequencer: convert to struct object_id remote: convert struct push_cas to struct object_id submodule: convert submodule config lookup to use object_id builtin/merge-tree: convert remaining caller of get_sha1 to object_id builtin/fsck: convert remaining caller of get_sha1 to object_id
2017-08-03submodule: remove gitmodules_configBrandon Williams1-4/+0
Now that the submodule-config subsystem can lazily read the gitmodules file we no longer need to explicitly pre-read the gitmodules by calling 'gitmodules_config()' so let's remove it. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-02Merge branch 'bc/object-id' into bw/submodule-config-cleanupJunio C Hamano1-3/+3
* bc/object-id: sha1_name: convert uses of 40 to GIT_SHA1_HEXSZ sha1_name: convert GET_SHA1* flags to GET_OID* sha1_name: convert get_sha1* to get_oid* Convert remaining callers of get_sha1 to get_oid. builtin/unpack-file: convert to struct object_id bisect: convert bisect_checkout to struct object_id builtin/update_ref: convert to struct object_id sequencer: convert to struct object_id remote: convert struct push_cas to struct object_id submodule: convert submodule config lookup to use object_id builtin/merge-tree: convert remaining caller of get_sha1 to object_id builtin/fsck: convert remaining caller of get_sha1 to object_id tag: convert gpg_verify_tag to use struct object_id commit: convert lookup_commit_graft to struct object_id
2017-08-02grep: recurse in-process using 'struct repository'Brandon Williams1-310/+86
Convert grep to use 'struct repository' which enables recursing into submodules to be handled in-process. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17sha1_name: convert GET_SHA1* flags to GET_OID*brian m. carlson1-1/+1
Convert the flags for get_oid_with_context and friends to use "OID" instead of "SHA1" in their names. This transform was made by running the following one-liner on the affected files: perl -pi -e 's/GET_SHA1/GET_OID/g' Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17sha1_name: convert get_sha1* to get_oid*brian m. carlson1-2/+2
Now that all the callers of get_sha1 directly or indirectly use struct object_id, rename the functions starting with get_sha1 to start with get_oid. Convert the internals in sha1_name.c to use struct object_id as well, and eliminate explicit length checks where possible. Convert a use of 40 in get_oid_basic to GIT_SHA1_HEXSZ. Outside of sha1_name.c and cache.h, this transition was made with the following semantic patch: @@ expression E1, E2; @@ - get_sha1(E1, E2.hash) + get_oid(E1, &E2) @@ expression E1, E2; @@ - get_sha1(E1, E2->hash) + get_oid(E1, E2) @@ expression E1, E2; @@ - get_sha1_committish(E1, E2.hash) + get_oid_committish(E1, &E2) @@ expression E1, E2; @@ - get_sha1_committish(E1, E2->hash) + get_oid_committish(E1, E2) @@ expression E1, E2; @@ - get_sha1_treeish(E1, E2.hash) + get_oid_treeish(E1, &E2) @@ expression E1, E2; @@ - get_sha1_treeish(E1, E2->hash) + get_oid_treeish(E1, E2) @@ expression E1, E2; @@ - get_sha1_commit(E1, E2.hash) + get_oid_commit(E1, &E2) @@ expression E1, E2; @@ - get_sha1_commit(E1, E2->hash) + get_oid_commit(E1, E2) @@ expression E1, E2; @@ - get_sha1_tree(E1, E2.hash) + get_oid_tree(E1, &E2) @@ expression E1, E2; @@ - get_sha1_tree(E1, E2->hash) + get_oid_tree(E1, E2) @@ expression E1, E2; @@ - get_sha1_blob(E1, E2.hash) + get_oid_blob(E1, &E2) @@ expression E1, E2; @@ - get_sha1_blob(E1, E2->hash) + get_oid_blob(E1, E2) @@ expression E1, E2, E3, E4; @@ - get_sha1_with_context(E1, E2, E3.hash, E4) + get_oid_with_context(E1, E2, &E3, E4) @@ expression E1, E2, E3, E4; @@ - get_sha1_with_context(E1, E2, E3->hash, E4) + get_oid_with_context(E1, E2, E3, E4) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17submodule: convert submodule config lookup to use object_idbrian m. carlson1-2/+2
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-13Merge branch 'ab/grep-lose-opt-regflags'Junio C Hamano1-2/+0
Code cleanup. * ab/grep-lose-opt-regflags: grep: remove redundant REG_NEWLINE when compiling fixed regex grep: remove regflags from the public grep_opt API grep: remove redundant and verbose re-assignments to 0 grep: remove redundant "fixed" field re-assignment to 0 grep: adjust a redundant grep pattern type assignment grep: remove redundant double assignment to 0
2017-07-13color: check color.ui in git_default_config()Jeff King1-1/+1
Back in prehistoric times, our decision on whether or not to show color by default relied on using a config callback that either did or didn't load color config like color.diff. When we introduced color.ui, we put it in the same boat: commands had to manually respect it by using git_color_config() or its git_color_default_config() convenience wrapper. But in 4c7f1819b (make color.ui default to 'auto', 2013-06-10), that changed. Since then, we default color.ui to auto in all programs, meaning that even plumbing commands like "git diff-tree --pretty" might colorize the output. Nobody seems to have complained in the intervening years, presumably because the "is stdout a tty" check does a good job of catching the right cases. But that leaves an interesting curiosity: color.ui defaults to auto even in plumbing, but you can't actually _disable_ the color via config. So if you really hate color and set "color.ui" to false, diff-tree will still show color (but porcelain like git-diff won't). Nobody noticed that either, probably because very few people disable color. One could argue that the plumbing should _always_ disable color unless an explicit --color option is given on the command line. But in practice, this creates a lot of complications for scripts which do want plumbing to show user-visible output. They can't just pass "--color" blindly; they need to check the user's config and decide what to send. Given that nobody has complained about the current behavior, let's assume it's a good path, and follow it to its conclusion: supporting color.ui everywhere. Note that you can create havoc by setting color.ui=always in your config, but that's more or less already the case. We could disallow it entirely, but it is handy for one-offs like: git -c color.ui=always foo >not-a-tty when "foo" does not take a --color option itself. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-12Merge branch 'rs/use-div-round-up'Junio C Hamano1-1/+1
Code cleanup. * rs/use-div-round-up: use DIV_ROUND_UP
2017-07-10use DIV_ROUND_UPRené Scharfe1-1/+1
Convert code that divides and rounds up to use DIV_ROUND_UP to make the intent clearer and reduce the number of magic constants. Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-05Merge branch 'bw/repo-object'Junio C Hamano1-1/+2
Introduce a "repository" object to eventually make it easier to work in multiple repositories (the primary focus is to work with the superproject and its submodules) in a single process. * bw/repo-object: ls-files: use repository object repository: enable initialization of submodules submodule: convert is_submodule_initialized to work on a repository submodule: add repo_read_gitmodules submodule-config: store the_submodule_cache in the_repository repository: add index_state to struct repo config: read config from a repository object path: add repo_worktree_path and strbuf_repo_worktree_path path: add repo_git_path and strbuf_repo_git_path path: worktree_git_path() should not use file relocation path: convert do_git_path to take a 'struct repository' path: convert strbuf_git_common_path to take a 'struct repository' path: always pass in commondir to update_common_dir path: create path.h environment: store worktree in the_repository environment: place key repository state in the_repository repository: introduce the repository object environment: remove namespace_len variable setup: add comment indicating a hack setup: don't perform lazy initialization of repository state
2017-06-30grep: remove regflags from the public grep_opt APIÆvar Arnfjörð Bjarmason1-2/+0
Refactor calls to the grep machinery to always pass opt.ignore_case & opt.extended_regexp_option instead of setting the equivalent regflags bits. The bug fixed when making -i work with -P in commit 9e3cbc59d5 ("log: make --regexp-ignore-case work with --perl-regexp", 2017-05-20) was really just plastering over the code smell which this change fixes. The reason for adding the extensive commentary here is that I discovered some subtle complexity in implementing this that really should be called out explicitly to future readers. Before this change we'd rely on the difference between `extended_regexp_option` and `regflags` to serve as a membrane between our preliminary parsing of grep.extendedRegexp and grep.patternType, and what we decided to do internally. Now that those two are the same thing, it's necessary to unset `extended_regexp_option` just before we commit in cases where both of those config variables are set. See 84befcd0a4 ("grep: add a grep.patternType configuration setting", 2012-08-03) for the code and documentation related to that. The explanation of why the if/else branches in grep_commit_pattern_type() are ordered the way they are exists in that commit message, but I think it's worth calling this subtlety out explicitly with a comment for future readers. Even though grep_commit_pattern_type() is the only caller of grep_set_pattern_type_option() it's simpler to reset the extended_regexp_option flag in the latter, since 2/3 branches in the former would otherwise need to reset it, this way we can do it in one place. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-27Spelling fixesVille Skyttä1-1/+1
Signed-off-by: Ville Skyttä <ville.skytta@iki.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-24Merge branch 'bw/config-h'Junio C Hamano1-0/+1
Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h
2017-06-23submodule: convert is_submodule_initialized to work on a repositoryBrandon Williams1-1/+2
Convert 'is_submodule_initialized()' to take a repository object and while we're at it, lets rename the function to 'is_submodule_active()' and remove the NEEDSWORK comment. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23Merge branches 'bw/ls-files-sans-the-index' and 'bw/config-h' into ↵Junio C Hamano1-0/+1
bw/repo-object * bw/ls-files-sans-the-index: ls-files: factor out tag calculation ls-files: factor out debug info into a function ls-files: convert show_files to take an index ls-files: convert show_ce_entry to take an index ls-files: convert prune_cache to take an index ls-files: convert ce_excluded to take an index ls-files: convert show_ru_info to take an index ls-files: convert show_other_files to take an index ls-files: convert show_killed_files to take an index ls-files: convert write_eolinfo to take an index ls-files: convert overlay_tree_on_cache to take an index tree: convert read_tree to take an index parameter convert: convert renormalize_buffer to take an index convert: convert convert_to_git to take an index convert: convert convert_to_git_filter_fd to take an index convert: convert crlf_to_git to take an index convert: convert get_cached_convert_stats_ascii to take an index * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h alias: use the early config machinery to expand aliases t7006: demonstrate a problem with aliases in subdirectories t1308: relax the test verifying that empty alias values are disallowed help: use early config when autocorrecting aliases config: report correct line number upon error discover_git_directory(): avoid setting invalid git_dir
2017-06-19Merge branch 'bw/object-id'Junio C Hamano1-11/+11
Conversion from uchar[20] to struct object_id continues. * bw/object-id: (33 commits) diff: rename diff_fill_sha1_info to diff_fill_oid_info diffcore-rename: use is_empty_blob_oid tree-diff: convert path_appendnew to object_id tree-diff: convert diff_tree_paths to struct object_id tree-diff: convert try_to_follow_renames to struct object_id builtin/diff-tree: cleanup references to sha1 diff-tree: convert diff_tree_sha1 to struct object_id notes-merge: convert write_note_to_worktree to struct object_id notes-merge: convert verify_notes_filepair to struct object_id notes-merge: convert find_notes_merge_pair_ps to struct object_id notes-merge: convert merge_from_diffs to struct object_id notes-merge: convert notes_merge* to struct object_id tree-diff: convert diff_root_tree_sha1 to struct object_id combine-diff: convert find_paths_* to struct object_id combine-diff: convert diff_tree_combined to struct object_id diff: convert diff_flush_patch_id to struct object_id patch-ids: convert to struct object_id diff: finish conversion for prepare_temp_file to struct object_id diff: convert reuse_worktree_file to struct object_id diff: convert fill_filespec to struct object_id ...
2017-06-19Merge branch 'ab/pcre-v2'Junio C Hamano1-3/+13
Update "perl-compatible regular expression" support to enable JIT and also allow linking with the newer PCRE v2 library. * ab/pcre-v2: grep: add support for PCRE v2 grep: un-break building with PCRE >= 8.32 without --enable-jit grep: un-break building with PCRE < 8.20 grep: un-break building with PCRE < 8.32 grep: add support for the PCRE v1 JIT API log: add -P as a synonym for --perl-regexp grep: skip pthreads overhead when using one thread grep: don't redundantly compile throwaway patterns under threading
2017-06-15config: don't include config.h by defaultBrandon Williams1-0/+1
Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13Merge branch 'sb/submodule-blanket-recursive'Junio C Hamano1-0/+3
Many commands learned to pay attention to submodule.recurse configuration. * sb/submodule-blanket-recursive: builtin/fetch.c: respect 'submodule.recurse' option builtin/push.c: respect 'submodule.recurse' option builtin/grep.c: respect 'submodule.recurse' option Introduce 'submodule.recurse' option for worktree manipulators submodule loading: separate code path for .gitmodules and config overlay reset/checkout/read-tree: unify config callback for submodule recursion submodule test invocation: only pass additional arguments submodule recursing: do not write a config variable twice
2017-06-02Merge branch 'ab/grep-preparatory-cleanup'Junio C Hamano1-4/+19
The internal implementation of "git grep" has seen some clean-up. * ab/grep-preparatory-cleanup: (31 commits) grep: assert that threading is enabled when calling grep_{lock,unlock} grep: given --threads with NO_PTHREADS=YesPlease, warn pack-objects: fix buggy warning about threads pack-objects & index-pack: add test for --threads warning test-lib: add a PTHREADS prerequisite grep: move is_fixed() earlier to avoid forward declaration grep: change internal *pcre* variable & function names to be *pcre1* grep: change the internal PCRE macro names to be PCRE1 grep: factor test for \0 in grep patterns into a function grep: remove redundant regflags assignments grep: catch a missing enum in switch statement perf: add a comparison test of log --grep regex engines with -F perf: add a comparison test of log --grep regex engines perf: add a comparison test of grep regex engines with -F perf: add a comparison test of grep regex engines perf: emit progress output when unpacking & building perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do grep: add tests to fix blind spots with \0 patterns grep: prepare for testing binary regexes containing rx metacharacters grep: add a test helper function for less verbose -f \0 tests ...
2017-06-02Merge branch 'jk/diff-blob'Junio C Hamano1-1/+3
The result from "git diff" that compares two blobs, e.g. "git diff $commit1:$path $commit2:$path", used to be shown with the full object name as given on the command line, but it is more natural to use the $path in the output and use it to look up .gitattributes. * jk/diff-blob: diff: use blob path for blob/file diffs diff: use pending "path" if it is available diff: use the word "path" instead of "name" for blobs diff: pass whole pending entry in blobinfo handle_revision_arg: record paths for pending objects handle_revision_arg: record modes for "a..b" endpoints t4063: add tests of direct blob diffs get_sha1_with_context: dynamically allocate oc->path get_sha1_with_context: always initialize oc->symlink_path sha1_name: consistently refer to object_context as "oc" handle_revision_arg: add handle_dotdot() helper handle_revision_arg: hoist ".." check out of range parsing handle_revision_arg: stop using "dotdot" as a generic pointer handle_revision_arg: simplify commit reference lookups handle_revision_arg: reset "dotdot" consistently
2017-06-02grep: convert to struct object_idBrandon Williams1-11/+11
Convert the remaining parts of grep to use struct object_id. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-01builtin/grep.c: respect 'submodule.recurse' optionStefan Beller1-0/+3
In builtin/grep.c we parse the config before evaluating the command line options. This makes the task of teaching grep to respect the new config option 'submodule.recurse' very easy by just parsing that option. As an alternative I had implemented a similar structure to treat submodules as the fetch/push command have, including * aligning the meaning of the 'recurse_submodules' to possible submodule values RECURSE_SUBMODULES_* as defined in submodule.h. * having a callback to parse the value and * reacting to the RECURSE_SUBMODULES_DEFAULT state that was the initial state. However all this is not needed for a true boolean value, so let's keep it simple. However this adds another place where "submodule.recurse" is parsed. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-29Merge branch 'bc/object-id'Junio C Hamano1-1/+1
Conversion from uchar[20] to struct object_id continues. * bc/object-id: (53 commits) object: convert parse_object* to take struct object_id tree: convert parse_tree_indirect to struct object_id sequencer: convert do_recursive_merge to struct object_id diff-lib: convert do_diff_cache to struct object_id builtin/ls-tree: convert to struct object_id merge: convert checkout_fast_forward to struct object_id sequencer: convert fast_forward_to to struct object_id builtin/ls-files: convert overlay_tree_on_cache to object_id builtin/read-tree: convert to struct object_id sha1_name: convert internals of peel_onion to object_id upload-pack: convert remaining parse_object callers to object_id revision: convert remaining parse_object callers to object_id revision: rename add_pending_sha1 to add_pending_oid http-push: convert process_ls_object and descendants to object_id refs/files-backend: convert many internals to struct object_id refs: convert struct ref_update to use struct object_id ref-filter: convert some static functions to struct object_id Convert struct ref_array_item to struct object_id Convert the verify_pack callback to struct object_id Convert lookup_tag to struct object_id ...
2017-05-26grep: skip pthreads overhead when using one threadÆvar Arnfjörð Bjarmason1-0/+2
Skip the administrative overhead of using pthreads when only using one thread. Instead take the non-threaded path which would be taken under NO_PTHREADS. The threading support was initially added in commit 5b594f457a ("Threaded grep", 2010-01-25) with a hardcoded compile-time number of 8 threads. Later the number of threads was made configurable in commit 89f09dd34e ("grep: add --threads=<num> option and grep.threads configuration", 2015-12-15). That change did not add any special handling for --threads=1. Now we take a slightly faster path by skipping thread handling entirely when 1 thread is requested. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26grep: don't redundantly compile throwaway patterns under threadingÆvar Arnfjörð Bjarmason1-3/+11
Change the pattern compilation logic under threading so that grep doesn't compile a pattern it never ends up using on the non-threaded code path, only to compile it again N times for N threads which will each use their own copy, ignoring the initially compiled pattern. This redundant compilation dates back to the initial introduction of the threaded grep in commit 5b594f457a ("Threaded grep", 2010-01-25). There was never any reason for doing this redundant work other than an oversight in the initial commit. Jeff King suggested on-list in <20170414212325.fefrl3qdjigwyitd@sigill.intra.peff.net> that this might be needed to check the pattern for sanity before threaded execution commences. That's not the case. The pattern is compiled under threading in start_threads() before any concurrent execution has started by calling pthread_create(), so if the pattern contains an error we still do the right thing. I.e. die with one error before any threaded execution has commenced, instead of e.g. spewing out an error for each N threads, which could be a regression a change like this might inadvertently introduce. This change is not meant as an optimization, any performance gains from this are in the hundreds to thousands of nanoseconds at most. If we wanted more performance here we could just re-use the compiled patterns in multiple threads (regcomp(3) is thread-safe), or partially re-use them and the associated structures in the case of later PCRE JIT changes. Rather, it's just to make the code easier to reason about. It's confusing to debug this under threading & non-threading when the threading codepaths redundantly compile a pattern which is never used. The reason the patterns are recompiled is as a side-effect of duplicating the whole grep_opt structure, which is not thread safe, writable, and munged during execution. The grep_opt structure then points to the grep_pat structure where pattern or patterns are stored. I looked into e.g. splitting the API into some "do & alloc threadsafe stuff", "spawn thread", "do and alloc non-threadsafe stuff", but the execution time of grep_opt_dup() & pattern compilation is trivial compared to actually executing the grep, so there was no point. Even with the more expensive JIT changes to follow the most expensive PCRE patterns take something like 0.0X milliseconds to compile at most[1]. The undocumented --debug mode added in commit 17bf35a3c7 ("grep: teach --debug option to dump the parse tree", 2012-09-13) still works properly with this change. It only emits debugging info during pattern compilation, which is now dumped by the pattern compiled just before the first thread is started. 1. http://sljit.sourceforge.net/pcre.html Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26grep: assert that threading is enabled when calling grep_{lock,unlock}Ævar Arnfjörð Bjarmason1-4/+4
Change the grep_{lock,unlock} functions to assert that num_threads is true, instead of only locking & unlocking the pthread mutex lock when it is. These functions are never called when num_threads isn't true, this logic has gone through multiple iterations since the initial introduction of grep threading in commit 5b594f457a ("Threaded grep", 2010-01-25), but ever since then they'd only be called if num_threads was true, so this check made the code confusing to read. Replace the check with an assertion, so that it's clear to the reader that this code path is never taken unless we're spawning threads. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26grep: given --threads with NO_PTHREADS=YesPlease, warnÆvar Arnfjörð Bjarmason1-0/+13
Add a warning about missing thread support when grep.threads or --threads is set to a non 0 (default) or 1 (no parallelism) value under NO_PTHREADS=YesPlease. This is for consistency with the index-pack & pack-objects commands, which also take a --threads option & are configurable via pack.threads, and have long warned about the same under NO_PTHREADS=YesPlease. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26grep: catch a missing enum in switch statementÆvar Arnfjörð Bjarmason1-0/+2
Add a die(...) to a default case for the switch statement selecting between grep pattern types under --recurse-submodules. Normally this would be caught by -Wswitch, but the grep_pattern_type type is converted to int by going through parse_options(). Changing the argument type passed to compile_submodule_options() won't work, the value will just get coerced. The -Wswitch-default warning will warn about it, but that produces a lot of noise across the codebase, this potential issue would be drowned in that noise. Thus catching this at runtime is the least bad option. This won't ever trigger in practice, but if a new pattern type were to be added this catches an otherwise silent bug during development. See commit 0281e487fd ("grep: optionally recurse into submodules", 2016-12-16) for the initial addition of this code. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-24get_sha1_with_context: dynamically allocate oc->pathJeff King1-1/+3
When a sha1 lookup returns the tree path via "struct object_context", it just copies it into a fixed-size buffer. This means the result can be truncated, and it means our "struct object_context" consumes a lot of stack space. Instead, let's allocate a string on the heap. Because most callers don't care about this information, we'll avoid doing it by default (so they don't all have to start calling free() on the result). There are basically two options for the caller to signal to us that it's interested: 1. By setting a pointer to storage in the object_context. 2. By passing a flag in another parameter. Doing (1) would match the way that sha1_object_info_extended() works. But it would mean that every caller would have to initialize the object_context, which they don't currently have to do. This patch does (2), and adds a new bit to the function's flags field. All of the callers that look at the "path" field are updated to pass the new flag. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08object: convert parse_object* to take struct object_idbrian m. carlson1-1/+1
Make parse_object, parse_object_or_die, and parse_object_buffer take a pointer to struct object_id. Remove the temporary variables inserted earlier, since they are no longer necessary. Transform all of the callers using the following semantic patch: @@ expression E1; @@ - parse_object(E1.hash) + parse_object(&E1) @@ expression E1; @@ - parse_object(E1->hash) + parse_object(E1) @@ expression E1, E2; @@ - parse_object_or_die(E1.hash, E2) + parse_object_or_die(&E1, E2) @@ expression E1, E2; @@ - parse_object_or_die(E1->hash, E2) + parse_object_or_die(E1, E2) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1.hash, E2, E3, E4, E5) + parse_object_buffer(&E1, E2, E3, E4, E5) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1->hash, E2, E3, E4, E5) + parse_object_buffer(E1, E2, E3, E4, E5) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06dir: convert fill_directory to take an indexBrandon Williams1-1/+1
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-19Merge branch 'ab/grep-plug-pathspec-leak'Junio C Hamano1-0/+1
Call clear_pathspec() to release resources immediately before the cmd_grep() function returns. * ab/grep-plug-pathspec-leak: grep: plug a trivial memory leak