aboutsummaryrefslogtreecommitdiffstats
path: root/xdiff-interface.c
AgeCommit message (Collapse)AuthorFilesLines
2025-11-18xdiff: use unambiguous types in xdl_hash_record()Ezekiel Newren1-1/+1
Convert the function signature and body to use unambiguous types. char is changed to uint8_t because this function processes bytes in memory. unsigned long to uint64_t so that the hash output is consistent across platforms. `flags` was changed from long to uint64_t to ensure the high order bits are not dropped on platforms that treat long as 32 bits. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: move Git config parsing into "environment.c"Patrick Steinhardt1-0/+1
In "config.c" we host both the business logic to read and write config files as well as the logic to parse specific Git-related variables. On the one hand this is mixing concerns, but even more importantly it means that we cannot easily remove the dependency on `the_repository` in our config parsing logic. Move the logic into "environment.c". This file is a grab bag of all kinds of global state already, so it is quite a good fit. Furthermore, it also hosts most of the global variables that we're parsing the config values into, making this an even better fit. Note that there is one hidden change: in `parse_fsync_components()` we use an `int` to iterate through `ARRAY_SIZE(fsync_component_names)`. But as -Wsign-compare warnings are enabled in this file this causes a compiler warning. The issue is fixed by using a `size_t` instead. This change allows us to drop the `USE_THE_REPOSITORY_VARIABLE` declaration. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `repo_read_object_file()`Patrick Steinhardt1-1/+1
Rename `repo_read_object_file()` to `odb_read_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename files to "odb.{c,h}"Patrick Steinhardt1-1/+1
In the preceding commits we have renamed the structures contained in "object-store.h" to `struct object_database` and `struct odb_backend`. As such, the code files "object-store.{c,h}" are confusingly named now. Rename them to "odb.{c,h}" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-store: merge "object-store-ll.h" and "object-store.h"Patrick Steinhardt1-1/+1
The "object-store-ll.h" header has been introduced to keep transitive header dependendcies and compile times at bay. Now that we have created a new "object-store.c" file though we can easily move the last remaining additional bit of "object-store.h", the `odb_path_map`, out of the header. Do so. As the "object-store.h" header is now equivalent to its low-level alternative we drop the latter and inline it into the former. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10hash: stop depending on `the_repository` in `null_oid()`Patrick Steinhardt1-1/+1
The `null_oid()` function returns the object ID that only consists of zeroes. Naturally, this ID also depends on the hash algorithm used, as the number of zeroes is different between SHA1 and SHA256. Consequently, the function returns the hash-algorithm-specific null object ID. This is currently done by depending on `the_hash_algo`, which implicitly makes us depend on `the_repository`. Refactor the function to instead pass in the hash algorithm for which we want to retrieve the null object ID. Adapt callsites accordingly by passing in `the_repository`, thus bubbling up the dependency on that global variable by one layer. There are a couple of trivial exceptions for subsystems that already got rid of `the_repository`. These subsystems instead use the repository that is available via the calling context: - "builtin/grep.c" - "grep.c" - "refs/debug.c" There are also two non-trivial exceptions: - "diff-no-index.c": Here we know that we may not have a repository initialized at all, so we cannot rely on `the_repository`. Instead, we adapt `diff_no_index()` to get a `struct git_hash_algo` as parameter. The only caller is located in "builtin/diff.c", where we know to call `repo_set_hash_algo()` in case we're running outside of a Git repository. Consequently, it is fine to continue passing `the_repository->hash_algo` even in this case. - "builtin/ls-files.c": There is an in-flight patch series that drops `USE_THE_REPOSITORY_VARIABLE` in this file, which causes a semantic conflict because we use `null_oid()` in `show_submodule()`. The value is passed to `repo_submodule_init()`, which may use the object ID to resolve a tree-ish in the superproject from which we want to read the submodule config. As such, the object ID should refer to an object in the superproject, and consequently we need to use its hash algorithm. This means that we could in theory just not bother about this edge case at all and just use `the_repository` in "diff-no-index.c". But doing so would feel misdesigned. Remove the `USE_THE_REPOSITORY_VARIABLE` preprocessor define in "hash.c". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06global: mark code units that generate warnings with `-Wsign-compare`Patrick Steinhardt1-0/+1
Mark code units that generate warnings with `-Wsign-compare`. This allows for a structured approach to get rid of all such warnings over time in a way that can be easily measured. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14global: introduce `USE_THE_REPOSITORY_VARIABLE` macroPatrick Steinhardt1-0/+2
Use of the `the_repository` variable is deprecated nowadays, and we slowly but steadily convert the codebase to not use it anymore. Instead, callers should be passing down the repository to work on via parameters. It is hard though to prove that a given code unit does not use this variable anymore. The most trivial case, merely demonstrating that there is no direct use of `the_repository`, is already a bit of a pain during code reviews as the reviewer needs to manually verify claims made by the patch author. The bigger problem though is that we have many interfaces that implicitly rely on `the_repository`. Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code units to opt into usage of `the_repository`. The intent of this macro is to demonstrate that a certain code unit does not use this variable anymore, and to keep it from new dependencies on it in future changes, be it explicit or implicit For now, the macro only guards `the_repository` itself as well as `the_hash_algo`. There are many more known interfaces where we have an implicit dependency on `the_repository`, but those are not guarded at the current point in time. Over time though, we should start to add guards as required (or even better, just remove them). Define the macro as required in our code units. As expected, most of our code still relies on the global variable. Nearly all of our builtins rely on the variable as there is no way yet to pass `the_repository` to their entry point. For now, declare the macro in "biultin.h" to keep the required changes at least a little bit more contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-14xdiff-interface: refactor parsing of merge.conflictstylePhillip Wood1-11/+18
Factor out the code that parses of conflict style name so it can be reused in a later commit that wants to parse the name given on the command line. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08Merge branch 'en/header-cleanup'Junio C Hamano1-2/+0
Remove unused header "#include". * en/header-cleanup: treewide: remove unnecessary includes in source files treewide: add direct includes currently only pulled in transitively trace2/tr2_tls.h: remove unnecessary include submodule-config.h: remove unnecessary include pkt-line.h: remove unnecessary include line-log.h: remove unnecessary include http.h: remove unnecessary include fsmonitor--daemon.h: remove unnecessary includes blame.h: remove unnecessary includes archive.h: remove unnecessary include treewide: remove unnecessary includes in source files treewide: remove unnecessary includes from header files
2023-12-26treewide: remove unnecessary includes in source filesElijah Newren1-2/+0
Each of these were checked with gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE} to ensure that removing the direct inclusion of the header actually resulted in that header no longer being included at all (i.e. that no other header pulled it in transitively). ...except for a few cases where we verified that although the header was brought in transitively, nothing from it was directly used in that source file. These cases were: * builtin/credential-cache.c * builtin/pull.c * builtin/send-pack.c Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09config: use config_error_nonbool() instead of custom messagesJeff King1-1/+1
A few config callbacks use their own custom messages to report an unexpected implicit bool like: [merge "foo"] driver These should just use config_error_nonbool(), so the user sees consistent messages. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09git_xmerge_config(): prefer error() to die()Jeff King1-3/+4
When parsing merge config, a few code paths die on error. It's preferable for us to call error() here, because the resulting error message from the config parsing code contains much more detail. For example, before: fatal: unknown style 'bogus' given for 'merge.conflictstyle' and after: error: unknown style 'bogus' given for 'merge.conflictstyle' fatal: bad config variable 'merge.conflictstyle' in file '.git/config' at line 7 Since we're touching these lines, I also marked them for translation. There's no reason they shouldn't behave like most other config-parsing errors. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-06Merge branch 'gc/config-context'Junio C Hamano1-2/+3
Reduce reliance on a global state in the config reading API. * gc/config-context: config: pass source to config_parser_event_fn_t config: add kvi.path, use it to evaluate includes config.c: remove config_reader from configsets config: pass kvi to die_bad_number() trace2: plumb config kvi config.c: pass ctx with CLI config config: pass ctx with config files config.c: pass ctx in configsets config: add ctx arg to config_fn_t urlmatch.h: use config_fn_t type config: inline git_color_default_config
2023-06-28config: add ctx arg to config_fn_tGlen Choo1-2/+3
Add a new "const struct config_context *ctx" arg to config_fn_t to hold additional information about the config iteration operation. config_context has a "struct key_value_info kvi" member that holds metadata about the config source being read (e.g. what kind of config source it is, the filename, etc). In this series, we're only interested in .kvi, so we could have just used "struct key_value_info" as an arg, but config_context makes it possible to add/adjust members in the future without changing the config_fn_t signature. We could also consider other ways of organizing the args (e.g. moving the config name and value into config_context or key_value_info), but in my experiments, the incremental benefit doesn't justify the added complexity (e.g. a config_fn_t will sometimes invoke another config_fn_t but with a different config value). In subsequent commits, the .kvi member will replace the global "struct config_reader" in config.c, making config iteration a global-free operation. It requires much more work for the machinery to provide meaningful values of .kvi, so for now, merely change the signature and call sites, pass NULL as a placeholder value, and don't rely on the arg in any meaningful way. Most of the changes are performed by contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every config_fn_t: - Modifies the signature to accept "const struct config_context *ctx" - Passes "ctx" to any inner config_fn_t, if needed - Adds UNUSED attributes to "ctx", if needed Most config_fn_t instances are easily identified by seeing if they are called by the various config functions. Most of the remaining ones are manually named in the .cocci patch. Manual cleanups are still needed, but the majority of it is trivial; it's either adjusting config_fn_t that the .cocci patch didn't catch, or adding forward declarations of "struct config_context ctx" to make the signatures make sense. The non-trivial changes are in cases where we are invoking a config_fn_t outside of config machinery, and we now need to decide what value of "ctx" to pass. These cases are: - trace2/tr2_cfg.c:tr2_cfg_set_fl() This is indirectly called by git_config_set() so that the trace2 machinery can notice the new config values and update its settings using the tr2 config parsing function, i.e. tr2_cfg_cb(). - builtin/checkout.c:checkout_main() This calls git_xmerge_config() as a shorthand for parsing a CLI arg. This might be worth refactoring away in the future, since git_xmerge_config() can call git_default_config(), which can do much more than just parsing. Handle them by creating a KVI_INIT macro that initializes "struct key_value_info" to a reasonable default, and use that to construct the "ctx" arg. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21object-store-ll.h: split this header out of object-store.hElijah Newren1-1/+2
The vast majority of files including object-store.h did not need dir.h nor khash.h. Split the header into two files, and let most just depend upon object-store-ll.h, while letting the two callers that need it depend on the full object-store.h. After this patch: $ git grep -h include..object-store | sort | uniq -c 2 #include "object-store.h" 129 #include "object-store-ll.h" Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-06Merge branch 'en/header-split-cleanup'Junio C Hamano1-1/+1
Split key function and data structure definitions out of cache.h to new header files and adjust the users. * en/header-split-cleanup: csum-file.h: remove unnecessary inclusion of cache.h write-or-die.h: move declarations for write-or-die.c functions from cache.h treewide: remove cache.h inclusion due to setup.h changes setup.h: move declarations for setup.c functions from cache.h treewide: remove cache.h inclusion due to environment.h changes environment.h: move declarations for environment.c functions from cache.h treewide: remove unnecessary includes of cache.h wrapper.h: move declarations for wrapper.c functions from cache.h path.h: move function declarations for path.c functions from cache.h cache.h: remove expand_user_path() abspath.h: move absolute path functions from cache.h environment: move comment_line_char from cache.h treewide: remove unnecessary cache.h inclusion from several sources treewide: remove unnecessary inclusion of gettext.h treewide: be explicit about dependence on gettext.h treewide: remove unnecessary cache.h inclusion from a few headers
2023-04-06Merge branch 'ab/remove-implicit-use-of-the-repository'Junio C Hamano1-1/+1
Code clean-up around the use of the_repository. * ab/remove-implicit-use-of-the-repository: libs: use "struct repository *" argument, not "the_repository" post-cocci: adjust comments for recent repo_* migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"
2023-03-28cocci: apply the "object-store.h" part of "the_repository.pending"Ævar Arnfjörð Bjarmason1-1/+1
Apply the part of "the_repository.pending.cocci" pertaining to "object-store.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21treewide: remove unnecessary cache.h inclusion from several sourcesElijah Newren1-1/+1
A number of files were apparently including cache.h solely to get gettext.h. By making those files explicitly include gettext.h, we can already drop the include of cache.h in these files. On top of that, there were some files using cache.h that didn't need to for any reason. Remove these unnecessary includes. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23cache.h: remove dependence on hex.h; make other files include it explicitlyElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-02Merge branch 'ep/maint-equals-null-cocci' for maint-2.35Junio C Hamano1-1/+1
* ep/maint-equals-null-cocci: tree-wide: apply equals-null.cocci contrib/coccinnelle: add equals-null.cocci
2022-05-02tree-wide: apply equals-null.cocciJunio C Hamano1-1/+1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-01xdiff: implement a zealous diff3, or "zdiff3"Phillip Wood1-0/+2
"zdiff3" is identical to ordinary diff3 except that it allows compaction of common lines on the two sides of history at the beginning or end of the conflict hunk. For example, the following diff3 conflict: 1 2 3 4 <<<<<< A B C D E |||||| 5 6 ====== A X C Y E >>>>>> 7 8 9 has common lines 'A', 'C', and 'E' on the two sides. With zdiff3, one would instead get the following conflict: 1 2 3 4 A <<<<<< B C D |||||| 5 6 ====== X C Y >>>>>> E 7 8 9 Note that the common lines, 'A', and 'E' were moved outside the conflict. Unlike with the two-way conflicts from the 'merge' conflictStyle, the zdiff3 conflict is NOT split into multiple conflict regions to allow the common 'C' lines to be shown outside a conflict, because zdiff3 shows the base version too and the base version cannot be reasonably split. Note also that the removing of lines common to the two sides might make the remaining text inside the conflict region match the base text inside the conflict region (for example, if the diff3 conflict had '5 6 E' on the right side of the conflict, then the common line 'E' would be moved outside and both the base and right side's remaining conflict text would be the lines '5' and '6'). This has the potential to surprise users and make them think there should not have been a conflict, but there definitely was a conflict and it should remain. Based-on-patch-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Co-authored-by: Elijah Newren <newren@gmail.com> Signed-off-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-13Merge branch 'ab/pickaxe-pcre2'Junio C Hamano1-11/+16
Rewrite the backend for "diff -G/-S" to use pcre2 engine when available. * ab/pickaxe-pcre2: (22 commits) xdiff-interface: replace discard_hunk_line() with a flag xdiff users: use designated initializers for out_line pickaxe -G: don't special-case create/delete pickaxe -G: terminate early on matching lines xdiff-interface: allow early return from xdiff_emit_line_fn xdiff-interface: prepare for allowing early return pickaxe -S: slightly optimize contains() pickaxe: rename variables in has_changes() for brevity pickaxe -S: support content with NULs under --pickaxe-regex pickaxe: assert that we must have a needle under -G or -S pickaxe: refactor function selection in diffcore-pickaxe() perf: add performance test for pickaxe pickaxe/style: consolidate declarations and assignments diff.h: move pickaxe fields together again pickaxe: die when --find-object and --pickaxe-all are combined pickaxe: die when -G and --pickaxe-regex are combined pickaxe tests: add missing test for --no-pickaxe-regex being an error pickaxe tests: test for -G, -S and --find-object incompatibility pickaxe tests: add test for "log -S" not being a regex pickaxe tests: add test for diffgrep_consume() internals ...
2021-05-11xdiff-interface: replace discard_hunk_line() with a flagÆvar Arnfjörð Bjarmason1-6/+0
Remove the dummy discard_hunk_line() function added in 3b40a090fd4 (diff: avoid generating unused hunk header lines, 2018-11-02) in favor of having a new XDL_EMIT_NO_HUNK_HDR flag, for use along with the two existing and similar XDL_EMIT_* flags. Unlike the recently amended xdiff_emit_line_fn interface which'll be called in a loop in xdl_emit_diff(), the hunk header is only emitted once. It makes more sense to pass this as a flag than provide a dummy callback because that function may be able to skip doing certain work if it knows the caller is doing nothing with the hunk header. It would be possible to do so in the case of -U0 now, but the benefit of doing so is so small that I haven't bothered. But this leaves the door open to that, and more importantly makes the API use more intuitive. The reason we're putting a flag in the gap between 1<<0 and 1<<2 is that the old 1<<1 flag was removed in 907681e940d (xdiff: drop XDL_EMIT_COMMON, 2016-02-23) without re-ordering the remaining flags. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11xdiff-interface: allow early return from xdiff_emit_line_fnÆvar Arnfjörð Bjarmason1-4/+14
Finish the change started in the preceding commit and allow an early return from "xdiff_emit_line_fn" callbacks, this will allows diffcore-pickaxe.c to save itself redundant work. Our xdiff interface also had the limitation of not being able to abort early since the beginning, see d9ea73e0564 (combine-diff: refactor built-in xdiff interface., 2006-04-05). Although at that time "xdiff_emit_line_fn" was called "xdiff_emit_consume_fn", and "xdiff_emit_hunk_fn" didn't exist yet. There was some work in this area of xdiff-interface.[ch] recently with 3b40a090fd4 (diff: avoid generating unused hunk header lines, 2018-11-02) and 7c61e25fbf1 (diff: use hunk callback for word-diff, 2018-11-02). In combination those two changes allow us to not do any work on the hunks and diff at all, but didn't change the status quo with regards to consumers that e.g. want the diff lines, but might want to abort early. Whereas now we can abort e.g. on the first "-line" of a 1000 line diff if that's all we needed. This interface is rather scary as noted in the comment to xdiff-interface.h being added here, as noted there a future change could add more exit codes, and hack xdl_emit_diff() and friends to ignore or skip things more selectively as a result. I did not see an inherent reason for why xdl_emit_{diffrec,record}() could not be changed to ferry the "xdiff_emit_line_fn" error code upwards instead of returning -1 on all "ret < 0". But doing so would require corresponding changes in xdl_emit_diff(), xdl_diff(). I didn't see any issue with narrowly doing that to accomplish what I needed here, but it would leave xdiff's own return values in an inconsistent state. Instead I've left it at returning a more conventional (for git's own codebase) 1 for an early return, and translating it (or rather, all non-zero) to -1 for xdiff's consumption. The reason for most of the "stop" complexity in xdiff_outf() is because we want to be able to abort early, but do so in a way that doesn't skip the appropriate strbuf_reset() invocations. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11xdiff-interface: prepare for allowing early returnÆvar Arnfjörð Bjarmason1-1/+2
Change the function prototype of xdiff_emit_line_fn to return an "int" instead of "void". Change all of those functions to "return 0", nothing checks those return values yet, and no behavior is being changed. In subsequent commits the interface will be changed to allow early return via this new return value. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27hash: provide per-algorithm null OIDsbrian m. carlson1-1/+1
Up until recently, object IDs did not have an algorithm member, only a hash. Consequently, it was possible to share one null (all-zeros) object ID among all hash algorithms. Now that we're going to be handling objects from multiple hash algorithms, it's important to make sure that all object IDs have a correct algorithm field. Introduce a per-algorithm null OID, and add it to struct hash_algo. Introduce a wrapper function as well, and use it everywhere we used to use the null_oid constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-28xdiff: avoid computing non-zero offset from NULL pointerJeff King1-2/+6
As with the previous commit, clang-11's UBSan complains about computing offsets from a NULL pointer, causing some tests to fail. In this case, though, we're actually computing a non-zero offset, which is even more dubious. From t7810: xdiff-interface.c:268:14: runtime error: applying non-zero offset 1 to null pointer ... not ok 131 - grep -p with userdiff The problem is our parsing of the funcname config. We count the number of lines in the string, allocate an array, and then loop over our allocated entries, parsing each line and moving our cursor to one past the trailing newline for the next iteration. But the final line will not generally have a trailing newline (since it's a config value), and hence we go to one past NULL. In practice this is OK, since our loop should terminate before we look at the value. But even computing such an invalid pointer technically violates the standard. We can fix it by leaving the pointer at NULL if we're at the end, rather than one-past. And while we're thinking about it, we can also document the variant by asserting that our initial line-count matches the second-pass of parsing. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-28avoid computing zero offsets from NULL pointerJeff King1-2/+2
The Undefined Behavior Sanitizer in clang-11 seems to have learned a new trick: it complains about computing offsets from a NULL pointer, even if that offset is 0. This causes numerous test failures. For example, from t1090: unpack-trees.c:1355:41: runtime error: applying zero offset to null pointer ... not ok 6 - in partial clone, sparse checkout only fetches needed blobs The code in question looks like this: struct cache_entry **cache_end = cache + nr; ... while (cache != cache_end) and we sometimes pass in a NULL and 0 for "cache" and "nr". This is conceptually fine, as "cache_end" would be equal to "cache" in this case, and we wouldn't enter the loop at all. But computing even a zero offset violates the C standard. And given the fact that UBSan is noticing this behavior, this might be a potential problem spot if the compiler starts making unexpected assumptions based on undefined behavior. So let's just avoid it, which is pretty easy. In some cases we can just switch to iterating with a numeric index (as we do in sequencer.c here). In other cases (like the cache_end one) the use of an end pointer is more natural; we can keep that by just explicitly checking for the NULL/0 case when assigning the end pointer. Note that there are two ways you can write this latter case, checking for the pointer: cache_end = cache ? cache + nr : cache; or the size: cache_end = nr ? cache + nr : cache; For the case of a NULL/0 ptr/len combo, they are equivalent. But writing it the second way (as this patch does) has the property that if somebody were to incorrectly pass a NULL pointer with a non-zero length, we'd continue to notice and segfault, rather than silently pretending the length was zero. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-20completion: add more parameter value completionNguyễn Thái Ngọc Duy1-0/+4
This adds value completion for a couple more paramters. To make it easier to maintain these hard coded lists, add a comment at the original list/code to remind people to update git-completion.bash too. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-13Merge branch 'jk/xdiff-interface'Junio C Hamano1-45/+31
The interface into "xdiff" library used to discover the offset and size of a generated patch hunk by first formatting it into the textual hunk header "@@ -n,m +k,l @@" and then parsing the numbers out. A new interface has been introduced to allow callers a more direct access to them. * jk/xdiff-interface: xdiff-interface: drop parse_hunk_header() range-diff: use a hunk callback diff: convert --check to use a hunk callback combine-diff: use an xdiff hunk callback diff: use hunk callback for word-diff diff: discard hunk headers for patch-ids earlier diff: avoid generating unused hunk header lines xdiff-interface: provide a separate consume callback for hunks xdiff: provide a separate emit callback for hunks
2018-11-05xdiff-interface: drop parse_hunk_header()Jeff King1-45/+0
This function was used only for parsing the hunk headers generated by xdiff. Now that we can use hunk callbacks to get that information directly, it has outlived its usefulness. Note to anyone who wants to resurrect it: the "len" parameter was totally unused, meaning that the function could read past the end of the "line" array. In practice this never happened, because we only used it to parse xdiff's generated header lines. But it would be dangerous to use it for other cases without fixing this defect. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-05diff: use hunk callback for word-diffJeff King1-0/+3
Our word-diff does not look at the -/+ lines generated by xdiff at all (because they are not real lines to show the user, but just the tokenized words split into lines). Instead we use the line numbers from the hunk headers to index our own data structure. As a result, our xdi_diff_outf() callback throws away all lines except hunk headers. We can instead use a hunk callback, which has two benefits: 1. We don't have to re-parse the generated hunk header line, but can use the passed parameters directly. 2. By setting our line callback to NULL, we can tell xdiff-interface that it does not even need to bother generating the other lines, saving a small amount of work. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-05diff: avoid generating unused hunk header linesJeff King1-0/+6
Some callers of xdi_diff_outf() do not look at the generated hunk header lines at all. By plugging in a no-op hunk callback, this tells xdiff not to even bother formatting them. This patch introduces a stock no-op callback and uses it with a few callers whose line callbacks explicitly ignore hunk headers (because they look only for +/- lines). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-02xdiff-interface: provide a separate consume callback for hunksJeff King1-4/+26
The previous commit taught xdiff to optionally provide the hunk header data to a specialized callback. But most users of xdiff actually use our more convenient xdi_diff_outf() helper, which ensures that our callbacks are always fed whole lines. Let's plumb the special hunk-callback through this interface, too. It will follow the same rule as xdiff when the hunk callback is NULL (i.e., continue to pass a stringified hunk header to the line callback). Since we add NULL to each caller, there should be no behavior change yet. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-02xdiff: provide a separate emit callback for hunksJeff King1-1/+1
The xdiff library always emits hunk header lines to our callbacks as formatted strings like "@@ -a,b +c,d @@\n". This is convenient if we're going to output a diff, but less so if we actually need to compute using those numbers, which requires re-parsing the line. In preparation for moving away from this, let's teach xdiff a new callback function which gets the broken-out hunk information. To help callers that don't want to use this new callback, if it's NULL we'll continue to format the hunk header into a string. Note that this function renames the "outf" callback to "out_line", as well. This isn't strictly necessary, but helps in two ways: 1. Now that there are two callbacks, it's nice to use more descriptive names. 2. Many callers did not zero the emit_callback_data struct, and needed to be modified to set ecb.out_hunk to NULL. By changing the name of the existing struct member, that guarantees that any new callers from in-flight topics will break the build and be examined manually. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-29convert "oidcmp() == 0" to oideq()Jeff King1-1/+1
Using the more restrictive oideq() should, in the long run, give the compiler more opportunities to optimize these callsites. For now, this conversion should be a complete noop with respect to the generated code. The result is also perhaps a little more readable, as it avoids the "zero is equal" idiom. Since it's so prevalent in C, I think seasoned programmers tend not to even notice it anymore, but it can sometimes make for awkward double negations (e.g., we can drop a few !!oidcmp() instances here). This patch was generated almost entirely by the included coccinelle patch. This mechanical conversion should be completely safe, because we check explicitly for cases where oidcmp() is compared to 0, which is what oideq() is doing under the hood. Note that we don't have to catch "!oidcmp()" separately; coccinelle's standard isomorphisms make sure the two are treated equivalently. I say "almost" because I did hand-edit the coccinelle output to fix up a few style violations (it mostly keeps the original formatting, but sometimes unwraps long lines). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-16object-store: move object access functions to object-store.hStefan Beller1-0/+1
This should make these functions easier to find and cache.h less overwhelming to read. In particular, this moves: - read_object_file - oid_object_info - write_object_file As a result, most of the codebase needs to #include object-store.h. In this patch the #include is only added to files that would fail to compile otherwise. It would be better to #include wherever identifiers from the header are used. That can happen later when we have better tooling for it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14sha1_file: convert read_sha1_file to struct object_idbrian m. carlson1-1/+1
Convert read_sha1_file to take a pointer to struct object_id and rename it read_object_file. Do the same for read_sha1_file_extended. Convert one use in grep.c to use the new function without any other code change, since the pointer being passed is a void pointer that is already initialized with a pointer to struct object_id. Update the declaration and definitions of the modified functions, and apply the following semantic patch to convert the remaining callers: @@ expression E1, E2, E3; @@ - read_sha1_file(E1.hash, E2, E3) + read_object_file(&E1, E2, E3) @@ expression E1, E2, E3; @@ - read_sha1_file(E1->hash, E2, E3) + read_object_file(E1, E2, E3) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1.hash, E2, E3, E4) + read_object_file_extended(&E1, E2, E3, E4) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1->hash, E2, E3, E4) + read_object_file_extended(E1, E2, E3, E4) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-26xdiff-interface: export comparing and hashing stringsStefan Beller1-0/+12
This will turn out to be useful in a later patch. xdl_recmatch is exported in xdiff/xutils.h, to be used by various xdiff/*.c files, but not outside of xdiff/. This one makes it available to the outside, too. While at it, add documentation. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-24Merge branch 'bw/config-h'Junio C Hamano1-0/+1
Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h
2017-06-15config: don't include config.h by defaultBrandon Williams1-0/+1
Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26print errno when reporting a system call errorNguyễn Thái Ngọc Duy1-2/+2
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26Merge branch 'js/regexec-buf'Junio C Hamano1-9/+4
Some codepaths in "git diff" used regexec(3) on a buffer that was mmap(2)ed, which may not have a terminating NUL, leading to a read beyond the end of the mapped region. This was fixed by introducing a regexec_buf() helper that takes a <ptr,len> pair with REG_STARTEND extension. * js/regexec-buf: regex: use regexec_buf() regex: add regexec_buf() that can work on a non NUL-terminated string regex: -G<pattern> feeds a non NUL-terminated string to regexec() and fails
2016-09-21regex: use regexec_buf()Johannes Schindelin1-9/+4
The new regexec_buf() function operates on buffers with an explicitly specified length, rather than NUL-terminated strings. We need to use this function whenever the buffer we want to pass to regexec(3) may have been mmap(2)ed (and is hence not NUL-terminated). Note: the original motivation for this patch was to fix a bug where `git diff -G <regex>` would crash. This patch converts more callers, though, some of which allocated to construct NUL-terminated strings, or worse, modified buffers to temporarily insert NULs while calling regexec(3). By converting them to use regexec_buf(), the code has become much cleaner. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07Convert read_mmblob to take struct object_id.brian m. carlson1-4/+4
Since all of its callers have been updated, convert read_mmblob to take a pointer to struct object_id. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-31xdiff: don't trim common tail with -WRené Scharfe1-6/+4
The function trim_common_tail() exits early if context lines are requested. If -U0 and -W are specified together then it can still trim context lines that might belong to a changed function. As a result that function is shown incompletely. Fix that by calling trim_common_tail() only if no function context or fixed context is requested. The parameter ctx is no longer needed now; remove it. While at it fix an outdated comment as well. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22convert trivial cases to ALLOC_ARRAYJeff King1-1/+1
Each of these cases can be converted to use ALLOC_ARRAY or REALLOC_ARRAY, which has two advantages: 1. It automatically checks the array-size multiplication for overflow. 2. It always uses sizeof(*array) for the element-size, so that it can never go out of sync with the declared type of the array. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-28xdiff: reject files larger than ~1GBJeff King1-0/+3
The xdiff code is not prepared to handle extremely large files. It uses "int" in many places, which can overflow if we have a very large number of lines or even bytes in our input files. This can cause us to produce incorrect diffs, with no indication that the output is wrong. Or worse, we may even underallocate a buffer whose size is the result of an overflowing addition. We're much better off to tell the user that we cannot diff or merge such a large file. This patch covers both cases, but in slightly different ways: 1. For merging, we notice the large file and cleanly fall back to a binary merge (which is effectively "we cannot merge this"). 2. For diffing, we make the binary/text distinction much earlier, and in many different places. For this case, we'll use the xdi_diff as our choke point, and reject any diff there before it hits the xdiff code. This means in most cases we'll die() immediately after. That's not ideal, but in practice we shouldn't generally hit this code path unless the user is trying to do something tricky. We already consider files larger than core.bigfilethreshold to be binary, so this code would only kick in when that is circumvented (either by bumping that value, or by using a .gitattribute to mark a file as diffable). In other words, we can avoid being "nice" here, because there is already nice code that tries to do the right thing. We are adding the suspenders to the nice code's belt, so notice when it has been worked around (both to protect the user from malicious inputs, and because it is better to die() than generate bogus output). The maximum size was chosen after experimenting with feeding large files to the xdiff code. It's just under a gigabyte, which leaves room for two obvious cases: - a diff3 merge conflict result on files of maximum size X could be 3*X plus the size of the markers, which would still be only about 3G, which fits in a 32-bit int. - some of the diff code allocates arrays of one int per record. Even if each file consists only of blank lines, then a file smaller than 1G will have fewer than 1G records, and therefore the int array will fit in 4G. Since the limit is arbitrary anyway, I chose to go under a gigabyte, to leave a safety margin (e.g., we would not want to overflow by allocating "(records + 1) * sizeof(int)" or similar. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-09xdiff: remove emit_func() and xdi_diff_hunks()René Scharfe1-44/+0
The functions are unused now, remove them. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-14add, merge, diff: do not use strcasecmp to compare config variable namesJonathan Nieder1-1/+1
The config machinery already makes section and variable names lowercase when parsing them, so using strcasecmp for comparison just feels wasteful. No noticeable change intended. Noticed-by: Jay Soffian <jaysoffian@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-26Merge branch 'rs/maint-diff-fd-leak' into maintJunio C Hamano1-1/+3
* rs/maint-diff-fd-leak: close file on error in read_mmfile()
2010-12-26close file on error in read_mmfile()René Scharfe1-1/+3
Reported in http://qa.debian.org/daca/cppcheck/sid/git_1.7.2.3-2.2.html and in http://thread.gmane.org/gmane.comp.version-control.git/123042. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-09xdiff-interface.c: always trim trailing space from xfuncname matchesBrandon Casey1-3/+2
Generally, trailing space is removed from the string matched by the xfuncname patterns. The exception is when the matched string exceeds the length of the fixed-size buffer that it will be copied in to. But, a string that exceeds the buffer can still contain trailing space in the portion of the string that will be copied into the buffer. So, simplify this code slightly, and just perform the trailing space removal always. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-04Merge branch 'maint-1.7.0' into maintJunio C Hamano1-5/+6
* maint-1.7.0: remove ecb parameter from xdi_diff_outf()
2010-05-04remove ecb parameter from xdi_diff_outf()René Scharfe1-5/+6
xdi_diff_outf() overrides the structure members of its last parameter, ignoring any value that callers pass in. It's no surprise then that all callers pass a pointer to an uninitialized structure. They also don't read it after the call, so the parameter is neither used for input nor for output. Turn it into a local variable of xdi_diff_outf(). Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-17refactor duplicated fill_mm() in checkout and merge-recursiveMichael Lukashov1-0/+17
The following function is duplicated: fill_mm Move it to xdiff-interface.c and rename it 'read_mmblob', as suggested by Junio C Hamano. Also, change parameters order for consistency with read_mmfile(). Signed-off-by: Michael Lukashov <michael.lukashov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-01userdiff: add xdiff_clear_find_func()René Scharfe1-0/+15
xdiff_set_find_func() is used to set user defined regular expressions for finding function signatures. Add xdiff_clear_find_func(), which frees the memory allocated by the former, making the API complete. Also, use the new function in diff.c (the only call site of xdiff_set_find_func()) to clean up after ourselves. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07Remove unused function scope local variablesBenjamin Kramer1-2/+1
These variables were unused and can be removed safely: builtin-clone.c::cmd_clone(): use_local_hardlinks, use_separate_remote builtin-fetch-pack.c::find_common(): len builtin-remote.c::mv(): symref diff.c::show_stats():show_stats(): total diffcore-break.c::should_break(): base_size fast-import.c::validate_raw_date(): date, sign fsck.c::fsck_tree(): o_sha1, sha1 xdiff-interface.c::parse_num(): read_some Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-26xdiff-interface.c: remove 10 duplicated linesJim Meyering1-11/+0
Remove an accidentally duplicated sequence of 10 lines. This happens to plug a leak, too. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-08Merge branch 'rs/blame'Junio C Hamano1-1/+48
* rs/blame: blame: use xdi_diff_hunks(), get rid of struct patch add xdi_diff_hunks() for callers that only need hunk lengths Allow alternate "low-level" emit function from xdl_diff Always initialize xpparam_t to 0 blame: inline get_patch()
2008-10-25add xdi_diff_hunks() for callers that only need hunk lengthsRené Scharfe1-1/+48
Based on a patch by Brian Downing, this uses the xdiff emit_func feature to implement xdi_diff_hunks(). It's a function that calls a callback for each hunk of a diff, passing its lengths. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-17Merge branch 'maint'Junio C Hamano1-0/+10
* maint: t1301-shared-repo.sh: don't let a default ACL interfere with the test git-check-attr(1): add output and example sections xdiff-interface.c: strip newline (and cr) from line before pattern matching t4018-diff-funcname: demonstrate end of line funcname matching flaw t4018-diff-funcname: rework negated last expression test Typo "does not exists" when git remote update remote. remote.c: correct the check for a leading '/' in a remote name Add testcase to ensure merging an early part of a branch is done properly Conflicts: t/t7600-merge.sh
2008-10-16xdiff-interface.c: strip newline (and cr) from line before pattern matchingBrandon Casey1-1/+11
POSIX doth sayeth: "In the regular expression processing described in IEEE Std 1003.1-2001, the <newline> is regarded as an ordinary character and both a period and a non-matching list can match one. ... Those utilities (like grep) that do not allow <newline>s to match are responsible for eliminating any <newline> from strings before matching against the RE." Thus far git has not been removing the trailing newline from strings matched against regular expression patterns. This has the effect that (quoting Jonathan del Strother) "... a line containing just 'FUNCNAME' (terminated by a newline) will be matched by the pattern '^(FUNCNAME.$)' but not '^(FUNCNAME$)'", and more simply not '^FUNCNAME$'. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-02xdiff-interface.c: strip newline (and cr) from line before pattern matchingBrandon Casey1-1/+11
POSIX doth sayeth: "In the regular expression processing described in IEEE Std 1003.1-2001, the <newline> is regarded as an ordinary character and both a period and a non-matching list can match one. ... Those utilities (like grep) that do not allow <newline>s to match are responsible for eliminating any <newline> from strings before matching against the RE." Thus far git has not been removing the trailing newline from strings matched against regular expression patterns. This has the effect that (quoting Jonathan del Strother) "... a line containing just 'FUNCNAME' (terminated by a newline) will be matched by the pattern '^(FUNCNAME.$)' but not '^(FUNCNAME$)'", and more simply not '^FUNCNAME$'. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-09-29Merge branch 'bc/master-diff-hunk-header-fix'Shawn O. Pearce1-9/+12
* bc/master-diff-hunk-header-fix: Clarify commit error message for unmerged files Use strchrnul() instead of strchr() plus manual workaround Use remove_path from dir.c instead of own implementation Add remove_path: a function to remove as much as possible of a path git-submodule: Fix "Unable to checkout" for the initial 'update' Clarify how the user can satisfy stash's 'dirty state' check. t4018-diff-funcname: test syntax of builtin xfuncname patterns t4018-diff-funcname: test syntax of builtin xfuncname patterns make "git remote" report multiple URLs diff hunk pattern: fix misconverted "\{" tex macro introducers diff: fix "multiple regexp" semantics to find hunk header comment diff: use extended regexp to find hunk headers diff: use extended regexp to find hunk headers diff.*.xfuncname which uses "extended" regex's for hunk header selection diff.c: associate a flag with each pattern and use it for compiling regex diff.c: return pattern entry pointer rather than just the hunk header pattern Conflicts: builtin-merge-recursive.c t/t7201-co.sh xdiff-interface.h
2008-09-29Merge branch 'jc/better-conflict-resolution'Shawn O. Pearce1-0/+20
* jc/better-conflict-resolution: Fix AsciiDoc errors in merge documentation git-merge documentation: describe how conflict is presented checkout --conflict=<style>: recreate merge in a non-default style checkout -m: recreate merge when checking out of unmerged index git-merge-recursive: learn to honor merge.conflictstyle merge.conflictstyle: choose between "merge" and "diff3 -m" styles rerere: understand "diff3 -m" style conflicts with the original rerere.c: use symbolic constants to keep track of parsing states xmerge.c: "diff3 -m" style clips merge reduction level to EAGER or less xmerge.c: minimum readability fixups xdiff-merge: optionally show conflicts in "diff3 -m" style xdl_fill_merge_buffer(): separate out a too deeply nested function checkout --ours/--theirs: allow checking out one side of a conflicting merge checkout -f: allow ignoring unmerged paths when checking out of the index Conflicts: Documentation/git-checkout.txt builtin-checkout.c builtin-merge-recursive.c t/t7201-co.sh
2008-09-20diff: fix "multiple regexp" semantics to find hunk header commentJunio C Hamano1-7/+10
When multiple regular expressions are concatenated with "\n", they were traditionally AND'ed together, and only a line that matches _all_ of them is taken as a match. This however is unwieldy when multiple regexp feature is used to specify alternatives. This fixes the semantics to take the first match. A nagative pattern, if matches, makes the line to fail as before. A match with a positive pattern will be the final match, and what it captures in $1 is used as the hunk header comment. We could write alternatives using "|" in ERE, but the machinery can only use captured $1 as the hunk header comment (or $0 if there is no match in $1), so you cannot write: "junk ( A | B ) | garbage ( C | D )" and expect both "junk" and "garbage" to get stripped with the existing code. With this fix, you can write it as: "junk ( A | B ) \n garbage ( C | D )" and the way capture works would match the user expectation more naturally. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-18Merge branch 'bc/maint-diff-hunk-header-fix' into bc/master-diff-hunk-header-fixJunio C Hamano1-2/+2
* bc/maint-diff-hunk-header-fix: diff.*.xfuncname which uses "extended" regex's for hunk header selection diff.c: associate a flag with each pattern and use it for compiling regex diff.c: return pattern entry pointer rather than just the hunk header pattern Cosmetical command name fix Start conforming code to "git subcmd" style part 3 t9700/test.pl: remove File::Temp requirement t9700/test.pl: avoid bareword 'STDERR' in 3-argument open() GIT 1.6.0.2 Fix some manual typos. Use compatibility regex library also on FreeBSD Use compatibility regex library also on AIX Update draft release notes for 1.6.0.2 Use compatibility regex library for OSX/Darwin git-svn: Fixes my() parameter list syntax error in pre-5.8 Perl Git.pm: Use File::Temp->tempfile instead of ->new t7501: always use test_cmp instead of diff Start conforming code to "git subcmd" style part 2 diff: Help "less" hide ^M from the output checkout: do not check out unmerged higher stages randomly Conflicts: Documentation/git.txt Documentation/gitattributes.txt Makefile diff.c t/t7201-co.sh
2008-09-18diff.c: associate a flag with each pattern and use it for compiling regexBrandon Casey1-2/+2
This is in preparation for allowing extended regular expression patterns. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-30merge.conflictstyle: choose between "merge" and "diff3 -m" stylesJunio C Hamano1-0/+20
This teaches "git merge-file" to honor merge.conflictstyle configuration variable, whose value can be "merge" (default) or "diff3". Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-14xdiff-interface: hide the whole "xdiff_emit_state" business from the callerJunio C Hamano1-5/+18
This further enhances xdi_diff_outf() interface so that it takes two common parameters: the callback function that processes one line at a time, and a pointer to its application specific callback data structure. xdi_diff_outf() creates its own "xdiff_emit_state" structure and stashes these two away inside it, which is used by the lowest level output function in the xdiff_outf() callchain, consume_one(), to call back to the application layer. With this restructuring, we lift the requirement that the caller supplied callback data structure embeds xdiff_emit_state structure as its first member. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-13Use strbuf for struct xdiff_emit_state's remainderBrian Downing1-22/+10
Continually xreallocing and freeing the remainder member of struct xdiff_emit_state was a noticeable performance hit. Use a strbuf instead. This yields a decent performance improvement on "git blame" on certain repositories. For example, before this commit: $ time git blame -M -C -C -p --incremental server.c >/dev/null 101.52user 0.17system 1:41.73elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+39561minor)pagefaults 0swaps With this commit: $ time git blame -M -C -C -p --incremental server.c >/dev/null 80.38user 0.30system 1:20.81elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+50979minor)pagefaults 0swaps Signed-off-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-13Make xdi_diff_outf interface for running xdiff_outf diffsBrian Downing1-1/+12
To prepare for the need to initialize and release resources for an xdi_diff with the xdiff_outf output function, make a new function to wrap this usage. Old: ecb.outf = xdiff_outf; ecb.priv = &state; ... xdi_diff(file_p, file_o, &xpp, &xecfg, &ecb); New: xdi_diff_outf(file_p, file_o, &state.xm, &xpp, &xecfg, &ecb); Signed-off-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-14Merge branch 'maint'Junio C Hamano1-2/+2
* maint: merge-file: handle empty files gracefully merge-recursive: handle file mode changes Minor wording changes in the keyboard descriptions in git-add --interactive. git fetch: Take '-n' to mean '--no-tags' quiltimport: fix misquoting of parsed -p<num> parameter git-quiltimport: better parser to grok "enhanced" series files.
2008-03-13merge-file: handle empty files gracefullyJohannes Schindelin1-2/+2
Earlier, it would error out while trying to read and/or writing them. Now, calling merge-file with empty files is neither interesting nor useful, but it is a bug that needed fixing. Noticed by Clemens Buchacher. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2008-02-22Avoid unnecessary "if-before-free" tests.Jim Meyering1-2/+1
This change removes all obvious useless if-before-free tests. E.g., it replaces code like this: if (some_expression) free (some_expression); with the now-equivalent: free (some_expression); It is equivalent not just because POSIX has required free(NULL) to work for a long time, but simply because it has worked for so long that no reasonable porting target fails the test. Here's some evidence from nearly 1.5 years ago: http://www.winehq.org/pipermail/wine-patches/2006-October/031544.html FYI, the change below was prepared by running the following: git ls-files -z | xargs -0 \ perl -0x3b -pi -e \ 's/\bif\s*\(\s*(\S+?)(?:\s*!=\s*NULL)?\s*\)\s+(free\s*\(\s*\1\s*\))/$2/s' Note however, that it doesn't handle brace-enclosed blocks like "if (x) { free (x); }". But that's ok, since there were none like that in git sources. Beware: if you do use the above snippet, note that it can produce syntactically invalid C code. That happens when the affected "if"-statement has a matching "else". E.g., it would transform this if (x) free (x); else foo (); into this: free (x); else foo (); There were none of those here, either. If you're interested in automating detection of the useless tests, you might like the useless-if-before-free script in gnulib: [it *does* detect brace-enclosed free statements, and has a --name=S option to make it detect free-like functions with different names] http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=blob;f=build-aux/useless-if-before-free Addendum: Remove one more (in imap-send.c), spotted by Jean-Luc Herren <jlh@gmx.ch>. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-20Re(-re)*fix trim_common_tail()Linus Torvalds1-4/+7
The tar-ball and the git archive itself is fine, but yes, the diff from 2.6.23 to 2.6.24-rc6 is bad. It's the "trim_common_tail()" optimization that has caused way too much pain. Very interesting breakage. The patch was actually "correct" in a (rather limited) technical sense, but the context at the end was missing because while the trim_common_tail() code made sure to keep enough common context to allow a valid diff to be generated, the diff machinery itself could decide that it could generate the diff differently than the "obvious" solution. Thee sad fact is that the git optimization (which is very important for "git blame", which needs no context), is only really valid for that one case where we really don't need any context. [jc: since this is shared with "git diff -U0" codepath, context recovery to the end of line needs to be done even for zero context case.] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-16Re-re-re-fix common tail optimizationJunio C Hamano1-1/+1
We need to be extra careful recovering the removed common section, so that we do not break context nor the changed incomplete line (i.e. the last line that does not end with LF). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-16trim_common_tail: brown paper bag fix.Jeff King1-5/+4
The recovered context lines were not LF terminated due to off-by-one error, which also caused the outer loop to count the number of recovered lines to terminate after running only once. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-14xdiff tail trimming: use correct type.Junio C Hamano1-3/+2
Inside xdiff library, the number of context lines is represented in long, not int. Noticed by Peter Baumann. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-13xdi_diff: trim common trailing linesJunio C Hamano1-1/+33
This implements earlier Linus's optimization to trim common lines at the end before passing them down to low level xdiff interface for all of our xdiff users. We could later enhance this to also trim common leading lines, but that would need tweaking the output function to add the number of lines trimmed at the beginning to line numbers that appear in the hunk headers. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-13xdl_diff: identify call sites.Junio C Hamano1-0/+5
This inserts a new function xdi_diff() that currently does not do anything other than calling the underlying xdl_diff() to the callchain of current callers of xdl_diff() function. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-06Per-path attribute based hunk header selection.Junio C Hamano1-0/+71
This makes"diff -p" hunk headers customizable via gitattributes mechanism. It is based on Johannes's earlier patch that allowed to define a single regexp to be used for everything. The mechanism to arrive at the regexp that is used to define hunk header is the same as other use of gitattributes. You assign an attribute, funcname (because "diff -p" typically uses the name of the function the patch is about as the hunk header), a simple string value. This can be one of the names of built-in pattern (currently, "java" is defined) or a custom pattern name, to be looked up from the configuration file. (in .gitattributes) *.java funcname=java *.perl funcname=perl (in .git/config) [funcname] java = ... # ugly and complicated regexp to override the built-in one. perl = ... # another ugly and complicated regexp to define a new one. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-07War on whitespaceJunio C Hamano1-2/+0
This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-04Move buffer_is_binary() to xdiff-interface.hJohannes Schindelin1-0/+8
We already have two instances where we want to determine if a buffer contains binary data as opposed to text. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-03-07Cast 64 bit off_t to 32 bit size_tShawn O. Pearce1-3/+5
Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4. This implies that we are able to access and work on files whose maximum length is around 2^63-1 bytes, but we can only malloc or mmap somewhat less than 2^32-1 bytes of memory. On such a system an implicit conversion of off_t to size_t can cause the size_t to wrap, resulting in unexpected and exciting behavior. Right now we are working around all gcc warnings generated by the -Wshorten-64-to-32 option by passing the off_t through xsize_t(). In the future we should make xsize_t on such problematic platforms detect the wrapping and die if such a file is accessed. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-21move read_mmfile() into xdiff-interfaceJohannes Schindelin1-0/+19
read_file() was a useful function if you want to work with the xdiff stuff, so it was renamed and put into a more central place. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-26Use xrealloc instead of reallocJonas Fonseca1-6/+6
Change places that use realloc, without a proper error path, to instead use xrealloc. Drop an erroneous error path in the daemon code that used errno in the die message in favour of the simpler xrealloc. Signed-off-by: Jonas Fonseca <fonseca@diku.dk> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-06Match ofs/cnt types in diff interface.Junio C Hamano1-4/+4
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-05combine-diff: move the code to parse hunk-header into common library.Junio C Hamano1-0/+46
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-05combine-diff: refactor built-in xdiff interface.Junio C Hamano1-0/+58
This refactors the line-by-line callback mechanism used in combine-diff so that other programs can reuse it more easily. Signed-off-by: Junio C Hamano <junkio@cox.net>