aboutsummaryrefslogtreecommitdiffstats
path: root/diff-lib.c
AgeCommit message (Collapse)AuthorFilesLines
2016-10-24commit: fix empty commit creation when there's no changes but ita entriesNguyễn Thái Ngọc Duy1-1/+3
If i-t-a entries are present and there is no change between the index and HEAD i-t-a entries, index_differs_from() still returns "dirty, new entries" (aka, the resulting commit is not empty), but cache-tree will skip i-t-a entries and produce the exact same tree of current commit. index_differs_from() is supposed to catch this so we can abort git-commit (unless --no-empty is specified). Update it to optionally ignore i-t-a entries when doing a diff between the index and HEAD so that it would return "no change" in this case and abort commit. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-24diff-lib: allow ita entries treated as "not yet exist in index"Nguyễn Thái Ngọc Duy1-0/+14
When comparing the index and the working tree to show which paths are new, and comparing the tree recorded in the HEAD and the index to see if committing the contents recorded in the index would result in an empty commit, we would want the former comparison to say "these are new paths" and the latter to say "there is no change" for paths that are marked as intent-to-add. We made a similar attempt at d95d728a ("diff-lib.c: adjust position of i-t-a entries in diff", 2015-03-16), which redefined the semantics of these two comparison modes globally, which was a disaster and had to be reverted at 78cc1a54 ("Revert "diff-lib.c: adjust position of i-t-a entries in diff"", 2015-06-23). To make sure we do not repeat the same mistake, introduce a new internal diffopt option so that this different semantics can be asked for only by callers that ask it, while making sure other unaudited callers will get the same comparison result. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07cache: convert struct cache_entry to use struct object_idbrian m. carlson1-13/+18
Convert struct cache_entry to use struct object_id by applying the following semantic patch and the object_id transforms from contrib, plus the actual change to the struct: @@ struct cache_entry E1; @@ - E1.sha1 + E1.oid.hash @@ struct cache_entry *E1; @@ - E1->sha1 + E1->oid.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-20Remove get_object_hash.brian m. carlson1-1/+1
Convert all instances of get_object_hash to use an appropriate reference to the hash member of the oid member of struct object. This provides no functional change, as it is essentially a macro substitution. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
2015-11-20Add several uses of get_object_hash.brian m. carlson1-1/+1
Convert most instances where the sha1 member of struct object is dereferenced to use get_object_hash. Most instances that are passed to functions that have versions taking struct object_id, such as get_sha1_hex/get_oid_hex, or instances that can be trivially converted to use struct object_id instead, are not converted. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
2015-06-25Merge branch 'nd/diff-i-t-a'Junio C Hamano1-12/+0
* nd/diff-i-t-a: Revert "diff-lib.c: adjust position of i-t-a entries in diff"
2015-06-23Revert "diff-lib.c: adjust position of i-t-a entries in diff"Junio C Hamano1-12/+0
This reverts commit d95d728aba06a34394d15466045cbdabdada58a2. It turns out that many other commands that need to interact with the result of running diff-files and diff-index, e.g. "git apply", "git rm", etc., need to be adjusted to the new world order it brings in. For example, it would break this sequence to correct a whitespace breakage in the parts you changed: git add -N file git diff --cached file | git apply --cached --whitespace=fix git checkout file In the old world order, "diff" showed a patch to modify an existing empty file by adding its full contents, and "apply" updated the index by modifying the existing empty blob (which is what an Intent-to-Add entry records in the index) with that patch. In the new world order, "diff" shows a patch to create a new file with its full contents, but because "apply" thinks that the i-t-a entry already exists in the index, it refused to accept a creation. Adjusting "apply" to this new world order is easy, but we need to assess the extent of the damage to the rest of the system the new world order brought in before going forward and adjust them all, after which we can resurrect the commit being reverted here. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-19Merge branch 'nd/diff-i-t-a'Junio C Hamano1-0/+12
After "git add -N", the path appeared in output of "git diff HEAD" and "git diff --cached HEAD", leading "git status" to classify it as "Changes to be committed". Such a path, however, is not yet to be scheduled to be committed. "git diff" showed the change to the path as modification, not as a "new file", in the header of its output. Treat such paths as "yet to be added to the index but Git already know about them"; "git diff HEAD" and "git diff --cached HEAD" should not talk about them, and "git diff" should show them as new files yet to be added to the index. * nd/diff-i-t-a: diff-lib.c: adjust position of i-t-a entries in diff
2015-05-05Merge branch 'bc/object-id'Junio C Hamano1-5/+5
Identify parts of the code that knows that we use SHA-1 hash to name our objects too much, and use (1) symbolic constants instead of hardcoded 20 as byte count and/or (2) use struct object_id instead of unsigned char [20] for object names. * bc/object-id: apply: convert threeway_stage to object_id patch-id: convert to use struct object_id commit: convert parts to struct object_id diff: convert struct combine_diff_path to object_id bulk-checkin.c: convert to use struct object_id zip: use GIT_SHA1_HEXSZ for trailers archive.c: convert to use struct object_id bisect.c: convert leaf functions to use struct object_id define utility functions for object IDs define a structure for object IDs
2015-03-23diff-lib.c: adjust position of i-t-a entries in diffNguyễn Thái Ngọc Duy1-0/+12
Entries added by "git add -N" are reminder for the user so that they don't forget to add them before committing. These entries appear in the index even though they are not real. Their presence in the index leads to a confusing "git status" like this: On branch master Changes to be committed: new file: foo Changes not staged for commit: modified: foo If you do a "git commit", "foo" will not be included even though "status" reports it as "to be committed". This patch changes the output to become On branch master Changes not staged for commit: new file: foo no changes added to commit The two hunks in diff-lib.c adjust "diff-index" and "diff-files" so that i-t-a entries appear as new files in diff-files and nothing in diff-index. Due to this change, diff-files may start to report "new files" for the first time. "add -u" needs to be told about this or it will die in denial, screaming "new files can't exist! Reality is wrong." Luckily, it's the only one among run_diff_files() callers that needs fixing. Now in the new world order, a hierarchy in the index that contain i-t-a paths is written out as a tree object as if these i-t-a entries do not exist, and comparing the index with such a tree object that would result from writing out the hierarchy will result in no difference. Update a test in t2203 that expected the i-t-a entries to appear as "added to the index" in the comparison to instead expect no output. An earlier change eec3e7e4 (cache-tree: invalidate i-t-a paths after generating trees, 2012-12-16) becomes an unnecessary pessimization in the new world order---a cache-tree in the index that corresponds to a hierarchy with i-t-a paths can now be marked as valid and record the object name of the tree that results from writing a tree object out of that hierarchy, as it will compare equal to that tree. Reverting the commit is left for the future, though, as it is purely a performance issue and no longer affects correctness. Helped-by: Michael J Gruber <git@drmicha.warpmail.net> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-13diff: convert struct combine_diff_path to object_idbrian m. carlson1-5/+5
Also, convert a constant to GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-04run_diff_files(): clarify computation of sha1 validityJunio C Hamano1-2/+6
Remove the need to have duplicated "if there is a change then feed null_sha1 and otherwise sha1 from the cache entry" for the "new" side of the diff by introducing two temporary variables to point at the object name of the old and the new side of the blobs. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-16Merge branch 'jk/diff-files-assume-unchanged'Junio C Hamano1-12/+21
* jk/diff-files-assume-unchanged: run_diff_files: do not look at uninitialized stat data
2014-05-15run_diff_files: do not look at uninitialized stat dataJeff King1-12/+21
If we try to diff an index entry marked CE_VALID (because it was marked with --assume-unchanged), we do not bother even running stat() on the file to see if it was removed. This started long ago with 540e694 (Prevent diff machinery from examining assume-unchanged entries on worktree, 2009-08-11). However, the subsequent code may look at our "struct stat" and expect to find actual data; currently it will find whatever cruft was left on the stack. This can cause problems in two situations: 1. We call match_stat_with_submodule with the stat data, so a submodule may be erroneously marked as changed. 2. If --find-copies-harder is in effect, we pass all entries, even unchanged ones, to diff_change, so it can list them as rename/copy sources. Since we found no change, we assume that function will realize it and not actually display any diff output. However, we end up feeding it a bogus mode, leading it to sometimes claim there was a mode change. We can fix both by splitting the CE_VALID and regular code paths, and making sure only to look at the stat information in the latter. Furthermore, we push the declaration of our "struct stat" down into the code paths that actually set it, so we cannot accidentally access it uninitialized in future code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-07Merge branch 'jc/hold-diff-remove-q-synonym-for-no-deletion'Junio C Hamano1-3/+0
Remove a confusing and deprecated "-q" option from "git diff-files"; "git diff-files --diff-filter=d" can be used instead.
2014-03-05Merge branch 'ks/combine-diff'Junio C Hamano1-2/+0
Teach combine-diff to honour the path-output-order imposed by diffcore-order, and optimize how matching paths are found in the N-way diffs made with parents. * ks/combine-diff: tests: add checking that combine-diff emits only correct paths combine-diff: simplify intersect_paths() further combine-diff: combine_diff_path.len is not needed anymore combine-diff: optimize combine_diff_path sets intersection diff test: add tests for combine-diff with orderfile diffcore-order: export generic ordering interface
2014-02-24combine-diff: combine_diff_path.len is not needed anymoreKirill Smelkov1-2/+0
The field was used in order to speed-up name comparison and also to mark removed paths by setting it to 0. Because the updated code does significantly less strcmp and also just removes paths from the list and free right after we know a path will not be needed, it is not needed anymore. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24pathspec: convert some match_pathspec_depth() to ce_path_match()Nguyễn Thái Ngọc Duy1-2/+3
This helps reduce the number of match_pathspec_depth() call sites and show how match_pathspec_depth() is used. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-09Merge branch 'jl/submodule-mv'Junio C Hamano1-2/+1
"git mv A B" when moving a submodule A does "the right thing", inclusing relocating its working tree and adjusting the paths in the .gitmodules file. * jl/submodule-mv: (53 commits) rm: delete .gitmodules entry of submodules removed from the work tree mv: update the path entry in .gitmodules for moved submodules submodule.c: add .gitmodules staging helper functions mv: move submodules using a gitfile mv: move submodules together with their work trees rm: do not set a variable twice without intermediate reading. t6131 - skip tests if on case-insensitive file system parse_pathspec: accept :(icase)path syntax pathspec: support :(glob) syntax pathspec: make --literal-pathspecs disable pathspec magic pathspec: support :(literal) syntax for noglob pathspec kill limit_pathspec_to_literal() as it's only used by parse_pathspec() parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN parse_pathspec: make sure the prefix part is wildcard-free rename field "raw" to "_raw" in struct pathspec tree-diff: remove the use of pathspec's raw[] in follow-rename codepath remove match_pathspec() in favor of match_pathspec_depth() remove init_pathspec() in favor of parse_pathspec() remove diff_tree_{setup,release}_paths convert common_prefix() to use struct pathspec ...
2013-09-09Merge branch 'jc/diff-filter-negation'Junio C Hamano1-5/+3
Teach "git diff --diff-filter" to express "I do not want to see these classes of changes" more directly by listing only the unwanted ones in lowercase (e.g. "--diff-filter=d" will show everything but deletion) and deprecate "diff-files -q" which did the same thing as "--diff-filter=d". * jc/diff-filter-negation: diff: deprecate -q option to diff-files diff: allow lowercase letter to specify what change class to exclude diff: reject unknown change class given to --diff-filter diff: preparse --diff-filter string argument diff: factor out match_filter() diff: pass the whole diff_options to diffcore_apply_filter()
2013-07-19diff: remove "diff-files -q" in a version of Git in a distant futureJunio C Hamano1-3/+0
This was inherited from "show-diff -q" that was invented to tell comparison between the index and the working tree to ignore only removals in 2005. These days, it is spelled as "--diff-filter=d". Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-19diff: deprecate -q option to diff-filesJunio C Hamano1-5/+3
This reimplements the ancient "-q" option to "git diff-files" that was inherited from "show-diff -q" in terms of "--diff-filter=d". We will be deprecating the "-q" option, so let's issue a warning when we do so. Incidentally this also tentatively fixes "git diff --no-index" to honor "-q" and hide deletions; the use will get the same warning. We should remove the support for "-q" in a future version but it is not that urgent. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15remove init_pathspec() in favor of parse_pathspec()Nguyễn Thái Ngọc Duy1-1/+1
While at there, move free_pathspec() to pathspec.c Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15parse_pathspec: add special flag for max_depth featureNguyễn Thái Ngọc Duy1-1/+0
match_pathspec_depth() and tree_entry_interesting() check max_depth field in order to support "git grep --max-depth". The feature activation is tied to "recursive" field, which led to some unwanted activation, e.g. 5c8eeb8 (diff-index: enable recursive pathspec matching in unpack_trees - 2012-01-15). This patch decouples the activation from "recursive" field, puts it in "magic" field instead. This makes sure that only "git grep" can activate this feature. And because parse_pathspec knows when the feature is not used, it does not need to sort pathspec (required for max_depth to work correctly). A small win for non-grep cases. Even though a new magic flag is introduced, no magic syntax is. The magic can be only enabled by parse_pathspec() caller. We might someday want to support ":(maxdepth:10)src." It all depends on actual use cases. max_depth feature cannot be enabled via init_pathspec() anymore. But that's ok because init_pathspec() is on its way to /dev/null. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-02diff-lib, read-tree, unpack-trees: mark cache_entry array paramters constRené Scharfe1-1/+2
Change the type merge_fn_t to accept the array of cache_entry pointers as const pointers to const pointers. This documents the fact that the merge functions don't modify the cache_entry contents or replace any of the pointers in the array. Only a single cast is necessary in unpack_nondirectories because adding two const modifiers at once is not allowed in C. The cast is safe in that it doesn't mask any modfication; call_unpack_fn only needs the array for reading. Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-02diff-lib, read-tree, unpack-trees: mark cache_entry pointers constRené Scharfe1-11/+12
Add const to struct cache_entry pointers throughout the tree which are only used for reading. This allows callers to pass in const pointers. Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-29diff: do not use null sha1 as a sentinel valueJeff King1-8/+12
The diff code represents paths using the diff_filespec struct. This struct has a sha1 to represent the sha1 of the content at that path, as well as a sha1_valid member which indicates whether its sha1 field is actually useful. If sha1_valid is not true, then the filespec represents a working tree file (e.g., for the no-index case, or for when the index is not up-to-date). The diff_filespec is only used internally, though. At the interfaces to the diff subsystem, callers feed the sha1 directly, and we create a diff_filespec from it. It's at that point that we look at the sha1 and decide whether it is valid or not; callers may pass the null sha1 as a sentinel value to indicate that it is not. We should not typically see the null sha1 coming from any other source (e.g., in the index itself, or from a tree). However, a corrupt tree might have a null sha1, which would cause "diff --patch" to accidentally diff the working tree version of a file instead of treating it as a blob. This patch extends the edges of the diff interface to accept a "sha1_valid" flag whenever we accept a sha1, and to use that flag when creating a filespec. In some cases, this means passing the flag through several layers, making the code change larger than would be desirable. One alternative would be to simply die() upon seeing corrupted trees with null sha1s. However, this fix more directly addresses the problem (while bogus sha1s in a tree are probably a bad thing, it is really the sentinel confusion sending us down the wrong code path that is what makes it devastating). And it means that git is more capable of examining and debugging these corrupted trees. For example, you can still "diff --raw" such a tree to find out when the bogus entry was introduced; you just cannot do a "--patch" diff (just as you could not with any other corrupted tree, as we do not have any content to diff). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-16diff-index: enable recursive pathspec matching in unpack_treesNguyen Thai Ngoc Duy1-0/+2
The pathspec structure has a few bits of data to drive various operation modes after we unified the pathspec matching logic in various codepaths. For example, max_depth field is there so that "git grep" can limit the output for files found in limited depth of tree traversal. Also in order to show just the surface level differences in "git diff-tree", recursive field stops us from descending into deeper level of the tree structure when it is set to false, and this also affects pathspec matching when we have wildcards in the pathspec. The diff-index has always wanted the recursive behaviour, and wanted to match pathspecs without any depth limit. But we forgot to do so when we updated tree_entry_interesting() logic to unify the pathspec matching logic. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-05Merge branch 'cn/eradicate-working-copy'Junio C Hamano1-1/+1
* cn/eradicate-working-copy: Remove 'working copy' from the documentation and C code
2011-10-05Merge branch 'jc/diff-index-unpack'Junio C Hamano1-0/+1
* jc/diff-index-unpack: diff-index: pass pathspec down to unpack-trees machinery unpack-trees: allow pruning with pathspec traverse_trees(): allow pruning with pathspec
2011-09-21Remove 'working copy' from the documentation and C codeCarlos Martín Nieto1-1/+1
The git term is 'working tree', so replace the most public references to 'working copy'. Signed-off-by: Carlos Martín Nieto <cmn@elego.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-29diff-index: pass pathspec down to unpack-trees machineryJunio C Hamano1-0/+1
And finally, pass the pathspec down through unpack_trees() to traverse_trees() callchain. Before and after applying this series, looking for changes in the kernel repository with a fairly narrow pathspec becomes somewhat faster. (without patch) $ /usr/bin/time git diff --raw v2.6.27 -- net/ipv6 >/dev/null 0.48user 0.05system 0:00.53elapsed 100%CPU (0avgtext+0avgdata 163296maxresident)k 0inputs+952outputs (0major+11163minor)pagefaults 0swaps (with patch) $ /usr/bin/time git diff --raw v2.6.27 -- net/ipv6 >/dev/null 0.01user 0.00system 0:00.02elapsed 104%CPU (0avgtext+0avgdata 43856maxresident)k 0inputs+24outputs (0major+3688minor)pagefaults 0swaps Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-08Merge branch 'jc/diff-index-refactor'Junio C Hamano1-52/+19
* jc/diff-index-refactor: diff-lib: refactor run_diff_index() and do_diff_cache() diff-lib: simplify do_diff_cache()
2011-08-01Merge branch 'jc/maint-reset-unmerged-path'Junio C Hamano1-1/+2
* jc/maint-reset-unmerged-path: reset [<commit>] paths...: do not mishandle unmerged paths
2011-07-13diff-lib: refactor run_diff_index() and do_diff_cache()Junio C Hamano1-28/+19
The latter is meant to be an API for internal callers that want to inspect the resulting diff-queue, while the former is an implementation of "git diff-index" command. Extract the common logic into a single helper function and make them thin wrappers around it. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-13diff-lib: simplify do_diff_cache()Junio C Hamano1-25/+1
Since 34110cd (Make 'unpack_trees()' have a separate source and destination index, 2008-03-06), we can run unpack_trees() without munging the index at all, but do_diff_cache() tried ever so carefully to work around the old behaviour of the function. We can just tell unpack_trees() not to touch the original index and there is no need to clean-up whatever the previous round has done. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-13reset [<commit>] paths...: do not mishandle unmerged pathsJunio C Hamano1-1/+2
Because "diff --cached HEAD" showed an incorrect blob object name on the LHS of the diff, we ended up updating the index entry with bogus value, not what we read from the tree. Noticed by John Nowak. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-06Merge branch 'jk/diff-not-so-quick'Junio C Hamano1-3/+1
* jk/diff-not-so-quick: diff: futureproof "stop feeding the backend early" logic diff_tree: disable QUICK optimization with diff filter Conflicts: diff.c
2011-05-31diff-index --quiet: learn the "stop feeding the backend early" logicJunio C Hamano1-1/+6
A negative return from the unpack callback function usually means unpack failed for the entry and signals the unpack_trees() machinery to fail the entire merge operation, immediately and there is no other way for the callback to tell the machinery to exit early without reporting an error. This is what we usually want to make a merge all-or-nothing operation, but the machinery is also used for diff-index codepath by using a custom unpack callback function. And we do sometimes want to exit early without failing, namely when we are under --quiet and can short-cut the diff upon finding the first difference. Add "exiting_early" field to unpack_trees_options structure, to signal the unpack_trees() machinery that the negative return value is not signaling an error but an early return from the unpack_trees() machinery. As this by definition hasn't unpacked everything, discard the resulting index just like the failure codepath. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-31Merge remote-tracking branch 'ko/maint' into jc/diff-index-quick-exit-earlyJunio C Hamano1-63/+78
* ko/maint: (4352 commits) git-submodule.sh: separate parens by a space to avoid confusing some shells Documentation/technical/api-diff.txt: correct name of diff_unmerge() read_gitfile_gently: use ssize_t to hold read result remove tests of always-false condition rerere.c: diagnose a corrupt MERGE_RR when hitting EOF between TAB and '\0' Git 1.7.5.3 init/clone: remove short option -L and document --separate-git-dir do not read beyond end of malloc'd buffer git-svn: Fix git svn log --show-commit Git 1.7.5.2 provide a copy of the LGPLv2.1 test core.gitproxy configuration copy_gecos: fix not adding nlen to len when processing "&" Update draft release notes to 1.7.5.2 Documentation/git-fsck.txt: fix typo: unreadable -> unreachable send-pack: avoid deadlock on git:// push with failed pack-objects connect: let callers know if connection is a socket connect: treat generic proxy processes like ssh processes sideband_demux(): fix decl-after-stmt t3503: test cherry picking and reverting root commits ... Conflicts: diff.c
2011-05-31diff: futureproof "stop feeding the backend early" logicJunio C Hamano1-3/+1
Refactor the "do not stop feeding the backend early" logic into a small helper function and use it in both run_diff_files() and diff_tree() that has the stop-early optimization. We may later add other types of diffcore transformation that require to look at the whole result like diff-filter does, and having the logic in a single place is essential for longer term maintainability. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-13Merge branch 'jc/fix-diff-files-unmerged' into maintJunio C Hamano1-4/+11
* jc/fix-diff-files-unmerged: diff-files: show unmerged entries correctly diff: remove often unused parameters from diff_unmerge() diff.c: return filepair from diff_unmerge() test: use $_z40 from test-lib
2011-05-06Merge branch 'jc/fix-diff-files-unmerged'Junio C Hamano1-4/+11
* jc/fix-diff-files-unmerged: diff-files: show unmerged entries correctly diff: remove often unused parameters from diff_unmerge() diff.c: return filepair from diff_unmerge() test: use $_z40 from test-lib
2011-04-23diff-files: show unmerged entries correctlyJunio C Hamano1-2/+8
Earlier, e9c8409 (diff-index --cached --raw: show tree entry on the LHS for unmerged entries., 2007-01-05) taught the command to show the object name and the mode from the entry coming from the tree side when comparing a tree with an unmerged index. This is a belated companion patch that teaches diff-files to show the mode from the entry coming from the working tree side, when comparing an unmerged index and the working tree. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-23diff: remove often unused parameters from diff_unmerge()Junio C Hamano1-3/+4
e9c8409 (diff-index --cached --raw: show tree entry on the LHS for unmerged entries., 2007-01-05) added a <mode, object name> pair as parameters to this function, to store them in the pre-image side of an unmerged file pair. Now the function is fixed to return the filepair it queued, we can make the caller on the special case codepath to do so. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-23Merge branch 'jc/maint-diff-q-filter'Junio C Hamano1-1/+2
* jc/maint-diff-q-filter: diff --quiet: disable optimization when --diff-filter=X is used
2011-03-16diff --quiet: disable optimization when --diff-filter=X is usedJunio C Hamano1-1/+2
The code notices that the caller does not want any detail of the changes and only wants to know if there is a change or not by specifying --quiet. And it breaks out of the loop when it knows it already found any change. When you have a post-process filter (e.g. --diff-filter), however, the path we found to be different in the previous round and set HAS_CHANGES bit may end up being uninteresting, and there may be no output at the end. The optimization needs to be disabled for such case. Note that the f245194 (diff: change semantics of "ignore whitespace" options, 2009-05-22) already disables this optimization by refraining from setting HAS_CHANGES when post-process filters that need to inspect the contents of the files (e.g. -S, -w) in diff_change() function. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03Convert ce_path_match() to use struct pathspecNguyễn Thái Ngọc Duy1-2/+2
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03struct rev_info: convert prune_data to struct pathspecNguyễn Thái Ngọc Duy1-3/+3
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03Convert struct diff_options to use struct pathspecNguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-09Submodules: Add the new "ignore" config option for diff and statusJens Lehmann1-5/+10
The new "ignore" config option controls the default behavior for "git status" and the diff family. It specifies under what circumstances they consider submodules as modified and can be set separately for each submodule. The command line option "--ignore-submodules=" has been extended to accept the new parameter "none" for both status and diff. Users that chose submodules to get rid of long work tree scanning times might want to set the "dirty" option for those submodules. This brings back the pre 1.7.0 behavior, where submodule work trees were never scanned for modifications. By using "--ignore-submodules=none" on the command line the status and diff commands can be told to do a full scan. This option can be set to the following values (which have the same name and meaning as for the "--ignore-submodules" option of status and diff): "all": All changes to the submodule will be ignored. "dirty": Only differences of the commit recorded in the superproject and the submodules HEAD will be considered modifications, all changes to the work tree of the submodule will be ignored. When using this value, the submodule will not be scanned for work tree changes at all, leading to a performance benefit on large submodules. "untracked": Only untracked files in the submodules work tree are ignored, a changed HEAD and/or modified files in the submodule will mark it as modified. "none" (which is the default): Either untracked or modified files in a submodules work tree or a difference between the subdmodules HEAD and the commit recorded in the superproject will make it show up as changed. This value is added as a new parameter for the "--ignore-submodules" option of the diff family and "git status" so the user can override the settings in the configuration. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-11Add optional parameters to the diff option "--ignore-submodules"Jens Lehmann1-0/+1
In some use cases it is not desirable that the diff family considers submodules that only contain untracked content as dirty. This may happen e.g. when the submodule is not under the developers control and not all build generated files have been added to .gitignore by the upstream developers. Using the "untracked" parameter for the "--ignore-submodules" option disables checking for untracked content and lets git diff report them as changed only when they have new commits or modified content. Sometimes it is not wanted to have submodules show up as changed when they just contain changes to their work tree. An example for that are scripts which just want to check for submodule commits while ignoring any changes to the work tree. Also users having large submodules known not to change might want to use this option, as the - sometimes substantial - time it takes to scan the submodule work tree(s) is saved. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-24Merge branch 'jl/submodule-diff-dirtiness'Junio C Hamano1-18/+27
* jl/submodule-diff-dirtiness: git status: ignoring untracked files must apply to submodules too git status: Fix false positive "new commits" output for dirty submodules Refactor dirty submodule detection in diff-lib.c git status: Show detailed dirty status of submodules in long format git diff --submodule: Show detailed dirty status of submodules
2010-03-13git status: ignoring untracked files must apply to submodules tooJens Lehmann1-1/+1
Since 1.7.0 submodules are considered dirty when they contain untracked files. But when git status is called with the "-uno" option, the user asked to ignore untracked files, so they must be ignored in submodules too. To achieve this, the new flag DIFF_OPT_IGNORE_UNTRACKED_IN_SUBMODULES is introduced. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-12git status: Fix false positive "new commits" output for dirty submodulesJens Lehmann1-4/+2
Testing if the output "new commits" should appear in the long format of "git status" is done by comparing the hashes of the diffpair. This always resulted in printing "new commits" for submodules that contained untracked or modified content, even if they did not contain new commits. The reason was that match_stat_with_submodule() did set the "changed" flag for dirty submodules, resulting in two->sha1 being set to the null_sha1 at the call sites, which indicates that new commits are present. This is changed so that when no new commits are present, the same object names are in the sha1 field for both sides of the filepair, and the working tree side will have the "dirty_submodule" flag set when appropriate. For a submodule to be seen as modified even when it just has a dirty work tree, some conditions had to be extended to also check for the "dirty_submodule" flag. Unfortunately the test case that should have found this bug had been changed incorrectly too. It is fixed and extended to test for other combinations too. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-12Refactor dirty submodule detection in diff-lib.cJens Lehmann1-18/+27
Moving duplicated code into the new function match_stat_with_submodule(). Replacing the implicit activation of detailed checks for the dirtiness of submodules when DIFF_FORMAT_PATCH was selected with explicitly setting the recently added DIFF_OPT_DIRTY_SUBMODULES option in diff_setup_done(). Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-09revision: introduce setup_revision_optJunio C Hamano1-1/+4
So far the last parameter to setup_revisions() was to specify the default ref when the command line did not give any (typically "HEAD"). This changes it to take a pointer to a structure so that we can add other information without touching too many codepaths in later patches. There is no functionality change. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-08git status: Show detailed dirty status of submodules in long formatJens Lehmann1-2/+4
Since 1.7.0 there are three reasons a submodule is considered modified against the work tree: It contains new commits, modified content or untracked content. Lets show all reasons in the long format of git status, so the user can better asses the nature of the modification. This change does not affect the short and porcelain formats. Two new members are added to "struct wt_status_change_data" to store the information gathered by run_diff_files(). wt-status.c uses the new flag DIFF_OPT_DIRTY_SUBMODULES to tell diff-lib.c it wants to get detailed dirty information about submodules. A hint line for submodules is printed in the dirty header when dirty submodules are present. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-04git diff --submodule: Show detailed dirty status of submodulesJens Lehmann1-8/+8
When encountering a dirty submodule while doing "git diff --submodule" print an extra line for new untracked content and another for modified but already tracked content. And if the HEAD of the submodule is equal to the ref diffed against in the superproject, drop the output which would just show the same SHA1s and no commit message headlines. To achieve that, the dirty_submodule bitfield is expanded to two bits. The output of "git status" inside the submodule is parsed to set the according bits. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-26Merge branch 'jl/diff-submodule-ignore'Junio C Hamano1-5/+7
* jl/diff-submodule-ignore: Teach diff --submodule that modified submodule directory is dirty git diff: Don't test submodule dirtiness with --ignore-submodules Make ce_uptodate() trustworthy again
2010-01-24git diff: Don't test submodule dirtiness with --ignore-submodulesJens Lehmann1-4/+6
The diff family suppresses the output of submodule changes when requested but checks them nonetheless. But since recently submodules get examined for their dirtiness, which is rather expensive. There is no need to do that when the --ignore-submodules option is used, as the gathered information is never used anyway. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-24Merge branch 'jc/fix-tree-walk'Junio C Hamano1-18/+1
* jc/fix-tree-walk: read-tree --debug-unpack unpack-trees.c: look ahead in the index unpack-trees.c: prepare for looking ahead in the index Aggressive three-way merge: fix D/F case traverse_trees(): handle D/F conflict case sanely more D/F conflict tests tests: move convenience regexp to match object names to test-lib.sh Conflicts: builtin-read-tree.c unpack-trees.c unpack-trees.h
2010-01-24Make ce_uptodate() trustworthy againJunio C Hamano1-1/+1
The rule has always been that a cache entry that is ce_uptodate(ce) means that we already have checked the work tree entity and we know there is no change in the work tree compared to the index, and nobody should have to double check. Note that false ce_uptodate(ce) does not mean it is known to be dirty---it only means we don't know if it is clean. There are a few codepaths (refresh-index and preload-index are among them) that mark a cache entry as up-to-date based solely on the return value from ie_match_stat(); this function uses lstat() to see if the work tree entity has been touched, and for a submodule entry, if its HEAD points at the same commit as the commit recorded in the index of the superproject (a submodule that is not even cloned is considered clean). A submodule is no longer considered unmodified merely because its HEAD matches the index of the superproject these days, in order to prevent people from forgetting to commit in the submodule and updating the superproject index with the new submodule commit, before commiting the state in the superproject. However, the patch to do so didn't update the codepath that marks cache entries up-to-date based on the updated definition and instead worked it around by saying "we don't trust the return value of ce_uptodate() for submodules." This makes ce_uptodate() trustworthy again by not marking submodule entries up-to-date. The next step _could_ be to introduce a few "in-core" flag bits to cache_entry structure to record "this entry is _known_ to be dirty", call is_submodule_modified() from ie_match_stat(), and use these new bits to avoid running this rather expensive check more than once, but that can be a separate patch. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-18Performance optimization for detection of modified submodulesJens Lehmann1-15/+31
In the worst case is_submodule_modified() got called three times for each submodule. The information we got from scanning the whole submodule tree the first time can be reused instead. New parameters have been added to diff_change() and diff_addremove(), the information is stored in a new member of struct diff_filespec. Its value is then reused instead of calling is_submodule_modified() again. When no explicit "-dirty" is needed in the output the call to is_submodule_modified() is not necessary when the submodules HEAD already disagrees with the ref of the superproject, as this alone marks it as modified. To achieve that, get_stat_data() got an extra argument. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-16Show submodules as modified when they contain a dirty work treeJens Lehmann1-2/+6
Until now a submodule only then showed up as modified in the supermodule when the last commit in the submodule differed from the one in the index or the diffed against commit of the superproject. A dirty work tree containing new untracked or modified files in a submodule was undetectable when looking at it from the superproject. Now git status and git diff (against the work tree) in the superproject will also display submodules as modified when they contain untracked or modified files, even if the compared ref matches the HEAD of the submodule. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-13Merge branch 'nd/sparse'Junio C Hamano1-2/+3
* nd/sparse: (25 commits) t7002: test for not using external grep on skip-worktree paths t7002: set test prerequisite "external-grep" if supported grep: do not do external grep on skip-worktree entries commit: correctly respect skip-worktree bit ie_match_stat(): do not ignore skip-worktree bit with CE_MATCH_IGNORE_VALID tests: rename duplicate t1009 sparse checkout: inhibit empty worktree Add tests for sparse checkout read-tree: add --no-sparse-checkout to disable sparse checkout support unpack-trees(): ignore worktree check outside checkout area unpack_trees(): apply $GIT_DIR/info/sparse-checkout to the final index unpack-trees(): "enable" sparse checkout and load $GIT_DIR/info/sparse-checkout unpack-trees.c: generalize verify_* functions unpack-trees(): add CE_WT_REMOVE to remove on worktree alone Introduce "sparse checkout" dir.c: export excluded_1() and add_excludes_from_file_1() excluded_1(): support exclude files in index unpack-trees(): carry skip-worktree bit over in merged_entry() Read .gitignore from index if it is skip-worktree Avoid writing to buffer in add_excludes_from_file_1() ... Conflicts: .gitignore Documentation/config.txt Documentation/git-update-index.txt Makefile entry.c t/t7002-grep.sh
2010-01-07unpack-trees.c: look ahead in the indexJunio C Hamano1-0/+1
This makes the traversal of index be in sync with the tree traversal. When unpack_callback() is fed a set of tree entries from trees, it inspects the name of the entry and checks if the an index entry with the same name could be hiding behind the current index entry, and (1) if the name appears in the index as a leaf node, it is also fed to the n_way_merge() callback function; (2) if the name is a directory in the index, i.e. there are entries in that are underneath it, then nothing is fed to the n_way_merge() callback function; (3) otherwise, if the name comes before the first eligible entry in the index, the index entry is first unpacked alone. When traverse_trees_recursive() descends into a subdirectory, the cache_bottom pointer is moved to walk index entries within that directory. All of these are omitted for diff-index, which does not even want to be fed an index entry and a tree entry with D/F conflicts. This fixes 3-way read-tree and exposes a bug in other parts of the system in t6035, test #5. The test prepares these three trees: O = HEAD^ 100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 a/b-2/c/d 100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 a/b/c/d 100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 a/x A = HEAD 100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 a/b-2/c/d 100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 a/b/c/d 100644 blob 587be6b4c3f93f93c489c0111bba5596147a26cb a/x B = master 120000 blob a36b77384451ea1de7bd340ffca868249626bc52 a/b 100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 a/b-2/c/d 100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 a/x With a clean index that matches HEAD, running git read-tree -m -u --aggressive $O $A $B now yields 120000 a36b77384451ea1de7bd340ffca868249626bc52 3 a/b 100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0 a/b-2/c/d 100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 1 a/b/c/d 100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 2 a/b/c/d 100644 587be6b4c3f93f93c489c0111bba5596147a26cb 0 a/x which is correct. "master" created "a/b" symlink that did not exist, and removed "a/b/c/d" while HEAD did not do touch either path. Before this series, read-tree did not notice the situation and resolved addition of "a/b" and removal of "a/b/c/d" independently. If A = HEAD had another path "a/b/c/e" added, this merge should conflict but instead it silently resolved "a/b" and then immediately overwrote it to add "a/b/c/e", which was quite bogus. Tests in t1012 start to work with this. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-07unpack-trees.c: prepare for looking ahead in the indexJunio C Hamano1-18/+0
This prepares but does not yet implement a look-ahead in the index entries when traverse-trees.c decides to give us tree entries in an order that does not match what is in the index. A case where a look-ahead in the index is necessary happens when merging branch B into branch A while the index matches the current branch A, using a tree O as their common ancestor, and these three trees looks like this: O A B t t t-i t-i t-i t-j t-j t/1 t/2 The traverse_trees() function gets "t", "t-i" and "t" from trees O, A and B first, and notices that A may have a matching "t" behind "t-i" and "t-j" (indeed it does), and tells A to give that entry instead. After unpacking blob "t" from tree B (as it hasn't changed since O in B and A removed it, it will result in its removal), it descends into directory "t/". The side that walked index in parallel to the tree traversal used to be implemented with one pointer, o->pos, that points at the next index entry to be processed. When this happens, the pointer o->pos still points at "t-i" that is the first entry. We should be able to skip "t-i" and "t-j" and locate "t/1" from the index while the recursive invocation of traverse_trees() walks and match entries found there, and later come back to process "t-i". While that look-ahead is not implemented yet, this adds a flag bit, CE_UNPACKED, to mark the entries in the index that has already been processed. o->pos pointer has been renamed to o->cache_bottom and it points at the first entry that may still need to be processed. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-26Merge branch 'jc/1.7.0-diff-whitespace-only-status'Junio C Hamano1-2/+2
* jc/1.7.0-diff-whitespace-only-status: diff.c: fix typoes in comments Make test case number unique diff: Rename QUIET internal option to QUICK diff: change semantics of "ignore whitespace" options Conflicts: diff.h
2009-10-11diff-lib.c: fix misleading comments on oneway_diff()Junio C Hamano1-1/+1
20a16eb (unpack_trees(): fix diff-index regression., 2008-03-10) adjusted diff-index to the new world order since 34110cd (Make 'unpack_trees()' have a separate source and destination index, 2008-03-06). Callbacks are expected to return anything non-negative as "success", and instead of reporting how many index entries they have processed, they are expected to advance o->pos themselves. The code did so, but a stale comment was left behind. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-28Merge branch 'jc/shortstatus'Junio C Hamano1-20/+2
* jc/shortstatus: git commit --dry-run -v: show diff in color when asked Documentation/git-commit.txt: describe --dry-run wt-status: collect untracked files in a separate "collect" phase Make git_status_config() file scope static to builtin-commit.c wt-status: move wt_status_colors[] into wt_status structure wt-status: move many global settings to wt_status structure commit: --dry-run status: show worktree status of conflicted paths separately wt-status.c: rework the way changes to the index and work tree are summarized diff-index: keep the original index intact diff-index: report unmerged new entries
2009-08-23Teach Git to respect skip-worktree bit (reading part)Nguyễn Thái Ngọc Duy1-2/+3
grep: turn on --cached for files that is marked skip-worktree ls-files: do not check for deleted file that is marked skip-worktree update-index: ignore update request if it's skip-worktree, while still allows removing diff*: skip worktree version Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-11Prevent diff machinery from examining assume-unchanged entries on worktreeNguyễn Thái Ngọc Duy1-2/+4
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-05diff-index: keep the original index intactJunio C Hamano1-18/+0
When comparing the index and a tree, we used to read the contents of the tree into stage #1 of the index and compared them with stage #0. In order not to lose sight of entries originally unmerged in the index, we hoisted them to stage #3 before reading the tree. Commit d1f2d7e (Make run_diff_index() use unpack_trees(), not read_tree(), 2008-01-19) changed all this. These days, we instead use unpack_trees() API to traverse the tree and compare the contents with the index, without modifying the index at all. There is no reason to hoist the unmerged entries to stage #3 anymore. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-05diff-index: report unmerged new entriesJunio C Hamano1-2/+2
Since an earlier change to diff-index by d1f2d7e (Make run_diff_index() use unpack_trees(), not read_tree(), 2008-01-19), we stopped reporting an unmerged path that does not exist in the tree, but we should. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-29diff: Rename QUIET internal option to QUICKJunio C Hamano1-2/+2
The option "QUIET" primarily meant "find if we have _any_ difference as quick as possible and report", which means we often do not even have to look at blobs if we know the trees are different by looking at the higher level (e.g. "diff-tree A B"). As a side effect, because there is no point showing one change that we happened to have found first, it also enables NO_OUTPUT and EXIT_WITH_STATUS options, making the end result look quiet. Rename the internal option to QUICK to reflect this better; it also makes grepping the source tree much easier, as there are other kinds of QUIET option everywhere. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-20Merge branch 'jc/cache-tree'Junio C Hamano1-0/+3
* jc/cache-tree: Avoid "diff-index --cached" optimization under --find-copies-harder Optimize "diff-index --cached" using cache-tree t4007: modernize the style cache-tree.c::cache_tree_find(): simplify internal API write-tree --ignore-cache-tree
2009-05-25Avoid "diff-index --cached" optimization under --find-copies-harderJunio C Hamano1-2/+3
When find-copies-harder is in effect, the diff frontends are expected to feed all paths, not just changed paths, to the diffcore, so that copy sources can be picked up. In such a case, not descending into subtrees using the cache-tree information is simply wrong. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-25Optimize "diff-index --cached" using cache-treeJunio C Hamano1-0/+2
When running "diff-index --cached" after making a change to only a small portion of the index, there is no point unpacking unchanged subtrees into the index recursively, only to find that all entries match anyway. Tweak unpack_trees() logic that is used to read in the tree object to catch the case where the tree entry we are looking at matches the index as a whole by looking at the cache-tree. As an exercise, after modifying a few paths in the kernel tree, here are a few numbers on my Athlon 64X2 3800+: (without patch, hot cache) $ /usr/bin/time git diff --cached --raw :100644 100644 b57e1f5... e69de29... M Makefile :100644 000000 8c86b72... 0000000... D arch/x86/Makefile :000000 100644 0000000... e69de29... A arche 0.07user 0.02system 0:00.09elapsed 102%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+9407minor)pagefaults 0swaps (with patch, hot cache) $ /usr/bin/time ../git.git/git-diff --cached --raw :100644 100644 b57e1f5... e69de29... M Makefile :100644 000000 8c86b72... 0000000... D arch/x86/Makefile :000000 100644 0000000... e69de29... A arche 0.02user 0.00system 0:00.02elapsed 103%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+2446minor)pagefaults 0swaps Cold cache numbers are very impressive, but it does not matter very much in practice: (without patch, cold cache) $ su root sh -c 'echo 3 >/proc/sys/vm/drop_caches' $ /usr/bin/time git diff --cached --raw :100644 100644 b57e1f5... e69de29... M Makefile :100644 000000 8c86b72... 0000000... D arch/x86/Makefile :000000 100644 0000000... e69de29... A arche 0.06user 0.17system 0:10.26elapsed 2%CPU (0avgtext+0avgdata 0maxresident)k 247032inputs+0outputs (1172major+8237minor)pagefaults 0swaps (with patch, cold cache) $ su root sh -c 'echo 3 >/proc/sys/vm/drop_caches' $ /usr/bin/time ../git.git/git-diff --cached --raw :100644 100644 b57e1f5... e69de29... M Makefile :100644 000000 8c86b72... 0000000... D arch/x86/Makefile :000000 100644 0000000... e69de29... A arche 0.02user 0.01system 0:01.01elapsed 3%CPU (0avgtext+0avgdata 0maxresident)k 18440inputs+0outputs (79major+2369minor)pagefaults 0swaps This of course helps "git status" as well. (without patch, hot cache) $ /usr/bin/time ../git.git/git-status >/dev/null 0.17user 0.18system 0:00.35elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+5336outputs (0major+10970minor)pagefaults 0swaps (with patch, hot cache) $ /usr/bin/time ../git.git/git-status >/dev/null 0.10user 0.16system 0:00.27elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+5336outputs (0major+3921minor)pagefaults 0swaps Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-23Merge branch 'lt/maint-diff-reduce-lstat'Junio C Hamano1-1/+1
* lt/maint-diff-reduce-lstat: Teach 'git checkout' to preload the index contents Avoid unnecessary 'lstat()' calls in 'get_stat_data()'
2009-05-09Avoid unnecessary 'lstat()' calls in 'get_stat_data()'Linus Torvalds1-1/+1
When we ask get_stat_data() to get the mode and size of an index entry, we can avoid the lstat() call if we have marked the index entry as being uptodate due to earlier lstat() calls. This avoids a lot of unnecessary lstat() calls in eg 'git checkout', where the last phase shows the differences to the working tree (requiring a diff), but earlier phases have already verified the index. On the kernel repo (with a fast machine and everything cached), this changes timings of a nul 'git checkout' from - Before (best of ten): 0.14user 0.05system 0:00.19elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+13237minor)pagefaults 0swaps - After 0.11user 0.03system 0:00.15elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+13235minor)pagefaults 0swaps so it can obviously be noticeable, although equally obviously it's not a show-stopper on this particular machine. The difference is likely larger on slower machines, or with operating systems that don't do as good a job of name caching. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-17Merge branch 'kb/checkout-optim'Junio C Hamano1-1/+1
* kb/checkout-optim: Revert "lstat_cache(): print a warning if doing ping-pong between cache types" checkout bugfix: use stat.mtime instead of stat.ctime in two places Makefile: Set compiler switch for USE_NSEC Create USE_ST_TIMESPEC and turn it on for Darwin Not all systems use st_[cm]tim field for ns resolution file timestamp Record ns-timestamps if possible, but do not use it without USE_NSEC write_index(): update index_state->timestamp after flushing to disk verify_uptodate(): add ce_uptodate(ce) test make USE_NSEC work as expected fix compile error when USE_NSEC is defined check_updates(): effective removal of cache entries marked CE_REMOVE lstat_cache(): print a warning if doing ping-pong between cache types show_patch_diff(): remove a call to fstat() write_entry(): use fstat() instead of lstat() when file is open write_entry(): cleanup of some duplicated code create_directories(): remove some memcpy() and strchr() calls unlink_entry(): introduce schedule_dir_for_removal() lstat_cache(): swap func(length, string) into func(string, length) lstat_cache(): generalise longest_match_lstat_cache() lstat_cache(): small cleanup and optimisation
2009-02-10Generalize and libify index_is_dirty() to index_differs_from(...)Stephan Beyer1-0/+15
index_is_dirty() in builtin-revert.c checks if the index is dirty. This patch generalizes this function to check if the index differs from a revision, i.e. the former index_is_dirty() behavior can now be achieved by index_differs_from("HEAD", 0). The second argument "diff_flags" allows to set further diff option flags like DIFF_OPT_IGNORE_SUBMODULES. See DIFF_OPT_* macros in diff.h for a list. index_differs_from() seems to be useful for more than builtin-revert.c, so it is moved into diff-lib.c and also used in builtin-commit.c. Yet to mention: - "rev.abbrev = 0;" can be safely removed. This has no impact on performance or functioning of neither setup_revisions() nor run_diff_index(). - rev.pending.objects is free()d because this fixes a leak. (Also see 295dd2ad "Fix memory leak in traverse_commit_list") Mentored-by: Daniel Barkalow <barkalow@iabervon.org> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Stephan Beyer <s-beyer@gmx.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-09lstat_cache(): swap func(length, string) into func(string, length)Kjetil Barvik1-1/+1
Swap function argument pair (length, string) into (string, length) to conform with the commonly used order inside the GIT source code. Also, add a note about this fact into the coding guidelines. Signed-off-by: Kjetil Barvik <barvik@broadpark.no> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-11Cleanup of unused symcache variable inside diff-lib.cKjetil Barvik1-29/+11
Commit c40641b77b0274186fd1b327d5dc3246f814aaaf, 'Optimize symlink/directory detection' by Linus Torvalds, removed the 'char *symcache' parameter to the has_symlink_leading_path() function. This made all variables currently named 'symcache' inside diff-lib.c unnecessary. This also let us throw away the 'struct oneway_unpack_data', and instead directly use the 'struct rev_info *revs' member, which was the only member left after removal of the 'symcache[] array' member. The 'struct oneway_unpack_data' was introduced by the following commit: 948dd346 "diff-files: careful when inspecting work tree items" Impact: cleanup PATH_MAX bytes less memory stack usage in some cases Signed-off-by: Kjetil Barvik <barvik@broadpark.no> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-30diff: vary default prefix depending on what are comparedJunio C Hamano1-0/+3
With a new configuration "diff.mnemonicprefix", "git diff" shows the differences between various combinations of preimage and postimage trees with prefixes different from the standard "a/" and "b/". Hopefully this will make the distinction stand out for some people. "git diff" compares the (i)ndex and the (w)ork tree; "git diff HEAD" compares a (c)ommit and the (w)ork tree; "git diff --cached" compares a (c)ommit and the (i)ndex; "git-diff HEAD:file1 file2" compares an (o)bject and a (w)ork tree entity; "git diff --no-index a b" compares two non-git things (1) and (2). Because these mnemonics now have meanings, they are swapped when reverse diff is in effect and this feature is enabled. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-16Fix buffer overflow in git diffDmitry Potapov1-4/+4
If PATH_MAX on your system is smaller than a path stored, it may cause buffer overflow and stack corruption in diff_addremove() and diff_change() functions when running git-diff Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-24"git diff": do not ignore index without --no-indexJunio C Hamano1-323/+0
Even if "foo" and/or "bar" does not exist in index, "git diff foo bar" should not change behaviour drastically from "git diff foo bar baz" or "git diff foo". A feature that "sometimes works and is handy" is an unreliable cute hack. "git diff foo bar" outside a git repository continues to work as a more colourful alternative to "diff -u" as before. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-10Optimize symlink/directory detectionLinus Torvalds1-5/+5
This is the base for making symlink detection in the middle fo a pathname saner and (much) more efficient. Under various loads, we want to verify that the full path leading up to a filename is a real directory tree, and that when we successfully do an 'lstat()' on a filename, we don't get a false positive due to a symlink in the middle of the path that git should have seen as a symlink, not as a normal path component. The 'has_symlink_leading_path()' function already did this, and cached a single level of symlink information, but didn't cache the _lack_ of a symlink, so the normal behaviour was actually the wrong way around, and we ended up doing an 'lstat()' on each path component to check that it was a real directory. This caches the last detected full directory and symlink entries, and speeds up especially deep directory structures a lot by avoiding to lstat() all the directories leading up to each entry in the index. [ This can - and should - probably be extended upon so that we eventually never do a bare 'lstat()' on any path entries at *all* when checking the index, but always check the full path carefully. Right now we do not generally check the whole path for all our normal quick index revalidation. We should also make sure that we're careful about all the invalidation, ie when we remove a link and replace it by a directory we should invalidate the symlink cache if it matches (and vice versa for the directory cache). But regardless, the basic function needs to be sane to do that. The old 'has_symlink_leading_path()' was not capable enough - or indeed the code readable enough - to really do that sanely. So I'm pushing this as not just an optimization, but as a base for further work. ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-10Merge branch 'py/diff-submodule'Junio C Hamano1-7/+26
* py/diff-submodule: is_racy_timestamp(): do not check timestamp for gitlinks diff-lib.c: rename check_work_tree_entity() diff: a submodule not checked out is not modified Add t7506 to test submodule related functions for git-status t4027: test diff for submodule with empty directory
2008-05-05Merge branch 'jc/lstat'Junio C Hamano1-2/+5
* jc/lstat: diff-files: mark an index entry we know is up-to-date as such write_index(): optimize ce_smudge_racily_clean_entry() calls with CE_UPTODATE
2008-05-04diff-lib.c: rename check_work_tree_entity()Junio C Hamano1-4/+4
The function is about checking for removed work tree item, so name it accordingly to avoid future confusion. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-04diff: a submodule not checked out is not modifiedJunio C Hamano1-3/+22
948dd34 (diff-index: careful when inspecting work tree items, 2008-03-30) made the work tree check careful not to be fooled by a new directory that exists at a place the index expects a blob. For such a change to be a typechange from blob to submodule, the new directory has to be a repository. However, if the index expects a submodule there, we should not insist the work tree entity to be a repository --- a simple directory that is not a full fledged repository (even an empty directory would do) should be considered an unmodified subproject, because that is how a superproject with a submodule is checked out sparsely by default. This makes the function check_work_tree_entity() even more careful not to report a submodule that is not checked out as removed. It fixes the recently added test in t4027. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-29git-svn: detect and fail gracefully when dcommitting to a voidMatthieu Moy1-0/+3
The command git svn clone (URL of an empty SVN repo here) works, creates an empty git repository. I can perform the initial commit there, but then, "git svn dcommit" says : Use of uninitialized value in concatenation (.) or string at .../git-svn line 414. Committing to ... Unable to determine upstream SVN information from HEAD history I guess a correct management of the initial commit in git-svn would be hard to implement, but at least, the error message can be improved. First step is something like the patch below, and better would be for "git svn clone" to warn that it won't be able to do much with the cloned repo. Acked-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-12diff-files: mark an index entry we know is up-to-date as suchJunio C Hamano1-2/+5
This does not make any difference when running diff-files alone, but if you internally run run_diff_files() and then run other operations further on the index, we do not have to run lstat(2) again on entries we already have checked. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-30diff-files: careful when inspecting work tree itemsJunio C Hamano1-6/+11
This fixes the same breakage in diff-files. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-30diff-index: careful when inspecting work tree itemsJunio C Hamano1-14/+55
Earlier, if you changed a staged path into a directory in the work tree, we happily ran lstat(2) on it and found that it exists, and declared that the user changed it to a gitlink. This is wrong for two reasons: (1) It may be a directory, but it may not be a submodule, and in the latter case, the change we need to report is "the blob at the path has disappeared". We need to check with resolve_gitlink_ref() to be consistent with what "git add" and "git update-index --add" does. (2) lstat(2) may have succeeded only because a leading component of the path was turned into a symbolic link that points at something that exists in the work tree. In such a case, the path itself does not exist anymore, as far as the index is concerned. This fixes these breakages in diff-index that the previous patch has exposed. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-10unpack_trees(): fix diff-index regression.Linus Torvalds1-0/+18
When skip_unmerged option is not given, unpack_trees() should not just skip unmerged cache entries but keep them in the result for the caller to sort them out. For callers other than diff-index, the incoming index should never be unmerged, but diff-index is a special case caller. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-09Make 'unpack_trees()' have a separate source and destination indexLinus Torvalds1-41/+8
We will always unpack into our own internal index, but we will take the source from wherever specified, and we will optionally write the result to a specified index (optionally, because not everybody even _wants_ any result: the index diffing really wants to just walk the tree and index in parallel). This ends up removing a fair number more lines than it adds, for the simple reason that we can now skip all the crud that tried to be oh-so-careful about maintaining our position in the index as we were traversing and modifying it. Since we don't actually modify the source index any more, we can just update the 'o->pos' pointer without worrying about whether an index entry got removed or replaced or added to. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-09Make 'unpack_trees()' take the index to work on as an argumentLinus Torvalds1-0/+2
This is just a very mechanical conversion, and makes everybody set it to '&the_index' before calling, but at least it makes it more explicit where we work with the index. The next stage would be to split that index usage up into a 'source' and a 'destination' index, so that we can unpack into a different index than we started out from. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-02diff-lib.c: constness strengtheningJunio C Hamano1-7/+6
The internal implementation of diff-index codepath used to use non const pointer to pass sha1 around, but it did not have to. With this, we can also lose the private no_sha1[] array, as we can use the public null_sha1[] array that exists exactly for the same purpose. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-09Allow callers of unpack_trees() to handle failureDaniel Barkalow1-2/+4
Return an error from unpack_trees() instead of calling die(), and exit with an error in read-tree, builtin-commit, and diff-lib. merge-recursive already expected an error return from unpack_trees, so it doesn't need to be changed. The merge function can return negative to abort. This will be used in builtin-checkout -m. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
2008-01-21Also use unpack_trees() in do_diff_cache()Johannes Schindelin1-79/+13
As in run_diff_index(), we call unpack_trees() with the oneway_diff() function in do_diff_cache() now. This makes the function diff_cache() obsolete. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21Make run_diff_index() use unpack_trees(), not read_tree()Linus Torvalds1-14/+137
A plain "git commit" would still run lstat() a lot more than necessary, because wt_status_print() would cause the index to be repeatedly flushed and re-read by wt_read_cache(), and that would cause the CE_UPTODATE bit to be lost, resulting in the files in the index being lstat'ed three times each. The reason why wt-status.c ended up invalidating and re-reading the cache multiple times was that it uses "run_diff_index()", which in turn uses "read_tree()" to populate the index with *both* the old index and the tree we want to compare against. So this patch re-writes run_diff_index() to not use read_tree(), but instead use "unpack_trees()" to diff the index to a tree. That, in turn, means that we don't need to modify the index itself, which then means that we don't need to invalidate it and re-read it! This, together with the lstat() optimizations, means that "git commit" on the kernel tree really only needs to lstat() the index entries once. That noticeably cuts down on the cached timings. Best time before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.399s user 0m0.232s sys 0m0.164s Best time after: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.254s user 0m0.140s sys 0m0.112s so it's a noticeable improvement in addition to being a nice conceptual cleanup (it's really not that pretty that "run_diff_index()" dirties the index!) Doing an "strace -c" on it also shows that as it cuts the number of lstat() calls by two thirds, it goes from being lstat()-limited to being limited by getdents() (which is the readdir system call): Before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 60.69 0.000704 0 69230 31 lstat 23.62 0.000274 0 5522 getdents 8.36 0.000097 0 5508 2638 open 2.59 0.000030 0 2869 close 2.50 0.000029 0 274 write 1.47 0.000017 0 2844 fstat After: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 45.17 0.000276 0 5522 getdents 26.51 0.000162 0 23112 31 lstat 19.80 0.000121 0 5503 2638 open 4.91 0.000030 0 2864 close 1.48 0.000020 0 274 write 1.34 0.000018 0 2844 fstat ... It passes the test-suite for me, but this is another of one of those really core functions, and certainly pretty subtle, so.. NOTE! The Linux lstat() system call is really quite cheap when everything is cached, so the fact that this is quite noticeable on Linux is likely to mean that it is *much* more noticeable on other operating systems. I bet you'll see a much bigger performance improvement from this on Windows in particular. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21Make on-disk index representation separate from in-core oneLinus Torvalds1-18/+14
This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-26Use is_absolute_path() in diff-lib.c, lockfile.c, setup.c, trace.cSteffen Prohaska1-1/+1
Using the helper function to test for absolute paths makes porting easier. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-18Merge branch 'ph/diffopts'Junio C Hamano1-11/+13
* ph/diffopts: Reorder diff_opt_parse options more logically per topics. Make the diff_options bitfields be an unsigned with explicit masks. Use OPT_BIT in builtin-pack-refs Use OPT_BIT in builtin-for-each-ref Use OPT_SET_INT and OPT_BIT in builtin-branch parse-options new features.
2007-11-11Make the diff_options bitfields be an unsigned with explicit masks.Pierre Habouzit1-11/+13
reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-10git-add: make the entry stat-clean after re-adding the same contentsJunio C Hamano1-1/+3
Earlier in commit 0781b8a9b2fe760fc4ed519a3a26e4b9bd6ccffe (add_file_to_index: skip rehashing if the cached stat already matches), add_file_to_index() were taught not to re-add the path if it already matches the index. The change meant well, but was not executed quite right. It used ie_modified() to see if the file on the work tree is really different from the index, and skipped adding the contents if the function says "not modified". This was wrong. There are three possible comparison results between the index and the file in the work tree: - with lstat(2) we _know_ they are different. E.g. if the length or the owner in the cached stat information is different from the length we just obtained from lstat(2), we can tell the file is modified without looking at the actual contents. - with lstat(2) we _know_ they are the same. The same length, the same owner, the same everything (but this has a twist, as described below). - we cannot tell from lstat(2) information alone and need to go to the filesystem to actually compare. The last case arises from what we call 'racy git' situation, that can be caused with this sequence: $ echo hello >file $ git add file $ echo aeiou >file ;# the same length If the second "echo" is done within the same filesystem timestamp granularity as the first "echo", then the timestamp recorded by "git add" and the timestamp we get from lstat(2) will be the same, and we can mistakenly say the file is not modified. The path is called 'racily clean'. We need to reliably detect racily clean paths are in fact modified. To solve this problem, when we write out the index, we mark the index entry that has the same timestamp as the index file itself (that is the time from the point of view of the filesystem) to tell any later code that does the lstat(2) comparison not to trust the cached stat info, and ie_modified() then actually goes to the filesystem to compare the contents for such a path. That's all good, but it should not be used for this "git add" optimization, as the goal of "git add" is to actually update the path in the index and make it stat-clean. With the false optimization, we did _not_ cause any data loss (after all, what we failed to do was only to update the cached stat information), but it made the following sequence leave the file stat dirty: $ echo hello >file $ git add file $ echo hello >file ;# the same contents $ git add file The solution is not to use ie_modified() which goes to the filesystem to see if it is really clean, but instead use ie_match_stat() with "assume racily clean paths are dirty" option, to force re-adding of such a path. There was another problem with "git add -u". The codepath shares the same issue when adding the paths that are found to be modified, but in addition, it asked "git diff-files" machinery run_diff_files() function (which is "git diff-files") to list the paths that are modified. But "git diff-files" machinery uses the same ie_modified() call so that it does not report racily clean _and_ actually clean paths as modified, which is not what we want. The patch allows the callers of run_diff_files() to pass the same "assume racily clean paths are dirty" option, and makes "git-add -u" codepath to use that option, to discover and re-add racily clean _and_ actually clean paths. We could further optimize on top of this patch to differentiate the case where the path really needs re-adding (i.e. the content of the racily clean entry was indeed different) and the case where only the cached stat information needs to be refreshed (i.e. the racily clean entry was actually clean), but I do not think it is worth it. This patch applies to maint and all the way up. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-10ce_match_stat, run_diff_files: use symbolic constants for readabilityJunio C Hamano1-7/+9
ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-14diff --no-index: do not forget to run diff_setup_done()Junio C Hamano1-0/+2
Code inspection by Linus found this. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-14diff: squelch empty diffs even moreRené Scharfe1-2/+6
When we compare two non-tracked files, or explicitly specify --no-index, the suggestion to run git-status is not helpful. The patch adds a new diff_options bitfield member, no_index, that is used instead of the special value of -2 of the rev_info field max_count to indicate that the index is not to be used. This makes it possible to pass that flag down to diffcore_skip_stat_unmatch(), which only has one diff_options parameter. This could even become a cleanup if we removed all assignments of max_count to a value of -2 (viz. replacement of a magic value with a self-documenting field name) but I didn't dare to do that so late in the rc game.. The no_index bit, if set, then tells diffcore_skip_stat_unmatch() to not account for any skipped stat-mismatches, which avoids the suggestion to run git-status. Signed-off-by: Rene Scharfe <rene.scharfe@lsfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-07diff-lib.c: don't strdup twiceRené Scharfe1-1/+1
The static function read_directory in diff-lib.c is only ever called with struct path_list lists with .strdup_paths turned on, i.e. path_list_insert will strdup the paths for us (again). Let's take advantage of that and stop doing it twice. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-07War on whitespaceJunio C Hamano1-1/+1
This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-04-21Merge branch 'lt/gitlink'Junio C Hamano1-12/+3
* lt/gitlink: Tests for core subproject support Expose subprojects as special files to "git diff" machinery Fix some "git ls-files -o" fallout from gitlinks Teach "git-read-tree -u" to check out submodules as a directory Teach git list-objects logic to not follow gitlinks Fix gitlink index entry filesystem matching Teach "git-read-tree -u" to check out submodules as a directory Teach git list-objects logic not to follow gitlinks Don't show gitlink directories when we want "other" files Teach git-update-index about gitlinks Teach directory traversal about subprojects Fix thinko in subproject entry sorting Teach core object handling functions about gitlinks Teach "fsck" not to follow subproject links Add "S_IFDIRLNK" file mode infrastructure for git links Add 'resolve_gitlink_ref()' helper function Avoid overflowing name buffer in deep directory structures diff-lib: use ce_mode_from_stat() rather than messing with modes manually
2007-04-13Do not default to --no-index when given two directories.Junio C Hamano1-10/+26
git-diff -- a/ b/ always defaulted to --no-index, primarily because the function is_in_index() was implemented quite incorrectly. Noticed by Patrick Maaß and Simon Schubert independently, initial patch was provided by Patrick but I fixed it differently. Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-09diff-lib: use ce_mode_from_stat() rather than messing with modes manuallyLinus Torvalds1-12/+3
The diff helpers used to do the magic mode canonicalization and all the other special mode handling by hand ("trust executable bit" and "has symlink support" handling). That's bogus. Use "ce_mode_from_stat()" that does this all for us. This is also going to be required when we add support for links to other git repositories. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-14Teach --quiet to diff backends.Junio C Hamano1-0/+6
This teaches git-diff-files, git-diff-index and git-diff-tree backends to exit early under --quiet option. Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-14Allow git-diff exit with codes similar to diff(1)Alex Riesen1-1/+4
This introduces a new command-line option: --exit-code. The diff programs will return 1 for differences, return 0 for equality, and something else for errors. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-10Merge branch 'js/diff-ni'Junio C Hamano1-15/+27
* js/diff-ni: Get rid of the dependency to GNU diff in the tests diff --no-index: support /dev/null as filename diff-ni: fix the diff with standard input diff: support reading a file from stdin via "-"
2007-03-04Merge branch 'js/symlink'Junio C Hamano1-0/+3
* js/symlink: Tell multi-parent diff about core.symlinks. Handle core.symlinks=false case in merge-recursive. Add core.symlinks to mark filesystems that do not support symbolic links.
2007-03-04diff --no-index: support /dev/null as filenameJohannes Schindelin1-17/+17
This allows us to create "new file" and "delete file" patches. It also cleans up the code. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-04diff-ni: fix the diff with standard inputJunio C Hamano1-5/+11
The earlier commit to read from stdin was full of problems, and this corrects them. - The mode bits should have been set to satisify S_ISREG(); we forgot to the S_IFREG bits and hardcoded 0644; - We did not give escape hatch to name a path whose name is really "-". Allow users to say "./-" for that; - Use of xread() was not prepared to see short read (e.g. reading from tty) nor handing read errors. Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-03diff: support reading a file from stdin via "-"Johannes Schindelin1-5/+11
This allows you to say echo Hello World | git diff x - to compare the contents of file "x" with the line "Hello World". This automatically switches to --no-index mode. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-03diff-ni: allow running from a subdirectory.Junio C Hamano1-1/+13
When run from a subdirectory of a repository, the command forgot to adjust paths given to it with prefix. Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-02Add core.symlinks to mark filesystems that do not support symbolic links.Johannes Sixt1-0/+3
Some file systems that can host git repositories and their working copies do not support symbolic links. But then if the repository contains a symbolic link, it is impossible to check out the working copy. This patch enables partial support of symbolic links so that it is possible to check out a working copy on such a file system. A new flag core.symlinks (which is true by default) can be set to false to indicate that the filesystem does not support symbolic links. In this case, symbolic links that exist in the trees are checked out as small plain files, and checking in modifications of these files preserve the symlink property in the database (as long as an entry exists in the index). Of course, this does not magically make symbolic links work on such defective file systems; hence, this solution does not help if the working copy relies on that an entry is a real symbolic link. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-28diff: make more cases implicit --no-indexJohannes Schindelin1-0/+54
When specifying an absolute path, or a relative path pointing outside the working tree, do not fail, but roll your own diffopt parsing, and execute a --no-index diff. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-26diff --no-index: also imitate the exit status of diff(1)Johannes Schindelin1-3/+8
diff sets the exit status to 0 when no changes were found, to 1 when changes were found, and 2 means error. We imitate this to be able to use "git diff" in the test scripts. (Actually, keeping in line with the rest of git, -1 is returned on error, which corresponds to an exit status 255). To find out if the diff is not empty, a member called "found_changes" was introduced in struct diff_options, which is set in builtin_diff() and fn_out_consume(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-26Merge branch 'master' into js/diff-niJunio C Hamano1-3/+13
* master: (201 commits) Documentation: link in 1.5.0.2 material to the top documentation page. Documentation: document remote.<name>.tagopt GIT 1.5.0.2 git-remote: support remotes with a dot in the name Documentation: describe "-f/-t/-m" options to "git-remote add" diff --cc: fix display of symlink conflicts during a merge. merge-recursive: fix longstanding bug in merging symlinks merge-index: fix longstanding bug in merging symlinks diff --cached: give more sensible error message when HEAD is yet to be created. Update tests to use test-chmtime Add test-chmtime: a utility to change mtime on files Add Release Notes to prepare for 1.5.0.2 Allow arbitrary number of arguments to git-pack-objects rerere: do not deal with symlinks. rerere: do not skip two conflicted paths next to each other. Don't modify CREDITS-FILE if it hasn't changed. diff-patch: Avoid emitting double-slashes in textual patch. Reword git-am 3-way fallback failure message. Limit filename for format-patch core.legacyheaders: Use the description used in RelNotes-1.5.0 ...
2007-02-25diff --cc: fix display of symlink conflicts during a merge.Junio C Hamano1-3/+13
"git-diff-files --cc" to show conflicts during merge did not pass the correct mode information for the working tree down, and showed bogus combined diff. Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-25Fix typo: do not show name1 when name2 failsJohannes Schindelin1-1/+1
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-24Evil Merge branch 'jc/status' (early part) into js/diff-niJunio C Hamano1-9/+5
* 'jc/status' (early part): run_diff_{files,index}(): update calling convention. update-index: do not die too early in a read-only repository. git-status: do not be totally useless in a read-only repository. This is to resolve semantic conflict (which is not textual) that changes the calling convention of run_diff_files() early.
2007-02-22Teach git-diff-files the new option `--no-index`Johannes Schindelin1-0/+207
With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-22run_diff_{files,index}(): update calling convention.Junio C Hamano1-9/+1
They used to open and read index themselves, but they now expect their callers to do so. Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-16Do not take mode bits from index after type change.Junio C Hamano1-3/+1
When we do not trust executable bit from lstat(2), we copied existing ce_mode bits without checking if the filesystem object is a regular file (which is the only thing we apply the "trust executable bit" business) nor if the blob in the index is a regular file (otherwise, we should do the same as registering a new regular file, which is to default non-executable). Noticed by Johannes Sixt. Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-05git-blame: no rev means start from the working tree file.Junio C Hamano1-1/+43
Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-06diff-index --cached --raw: show tree entry on the LHS for unmerged entries.Junio C Hamano1-3/+6
This updates the way diffcore represents an unmerged pair somewhat. It used to be that entries with mode=0 on both sides were used to represent an unmerged pair, but now it has an explicit flag. This is to allow diff-index --cached to report the entry from the tree when the path is unmerged in the index. This is used in updating "git reset <tree> -- <path>" to restore absense of the path in the index from the tree. Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-05diff-index --cc shows a 3-way diff between HEAD, index and working tree.Paul Mackerras1-0/+25
This implements a 3-way diff between the HEAD commit, the state in the index, and the working directory. This is like the n-way diff for a merge, and uses much of the same code. It is invoked with the -c flag to git-diff-index, which it already accepted and did nothing with. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-23Convert memset(hash,0,20) to hashclr(hash).Junio C Hamano1-1/+1
In the same spirit as hashcmp() and hashcpy(). Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-23Convert memcpy(a,b,20) to hashcpy(a,b).Shawn Pearce1-2/+1
This abstracts away the size of the hash values when copying them from memory location to memory location, much as the introduction of hashcmp abstracted away hash value comparsion. A few call sites were using char* rather than unsigned char* so I added the cast rather than open hashcpy to be void*. This is a reasonable tradeoff as most call sites already use unsigned char* and the existing hashcmp is also declared to be unsigned char*. [jc: Splitted the patch to "master" part, to be followed by a patch for merge-recursive.c which is not in "master" yet. Fixed the cast in the latter hunk to combine-diff.c which was wrong in the original. Also converted ones left-over in combine-diff.c, diff-lib.c and upload-pack.c ] Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-17Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length.David Rientjes1-1/+1
Introduces global inline: hashcmp(const unsigned char *sha1, const unsigned char *sha2) Uses memcmp for comparison and returns the result based on the length of the hash name (a future runtime decision). Acked-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-19Merge branch 'lt/objlist' into nextJunio C Hamano1-2/+2
* lt/objlist: Add "named object array" concept xdiff: minor changes to match libxdiff-0.21 fix rfc2047 formatter. Fix t8001-annotate and t8002-blame for ActiveState Perl Add specialized object allocator
2006-06-19Add "named object array" conceptLinus Torvalds1-2/+2
We've had this notion of a "object_list" for a long time, which eventually grew a "name" member because some users (notably git-rev-list) wanted to name each object as it is generated. That object_list is great for some things, but it isn't all that wonderful for others, and the "name" member is generally not used by everybody. This patch splits the users of the object_list array up into two: the traditional list users, who want the list-like format, and who don't actually use or want the name. And another class of users that really used the list as an extensible array, and generally wanted to name the objects. The patch is fairly straightforward, but it's also biggish. Most of it really just cleans things up: switching the revision parsing and listing over to the array makes things like the builtin-diff usage much simpler (we now see exactly how many members the array has, and we don't get the objects reversed from the order they were on the command line). One of the main reasons for doing this at all is that the malloc overhead of the simple object list was actually pretty high, and the array is just a lot denser. So this patch brings down memory usage by git-rev-list by just under 3% (on top of all the other memory use optimizations) on the mozilla archive. It does add more lines than it removes, and more importantly, it adds a whole new infrastructure for maintaining lists of objects, but on the other hand, the new dynamic array code is pretty obvious. The change to builtin-diff-tree.c shows a fairly good example of why an array interface is sometimes more natural, and just much simpler for everybody. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-18Don't instantiate structures with FAMs.Florian Forster1-19/+22
Since structures with `flexible array members' are an incomplete datatype ANSI C99 forbids creating instances of them. This patch removes such an instance from `diff-lib.c' and replaces it with a pointer to a `struct combine_diff_path'. Since all neccessary memory is allocated at once the number of calls to `xmalloc' is not increased. Signed-off-by: Florian Forster <octo@verplant.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-22Libified diff-index: backward compatibility fix.Junio C Hamano1-1/+9
"diff-index -m" does not mean "do not ignore merges", but means "pretend missing files match the index". The previous round tried to address this, but failed because setup_revisions() ate "-m" flag before the caller had a chance to intervene. Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-22Libify diff-index.Junio C Hamano1-0/+203
The second installment to libify diff brothers. The pathname arguments are checked more strictly than before because we now use the revision.c::setup_revisions() infrastructure. Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-22Libify diff-files.Junio C Hamano1-1762/+100
This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-19diff --stat: do not drop rename information.Junio C Hamano1-9/+68
When a verbatim rename or copy is detected, we did not show anything on the "diff --stat" for the filepair. This makes it to show the rename information. Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-19diff: move diff.c to diff-lib.c to make room.Junio C Hamano1-0/+1736
Now I am not doing any real "git-diff in C" yet, but this would help before doing so. Signed-off-by: Junio C Hamano <junkio@cox.net>