aboutsummaryrefslogtreecommitdiffstats
path: root/tree-diff.c
AgeCommit message (Collapse)AuthorFilesLines
2017-11-01diff: make struct diff_flags members lowercaseBrandon Williams1-8/+8
Now that the flags stored in struct diff_flags are being accessed directly and not through macros, change all struct members from being uppercase to lowercase. This conversion is done using the following semantic patch: @@ expression E; @@ - E.RECURSIVE + E.recursive @@ expression E; @@ - E.TREE_IN_RECURSIVE + E.tree_in_recursive @@ expression E; @@ - E.BINARY + E.binary @@ expression E; @@ - E.TEXT + E.text @@ expression E; @@ - E.FULL_INDEX + E.full_index @@ expression E; @@ - E.SILENT_ON_REMOVE + E.silent_on_remove @@ expression E; @@ - E.FIND_COPIES_HARDER + E.find_copies_harder @@ expression E; @@ - E.FOLLOW_RENAMES + E.follow_renames @@ expression E; @@ - E.RENAME_EMPTY + E.rename_empty @@ expression E; @@ - E.HAS_CHANGES + E.has_changes @@ expression E; @@ - E.QUICK + E.quick @@ expression E; @@ - E.NO_INDEX + E.no_index @@ expression E; @@ - E.ALLOW_EXTERNAL + E.allow_external @@ expression E; @@ - E.EXIT_WITH_STATUS + E.exit_with_status @@ expression E; @@ - E.REVERSE_DIFF + E.reverse_diff @@ expression E; @@ - E.CHECK_FAILED + E.check_failed @@ expression E; @@ - E.RELATIVE_NAME + E.relative_name @@ expression E; @@ - E.IGNORE_SUBMODULES + E.ignore_submodules @@ expression E; @@ - E.DIRSTAT_CUMULATIVE + E.dirstat_cumulative @@ expression E; @@ - E.DIRSTAT_BY_FILE + E.dirstat_by_file @@ expression E; @@ - E.ALLOW_TEXTCONV + E.allow_textconv @@ expression E; @@ - E.TEXTCONV_SET_VIA_CMDLINE + E.textconv_set_via_cmdline @@ expression E; @@ - E.DIFF_FROM_CONTENTS + E.diff_from_contents @@ expression E; @@ - E.DIRTY_SUBMODULES + E.dirty_submodules @@ expression E; @@ - E.IGNORE_UNTRACKED_IN_SUBMODULES + E.ignore_untracked_in_submodules @@ expression E; @@ - E.IGNORE_DIRTY_SUBMODULES + E.ignore_dirty_submodules @@ expression E; @@ - E.OVERRIDE_SUBMODULE_CONFIG + E.override_submodule_config @@ expression E; @@ - E.DIRSTAT_BY_LINE + E.dirstat_by_line @@ expression E; @@ - E.FUNCCONTEXT + E.funccontext @@ expression E; @@ - E.PICKAXE_IGNORE_CASE + E.pickaxe_ignore_case @@ expression E; @@ - E.DEFAULT_FOLLOW_RENAMES + E.default_follow_renames Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-01diff: remove DIFF_OPT_SET macroBrandon Williams1-2/+2
Remove the `DIFF_OPT_SET` macro and instead set the flags directly. This conversion is done using the following semantic patch: @@ expression E; identifier fld; @@ - DIFF_OPT_SET(&E, fld) + E.flags.fld = 1 @@ type T; T *ptr; identifier fld; @@ - DIFF_OPT_SET(ptr, fld) + ptr->flags.fld = 1 Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-01diff: remove DIFF_OPT_TST macroBrandon Williams1-6/+6
Remove the `DIFF_OPT_TST` macro and instead access the flags directly. This conversion is done using the following semantic patch: @@ expression E; identifier fld; @@ - DIFF_OPT_TST(&E, fld) + E.flags.fld @@ type T; T *ptr; identifier fld; @@ - DIFF_OPT_TST(ptr, fld) + ptr->flags.fld Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-14tree-walk: convert fill_tree_descriptor() to object_idRené Scharfe1-3/+2
All callers of fill_tree_descriptor() have been converted to object_id already, so convert that function as well. As a nice side-effect we get rid of NULL checks in tree-diff.c, as fill_tree_descriptor() already does them for us. Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-11Merge branch 'bw/object-id'Junio C Hamano1-2/+3
Conversion from uchar[20] to struct object_id continues. * bw/object-id: receive-pack: don't access hash of NULL object_id pointer notes: don't access hash of NULL object_id pointer tree-diff: don't access hash of NULL object_id pointer
2017-07-17tree-diff: don't access hash of NULL object_id pointerRené Scharfe1-2/+3
The object_id pointers can be NULL for invalid entries. Don't try to dereference them and pass NULL along to fill_tree_descriptor() instead, which handles them just fine. Found with Clang's UBSan. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-24Merge branch 'ab/free-and-null'Junio C Hamano1-4/+2
A common pattern to free a piece of memory and assign NULL to the pointer that used to point at it has been replaced with a new FREE_AND_NULL() macro. * ab/free-and-null: *.[ch] refactoring: make use of the FREE_AND_NULL() macro coccinelle: make use of the "expression" FREE_AND_NULL() rule coccinelle: add a rule to make "expression" code use FREE_AND_NULL() coccinelle: make use of the "type" FREE_AND_NULL() rule coccinelle: add a rule to make "type" code use FREE_AND_NULL() git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
2017-06-16coccinelle: make use of the "type" FREE_AND_NULL() ruleÆvar Arnfjörð Bjarmason1-4/+2
Apply the result of the just-added coccinelle rule. This manually excludes a few occurrences, mostly things that resulted in many FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent change. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-05tree-diff: convert path_appendnew to object_idBrandon Williams1-3/+3
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-05tree-diff: convert diff_tree_paths to struct object_idBrandon Williams1-31/+32
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-05tree-diff: convert try_to_follow_renames to struct object_idBrandon Williams1-3/+5
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-05diff-tree: convert diff_tree_sha1 to struct object_idBrandon Williams1-5/+7
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02tree-diff: convert diff_root_tree_sha1 to struct object_idBrandon Williams1-2/+2
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02diff: convert diff_change to struct object_idBrandon Williams1-1/+1
Convert diff_change to take a struct object_id. In addition convert the function pointer type 'change_fn_t' to also take a struct object_id. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02diff: convert diff_addremove to struct object_idBrandon Williams1-4/+4
Convert diff_addremove to take a struct object_id. In addtion convert the function pointer type 'add_remove_fn_t' to also take a struct object_id. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-27Merge branch 'jk/avoid-unbounded-alloca'Junio C Hamano1-6/+16
* jk/avoid-unbounded-alloca: tree-diff: avoid alloca for large allocations
2016-06-07tree-diff: avoid alloca for large allocationsJeff King1-6/+16
Commit 72441af (tree-diff: rework diff_tree() to generate diffs for multiparent cases as well, 2014-04-07) introduced the use of alloca so that the common cases of commits with 1 or 2 parents would not be adversely affected by going through the multi-parent code. However, our xalloca is not ideal when the number of parents grows very large: 1. If the requested size is too large for our stack, alloca() has no way to tell us, and we simply segfault while trying to access the memory. 2. It does not use our usual memory_limit_check() logic. I measured, and alloca is indeed buying us a very small speedup over xmalloc()/free(). So we'd want to keep something like it. This patch simply puts a conditional in place at each callsite: we use alloca for common known-small numbers of parents, and otherwise use the heap. We are technically still vulnerable to (1), but no more so than if we simply put a few dozen bytes on the stack, which we must do all the time anyway. And likewise, we technically miss a memory limit check if it is tiny, but such a limit is pointless. An alternative to this would be implement something like: struct tree *tp, tp_fallback[2]; if (nparent <= ARRAY_SIZE(tp_fallback)) tp = tp_fallback; else ALLOC_ARRAY(tp, nparent); ... if (tp != tp_fallback) free(tp); That would let us drop our xalloca() portability code entirely. But in my measurements, this seemed to perform slightly worse than the xalloca solution. Note in the example above, and in the patch below, I've used ALLOC_ARRAY() to replace the manual xmalloc(nr * sizeof(*x)). Besides being shorter, this has the bonus that one cannot accidentally overflow a size_t during that computation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-02pathspec: rename free_pathspec() to clear_pathspec()Junio C Hamano1-2/+2
The function takes a pointer to a pathspec structure, and releases the resources held by it, but does not free() the structure itself. Such a function should be called "clear", not "free". Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-25tree-walk: convert tree_entry_extract() to use struct object_idbrian m. carlson1-1/+1
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-25struct name_entry: use struct object_id instead of unsigned char sha1[20]brian m. carlson1-3/+3
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-16tree-diff: catch integer overflow in combine_diff_path allocationJeff King1-2/+2
A combine_diff_path struct has two "flex" members allocated alongside the struct: a string to hold the pathname, and an array of parent pointers. We use an "int" to compute this, meaning we may easily overflow it if the pathname is extremely long. We can fix this by using size_t, and checking for overflow with the st_add helper. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-19tree-diff: catch integer overflow in combine_diff_path allocationJeff King1-2/+2
A combine_diff_path struct has two "flex" members allocated alongside the struct: a string to hold the pathname, and an array of parent pointers. We use an "int" to compute this, meaning we may easily overflow it if the pathname is extremely long. We can fix this by using size_t, and checking for overflow with the st_add helper. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-13diff: convert struct combine_diff_path to object_idbrian m. carlson1-5/+5
Also, convert a constant to GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-07tree-diff: rework diff_tree() to generate diffs for multiparent cases as wellKirill Smelkov1-64/+440
Previously diff_tree(), which is now named ll_diff_tree_sha1(), was generating diff_filepair(s) for two trees t1 and t2, and that was usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes a commit introduces. In Git, however, we have fundamentally built flexibility in that a commit can have many parents - 1 for a plain commit, 2 for a simple merge, but also more than 2 for merging several heads at once. For merges there is a so called combine-diff, which shows diff, a merge introduces by itself, omitting changes done by any parent. That works through first finding paths, that are different to all parents, and then showing generalized diff, with separate columns for +/- for each parent. The code lives in combine-diff.c . There is an impedance mismatch, however, in that a commit could generally have any number of parents, and that while diffing trees, we divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there is no special casing for multiple parents commits in e.g. revision-walker . That impedance mismatch *hurts* *performance* *badly* for generating combined diffs - in "combine-diff: optimize combine_diff_path sets intersection" I've already removed some slowness from it, but from the timings provided there, it could be seen, that combined diffs still cost more than an order of magnitude more cpu time, compared to diff for usual commits, and that would only be an optimistic estimate, if we take into account that for e.g. linux.git there is only one merge for several dozens of plain commits. That slowness comes from the fact that currently, while generating combined diff, a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). That's because at present, to compute combine-diff, for first finding paths, that "every parent touches", we use the following combine-diff property/definition: D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (w.r.t. paths) where D(A,P1...Pn) is combined diff between commit A, and parents Pi and D(A,Pi) is usual two-tree diff Pi..A So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n 1-parent diffs and intersecting results will be slow. And usually, for linux.git and other topic-based workflows, that D(A,P2) is huge, because, if merge-base of A and P2, is several dozens of merges (from A, via first parent) below, that D(A,P2) will be diffing sum of merges from several subsystems to 1 subsystem. The solution is to avoid computing n 1-parent diffs, and to find changed-to-all-parents paths via scanning A's and all Pi's trees simultaneously, at each step comparing their entries, and based on that comparison, populate paths result, and deduce we could *skip* *recursing* into subdirectories, if at least for 1 parent, sha1 of that dir tree is the same as in A. That would save us from doing significant amount of needless work. Such approach is very similar to what diff_tree() does, only there we deal with scanning only 2 trees simultaneously, and for n+1 tree, the logic is a bit more complex: D(T,P1...Pn) calculation scheme ------------------------------- D(T,P1...Pn) = D(T,P1) ^ ... ^ D(T,Pn) (regarding resulting paths set) D(T,Pj) - diff between T..Pj D(T,P1...Pn) - combined diff from T to parents P1,...,Pn We start from all trees, which are sorted, and compare their entries in lock-step: T P1 Pn - - - |t| |p1| |pn| |-| |--| ... |--| imin = argmin(p1...pn) | | | | | | |-| |--| |--| |.| |. | |. | . . . . . . at any time there could be 3 cases: 1) t < p[imin]; 2) t > p[imin]; 3) t = p[imin]. Schematic deduction of what every case means, and what to do, follows: 1) t < p[imin] -> ∀j t ∉ Pj -> "+t" ∈ D(T,Pj) -> D += "+t"; t↓ 2) t > p[imin] 2.1) ∃j: pj > p[imin] -> "-p[imin]" ∉ D(T,Pj) -> D += ø; ∀ pi=p[imin] pi↓ 2.2) ∀i pi = p[imin] -> pi ∉ T -> "-pi" ∈ D(T,Pi) -> D += "-p[imin]"; ∀i pi↓ 3) t = p[imin] 3.1) ∃j: pj > p[imin] -> "+t" ∈ D(T,Pj) -> only pi=p[imin] remains to investigate 3.2) pi = p[imin] -> investigate δ(t,pi) | | v 3.1+3.2) looking at δ(t,pi) ∀i: pi=p[imin] - if all != ø -> ⎧δ(t,pi) - if pi=p[imin] -> D += ⎨ ⎩"+t" - if pi>p[imin] in any case t↓ ∀ pi=p[imin] pi↓ ~ For comparison, here is how diff_tree() works: D(A,B) calculation scheme ------------------------- A B - - |a| |b| a < b -> a ∉ B -> D(A,B) += +a a↓ |-| |-| a > b -> b ∉ A -> D(A,B) += -b b↓ | | | | a = b -> investigate δ(a,b) a↓ b↓ |-| |-| |.| |.| . . . . ~~~~~~~~ This patch generalizes diff tree-walker to work with arbitrary number of parents as described above - i.e. now there is a resulting tree t, and some parents trees tp[i] i=[0..nparent). The generalization builds on the fact that usual diff D(A,B) is by definition the same as combined diff D(A,[B]), so if we could rework the code for common case and make it be not slower for nparent=1 case, usual diff(t1,t2) generation will not be slower, and multiparent diff tree-walker would greatly benefit generating combine-diff. What we do is as follows: 1) diff tree-walker ll_diff_tree_sha1() is internally reworked to be a paths generator (new name diff_tree_paths()), with each generated path being `struct combine_diff_path` with info for path, new sha1,mode and for every parent which sha1,mode it was in it. 2) From that info, we can still generate usual diff queue with struct diff_filepairs, via "exporting" generated combine_diff_path, if we know we run for nparent=1 case. (see emit_diff() which is now named emit_diff_first_parent_only()) 3) In order for diff_can_quit_early(), which checks DIFF_OPT_TST(opt, HAS_CHANGES)) to work, that exporting have to be happening not in bulk, but incrementally, one diff path at a time. For such consumers, there is a new callback in diff_options introduced: ->pathchange(opt, struct combine_diff_path *) which, if set to !NULL, is called for every generated path. (see new compat ll_diff_tree_sha1() wrapper around new paths generator for setup) 4) The paths generation itself, is reworked from previous ll_diff_tree_sha1() code according to "D(A,P1...Pn) calculation scheme" provided above: On the start we allocate [nparent] arrays in place what was earlier just for one parent tree. then we just generalize loops, and comparison according to the algorithm. Some notes(*): 1) alloca(), for small arrays, is used for "runs not slower for nparent=1 case than before" goal - if we change it to xmalloc()/free() the timings get ~1% worse. For alloca() we use just-introduced xalloca/xalloca_free compatibility wrappers, so it should not be a portability problem. 2) For every parent tree, we need to keep a tag, whether entry from that parent equals to entry from minimal parent. For performance reasons I'm keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ. Not doing so, we'd need to alloca another [nparent] array, which hurts performance. 3) For emitted paths, memory could be reused, if we know the path was processed via callback and will not be needed later. We use efficient hand-made realloc-style path_appendnew(), that saves us from ~1-1.5% of potential additional slowdown. 4) goto(s) are used in several places, as the code executes a little bit faster with lowered register pressure. Also - we should now check for FIND_COPIES_HARDER not only when two entries names are the same, and their hashes are equal, but also for a case, when a path was removed from some of all parents having it. The reason is, if we don't, that path won't be emitted at all (see "a > xi" case), and we'll just skip it, and FIND_COPIES_HARDER wants all paths - with diff or without - to be emitted, to be later analyzed for being copies sources. The new check is only necessary for nparent >1, as for nparent=1 case xmin_eqtotal always =1 =nparent, and a path is always added to diff as removal. ~~~~~~~~ Timings for # without -c, i.e. testing only nparent=1 case `git log --raw --no-abbrev --no-renames` before and after the patch are as follows: navy.git linux.git v3.10..v3.11 before 0.611s 1.889s after 0.619s 1.907s slowdown 1.3% 0.9% This timings show we did no harm to usual diff(tree1,tree2) generation. From the table we can see that we actually did ~1% slowdown, but I think I've "earned" that 1% in the previous patch ("tree-diff: reuse base str(buf) memory on sub-tree recursion", HEAD~~) so for nparent=1 case, net timings stays approximately the same. The output also stayed the same. (*) If we revert 1)-4) to more usual techniques, for nparent=1 case, we'll get ~2-2.5% of additional slowdown, which I've tried to avoid, as "do no harm for nparent=1 case" rule. For linux.git, combined diff will run an order of magnitude faster and appropriate timings will be provided in the next commit, as we'll be taking advantage of the new diff tree-walker for combined-diff generation there. P.S. and combined diff is not some exotic/for-play-only stuff - for example for a program I write to represent Git archives as readonly filesystem, there is initial scan with `git log --reverse --raw --no-abbrev --no-renames -c` to extract log of what was created/changed when, as a result building a map {} sha1 -> in which commit (and date) a content was added that `-c` means also show combined diff for merges, and without them, if a merge is non-trivial (merges changes from two parents with both having separate changes to a file), or an evil one, the map will not be full, i.e. some valid sha1 would be absent from it. That case was my initial motivation for combined diffs speedup. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-27tree-diff: reuse base str(buf) memory on sub-tree recursionKirill Smelkov1-19/+19
Instead of allocating it all the time for every subtree in ll_diff_tree_sha1, let's allocate it once in diff_tree_sha1, and then all callee just use it in stacking style, without memory allocations. This should be faster, and for me this change gives the following slight speedups for git log --raw --no-abbrev --no-renames --format='%H' navy.git linux.git v3.10..v3.11 before 0.618s 1.903s after 0.611s 1.889s speedup 1.1% 0.7% Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-27tree-diff: no need to call "full" diff_tree_sha1 from show_path()Kirill Smelkov1-2/+6
As described in previous commit, when recursing into sub-trees, we can use lower-level tree walker, since its interface is now sha1 based. The change is ok, because diff_tree_sha1() only invokes ll_diff_tree_sha1(), and also, if base is empty, try_to_follow_renames(). But base is not empty here, as we have added a path and '/' before recursing. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-27tree-diff: rework diff_tree interface to be sha1 basedKirill Smelkov1-32/+28
In the next commit this will allow to reduce intermediate calls, when recursing into subtrees - at that stage we know only subtree sha1, and it is natural for tree walker to start from that phase. For now we do diff_tree show_path diff_tree_sha1 diff_tree ... and the change will allow to reduce it to diff_tree show_path diff_tree Also, it will allow to omit allocating strbuf for each subtree, and just reuse the common strbuf via playing with its len. The above-mentioned improvements go in the next 2 patches. The downside is that try_to_follow_renames(), if active, we cause re-reading of 2 initial trees, which was negligible based on my timings, and which is outweighed cogently by the upsides. NOTE To keep with the current interface and semantics, I needed to rename the function from diff_tree() to diff_tree_sha1(). As diff_tree_sha1() was already used, and the function we are talking here is its more low-level helper, let's use convention for prefixing such helpers with "ll_". So the final renaming is diff_tree() -> ll_diff_tree_sha1() Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-26tree-diff: diff_tree() should now be staticKirill Smelkov1-2/+2
We reworked all its users to use the functionality through diff_tree_sha1 variant in recent patches (see "tree-diff: allow diff_tree_sha1 to accept NULL sha1" and what comes next). diff_tree() is now not used outside tree-diff.c - make it static. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-26tree-diff: remove special-case diff-emitting code for empty-tree casesKirill Smelkov1-12/+14
While walking trees, we iterate their entries from lowest to highest in sort order, so empty tree means all entries were already went over. If we artificially assign +infinity value to such tree "entry", it will go after all usual entries, and through the usual driver loop we will be taking the same actions, which were hand-coded for special cases, i.e. t1 empty, t2 non-empty pathcmp(+∞, t2) -> +1 show_path(/*t1=*/NULL, t2); /* = t1 > t2 case in main loop */ t1 non-empty, t2-empty pathcmp(t1, +∞) -> -1 show_path(t1, /*t2=*/NULL); /* = t1 < t2 case in main loop */ In other words when we have t1 and t2, we return a sign that tells the caller to indicate the "earlier" one to be emitted, and by returning the sign that causes the non-empty side to be emitted, we will automatically cause the entries from the remaining side to be emitted, without attempting to touch the empty side at all. We can teach tree_entry_pathcmp() to pretend that an empty tree has an element that sorts after anything else to achieve this. Right now we never go to when compared tree descriptors are both infinity, as this condition is checked in the loop beginning as finishing criteria, but will do so in the future, when there will be several parents iterated simultaneously, and some pair of them would run to the end. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-20tree-diff: simplify tree_entry_pathcmpKirill Smelkov1-11/+6
Since an earlier "Finally switch over tree descriptors to contain a pre-parsed entry", we can safely access all tree_desc->entry fields directly instead of first "extracting" them through tree_entry_extract. Use it. The code generated stays the same - only it now visually looks cleaner. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-20tree-diff: show_path prototype is not needed anymoreKirill Smelkov1-3/+0
We moved all action-taking code below show_path() in recent HEAD~~ (tree-diff: move all action-taking code out of compare_tree_entry). Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-20tree-diff: rename compare_tree_entry -> tree_entry_pathcmpKirill Smelkov1-6/+9
Since previous commit, this function does not compare entry hashes, and mode are compared fully outside of it. So what it does is compare entry names and DIR bit in modes. Reflect this in its name. Add documentation stating the semantics, and move the note about files/dirs comparison to it. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-20tree-diff: move all action-taking code out of compare_tree_entry()Kirill Smelkov1-16/+12
- let it do only comparison. This way the code is cleaner and more structured - cmp function only compares, and the driver takes action based on comparison result. There should be no change in performance, as effectively, we just move if series from on place into another, and merge it to was-already-there same switch/if, so the result is maybe a little bit faster. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-20tree-diff: don't assume compare_tree_entry() returns -1,0,1Kirill Smelkov1-8/+14
It does, but we'll be reworking it in the next patch after it won't, and besides it is better to stick to standard strcmp/memcmp/base_name_compare/etc... convention, where comparison function returns <0, =0, >0 Regarding performance, comparing for <0, =0, >0 should be a little bit faster, than switch, because it is just 1 test-without-immediate instruction and then up to 3 conditional branches, and in switch you have up to 3 tests with immediate and up to 3 conditional branches. No worry, that update_tree_entry(t2) is duplicated for =0 and >0 - it will be good after we'll be adding support for multiparent walker and will stay that way. =0 case goes first, because it happens more often in real diffs - i.e. paths are the same. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-20tree-diff: consolidate code for emitting diffs and recursion in one placeKirill Smelkov1-30/+82
Currently both compare_tree_entry() and show_entry() invoke opt diff callbacks (opt->add_remove() and opt->change()), and also they both have code which decides whether to recurse into sub-tree, and whether to emit a tree as separate entry if DIFF_OPT_TREE_IN_RECURSIVE is set. I.e. we have code duplication and logic scattered on two places. Let's consolidate it - all diff emiting code and recurion logic moves to show_entry, which is now named as show_path, because it shows diff for a path, based on up to two tree entries, with actual diff emitting code being kept in new helper emit_diff() for clarity. What we have as the result, is that compare_tree_entry is now free from code with logic for diff generation, and also performance is not affected as timings for `git log --raw --no-abbrev --no-renames` for navy.git and `linux.git v3.10..v3.11`, just like in previous patch, stay the same. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-04tree-diff: show_tree() is not neededKirill Smelkov1-32/+3
We don't need special code for showing added/removed subtree, because we can do the same via diff_tree_sha1, just passing NULL for absent tree. And compared to show_tree(), which was calling show_entry() for every tree entry, that would lead to the same show_entry() callings: show_tree(t): for e in t.entries: show_entry(e) diff_tree_sha1(NULL, new): /* the same applies to (old, NULL) */ diff_tree(t1=NULL, t2) ... if (!t1->size) show_entry(t2) ... and possible overhead is negligible, since after the patch, timing for `git log --raw --no-abbrev --no-renames` for navy.git and `linux.git v3.10..v3.11` is practically the same. So let's say goodbye to show_tree() - it removes some code, but also, and what is important, consolidates more code for showing/recursing into trees into one place. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24tree-diff: no need to pass match to skip_uninteresting()Kirill Smelkov1-9/+8
It is neither used there as input, nor the output written through it, is used outside. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24tree-diff: no need to manually verify that there is no mode change for a pathKirill Smelkov1-10/+5
Because if there is, such two tree entries would never be compared as equal - the code in base_name_compare() explicitly compares modes, if there is a change for dir bit, even for equal paths, entries would compare as different. The code I'm removing here is from 2005 April 262e82b4 (Fix diff-tree recursion), which pre-dates base_name_compare() introduction in 958ba6c9 (Introduce "base_name_compare()" helper function) by a month. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-05tree-diff: convert diff_root_tree_sha1() to just call diff_tree_sha1 with ↵Kirill Smelkov1-14/+1
old=NULL Now since diff_tree_sha1 understands NULL for both old and new, we could indicate an empty tree for root commit by providing just NULL for old sha1. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-05tree-diff: allow diff_tree_sha1 to accept NULL sha1Kirill Smelkov1-8/+4
which would mean that corresponding tree - old or new - is empty. As followup patches will show, that functionality was already needed in several places of Git codebase, but there, we were preparing empty tree_desc objects by hand, with some code duplication. For handling sha1 = NULL case, let's reuse fill_tree_descriptor() which returns just empty tree_desc in that case. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-28pathspec: stop --*-pathspecs impact on internal parse_pathspec() usesNguyễn Thái Ngọc Duy1-1/+3
Normally parse_pathspec() is used on command line arguments where it can do fancy thing like parsing magic on each argument or adding magic for all pathspecs based on --*-pathspecs options. There's another use of parse_pathspec(), where pathspec is needed, but the input is known to be pure paths. In this case we usually don't want --*-pathspecs to interfere. And we definitely do not want to parse magic in these paths, regardless of --literal-pathspecs. Add new flag PATHSPEC_LITERAL_PATH for this purpose. When it's set, --*-pathspecs are ignored, no magic is parsed. And if the caller allows PATHSPEC_LITERAL (i.e. the next calls can take literal magic), then PATHSPEC_LITERAL will be set. This fixes cases where git chokes when GIT_*_PATHSPECS are set because parse_pathspec() indicates it won't take any magic. But GIT_*_PATHSPECS add them anyway. These are export GIT_LITERAL_PATHSPECS=1 git blame -- something git log --follow something git log --merge "git ls-files --with-tree=path" (aka parse_pathspec() in overlay_tree_on_cache()) is safe because the input is empty, and producing one pathspec due to PATHSPEC_PREFER_CWD does not take any magic into account. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15pathspec: support :(literal) syntax for noglob pathspecNguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15tree-diff: remove the use of pathspec's raw[] in follow-rename codepathNguyễn Thái Ngọc Duy1-2/+2
Put a checkpoint to guard unsupported pathspec features in future. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15remove init_pathspec() in favor of parse_pathspec()Nguyễn Thái Ngọc Duy1-5/+5
While at there, move free_pathspec() to pathspec.c Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15remove diff_tree_{setup,release}_pathsNguyễn Thái Ngọc Duy1-14/+4
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15guard against new pathspec magic in pathspec matching codeNguyễn Thái Ngọc Duy1-0/+19
GUARD_PATHSPEC() marks pathspec-sensitive code, basically all those that touch anything in 'struct pathspec' except fields "nr" and "original". GUARD_PATHSPEC() is not supposed to fail. It's mainly to help the designers catch unsupported codepaths. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15parse_pathspec: add special flag for max_depth featureNguyễn Thái Ngọc Duy1-1/+0
match_pathspec_depth() and tree_entry_interesting() check max_depth field in order to support "git grep --max-depth". The feature activation is tied to "recursive" field, which led to some unwanted activation, e.g. 5c8eeb8 (diff-index: enable recursive pathspec matching in unpack_trees - 2012-01-15). This patch decouples the activation from "recursive" field, puts it in "magic" field instead. This makes sure that only "git grep" can activate this feature. And because parse_pathspec knows when the feature is not used, it does not need to sort pathspec (required for max_depth to work correctly). A small win for non-grep cases. Even though a new magic flag is introduced, no magic syntax is. The magic can be only enabled by parse_pathspec() caller. We might someday want to support ":(maxdepth:10)src." It all depends on actual use cases. max_depth feature cannot be enabled via init_pathspec() anymore. But that's ok because init_pathspec() is on its way to /dev/null. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-27Merge branch 'jk/maint-null-in-trees'Junio C Hamano1-4/+4
We do not want a link to 0{40} object stored anywhere in our objects. * jk/maint-null-in-trees: fsck: detect null sha1 in tree entries do not write null sha1s to on-disk index diff: do not use null sha1 as a sentinel value
2012-08-03diff_setup_done(): return voidThomas Rast1-2/+1
diff_setup_done() has historically returned an error code, but lost the last nonzero return in 943d5b7 (allow diff.renamelimit to be set regardless of -M/-C, 2006-08-09). The callers were in a pretty confused state: some actually checked for the return code, and some did not. Let it return void, and patch all callers to take this into account. This conveniently also gets rid of a handful of different(!) error messages that could never be triggered anyway. Note that the function can still die(). Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-29diff: do not use null sha1 as a sentinel valueJeff King1-4/+4
The diff code represents paths using the diff_filespec struct. This struct has a sha1 to represent the sha1 of the content at that path, as well as a sha1_valid member which indicates whether its sha1 field is actually useful. If sha1_valid is not true, then the filespec represents a working tree file (e.g., for the no-index case, or for when the index is not up-to-date). The diff_filespec is only used internally, though. At the interfaces to the diff subsystem, callers feed the sha1 directly, and we create a diff_filespec from it. It's at that point that we look at the sha1 and decide whether it is valid or not; callers may pass the null sha1 as a sentinel value to indicate that it is not. We should not typically see the null sha1 coming from any other source (e.g., in the index itself, or from a tree). However, a corrupt tree might have a null sha1, which would cause "diff --patch" to accidentally diff the working tree version of a file instead of treating it as a blob. This patch extends the edges of the diff interface to accept a "sha1_valid" flag whenever we accept a sha1, and to use that flag when creating a filespec. In some cases, this means passing the flag through several layers, making the code change larger than would be desirable. One alternative would be to simply die() upon seeing corrupted trees with null sha1s. However, this fix more directly addresses the problem (while bogus sha1s in a tree are probably a bad thing, it is really the sentinel confusion sending us down the wrong code path that is what makes it devastating). And it means that git is more capable of examining and debugging these corrupted trees. For example, you can still "diff --raw" such a tree to find out when the bogus entry was introduced; you just cannot do a "--patch" diff (just as you could not with any other corrupted tree, as we do not have any content to diff). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-16use custom rename score during --followJeff King1-0/+1
If you provide a custom rename score on the command line, like: git log -M50 --follow foo.c it is completely ignored, and there is no way to --follow with a looser rename score. Instead, let's use the same rename score that will be used for generating diffs. This is convenient, and mirrors what we do with the break-score. You can see an example of it being useful in git.git: $ git log --oneline --summary --follow \ Documentation/technical/api-string-list.txt 86d4b52 string-list: Add API to remove an item from an unsorted list 1d2f80f string_list: Fix argument order for string_list_append e242148 string-list: add unsorted_string_list_lookup() 0dda1d1 Fix two leftovers from path_list->string_list c455c87 Rename path_list to string_list create mode 100644 Documentation/technical/api-string-list.txt $ git log --oneline --summary -M40 --follow \ Documentation/technical/api-string-list.txt 86d4b52 string-list: Add API to remove an item from an unsorted list 1d2f80f string_list: Fix argument order for string_list_append e242148 string-list: add unsorted_string_list_lookup() 0dda1d1 Fix two leftovers from path_list->string_list c455c87 Rename path_list to string_list rename Documentation/technical/{api-path-list.txt => api-string-list.txt} (47%) 328a475 path-list documentation: document all functions and data structures 530e741 Start preparing the API documents. create mode 100644 Documentation/technical/api-path-list.txt You could have two separate rename scores, one for following and one for diff. But almost nobody is going to want that, and it would just be unnecessarily confusing. Besides which, we re-use the diff results from try_to_follow_renames for the actual diff output, which means having them as separate scores is actively wrong. E.g., with the current code, you get: $ git log --oneline --diff-filter=R --name-status \ -M90 --follow git.spec.in 27dedf0 GIT 0.99.9j aka 1.0rc3 R084 git-core.spec.in git.spec.in f85639c Rename the RPM from "git" to "git-core" R098 git.spec.in git-core.spec.in The first one should not be considered a rename by the -M score we gave, but we print it anyway, since we blindly re-use the diff information from the follow (which uses the default score). So this could also be considered simply a bug-fix, as with the current code "-M" is completely ignored when using "--follow". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-27tree_entry_interesting(): give meaningful names to return valuesNguyễn Thái Ngọc Duy1-7/+9
It is a basic code hygiene to avoid magic constants that are unnamed. Besides, this helps extending the value later on for "interesting, but cannot decide if the entry truely matches yet" (ie. prefix matches) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-27tree-walk.c: do not leak internal structure in tree_entry_len()Nguyễn Thái Ngọc Duy1-3/+3
tree_entry_len() does not simply take two random arguments and return a tree length. The two pointers must point to a tree item structure, or struct name_entry. Passing random pointers will return incorrect value. Force callers to pass struct name_entry instead of two pointers (with hope that they don't manually construct struct name_entry themselves) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-06Merge branch 'jk/diff-not-so-quick'Junio C Hamano1-2/+1
* jk/diff-not-so-quick: diff: futureproof "stop feeding the backend early" logic diff_tree: disable QUICK optimization with diff filter Conflicts: diff.c
2011-05-31diff: futureproof "stop feeding the backend early" logicJunio C Hamano1-3/+1
Refactor the "do not stop feeding the backend early" logic into a small helper function and use it in both run_diff_files() and diff_tree() that has the stop-early optimization. We may later add other types of diffcore transformation that require to look at the whole result like diff-filter does, and having the logic in a single place is essential for longer term maintainability. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-31diff_tree: disable QUICK optimization with diff filterJeff King1-0/+1
We stop looking for changes early with QUICK, so our diff queue contains only a subset of the changes. However, we don't apply diff filters until later; it will appear at that point as though there are no changes matching our filter, when in reality we simply didn't keep looking for changes long enough. Commit 2cfe8a6 (diff --quiet: disable optimization when --diff-filter=X is used, 2011-03-16) fixes this in some cases by disabling the optimization when a filter is present. However, it only tweaked run_diff_files, missing the similar case in diff_tree. Thus the fix worked only for diffing the working tree and index, but not between trees. Noticed by Yasushi SHOJI. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-06Merge branch 'nd/struct-pathspec'Junio C Hamano1-33/+20
* nd/struct-pathspec: pathspec: rename per-item field has_wildcard to use_wildcard Improve tree_entry_interesting() handling code Convert read_tree{,_recursive} to support struct pathspec Reimplement read_tree_recursive() using tree_entry_interesting()
2011-03-25Improve tree_entry_interesting() handling codeNguyễn Thái Ngọc Duy1-33/+20
t_e_i() can return -1 or 2 to early shortcut a search. Current code may use up to two variables to handle it. One for saving return value from t_e_i temporarily, one for saving return code 2. The second variable is not needed. If we make sure the first variable does not change until the next t_e_i() call, then we can do something like this: int ret = 0; while (...) { if (ret != 2) { ret = t_e_i(); if (ret < 0) /* no longer interesting */ break; if (ret == 0) /* skip this round */ continue; } /* ret > 0, interesting */ } Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22Remove unused variablesJohannes Schindelin1-2/+1
Noticed by gcc 4.6.0. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03grep: drop pathspec_matches() in favor of tree_entry_interesting()Nguyễn Thái Ngọc Duy1-2/+2
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03tree_entry_interesting(): support depth limitNguyễn Thái Ngọc Duy1-0/+4
This is needed to replace pathspec_matches() in builtin/grep.c. max_depth == -1 means infinite depth. Depth limit is only effective when pathspec.recursive == 1. When pathspec.recursive == 0, the behavior depends on match functions: non-recursive for tree_entry_interesting() and recursive for match_pathspec{,_depth} Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03diff-tree: convert base+baselen to writable strbufNguyễn Thái Ngọc Duy1-68/+52
In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03Move tree_entry_interesting() to tree-walk.c and export itNguyễn Thái Ngọc Duy1-112/+0
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03tree_entry_interesting(): remove dependency on struct diff_optionsNguyễn Thái Ngọc Duy1-16/+10
This function can be potentially used in more places than just tree-diff.c. "struct diff_options" does not make much sense outside diff_tree_sha1(). While removing the use of diff_options, it also removes tree_entry_extract() call, which means S_ISDIR() uses the entry->mode directly, without being filtered by canon_mode() (called internally inside tree_entry_extract). The only use of the mode information in this function is to check the type of the entry by giving it to S_ISDIR() macro, and the result does not change with or without canon_mode(), so it is ok to bypass tree_entry_extract(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03Convert struct diff_options to use struct pathspecNguyễn Thái Ngọc Duy1-35/+13
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-26Merge branch 'en/tree-walk-optim'Junio C Hamano1-14/+15
* en/tree-walk-optim: diff_tree(): Skip skip_uninteresting() when all remaining paths interesting tree_entry_interesting(): Make return value more specific tree-walk: Correct bitrotted comment about tree_entry() Document pre-condition for tree_entry_interesting
2010-08-26diff_tree(): Skip skip_uninteresting() when all remaining paths interestingElijah Newren1-13/+12
In 1d848f6 (tree_entry_interesting(): allow it to say "everything is interesting" 2007-03-21), both show_tree() and skip_uninteresting() were modified to determine if all remaining tree entries were interesting. However, the latter returns as soon as it finds the first interesting path, without any way to signal to its caller (namely, diff_tree()) that all remaining paths are interesting, making these extra checks useless. Pass whether all remaining entries are interesting back to diff_tree(), and whenever they are, have diff_tree() skip subsequent calls to skip_uninteresting(). With this change, I measure speedups of 3-4% for the commands $ git rev-list --quiet HEAD -- Documentation/ $ git rev-list --quiet HEAD -- t/ in git.git. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-26tree_entry_interesting(): Make return value more specificElijah Newren1-1/+1
tree_entry_interesting() can signal to its callers not only if the given entry matches one of the specified paths, but whether all remaining paths will (or will not) match. When no paths are specified, all paths are considered interesting, so intead of returning 1 (this path is interesting) return 2 (all paths are interesting). This will allow the caller to avoid calling tree_entry_interesting() again, which theoretically should speed up tree walking. I am not able to measure any actual gains in practice, but it certainly can not hurt and seems to make the code more readable to me. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-26Document pre-condition for tree_entry_interestingElijah Newren1-0/+2
tree_entry_interesting will fail to find appropriate matches if the base directory path is not terminated with a slash. Knowing this earlier would have saved me some debugging time. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-13diff --follow: do call diffcore_std() as necessaryJunio C Hamano1-0/+11
Usually, diff frontends populate the output queue with filepairs without any rename information and call diffcore_std() to sort the renames out. When --follow is in effect, however, diff-tree family of frontend has a hack that looks like this: diff-tree frontend -> diff_tree_sha1() . populate diff_queued_diff . if --follow is in effect and there is only one change that creates the target path, then -> try_to_follow_renames() -> diff_tree_sha1() with no pathspec but with -C -> diffcore_std() to find renames . if rename is found, tweak diff_queued_diff and put a single filepair that records the found rename there -> diffcore_std() . tweak elements on diff_queued_diff by - rename detection - path ordering - pickaxe filtering We need to skip parts of the second call to diffcore_std() that is related to rename detection, and do so only when try_to_follow_renames() did find a rename. Earlier 1da6175 (Make diffcore_std only can run once before a diff_flush, 2010-05-06) tried to deal with this issue incorrectly; it unconditionally disabled any second call to diffcore_std(). This hopefully fixes the breakage. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-13diff --follow: do not waste cycles while recursingJunio C Hamano1-1/+1
The "--follow" logic is called from diff_tree_sha1() function, but the input trees to diff_tree_sha1() are not necessarily the top-level trees (compare_tree_entry() calls it while it recursively descends into subtrees). When a newly created path lives in somewhere deep in the source hierarchy, e.g. "platform/", but the rename source is in a totally different place in the destination hierarchy, e.g. "lang-api/src/com/...", running "try_to_find_renames()" while base is set to "platform/" is a wasted call. We only need to run the rename following at the very top level. Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-07Make git log --follow find copies among unmodified files.Bo Yang1-1/+1
'git log --follow <path>' don't track copies from unmodified files, and this patch fix it. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-18Performance optimization for detection of modified submodulesJens Lehmann1-4/+4
In the worst case is_submodule_modified() got called three times for each submodule. The information we got from scanning the whole submodule tree the first time can be reused instead. New parameters have been added to diff_change() and diff_addremove(), the information is stored in a new member of struct diff_filespec. Its value is then reused instead of calling is_submodule_modified() again. When no explicit "-dirty" is needed in the output the call to is_submodule_modified() is not necessary when the submodules HEAD already disagrees with the ref of the superproject, as this alone marks it as modified. To achieve that, get_stat_data() got an extra argument. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-29diff: Rename QUIET internal option to QUICKJunio C Hamano1-1/+1
The option "QUIET" primarily meant "find if we have _any_ difference as quick as possible and report", which means we often do not even have to look at blobs if we know the trees are different by looking at the higher level (e.g. "diff-tree A B"). As a side effect, because there is no point showing one change that we happened to have found first, it also enables NO_OUTPUT and EXIT_WITH_STATUS options, making the end result look quiet. Rename the internal option to QUICK to reflect this better; it also makes grepping the source tree much easier, as there are other kinds of QUIET option everywhere. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-29diff: change semantics of "ignore whitespace" optionsJunio C Hamano1-1/+2
Traditionally, the --ignore-whitespace* options have merely meant to tell the diff output routine that some class of differences are not worth showing in the textual diff output, so that the end user has easier time to review the remaining (presumably more meaningful) changes. These options never affected the outcome of the command, given as the exit status when the --exit-code option was in effect (either directly or indirectly). When you have only whitespace changes, however, you might expect git diff -b --exit-code to report that there is _no_ change with zero exit status. Change the semantics of --ignore-whitespace* options to mean more than "omit showing the difference in text". The exit status, when --exit-code is in effect, is computed by checking if we found any differences at the path level, while diff frontends feed filepairs to the diffcore engine. When "ignore whitespace" options are in effect, we defer this determination until the very end of diffcore transformation. We simply do not know until the textual diff is generated, which comes very late in the pipeline. When --quiet is in effect, various diff frontends optimize by breaking out early from the loop that enumerates the filepairs, when we find the first path level difference; when --ignore-whitespace* is used the above change automatically disables this optimization. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-01Merge branch 'ne/maint-1.6.0-diff-tree-t-r-show-directory'Junio C Hamano1-0/+6
* ne/maint-1.6.0-diff-tree-t-r-show-directory: diff-tree -r -t: include added/removed directories in the output
2009-06-13diff-tree -r -t: include added/removed directories in the outputNick Edelen1-0/+6
We used to include only the modified and typechanged directories in the ouptut, but for consistency's sake, we should also include added and removed ones as well. This makes the output more consistent, but it may break existing scripts that expect to see the current output which has long been the established behaviour. Signed-off-by: Nick Edelen <sirnot@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-22Fix typos / spelling in commentsMike Ralphson1-1/+1
Signed-off-by: Mike Ralphson <mike@abacus.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-01tree_entry_interesting: a pathspec only matches at directory boundaryBjörn Steinbrink1-3/+9
Previously the code did a simple prefix match, which means that a path in a directory "frotz/" would have matched with pathspec "f". Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-31'git foo' program identifies itself without dash in die() messagesJunio C Hamano1-1/+1
This is a mechanical conversion of all '*.c' files with: s/((?:die|error|warning)\("git)-(\S+:)/$1 $2/; The result was manually inspected and no false positive was found. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-16Fix buffer overflow in git diffDmitry Potapov1-5/+22
If PATH_MAX on your system is smaller than a path stored, it may cause buffer overflow and stack corruption in diff_addremove() and diff_change() functions when running git-diff Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-12Fix small memory leaks induced by diff_tree_setup_pathsMike Hommey1-0/+2
Run diff_tree_release_paths in the appropriate places, and add a test to avoid NULL dereference. Better safe than sorry. Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-11Make the diff_options bitfields be an unsigned with explicit masks.Pierre Habouzit1-7/+7
reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-10-21Fix diffcore-break total breakageLinus Torvalds1-0/+1
Ok, so on the kernel list, some people noticed that "git log --follow" doesn't work too well with some files in the x86 merge, because a lot of files got renamed in very special ways. In particular, there was a pattern of doing single commits with renames that looked basically like - rename "filename.h" -> "filename_64.h" - create new "filename.c" that includes "filename_32.h" or "filename_64.h" depending on whether we're 32-bit or 64-bit. which was preparatory for smushing the two trees together. Now, there's two issues here: - "filename.c" *remained*. Yes, it was a rename, but there was a new file created with the old name in the same commit. This was important, because we wanted each commit to compile properly, so that it was bisectable, so splitting the rename into one commit and the "create helper file" into another was *not* an option. So we need to break associations where the contents change too much. Fine. We have the -B flag for that. When we break things up, then the rename detection will be able to figure out whether there are better alternatives. - "git log --follow" didn't with with -B. Now, the second case was really simple: we use a different "diffopt" structure for the rename detection than the basic one (which we use for showing the diffs). So that second case is trivially fixed by a trivial one-liner that just copies the break_opt values from the "real" diffopts to the one used for rename following. So now "git log -B --follow" works fine: diff --git a/tree-diff.c b/tree-diff.c index 26bdbdd..7c261fd 100644 --- a/tree-diff.c +++ b/tree-diff.c @@ -319,6 +319,7 @@ static void try_to_follow_renames(struct tree_desc *t1, struct tree_desc *t2, co diff_opts.detect_rename = DIFF_DETECT_RENAME; diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT; diff_opts.single_follow = opt->paths[0]; + diff_opts.break_opt = opt->break_opt; paths[0] = NULL; diff_tree_setup_paths(paths, &diff_opts); if (diff_setup_done(&diff_opts) < 0) however, the end result does *not* work. Because our diffcore-break.c logic is totally bogus! In particular: - it used to do if (base_size < MINIMUM_BREAK_SIZE) return 0; /* we do not break too small filepair */ which basically says "don't bother to break small files". But that "base_size" is the *smaller* of the two sizes, which means that if some large file was rewritten into one that just includes another file, we would look at the (small) result, and decide that it's smaller than the break size, so it cannot be worth it to break it up! Even if the other side was ten times bigger and looked *nothing* like the samell file! That's clearly bogus. I replaced "base_size" with "max_size", so that we compare the *bigger* of the filepair with the break size. - It calculated a "merge_score", which was the score needed to merge it back together if nothing else wanted it. But even if it was *so* different that we would never want to merge it back, we wouldn't consider it a break! That makes no sense. So I added if (*merge_score_p > break_score) return 1; to make it clear that if we wouldn't want to merge it at the end, it was *definitely* a break. - It compared the whole "extent of damage", counting all inserts and deletes, but it based this score on the "base_size", and generated the damage score with delta_size = src_removed + literal_added; damage_score = delta_size * MAX_SCORE / base_size; but that makes no sense either, since quite often, this will result in a number that is *bigger* than MAX_SCORE! Why? Because base_size is (again) the smaller of the two files we compare, and when you start out from a small file and add a lot (or start out from a large file and remove a lot), the base_size is going to be much smaller than the damage! Again, the fix was to replace "base_size" with "max_size", at which point the damage actually becomes a sane percentage of the whole. With these changes in place, not only does "git log -B --follow" work for the case that triggered this in the first place, ie now git log -B --follow arch/x86/kernel/vmlinux_64.lds.S actually gives reasonable results. But I also wanted to verify it in general, by doing a full-history git log --stat -B -C on my kernel tree with the old code and the new code. There's some tweaking to be done, but generally, the new code generates much better results wrt breaking up files (and then finding better rename candidates). Here's a few examples of the "--stat" output: - This: include/asm-x86/Kbuild | 2 - include/asm-x86/debugreg.h | 79 +++++++++++++++++++++++++++++++++++------ include/asm-x86/debugreg_32.h | 64 --------------------------------- include/asm-x86/debugreg_64.h | 65 --------------------------------- 4 files changed, 68 insertions(+), 142 deletions(-) Becomes: include/asm-x86/Kbuild | 2 - include/asm-x86/{debugreg_64.h => debugreg.h} | 9 +++- include/asm-x86/debugreg_32.h | 64 ------------------------- 3 files changed, 7 insertions(+), 68 deletions(-) - This: include/asm-x86/bug.h | 41 +++++++++++++++++++++++++++++++++++++++-- include/asm-x86/bug_32.h | 37 ------------------------------------- include/asm-x86/bug_64.h | 34 ---------------------------------- 3 files changed, 39 insertions(+), 73 deletions(-) Becomes include/asm-x86/{bug_64.h => bug.h} | 20 +++++++++++++----- include/asm-x86/bug_32.h | 37 ----------------------------------- 2 files changed, 14 insertions(+), 43 deletions(-) Now, in some other cases, it does actually turn a rename into a real "delete+create" pair, and then the diff is usually bigger, so truth in advertizing: it doesn't always generate a nicer diff. But for what -B was meant for, I think this is a big improvement, and I suspect those cases where it generates a bigger diff are tweakable. So I think this diff fixes a real bug, but we might still want to tweak the default values and perhaps the exact rules for when a break happens. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-06-22Fix up "git log --follow" a bit..Linus Torvalds1-9/+28
This fixes "git log --follow" to hopefully not leak memory any more, and also cleans it up a bit to look more like some of the other functions that use "diff_queued_diff" (by *not* using it directly as a global in the code, but by instead just taking a pointer to the diff queue and using that). As to "diff_queued_diff", I think it would be better off not as a global at all, but as being just an entry in the "struct diff_options" structure, but that's a separate issue, and there may be some subtle reason for why it's currently a global. Anyway, no real changes. Instead of having a magical first entry in the diff-queue, we now end up just keeping the diff-queue clean, and keeping our "preferred" file pairing in an internal "choice" variable. That makes it easy to switch the choice around when we find a better one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-22Finally implement "git log --follow"Linus Torvalds1-0/+59
Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show *that* particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is *not* carrying around a "commit,pathname" kind of pair and it's *not* going to be able to notice the file coming from multiple *different* files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-03-22tree_entry_interesting(): allow it to say "everything is interesting"Junio C Hamano1-5/+28
In addition to optimizing pathspecs that would never match, which was done earlier, this optimizes pathspecs that would always match (e.g. "arch/" while the traversal is already in "arch/i386/" hierarchy). This patch makes the worst case slightly more palatable, while improving average case. Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-22tree-diff: avoid strncmp()Junio C Hamano1-23/+37
If we already know that some of the pathspecs can match later entries in the tree we are looking at, we do not have to do more expensive strncmp() upfront before comparing the length of the match pattern and the path, as a path longer than the match pattern will not match it, and a path shorter than the match pattern will match only if the path is a directory-component wise prefix of the match pattern. Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-22Teach tree_entry_interesting() that the tree entries are sorted.Junio C Hamano1-6/+35
When we are looking at a tree entry with pathspecs, if all the pathspecs sort strictly earlier than the entry we are currently looking at, there is no way later entries in the same tree would match our pathspecs, because the entries are sorted. Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-21Initialize tree descriptors with a helper function rather than by hand.Linus Torvalds1-10/+12
This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-19Set up for better tree diff optimizationsLinus Torvalds1-10/+34
This is mainly just a cleanup patch, and sets up for later changes where the tree-diff.c "interesting()" function can return more than just a yes/no value. In particular, it should be quite possible to say "no subsequent entries in this tree can possibly be interesting any more", and thus allow the callers to short-circuit the tree entirely. In fact, changing the callers to do so is trivial, and is really all this patch really does, because changing "interesting()" itself to say that nothing further is going to be interesting is definitely more complicated, considering that we may have arbitrary pathspecs. But in cleaning up the callers, this actually fixes a potential small performance issue in diff_tree(): if the second tree has a lot of uninterestign crud in it, we would keep on doing the "is it interesting?" check on the first tree for each uninteresting entry in the second one. The answer is obviously not going to change, so that was just not helping. The new code is clearer and simpler and avoids this issue entirely. I also renamed "interesting()" to "tree_entry_interesting()", because I got frustrated by the fact that - we actually had *another* function called "interesting()" in another file, and I couldn't tell from the profiles which one was the one that mattered more. - when rewriting it to return a ternary value, you can't just do if (interesting(...)) ... any more, but want to assign the return value to a local variable. The name of choice for that variable would normally be "interesting", so I just wanted to make the function name be more specific, and avoid that whole issue (even though I then didn't choose that name for either of the users, just to avoid confusion in the patch itself ;) In other words, this doesn't really change anything, but I think it's a good thing to do, and if somebody comes along and writes the logic for "yeah, none of the pathspecs you have are interesting", we now support that trivially. It could easily be a meaningful optimization for things like "blame", where there's just one pathspec, and stopping when you've seen it would allow you to avoid about 50% of the tree traversals on average. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-18Merge branch 'ar/diff'Junio C Hamano1-0/+2
* ar/diff: Add tests for --quiet option of diff programs try-to-simplify-commit: use diff-tree --quiet machinery. revision.c: explain what tree_difference does Teach --quiet to diff backends. diff --quiet Remove unused diffcore_std_no_resolve Allow git-diff exit with codes similar to diff(1)
2007-03-18Avoid unnecessary strlen() callsLinus Torvalds1-27/+29
This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-14Teach --quiet to diff backends.Junio C Hamano1-0/+2
This teaches git-diff-files, git-diff-index and git-diff-tree backends to exit early under --quiet option. Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-27convert object type handling from a string to a numberNicolas Pitre1-3/+3
We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-26Make git-cherry handle root treesRene Scharfe1-0/+18
This patch on top of 'next' makes built-in git-cherry handle root commits. It moves the static function log-tree.c::diff_root_tree() to tree-diff.c and makes it more similar to diff_tree_sha1() by shuffling around arguments and factoring out the call to log_tree_diff_flush(). Consequently the name is changed to diff_root_tree_sha1(). It is a version of diff_tree_sha1() that compares the empty tree (= root tree) against a single 'real' tree. This function is then used in get_patch_id() to compute patch IDs for initial commits instead of SEGFAULTing, as the current code does if confronted with parentless commits. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-17Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length.David Rientjes1-2/+1
Introduces global inline: hashcmp(const unsigned char *sha1, const unsigned char *sha2) Uses memcmp for comparison and returns the result based on the length of the hash name (a future runtime decision). Acked-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-14Make show_entry voidDavid Rientjes1-6/+6
Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-10tree-diff: do not assume we use only one pathspecJunio C Hamano1-21/+25
The way tree-diff was set up assumed we would use only one set of pathspec during the entire life of the program. Move the pathspec related static variables out to diff_options structure so that we can filter commits with one set of paths while show the actual diffs using different set of paths. I suspect this breaks blame.c, and makes "git log paths..." to default to the --full-diff, the latter of which is dealt with the next commit. Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-04Use blob_, commit_, tag_, and tree_type throughout.Peter Eriksen1-3/+4
This replaces occurences of "blob", "commit", "tag", and "tree", where they're really used as type specifiers, which we already have defined global constants for. Signed-off-by: Peter Eriksen <s022018@student.dtu.dk> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-03-29tree/diff header cleanup.Junio C Hamano1-28/+0
Introduce tree-walk.[ch] and move "struct tree_desc" and associated functions from various places. Rename DIFF_FILE_CANON_MODE(mode) macro to canon_mode(mode) and move it to cache.h. This macro returns the canonicalized st_mode value in the host byte order for files, symlinks and directories -- to be compared with a tree_desc entry. create_ce_mode(mode) in cache.h is similar but is intended to be used for index entries (so it does not work for directories) and returns the value in the network byte order. Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-01-31Make the "struct tree_desc" operations available to othersLinus Torvalds1-6/+6
We have operations to "extract" and "update" a "struct tree_desc", but we only used them in tree-diff.c and they were static to that file. But other tree traversal functions can use them to their advantage Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-26avoid asking ?alloc() for zero bytes.Junio C Hamano1-0/+4
Avoid asking for zero bytes when that change simplifies overall logic. Later we would change the wrapper to ask for 1 byte on platforms that return NULL for zero byte request. Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-22Split up tree diff functions into tree-diff.c libraryLinus Torvalds1-0/+270
This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>