I'm writing a git filter-branch --tree-filter command that uses git log --follow to check if certain files should be kept or deleted during the filtering.
Basically, I want to keep commits that contain a filename, even if this file was renamed and/or moved.
This is the filter I'm running:
git filter-branch --prune-empty --tree-filter '~/preserve.sh' -- --all
This is the command I'm using inside preserve.sh:
git log --pretty=format:'%H' --name-only --follow --all -- "$f"
The result is that a commit that creates a file that is later moved to another path is stripped out of history when I'm searching for the file in the new path, which shouldn't happen. For example:
commit 1: creates
foo/hello.txt;commit 2: moves
foo/hello.txttobar/hello.txt;using
git filter-branchpassingbar/hello.txtyields a history with only commit 2.
At first, I thought the problem was happening because I wasn't using --all in git log, that is, when analyzing commit 1 it wouldn't find foo/hello.txt because it was only looking in past history where bar/hello.txt isn't mentioned anywhere. But then I added --all, which looks to all commits (including the "future" ones), however, nothing changed.
I checked out to the commit where the file is being created, ran that log command and it worked (listed both foo/hello.txt and bar/hello.txt), so there's nothing wrong with it. I also logged the results of the log command when it's run by filter-branch and in this case I can see that in commit 1 the file is not found (only bar/hello.txt is listed).
I think this problem happens because internally git is copying each commit to a "new repo" structure so by the time it's analyzing commit 1 the newer commits don't exist yet.
Is there a way to fix this, or another way to approach the problem of re-writing history while preserving renames/moves?
I'm running a modified version of the script found in this answer.