Before I even start on an answer, I will say: it is not clear to me why you are doing any of this at all. You could, for instance, use git archive to create a tar or zip file of any given commit. For instance:
git archive -o foo.tar v2.3.1
makes a foo.tar file out of the revision tagged v2.3.1. To make many tar or zip files out of all the revisions reachable from master, you could write:
git rev-list master | while read hash; do
git archive -o /path/to/$hash.zip $hash
done
and be done with it.
Might this error be because of a merge?
Yes, it might.
If so, how do I get around it such that I can get all the commits in the master branch of the repo?
Beware: the commits in master likely include many commits that are also in other branches.
When you do this:
commits = list(repo.iter_commits('master'))
you get a full list of every commit that is reachable from the name master, starting with the most recent. Suppose master points to commit in a graph that looks like this one, for instance. Instead of each actual commit hash ID, I'll use a single uppercase letter to represent the commits:
A--B--C------G <-- master
\ /
D--E--F <--- develop
This repository has seven (count them!) commits. All seven commits are on, i.e., reachable from, branch master. Six of the seven commits are on branch develop. The name master identifies commit G, which is a merge commit. The name develop identifies commit F, which is not.
When you do this:
repo.head.reset('HEAD~1', index = True, working_tree = True)
you have Python tell Git to resolve the current commit, which is one of these seven, to its first parent, and then change the repository's idea of "current commit" to the commit you just found. Let's say that you start out with HEAD (the current commit) being commit G. Then HEAD~1 is commit C.
Here things get a bit complicated. The repo.head object represents Git's own HEAD, which is always one of two different items. In this case, though, it's pretty clearly a symbolic reference, pointing to master. I have not tested this out but it seems virtually certain that GitPython faithfully reproduces Git's own behavior here, and does the equivalent of git reset with one of --soft, --mixed, or --hard depending on your parameters, and yours are those for --hard (curiously the command shown failing here uses --mixed; either your code doesn't match your posting, or more likely, GitPython uses an extra step). So what this ends up doing is making the name master point to the newly selected commit C:
A--B--C <-- master
\
D--E--F <-- develop
Where did commit G go? Well, nowhere really, but it's now "lost": it is hard to find, and after an expiration period, it will be really removed entirely. So commit G is effectively gone. (It could be resurrected, if we know its hash: we could force master to point to it again with another git reset or equivalent. Your list of commits in variable commits still lists its hash, so that's one of many ways we could find and resurrect it.)
You now do your main loop body code, working with commit C:
sha = c.name_rev.split()[0]
shutil.copytree(repo_path, destination_path)
You've gone through one of the seven commits in your list, making a copy of commit C while thinking it was commit G (the first commit in repo.iter_commits('master') is commit G since that's the one master points-to).
You are now ready to loop around to work on the second. The repository, however, now has just six commits, and master points to commit C. You now do another git reset --hard, erasing commit C from the picture, leaving us with:
A--B <-- master
\
D--E--F <-- develop
Now you do something with commit B (while the c in for c in commits is on the second commit of the seven, listed in some order—it's not clear what order repo.iter_commits uses, but it probably runs git rev-list and hence gets the default order; if so, see the git rev-list documentation).
Now you do another git reset --hard. This time, commit B is not forgotten: commit D remembers it. But master winds up pointing to commit A:
A <-- master
\
B--D--E--F <-- develop
You do your thing with commit A, while the for c in commits is on the third commit of seven.
Now you ask Git to find A's first parent commit ... but A doesn't have a first parent, or any parent at all. Commit A is the first commit ever made; it's a root commit. At this point, git reset simply fails. You've iterated over the four commits that are reachable from master by following only the first-parent links. The other three commits that are reachable from master require, at one point, following the second parent. You have also removed two of the four commits you visited; two remain only because they're reachable from another name.
Note that you could have the same graph but without the name develop any more:
A--B--C------G <-- master
\ /
D--E--F
In this case, the first git reset that wipes out G also wipes out access to the D-E-F chain, because G was the key to that access: it's now G^2, which is the second parent of commit G, that finds F. It's F that finds E, and E that finds D; so losing G loses all of these, and this winds up leaving just:
A--B--C <-- master
visible. (As before, all the "erased" commits stick around for a grace period, and can be resurrected as long as you can find them again.)
... how do I get around it
Use a completely different algorithm, and/or choose your commits wisely. Just because there are seven (or whatever other number) of commits that are reachable from some branch name, does not mean that all seven (or whatever) are linked as first parents.
Note that even in a completely linear setup, such as:
A--B <-- master
you will have a list of two commits (in the order B then A), but you can only run git reset HEAD~1 once, to step back from B to A. Once you are on A, you cannot step back again. You must step back one fewer times than you do things with commits, in this situation. You should also do your thing, whatever it is, with the commit first.
It's not immediately obvious to me how GitPython deals with a "detached HEAD", though if you want to access files directly from Python code there's not that much point to using a detached HEAD. But if you're going to run shutils.copytree you might as well just write this whole thing in shell script, which is far simpler: Git is full of shell scripts, and is designed to work well with them, and requires a shell interpreter to exist in order for Git to function at all, so that if you have Git, you have a shell interpreter.