The second approach doesn't have to have a lot of ugly and unnecessary merges and commits. The approach I prefer is often called semi-linear merging:
- create a new topic branch
- make a bunch of commits
- just before merging back to the parent branch, clean up the commits:
- rebase onto the latest version of the parent branch
- squash typo fix commits
- split commits doing multiple things at once into separate commits
- reorder the commits to make it easier for a reviewer to understand the sequence of changes
- etc.
- merge with
--no-ff into the parent branch
The above steps result in a history that looks like this:
* 354b644 Merge branch 'topic3'
|\
| * 54527e0 remove foo now that it is no longer used
| * 1ef3dad stop linking against foo
| * 7dfc7e5 wrap lines longer than 80 characters, no other changes
| * b45fbcf delete end-of-line whitespace, fix indendataion
|/
* db13612 Merge branch 'topic2'
|\
| * 961eebf unbreak build by adding a missing semicolon
|/
* a5b6b16 Merge branch 'topic1'
|\
... (more history not shown)
The above graph has all the same advantages of approach #1:
You can easily revert an entire topic by reverting the merge commit (e.g., git revert -m 1 354b644).
You can use the --first-parent argument to git log to get a concise summary that resembles what you would get with approach #1:
* 354b644 Merge branch 'topic3'
* db13612 Merge branch 'topic2'
* a5b6b16 Merge branch 'topic1'
... (more history not shown)
You can still easily examine the entirety of changes made in a topic branch. For example, git diff 354b644^..354b644 will show you what was changed for topic #3.
But you get benefits that approach #1 can't give you:
- The history is much easier to review: commits
b45fbcf and 7dfc7e5 (for the topic3 branch) introduce a lot of noise but no actual logic changes. Someone trying to answer the question, "What logic changes were made for topic #3?" might have a hard time digging through the noise if all of those commits were squashed into one.
- The merge commits nicely identify the context for the series of commits on the merged branch (e.g., this group of commits were made to address topic #3).
- The finer granularity of commits makes it easier to figure out why a particular change was made, which can help distinguish accidental changes from intentional-but-subtle.
- If multiple people collaborated on the branch, you can see who they all were and how much each person contributed.
- The number of commits on the merged topic branch gives you a rough idea about how much was changed.
- The time range of the commits can provide useful context.
- You can easily cherry-pick a specific change made onto a different branch (e.g., cherry-pick the minimal change needed to fix a bug onto a release branch).
There is one disadvantage I can think of: It may be hard to configure your software development tools to only follow the first-parent path and ignore all of those intermediate commits. For example, there is no --first-parent argument to git bisect. Also, I'm not familiar enough with Jenkins to know how easy it is to configure it to prioritize building and testing the first-parent path over all the other commits.
--first-parentoption ofgit logto hide individual commits on merged-in branches. Not messy at all.