Merge vs Rebase: Do They Produce the Same Result?

December 21, 2017

I get asked quite a lot whether I recommend a merge-based workflow, or one where people rebase onto master. But to be quite honest, I couldn't possibly care less. Your workflow is your workflow after all, it's up to your team to work in the way that's most productive for you. For some teams that's merging, for some teams that's rebasing... n the end, the code gets integrated and the end result is the same either way, whether you merge or rebase it, right?

Right?

If you're a rebase fan, you've probably run into cases where you get conflicts during a rebase that you wouldn't get during a merge. But that's not very interesting... is there a case where merge and rebase both finish and produce a result, but a different tree?

Is git-merge guaranteed to produce the same results as git-rebase?

No!

It's actually not a guarantee; in fact, you can create two branches that merge differently than they rebase. To avoid any spoilers, I've hidden the details in case you want to think about this on your own. 🤔 Click "expand" below to see the details.

Click to expand...

Hello, world.

You can follow along with this GitHub repository.

Two branches

Imagine that you have two branches, one is master, and the other is the unimaginatively named branch branch. They're both based off a common ancestor 0d7088f. Further, imagine that your branch has two commits based off that common ancestor:

Ancestor 0d7088fbranch 3f3ca4fbranch 09d3ac4
OneOneOne
Two2Two
ThreeThreeThree
FourFourFour
FiveFiveFive
SixSixSix
SevenSeven7
EightEightEight

Finally, imagine that your master branch has a single commit based off the common ancestor:

Ancestor 0d7088fmaster f2e864b
OneOne
Two2
ThreeThree
FourFour
FiveFive
SixSix
SevenSeven
EightEight

What happens when you try to merge or rebase these?

Merge

When Git merges two branches, it only look at the tip commit in each branch, and compares them to their common ancestor. It does not look at any intermediate commits. In the above example, when we merge branch into master, the algorithm looks at the changes made in branch by comparing commit 09d3ac4 to the common ancestor commit 0d7088f. It also looks at the changes made in master by comparing commit f2e864b to the common ancestor commit.

The merge algorithm compares each line1 in the common ancestor, comparing it to the file in branch and the file in master. If the line is unchanged in all branches, then there's no problem - that line is brought into the merge result. In this example, line 1 in unchanged in both branches, so line 1 of the merge result will be One.

If a line is changed in only one branch, then that change is brought forward into the merge result. In this example, line 7 is changed only in branch. So in the resulting merge, line 7 will have the contents from branch, which is the digit 7. Also, line 2 is changed only in master, so in the merge result it will be the digit 2.

Merge Result
One
2
Three
Four
Five
Six
7
Eight

Remember that merge only looks at the tip commits, so comparing the common ancestor to branch, line two appears unchanged, since the ancestor and tip are identical.

Rebase

Rebase works a bit differently - instead of doing a three-way merge between the tip commits on each branch, it tries to replay the commits on one branch onto another. In the above example, if we want to rebase branch onto master, then Git will create a patch for each commit on branch and apply those patches onto master.2

When you rebase, Git will switch you to the master branch, checking out f2e864b. Then Git will apply the differences between the common ancestor and the first commit on branch. In this example, the patch between the common ancestor and the branch changes line two from Two to 2. But that's already the value of the file in master. So there's nothing to do, and the patch for 3f3ca4f applies cleanly.

Then a patch for the second commit on the branch is applied: it changes like two back to the text representation, and changes line seven to a digit. So the rebase result is:

Rebase Result
One
Two
Three
Four
Five
Six
7
Eight

So rebase preserves the changes in the branch while merge preserved the changes in master.

Conclusion

Generally these sorts of changes will cause a conflict instead of different results. It was key that in branch we changed the contents of line 2 back to the contents in the common ancestor. That allowed the merge engine to consider that the line in branch was unchanged.

Merge ResultRebase Result
OneOne
2Two
ThreeThree
FourFour
FiveFive
SixSix
77
EightEight

So... is this a problem?

It might seem concerning that this comes up when there was an apparent revert of your changes. Logically, both the branch and the master branches changed line two, but then branch changed it back. So although this seems rather derived, it's not that unlikely.

But whether you prefer a merge workflow or a rebase workflow, you should be careful of your integration and following good development practices:

  1. Code review, ideally using pull requests, so that your team members have visibility into changes before they're integrated into master.

  2. Continuous integration builds and tests, as part of your integration workflow. Ideally, with build policies to ensure that builds succeed and tests pass.

So make sure to do proper code reviews, which keep this an interesting difference instead of an actual problem in your workflow.

Footnotes

  1. Strictly speaking, the merge engine doesn't actually look at lines, it looks at groups of lines, or "hunks". But it's easier to reason about individual lines for this example. ↩

  2. By default, rebase will create and then apply patches, but when invoked with git rebase --merge then it will cherry-pick the changes. This uses the merge engine instead of patch application, but in this example, the results are the same. ↩