code review illustration

Cleaning History for GitHub PRs

At CyberArk, code reviews are a part of daily life. Getting input from peers is essential to maintaining high code quality. Our code is hosted on GitHub, and engineers submit a GitHub pull request to make changes. The PR must be approved by another engineer before it can be merged to master.

The code review process is an iterative one. The reviewer typically asks for changes, the submitter makes the changes, the reviewer reviews the updates, and the cycle continues. The end result can be multiple changes to each of the commits that make up the PR.

Having a clean commit history is also very important for code quality. Ensuring that others can make sense of how a project has changed spreads the knowledge of the change beyond the submitter and reviewer.

Lately, I’ve been experimenting with ways to combine these two processes using Git’s interactive rebase feature. Note that we don’t share development branches, so it’s ok to rewrite history this way.

My typical daily workflow looks like this:

  1. Pick an issue to work on.
  2. Create and checkout a topic branch for the work (git checkout --branch), then
    • hack
    • commit small change
    • hack some more
    • commit another small change
    • push my messy branch to GH so others can see what I’m doing, git push (with --set-upstream the first time)
    • repeat until done
  3. Clean up my history locally (git rebase --interactive), push the updates to GitHub.
  4. Open a PR, ask for a review

This is where things get interesting. Assuming that my PR contains more than a single commit, and that my updates are non-trivial, it’s likely that the reviewer will suggest changes to more than a single commit.

From here, my flow looks like this:

  1. Respond to each comment on the PR. Acknowledge changes that I agree need to be made, ask questions about others. If I’ve changed the same file in multiple commits, decide which commit the requested update belongs in.
  2. Once we’ve agreed on a set of changes, start making them.
  3. Open the “Files Changed” tab on the GitHub PR.
  4. Open the “Changes from all commits” dropdown, pick the newest commit with a review comment (say it’s hash is abcd123). For this flow, it’s easiest to do the commits in reverse-chronological order, so the hashes of the older ones don’t change as you go.
  5. Use an interactive rebase to update it:
    • git rebase --interactive abcd123^
    • choose to edit abcd123
    • make updates
    • run the automated tests to ensure the changes are good
    • git add the changes, git commit --amend to update the commit
    • git rebase --continue
    • git push --force-with-lease the updates. This pushes rewritten history, so the push needs to be forced.
  6. Repeat from 4 until all updates are made.

I use some of the variants of these commands so frequently, I’ve added git aliases for them. I’ve got git ammend for git commit --amend, and git pushf for git pushf --force-with-leasepushf was particularly helpful, because command line completion for git push stops at --force, which is rarely what I want.

I use --force-with-lease, rather than --force, because it offers some protection against overwriting changes to the remote branch. I switched to using it after accidentally force-pushing changes from an out-of-date local branch. It does come with some caveats, though.

This process works well when the reviewer requests straightforward changes to commits. Sometimes, though, the changes are complicated, and require many modifications to multiple commits. In that case, you may be better off abandoning your current commits and creating new history altogether.

You can see examples of some of our attempts at keeping our history clean in the repositories for Conjur and Summon.