Streamline your TFS to Git migration with Gitflow

As a long-time TFS user and VSS user before that, what is most commonly referred to as centralized version control (CVCS) was all I knew.

Streamline your TFS to Git migration with Gitflow

As a long-time TFS user and VSS user before that, what is most commonly referred to as centralized version control (CVCS) was all I knew. I "bound" my code to a remote repository, allowing me to get the latest changes and begin making modifications locally. My local mods would tell the server that I've "checked out" the file either exclusively or non-exclusively to avoid conflicts with other users. Essentially, my local copy of the repository was a live-connected representation of what was on the server in real-time. This model works reasonably well because when you have a small on-location team with an on-location server, you can collaborate very closely with your team members. e.g. how we used to assemble our workforce, everyone under one roof. Where this model starts to fall apart is how software teams work in the modern distributed workforce world. In this brave new remote world I've lived in for the past decade; centralized anything can be challenging. If I'm trying to edit a locked file at 2 am when the other team member who locked it is asleep, then my options are to either go to bed or make my changes offline and hope I can reconcile them when the file becomes available to me. Now, I'm not saying there are no processes you can put on top of CVCS to avoid those types of conflicts, but I've found that most of them are pretty expensive to implement in terms of resources and productivity. Imagine the scenario of branching a large project for a minor feature. In TFS, a complete physical copy of that entire repository is copied on the server and then pulled down locally for you to begin making changes. Now imagine 10 developers all working on features for that same project. Expensive, complicated, and slow all come to mind. Not to mention that without that server connection, all your branches and copies mean nothing. You'll have no change history or idea of how the remote branches are tied together; they can't even diff or merge. I'm just touching on a few topics here, but I think you get the point. I've just reached a place with myself and my team where we needed to find a better way.

So Git to the rescue? Yes and No. Git, arguably the most popular distributed version control system (DVCS), doesn't quite get it done on its own. Git, with all its power, can present an entirely new series of headaches which, in some ways, can be much more painful than any centralized horror story I've encountered. Git is so powerful that a single developer can rewrite (rebase) the change history for an entire repository. They could then overwrite (force push) any number of 'remotes' with their version of history, wreaking havoc on all others who've copied (cloned) that repo before the rewrite. This power is available because when you clone or create a git repository, its entire lifetime of changes is included on disc in the inconspicuous .git folder. Powerful? Yep. Awesome!? Yep. Scary as hell? Yep. It's peer-to-peer coding; it's decentralized; it's both connected and disconnected at the same time; it's almost anything you want it to be in any process you want to use it with. "I see your version of history, but this is how it really went down" git push origin master -f

This brings me to Gitflow. Gitflow is essentially a branching model that gives you the safety and control of a centralized system while still keeping most of the decentralized power of git. You still have the freedom of having the repository and all its history locally, but release code, merges, and collaboration all happen on the server. The server will keep the 'true' version of what's being worked on, e.g. the develop branch, and the 'true' version of what's in production, e.g. the master branch. Beyond those two trunks can be any number of branches that match a specific workflow, all kept sane by standard branch naming conventions. So feature/my-dev-feature would branch directly off the 'origin' develop branch so that I could checkout and begin my work locally while 'pushing' to the server (origin) every so often for safety and potential collaboration. It's process, rules, and code of conduct. If Git were life, then Gitflow is the laws we follow to maintain a civilized society. Louis CK says it best: "The number one thing preventing murder is the law against murder"...dark, funny, and probably true ;)

Merging in Gitflow

Depending on how you want to design your workflow, you've got a few options here. Before we go there, though, as the developer, you have some excellent options to control how the changeset history will look before it enters the remote origin branch. By doing an interactive rebase git rebase -i, the dev can review each commit on their branch and choose which ones to leave intact (pick) and which ones to combine (squash). So commit small and often, then clean up when you're ready to merge. A very powerful way to organize changes so that they are easy to understand by others.

Pull request

The best way for me to think of a pull request in terms of workflow is that it's a code review. When a formal review is something your process requires, like being the gatekeeper of a large project with many contributors, a pull request model is a great way to collaborate on a proposed merge. In all of my years doing code reviews, this has been the most satisfying part of my enterprise migration. And when you combine this with a CI server doing branch management, you have yet another powerful form of quality control when a pull request comes in. It doesn't get better than that when it comes to a formal review and merge process. Tools matter here, though. Products like Github, Bitbucket, Stash, and others are there to make this as easy and as second nature as possible. The commit comment is only the first part of the discussion. It's the reasoning, the collaboration, and the line-by-line comments that make this such a powerful way to integrate code.

Fast-forward vs 3-way merging

In TFS, everything is a 3-way merge, meaning that when merging changes from one branch to another, you need a dedicated merge commit to resolve potential conflicts, even if there aren't any to resolve. In Git, they streamline this a bit by doing a fast-forward merge when your changes remain on a linear path. For example, if you branch from develop, make your changes on feature, then merge back in with no other changes being made on develop before you do, then your commits are tacked on top of develop. No 3-way merge was required because the path remained linear from your feature back to develop...allowing the develop branch to "fast-forward" to the end of your merged commits. Now, you could have easily just made the changes right on the develop branch, but why when you have this kind of safety? You can cheaply branch, just in case changes are committed while making yours, and then merge as if the branch never happened. Safe, clean, and even one click with the right tools. When Git cannot perform a fast-forward merge, it falls back to a standard 3-way merge which will allow you to resolve any necessary conflicts and create a dedicated merge commit in your repo history. You could, of course, force a merge commit, even if fast-forward is available.

Rebasing

Sometimes, I consider this method of merging my changes the best of both worlds. For example, you have a long-running feature that you branched off the main branch. Over the course of development, other changes by other team members have been merged into the main branch, forcing your feature branch out of sync with the latest changes to the code base. Before feature completion, you have two basic options, you can 3-way merge the latest changes onto your feature branch, creating a dedicated merge commit, or you can rebase your branch onto the main branch creating a linear history for a potential fast-forward merge. That doesn't mean there won't be conflicts, but in the case of a rebase you can deal with those conflicts and history changes on your local branch before the Gitflow merge onto the remote dev branch. This method of rewriting history is safe because you've never pushed your changes to the main remote branch yet, so the fact that they've been rewritten locally will not affect others who've cloned the main remote branch. When viewing your history, this ultimately looks like each change was made one after the other. You've effectively stuffed all of the latest changes behind your new ones, making it look like they'd been there all along. That is the best of both worlds part. You can avoid unnecessary merge commits when you merge in new changes on a long-running feature and still get the one controlled merge commit onto the main branch when you complete the feature. You could, of course, fast-forward merge your feature, but then you will have no visual record of when that code was branched and when it was integrated into the main code base. This ultimately means that I have a choice, which is really what this is about. It's less about the "right" way and more about the power of choice and what makes the most sense for your project or enterprise. On our team, we'll rebase on to devleop before the merge, interactive rebase to organize the feature's new history, and finally create a single merge commit back on to develop. Once complete, the local and remote feature branches are removed to reduce clutter.

Bringing it all together

These are just some ways Git and Gitflow have impacted how my team works together. A big part of using a workflow like this is adopting tools that embrace it. In my particular setup, I'm using both SourceTree and Visual Studio Git tools as my git clients, Stash and Visual Studio Online as my git servers, Jira for issue tracking, and both Bamboo and TeamCity as my CI servers. Each tool in the chain enforces the workflow and links everything together as you move through the process. For example, a feature in Jira becomes a feature branch on Stash containing the id of the issue in the branch name. When that feature branch is pushed to the remote server, the CI server will detect that branch, check it out and build it. All ultimately leads back to the issue in Jira. Powerful stuff with a killer audit trail for quality control. With Git, though, no tool can (or should) completely replace the command line. If you're going to embrace Git, then embrace the CLI in all its glory. Not a day goes by when I don't fall back to the shell. Some things are faster and easier that way, so the sooner you get used to it, the more empowered you'll feel and the more productive you'll be. I still learn something new every day, which is probably what I enjoy most.

But wait, there's more! Here are some great resources to get you Gitflowing? in no time: