Tags:
create new tag
view all tags

Git Policies

We use git for development work.

If anyone is new to git, I strongly recommend they read (at least the first couple of chapters of) Pro Git - it is available online so won't cost you a penny. Reading the first couple of chapters will take you less than an hour, and it will give you a firm grounding in the concepts behind git. Attempting to treat git as 'a better cvs/svn' etc will just lead to you getting frustrated and confused.

The big features of git are:

  • its ability to work offline (the entire history of a project is contained on your machine)
  • its speed
  • its use of compression
  • its powerful use of branches (and merging in work from multiple places)

With this power, however, comes the ability to easily turn development projects into a horrible mess of branches and merges.

We therefore have certain policies in place that hopefully allow us to benefit from gits strengths, without surrendering the readability of our project history.

Repositories

Our central git repository is git.ghostscript.com. This allows anyone to read the repos shown there, but writing to those repos can only be done via ssh to casper.

Central "Golden" repos.

All our projects have central repositories on casper (ghostscript.com). For example:

   /home/git/ghostpdl.git
   /home/git/mupdf.git

For the purposes of this discussion I will refer to these as the golden repos (or just golden in git commands).

These golden repos are the public facing repos - this is where outside users can (and do) download our code from. Accordingly we are very careful what goes into these repos. Essentially code only goes into golden once it's been tested (and ideally reviewed).

We never force push to golden. Well, OK, we sometimes force push to golden, but it's a rare occurrence and we only do it when we have to.

These are bare repos. (If you don't know what a bare repo is, then you need to go and read up on git).

We also have some private git repositories for our closed source software (like SOT). I'm not going to put their locations on a public twiki, but they are approached in the same manner.

Personal repos on casper.

Every "core" developer (by which I mean Artifex staff member, contractor, trusted friend etc) should have a set of personal repos on casper. The ones for public code (Ghostscript, MuPDF etc) are generally held in a repos directory within their home directory.

These are bare repos, and can be generated by something like:

   cd ~/repos
   git clone --bare /home/git/ghostpdl.git ghostpdl.git

These repos will automatically appear in the git.ghostscript.com listing.

Accordingly, we tend to have a private-repos directory in each of our home directories to contain personal checkouts of the non-public code projects. These (obviously) do not appear on the web view.

Your personal (and private) repos are yours to do with as you please. Make as many branches as you like. Rewrite history. Force push to your hearts content. While other people can see your repo, this is basically your sandbox to do what you want.

The key purpose of these personal repos are:

  • Backup (anything put here will be baked up as part of caspers normal backup procedure, protecting you if your laptop dies).
  • Collaboration (anything put here can be viewed (and 'reviewed') by co-workers).

For the purposes of this discussion I will refer to these repos as personal repos (or just personal in git commands).

Personal repos on your local machine.

Obviously, you'll need a repo on your local machine within which you'll do your development. You can set it up using a command like:

git clone USERNAME@ghostscript.com:/home/robin/repos/ghostpdl.git ghostpdl.git

(This will leave origin pointing to your personal repo.)

This will be a non-bare repo, and is once again yours to do with as you wish.

Generally you'll want to have a set of 'remotes' set up to allow you to get access to code written by others:

   git remote add golden USERNAME@ghostscript.com:/home/git/ghostpdl.git
   git remote add robin USERNAME@ghostscript.com:/home/robin/repos/ghostpdl.git
   git remote add ken USERNAME@ghostscript.com:/home/ken/repos/ghostpdl.git
   git remote add ray USERNAME@ghostscript.com:/home/ray/repos/ghostpdl.git

etc.

Policies

Be Excellent To Golden

We never (OK, almost never) force push to golden.

Every commit on golden/master is tested automatically by the cluster, and we do our very best to ensure that these tests never fail by cluster testing the code before we push it.

Of course accidents do happen, but by and large we really want every commit on golden/master to be a good one. In particular if we ever have to bisect through history looking for where things went wrong, it's a real pain if commits are broken.

Merges Are Bad (M'Kay)

In general, we avoid ever pushing merge commits to golden.

We rebase rather than merge (again, if you don't know what this means, please read enough of the git book so you do).

Merges are never required for simple single commit bug fixes. They are never required for branches which have just had a single developer working on them.

In very rare cases, when 2 or more developers have been collaborating on a branch for a long period of time, a merge at the end may be acceptable.

See the Workflow section below for how we avoid merges.

If you get this wrong, you will get whinged at. If you get this wrong repeatedly, we'll get really annoyed with you.

Write Lucid Commit Messages

All commit messages should be clear. This has not been the case in the past with commits we've inherited from pre-git times.

The first line should be a potted description of the bug (e.g. "Bug 695140: Problem with CMYK colorspace in tif device"). Try to keep this below about 70 chars to avoid it being truncated in certain git commands.

Then leave a blank line and write full details of the fix. Please do mention bug numbers. Please do mention test files. Please do mention implementation decisions.

The goal is that when a developer looks back at your commit in 4 years time, he should be able to understand what your fix does, and why you did it in this way.

Workflow

Day to day updating.

In general, you can pull any changes from upstream into your git repo using:

git pull --rebase golden master

This will pull in changes from the master branch in the golden repo into your current branch, and rebase any of your outstanding commits on to the end of that.

This has the effect of avoiding merges.

Simple fixes

To do a simple fix, just change the code, and commit it to your local repo. Push it up to your personal repo on casper. Cluster test it.

If the cluster tests fail, fix the commit (git commit --amend), push the fixed version up to your personal repo (git push -f personal master), and retest. Rinse and repeat.

You can then ask a colleague to review your code. This is an informal process in which another developer looks over your code (either by pulling it from your personal repo, or looking at it using the web view on [[http://git.ghostscript.com][git.ghostscript.com]). This captures a huge number of silly typos/thinkos etc. For some projects (MuPDF, SOT) this is a compulsory step, for others (like Ghostscript) it's just recommended.

Finally you can push your work to golden (git push golden master). If the push fails, it's probably because someone else has committed before you. Just pull in their changes (git pull --rebase golden master) and then try pushing again. Never force push to golden.

More major development work

If you have more major development work to do, it can be worth using a local branch. This enables you to easily step away from the work and come back to it some time later as priorities change.

Create a local branch (git checkout -b foo-dev-branch). Do all the development you want, making as many commits onto your branch as you want.

Ideally every commit should build/run/test as normal, and you can cluster test as you go along to be sure this is the case. If build breakages are unavoidable, then see the next section.

Conventional git workflow would have you periodically merge master into your development branch to pull in changes. We do not do that. Instead, we prefer to rebase development branches (repeatedly) onto the end of master. (git pull --rebase golden master)

This avoids merges and keeps the history clean.

Once development work is complete, we do one more rebase (git pull --rebase golden master). This leaves your new branch at the end of master.

You can then push your branch up onto your personal repo (git push personal foo-dev-branch) (or maybe git push -f personal foo-dev-branch if you have a previous version there already).

A neat trick is to also do:

git push personal foo-dev-branch:master

This puts foo-dev-branch onto your personal repo as master (i.e. all your commits are pushed onto the end of master). You can then pull it back down to your local machine using git pull --rebase personal master.

Your reviewer can then look over it, and when he's happy (and all the cluster tests have passed), you can publish it to golden. (git push golden master)

Again, if the push fails, it's probably because someone else has committed before you. Just pull in their changes (git pull --rebase golden master) and then try pushing again. Never force push to golden.

Major refactors in which the build or tests break.

Sometimes during development it's very hard to ensure that the build or tests do not get broken along the way. (Indeed during refactors, it can be quite deliberate that certain things break). The traditional git workflow for such things is to finish the entire branch of development, merge it back to master, and then test the result there. If it works, commit it.

We do not like this, as it causes problems. While the cluster is smart enough not to test every entry on a branch, git bisect is not.

When using git bisect to hunt for where a particular problem was introduced it will look at all the parents of a merge commit. This means that bisects through a repo that includes merges of branches that include build or test breakages can be very painful.

On the other hand, if we were simply to squash the entire branch history down to a single commit (known not to break anything) valuable development history would be lost.

We therefore propose a compromise solution that keeps git bisect working nicely, but also maintains the development history.

As before, create a local branch (git checkout -b foo-dev-branch). Do all the development you want, making as many commits onto your branch as you want.

These commits can break the build/tests etc, but ideally should still form lucid steps of development with sane commit messages.

Again, we (repeatedly) rebase the branch onto the end of master to pull in any new developments (git pull --rebase golden master).

Once development is complete, we test and review as usual.

We then do 2 things. Firstly, we publish the entire development branch to golden (git push golden foo-dev-branch) - it is therefore important to pick a decent name for your development branch (or to rename it before pushing) (xxx-dev-branch seems a reasonable template to follow).

Next, we squash the entirety of the branch to a single commit (using git rebase -i master). We ensure that the squashed commit message includes text such as "Full development history of this commit can be seen on foo-dev-branch."

This single commit is then pushed to golden/master (git push golden master).

Some things to note about this method of working:

  • Only a single commit ends up on golden/master; one that is known to build and pass tests.
  • In the event of a git bisect none of the (potentially broken) commits along foo-dev-branch are tested, hence git bisect continues smoothly.
  • In the event that someone wants to see the full development history of the branch, the 'squashed' commit clearly points them to it.
  • The squashed commit is in fact exactly the same commit as a merged commit would be, just with the troublesome parent removed.

-- Robin Watts - 2015-01-14

Comments


Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r3 - 2018-07-02 - RobinWatts
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright 2014 Artifex Software Inc