Version control

To keep our code transparent and findable the preferred code hosting platform is GitHub and version management is git. The repository should preferably be public from the start.

GitHub

Netherlands eScience Center uses GitHub (http://www.github.com) for version control.

By default an eScience Research Engineer is expected to create a new GitHub organization for each project and create repositories in there. However a new repository should be made in the Netherlands eScience Center GitHub organization (https://github.com/NLeSC) when the repository is used in multiple projects.

Policy

Version control from the beginning of the project

It is highly recommended to start using version control on day one of the project.

Use git as version control system

Other version control systems can be used if the project does not start in the eScience Center and does not use git, or when the prevailing version control system in the particular community is not git. Even then, changing version control systems should be considered (especially if Subversion or another centralised system is used).

Git documentation:

Choose one branching model

Make the choice explicit in the contribution guidelines, and link to documentation on how to get started with it. Our default choice is GitHub flow branching model

GitHub flow is a very simple and sane branching model. It supports collaboration and is based on pull requests, therefore relies heavily on GitHub. The Pro Git book describes in detail the workflow of collaboration on the project with use of git branches, forks and GitHub in Contributing to a Project chapter. Other more complicated models could be used if necessary, but we should strive for simplicity and uniformity within the eScience Center since that will enhance collaboration between the engineers. Learning a new branching model should not stand in the way of contributions. You can learn more about those other models from atlassian page.

Repositories should be public

Unless code cannot be open (e.g. when working with commercial partners, or when there are competitiveness issues) it should be in a public online repository. In case the code uses data that cannot be open, an engineer should try to keep sensitive parts outside of the main codebase. If you accidentally included copyrighted files in your repository, you need to remove them from the HEAD as well as from history. There is a gist here that explains how.

Meaningful commit messages

Commit messages are the way for other developers to understand changes in the codebase. In case of using GitHub flow model, commit messages can be very short but pull request comments should explain all the changes. It is very important to explain the why behind implementation choices. To learn more about writing good commit messages, read tpope’s guide and this post

GitHub has some interesting features that allow you to close issues directly from commit messages.

Code snippets library

Sometimes, we develop small snippets of code that can be directly reused in other projects, but that are too small to put in a library. We store these code snippets in git, in GitHub Gists.