Basic ideas

Any time multiple versions of a document exist, whether due to a document changing over time, or because multiple authors are working on it, some kind of version control is needed.

Version control allows:

  • accessing any version from the original to the most recent;
  • seeing what has changed from one version to the next;
  • giving a label of some kind to distinguish a particular version.

Examples of version control systems

The simplest method of version control is probably the most widely used in science: changing the file name.

Cartoon from "Piled Higher and Deeper" by Jorge Cham
Cartoon from "Piled Higher and Deeper" by Jorge Cham

from “Piled Higher and Deeper” by Jorge Cham www.phdcomics.com

Other examples include:
  • “track changes” in Microsoft Word
  • Time Machine in Mac OS X
  • versioning in Dropbox, Google Drive
  • formal version control systems such as CVS, Subversion, Mercurial, Git

The importance of tracking projects, not individual files

Early version control systems, such as CVS, track each file separately - each file has its own version number. The same is true of Dropbox, Microsoft Word.

This is a problem when you make changes to several files at once, and the changes in one file depend on changes in another.

In modern version control systems, and in backup-based systems such as Time Machine, entire directory trees are tracked as a unit, which means that each version corresponds to the state of an entire project at a point in time.

Advantages of formal version control systems

  • explicit version number for each version
  • easy to switch between versions
  • easy to see changes between versions
  • tools to help merge incompatible changes

In the next sections, we will use Git, probably the most widely used modern version control system, to introduce the principles of version control. We will demonstrate both Git’s command-line interface and a graphical interface, SourceTree.