Much of this tutorial is borrowed from Jenny Bryan at the University of British Columbia.

Installation of git

If you are using your own local installation of RStudio, you may need to install the git program. This should not be necessary if you are on Linux, or a Mac running OS X 10.9 or later, as git is preinstalled. If you are on Windows or an earlier version of OS X… just use the RStudio server for now at least.

Configuring RStudio to work with git

  1. From RStudio, go to “Tools > Global options”, select Git/SVN, and make sure that “Enable version control interface …” is checked.

Creating a GitHub account

You can use git to track changes in your own work without using GitHub. But if you want to use git collaboratively, and in particular, if you want to use it to turn in work for this class, then you need a copy of your repo to exist “in the cloud”. GitHub is by far the most popular choice for this.

  1. Visit http://www.github.com and, if you do not already have an account, create one.

Linking your RStudio and git accounts

Git associates your name and e-mail address with each commit, which helps when multiple people collaborate on a project. To configure your name and e-mail address in git, go to Tools > Shell…, and type the following (substituting your name and the email address you used to register a GitHub account)

git config --global user.name 'Your Name'
git config --global user.email 'your@email.com'

Creating a new project

You can either create a new project first in RStudio and then “upload” it to GitHub, or you can create a project in GitHub and “download” it to RStudio. The latter is a bit easier.

  1. Log in to your GitHub account, and then go to the following URL to create a new repository which is linked to our course organization in GitHub: https://classroom.github.com/a/6xNV7qsd

Downloading your new repo to the RStudio server

You now have a minimal project with two files: the README.md documentation (this file is a Markdown file and you can use the same syntax to format it that you use to format RMarkdown documents, minus the code chunks), and a special file called .gitignore that tells git to ignore certain temporary files when recording changes to a repo.

But if you actually want to do anything with this project you need a copy of the repo where you use RStudio (for most people, this is the Oberlin RStudio server)

  1. From RStudio, select File > New Project… (or use the Project dropdown menu in the upper right). Save any unsaved changes in your workspace that you want to keep, and then choose “Version Control”, followed by “Git”. Copy the URL of the repo you created on GitHub (it should look something like https://github.com/ocstats/stat209-s2021-git-intro-crdawson.git, but with your GitHub account name at the end instead of mine), and provide a name for the directory that will be created in your RStudio account to store your project files, as we’ve done in the past for labs (you might want to use a shorter name like git-intro).

Making local changes and committing them

  1. Open the README.md file in RStudio, add a short description of the project, and save the file. Now select the “Git” tab in the upper right pane. You should see README.md appear in the list with an “M” next to it (this tells us that the file has been Modified since the last commit. To tell git that we want to commit these changes, check the box next to the file to put it in the “staging” area (the changes are not committed yet, but they are ready to be committed when you give the go ahead). This is equivalent to the add step at the command line. To see the differences between the archived and current version of the file, you can click “Diff”. When you’re ready to commit, you will need to add a commit message in the text box. This should be a brief but informative statement about what you have added or changed since your last commit. Then click “Commit”. Now the current state of your project is registered in your local repo. (They are not yet uploaded to GitHub, but you have created a snapshot of your project that you can revert to if needed)

You should commit changes often; every time you finish some meaningful chunk of work

“Pushing” the changes to GitHub

If you are collaborating on a project, your collaborators can only see changes that you have “uploaded” or pushed to the shared, remote copy of the repo.

Caution: Before you push, you should always pull! This way, if someone else has made changes since your last push, you will be informed of any conflicts that need to be resolved. If you try to push while your local repo is out of date, you will get an error which can be non-trivial to resolve.

  1. Click the “Pull” button in the Git tab in RStudio (the blue down arrow). Nothing will happen, except that you will get a message saying that your local repo is already up-to-date, but it is good to establish this habit. Now you are ready to “Push” (the green up arrow).

Adding an RMarkdown document to your project

  1. Create a new .Rmd file (File > New > RMarkdown), change the title and author, save it, and Knit to .html. You should see the new .Rmd and .html files show up in the Git tab. “Stage” them, then commit, pull and push.

  2. Replace the plot command in the RMarkdown template document with a ggplot() command that plots speed against distance using the cars data. Knit, stage, commit, pull and push. (In practice you will not necessarily push every time you commit, nor will you commit every time you change a single line, but we are developing muscle memory here)

Reverting undesired changes

  1. Introduce a typo or some other undesirable change to your Markdown file. In the “Git” tab, do a Diff on the file, to see the change reflected. Then, click “Revert”, to restore the file to the state it was in the last time you committed.

Deleting a file

  1. Create a new R script called scratchpad.R (this is a file that just contains R code; no Markdown). Add a comment, and save it. Stage the file, commit, pull and push. Now suppose you didn’t actually want to keep that file. Delete it by checking the box next to the file in the “Files” tab and clicking “Delete”. In the “Git” tab, you should now see a red “D” for “Deleted”. The next time you commit, the latest snapshot in the repo will have that file removed. (Since it had previously been committed, however, it can still be recovered from an earlier snapshot if needed)

Reviewing your recent work

  1. Click the “clock” icon in the Git tab. This is the git “History” command (not to be confused with the History tab in RStudio), and will show you a record of all the commits you’ve made so far.