Compartmentalized | Documented | Extendible | Reproducible | Robust |
I will introduce you to RStudio and GitHub/GitLab I will show you a variety of approaches for organizing your projects that involve code and show you how to use GitHub/GitLab without having to learn Git. If you already use RStudio and Git, skip this week as this will be introductory level. If you have tried RStudio or Git and gotten frustrated or thought ‘I don’t have time to learn this’, this is for you. If you want to learn how to easily keep track of changes in your code, this is for you. If you have never used RStudio, this is for you.
I’m going to show you how to work with Git/GitHub/GitLab with no command-line interface. The goal is to futz with Git as little as possible. See the Links tab above for a nice online workshop on Git/GitHub if you want to learn more.
When you open RStudio you will see 4 panels:
That should give you a new project.
require(graphics)
## Annette Dobson (1990) "An Introduction to Generalized Linear Models".
## Page 9: Plant Weight Data.
ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
weight <- c(ctl, trt)
lm.D9 <- lm(weight ~ group)
opar <- par(mfrow = c(2,2), oma = c(0, 0, 1.1, 0))
plot(lm.D9, las = 1) # Residuals, Fitted, ...
par(opar)
Keep your projects in separate folders with a uniform set of folder names. For example,
The top right corner allows you to create projects and switch between projects.
Key one is to not save your .Rdata
(your environment) when your are done for the day. Under Tools > Project Options… > General
R for Data Science is a great book to introduce you to working with data in R. Read through the following sections of the R for Data Science book and work through the examples.
Tracking your code (and project) changes. I am focusing on Git for individuals not teams. If you aren’t using a change tracker (version control), then start just with a personal project and track only your changes. I will not cover branches.
For NOAA staff, GitHub can be used for publishing public projects: NOAA GitHub and a NWFSC example NWFSC Timeseries. NWFSC has a GitLab server if you want a repository server for your non-public projects and for internal collaboration.
.git
(so if you wanted to get rid of the history and other Git info, you could delete that folder). You have a local repo and a remote repo (on GitHub/GitLab).Workflow (to get started):
Goals today:
I am going to show a workflow that is usually robust. Connecting Git on your computer and GitHub (or GitLab) is a source of much misery, and in my experience creating the repo on GitHub (or GitLab) first eliminates the problems. This really important for the first time you connect your computer to the remote repository server (GitHub or GitLab). Start with a new repo created on GitHub or GitLab.
New Project
(upper right, blue cube with R)Git
tab in the upper right.Commit
. Add a comment: first line is subject, newline, description (options).Repeat 1-4 a few times.
Now look at the history. The little clock-like icon (or History in the Git window).
filter by file
to see just the changes to one file.View file @...
Push changes up the GitHub or GitLab
Don’t Fork. That would be if you are contributing to their repo. If you just want to copy it and then adapt it for your purposes, do this.
+
in top right and click import repository
. Paste in the url and give your repo a name.New Project
on right, then Import Repository
tab, then click Repo by URL
. Paste in url and give repo a name.New Project
. Then select Version Control
and paste in the url of your repository’s url. For example, https://github.com/<youraccount>/Test
or https://gitlab.com/<youraccount>/Test
orDo not use branches (wait till you are friends with Git).
Do not use Git at the command line.
If you use Dropbox or iCloud on multiple computers to keep folders synced up across different computers, don’t put your Git repos in those folders.
The Git info is in the hidden folder .git
. If you need to get rid of the repository data (like history), delete that. Don’t copy that folder into another repo.
Start by making a blank repo on GitHub or GitLab and select the box to add a Readme file. Then clone that with RStudio or GitHub Desktop.
Despite the name, it works with GitLab too.
You can commit your changes and push them with RStudio only, but many people use a Git GUI to interact with Git and their remote repository. GitHub Desktop is simple and works well. Why not just use RStudio? If you are happy with that, fine. Many people prefer a separate Git GUI, though Git with RStudio has improved and serves many just fine.
To use your local repository with GitHub Desktop, you need to add it.
How does it know what the remote repository (GitHub/GitLab) is? It is stored in the .git information which was set when you cloned the repo from GitHub/GitLab.