Compartmentalized | Documented | Extendible | Reproducible | Robust |
This week I will cover some more features of using Git, GitHub, and RStudio. Although I’ll be using GitHub, the same workflow applies to GitLab. I am going to be using GitHub Desktop to interact with GitHub since I find that GitHub Desktop helps me deal with merge conflicts and reverts (getting rid of changes I have made).
I’ll be illustrating this with this repo: https://github.com/eeholmes/MyNewPackage You can see a real-world example of this here on my MARSS package repo: https://github.com/nwfsc-timeseries/MARSS
Merge conflicts happen when there are changes to a file on your remote repository (GitHub or GitLab) but also changes to that same file on your local repository. Git doesn’t know how to resolve the conflicting changes and needs your help.
GitHub Desktop makes resolving these pretty easy.
hello.R
and show where the conflicts are. You then edit hello.R
in RStudio to fix the conflicts.hello.R
are still there.
hello.R
and fix the conflict. Git won’t have marked it so it might be hard to find.Unfortunately, when you hit ‘Push’ in the Git tab in RStudio, it will immediately change hello.R
with the conflicts. RStudio won’t give you the chance to abandon the merge or pick one of the files.
But you can fix and then merge.
hello.R
and get rid of all the merge conflict code (denoted with the ============ and >>>>>>>>>>>>>>>>).Why use branches?
When you start, keep it simple. Use a branch for one file or two. Work on the file and then merge it back into master. Then get rid of the branch. It’s not necessary to use branches but if you do a lot of coding or work on packages, then getting comfortable with them will help you out.
Click the new branch icon and give your branch a name. Give it an INFORMATIVE name. tmp
, foo
are bad. hello_branch
is good as it tells what this branch is for (working on the hello.R
file).
Now that you have a branch, it is critical that you pay attention to the Git tab and know where you are working. RStudio will remember what branch you are on.
Let’s make a change to hello.R
on hello_branch, put to GitHub and see what the two branches look like.
There are a few ways to do a pull request.
You can do it from GitHub Desktop. It’ll just redirect you to GitHub however.
You can do it from GitHub.
Once you have created the pull requests, you'll see that the pull request tab (in GitHub) shows that there is a request.
Click on the request. You have 2 options.
You have done the merge on GitHub. You still need to do a Pull to get that change into your local repository.
Delete your branch when you are done with it. All the history is saved. There is no reason keep branches that you are done with.
The branch toolbar in GitHub Desktop let’s keep branches up to date with each other.
Let’s say I am working on littleforecast.R
in the master branch while working on hello.R
in the hello_branch. I want to keep these synced up.
This is similar to a pull request but happening locally. When a team is working on different branches, they would use pull requests.
You can do the same actions from GitHub.
Say you made a change and you need to get rid of that. The temptation (for me) is to jump onto the Git command line and clobber my repository with reset
and revert
commands. Don’t do this. Here are some strategies that will make this let prone to leaving your code a mess.
No? Easy click on the file in the Git panel in RStudio, right click, and click ‘Revert’. Note this will take things all the way back to your last commit!! If you have been making a bunch of changes without committing those, then you are out of luck.
Yes? Go to History in the GitHub Desktop window, click on the commit and click ‘Revert’. This will get rid of all the changes that went with that commit. So if you changed multiple files, all those files will be reverted. If you have pushed the changes to GitHub, then you can push the revert and it’ll show up on GitHub too.
Yes but you just want to revert one file in a multi-file commit? Ok, you can do this at the Git command line, but I find that to be a huge time suck and in my early Git days, I sometimes left my repository with a horrible problem that I could not fix and had to completely rebuild my repo. Since I don’t need to be a Git wizard, this is what I do when I want to go ‘back in time’ for a since file.
Assuming you have already pushed the changes up to GitHub
< >
to browse your repo at the state in time where your file was ok.If you have not pushed the changes up to GitHub.
Ok, here’s the Git command to get a single file back. This works whether or not you have pushed to GitHub. The problem with this and why I don’t do it is that I usually need to look at the file. So I am scrolling back through the status of my repo in the past until I find the status that I want. Then I stare a bit and think and think. Then get a coffee and think some more. Then I scroll back through the status of the repo in the past some more and THEN I do the copy and paste. It is rarely the case that I know exactly what commit that I need to get rid of—and even rarer that I want to go completely to a status in the past.
git log
to find the commit hash (the long number)git checkout 1d0f8c2eb4e66db0a7123588ae2fad26a6338303~1 -- ./R/test.R
would reset test.R to one before that commit. This part 1d0f8c2eb4e66db0a7123588ae2fad26a6338303
is the bad commit hash and this part ~1
means what the file was like 1 commit before that.If you accidentally leave off the file name and Git says you have a detached head, use git checkout master
to reattach your head.
Once doing commits and push/pulls is familiar and you are no longer messing up your repository or making merge conflicts,
file_xyz
should only be in one branch.