Do you ever face a blank script, and not know where to start? This is one reason I find it useful to have a template to follow for my R scripts.

The three main reasons I use a template are as follows:

  1. a template gets me started, so I rarely suffer from writer’s (or data-scientist-coder’s) block. I can just start thinking and writing.
  2. a template ensures I capture all the relvant details for the script up front so that I can remember why I wrote this script when I dig it up in 3 or 4 weeks. (Sometimes even 3 or 4 days is enough for me to forget.)
  3. a template stops me having to remember all the elements that I typically forget (or would not make the time) to include.

This post is focusses on the script header template I am using. There is obviously a lot of detail that you can go into in your script header, but I try to keep it simple enough that I will fill it in each time. You can always add more detail to your own template if you like.

Here are the components of the header template that I use:

Basic Header Information:

The first part of my template is all commented out (with #’s), and contains the basic bibliographic information and notes. It includes:

  1. Script name: I try to use descriptive names.  Something like summarising_soil_hazards.R is almost always going to be better than an uniformative name like script1.R….
  2. Purpose of the script: Just writing this down often helps clarify what exactly I am trying to do.
  3. Author(s): I share many of my scripts with my students, colleagues and clients – it is good to know where the script originated if they have any questions, comments or wish to suggest improvements
  4. Date Created: This date is automatically filled in with my template script. (Some people also like to include an updated field, but I tend to forget to fill this in and would rather get this data from version control)
  5. Copyright statement / Usage Restrictions: For intellectual property (IP) reasons it is good to know where the copyright resides, and how you are happy for people to use your work. You may wish to use a formal type of licence.
  6. Contact Information: It helps if people know who to contact, and how they can reach you the best. It’s not uncommon for me to include both my personal and work email accounts. I do this in case one of them ceases to be in service in the future for whatever reason.
  7. Notes: This is a free-text space which I use to jot down any thoughts or more detailed notes about the script, or even work to do.

Script / Project Set up Code

The second part includes a few R / RStudio commands that gets my R session up and running. I include:

  1. Setting the working directory: I include paths for both Mac and PC (see the difference here).
  2. Setting the numeric display options: (Unlike my brother, and some of my students, I prefer non-scientific notation (e.g. I prefer to see 0.0000002 than 2e-07, but this is just my preference. Please don’t hate me for it.) I also tend to limit the display of decimals to 4 decimal places, just to keep it slightly clean. Obviously, I keep the full level of detail in the actual data.
  3. Setting the memory limit (for Windows):  I tend to work with large data files, and commonly run into memory issues on my PC.
  4. Load up the packages I’ll be bound to use: There are some packages I use most days. I load them up by default. If the script uses lots of packages, then I tend load up all up by calling another file (packages.R), using the source() function.
  5. Load up the functions I’ll need into memory: I sometimes will reuse functions I have written in other analyses. I load them up here.

This is what my template header looks like:

## ---------------------------
##
## Script name: 
##
## Purpose of script:
##
## Author: Dr. Timothy Farewell
##
## Date Created: 2018-04-11
##
## Copyright (c) Timothy Farewell, 2018
## Email: hello@timfarewell.co.uk
##
## ---------------------------
##
## Notes:
##   
##
## ---------------------------

## set working directory for Mac and PC

setwd("~/Google Drive/")      # Tim's working directory (mac)
setwd("C:/Users/tim/Google Drive/")    # Tim's working directory (PC)

## ---------------------------

options(scipen = 6, digits = 4) # I prefer to view outputs in non-scientific notation
memory.limit(30000000)     # this is needed on some PCs to increase memory allowance, but has no impact on macs.

## ---------------------------

## load up the packages we will need:  (uncomment as required)

require(tidyverse)
require(data.table)
# source("functions/packages.R")       # loads up all the packages we need

## ---------------------------

## load up our functions into memory

# source("functions/summarise_data.R") 

## ---------------------------

How I use the template (snippets)

You can use header templates in a number of ways. You could save it as a text file, then copy and paste it in each time. But this is not the best approach, particularly if you think you will use this a lot. Instead, I like to use something called snippets!

With snippets, all you need to to is to type “header” and then hit tab, and RStudio will paste in your template for you. It’s really easy.

Here is what you need to do in Rstudio to enable the use of snippets:

  1. Open RStudio
  2. Go to: Tools -> Global Options -> Code -> Tab Editing -> Snippets -> “Edit Snippets” – this will bring up a block of code. Scroll to the bottom of this.
  3. Modify the code below to suit your needs and then copy/paste this in to the bottom of  the snippets code block. (the indent / tabs are important)
  4. Click Save and close the window.
snippet header
	## ---------------------------
	##
	## Script name: 
	##
	## Purpose of script:
	##
	## Author: Dr. Timothy Farewell
	##
	## Date Created: `r paste(Sys.Date())`
	##
	## Copyright (c) Timothy Farewell, `r paste(format(Sys.Date(), "%Y"))`
	## Email: hello@timfarewell.co.uk
	##
	## ---------------------------
	##
	## Notes:
	##   
	##
	## ---------------------------
	
	## set working directory for Mac and PC
	
	setwd("~/Google Drive/")  		# Tim's working directory (mac)
	setwd("C:/Users/tim/Google Drive/")  	# Tim's working directory (PC)
	
	## ---------------------------
	
	options(scipen = 6, digits = 4) # I prefer to view outputs in non-scientific notation
	memory.limit(30000000)   	# this is needed on some PCs to increase memory allowance, but has no impact on macs.
	
	## ---------------------------
	
	## load up the packages we will need:  (uncomment as required)
	
	require(tidyverse)
	require(data.table)
	# source("functions/packages.R")       # loads up all the packages we need
	
	## ---------------------------
	
	## load up our functions into memory
	
	# source("functions/summarise_data.R") 
	
	## ---------------------------

How to start each script with the new header

From now on, when you open up a new script that you want this header there, you can just type:

  • header{snippet}

or just

  • header (and hit tab)

and it will magically fill in the header for you, filling in dates etc for you. You can then just start working, which is the real aim here.

What else do you include in your headers?

Let me know if you think I’ve missed anything out of my header, or if there is any way you think this could be improved.


0 Comments

Leave a Reply