Back to Repository Home


Using code chunks in R markdown

Use ctrl+alt+i to start a new code chunk in R markdown.

#while inside a code chunk, you still need to use hashtags to insert comments, as you would during a standard R script.

#the basic act of creating an object uses the arrow sign <-.
thing1 <-5

#once you run the line above, you will have an object named thing1 in your environment, and it will be equal to 5.

#you can add more objects and then use them in tandem.
thing2 <-2

#adding them will give you a value in the console.
thing1+thing2
## [1] 7
#you can save the added value as a new object too.
thing3 <- thing1+thing2

#it's a good idea to clear your environment after you are done coding, and then re-run your entire script, to make sure you didn't overwrite anything along the way. It's easy to accidentally change objects along the way.

Using functions

R is made up of packages and functions. Both make common actions easier to repeat.

Examples of functions include:

#if you need a list of numbers from 1 to 15, you can hard code it like this:
number_list <-c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15)

#this can get tedious, but fortunately there is a function that already does this - the seq() function.

#to see what seq() is (or any other function), just search for info using the  question mark and the name, like this:
?seq()

#we can use tihs to create another list, from 1 to 1000, in increments of 5).
number_list2<-seq(from=1, to=1000, by =5)

#if you need a function but don't have the package, you can install it as long as you are connected to the internet. use the install.packages() command.
?install.packages()

#some other helpful functions include standard descriptive statistics, such as finding means, meadians, and SDs.

mean(number_list2)
## [1] 498.5
median(number_list2)
## [1] 498.5
sd(number_list2)
## [1] 289.3959
#you can randomly sample from a Z distribution using the rnorm() function.
rnorm(25, mean=0, sd=1)
##  [1] -0.464093529 -0.700992138  1.111846214  0.639013885  0.600186657
##  [6] -1.683091424 -0.613919761 -0.819677021  1.118852116  0.134088278
## [11] -0.548526504  1.365144368  0.003980502  0.389794466 -0.821133806
## [16]  0.128191413  0.120221943  0.705647881 -0.739867245 -0.518064745
## [21]  1.078756896 -1.353828345 -0.273965641  0.472367311 -0.454228836
#you can also generate random matrices with placeholder values, using the matrix() command.
m <-matrix(NA, 50,3) #creates a matrix of "NA"s with 50 rows and 3 columns

#sample() is a useful way to randomly draw from a pre-existing data array, matrix, list, etc., either with replacement or without.

sample(number_list2,8,replace=T) #sample 8 values from number list 2, with replacement.
## [1] 826 606 611 456 416 246 346 561

Writing your own Functions.

This is a great way to steamline your code and check that it is doing what you think it is doing.

#writing a function.

#this is a simple function to calculate means. Give it a name and define it using the function() command.
Fred_mean <-function(x) {sum(x)/length(x)} #add all Xs and divide by the length (i.e., the number of elements in the list).

#now you can call your function and use it with any vector of numbers that you have in your environment.
Fred_mean(x=number_list2)
## [1] 498.5
Fred_mean(x=c(1,3,6,22,5,3,6,7,88,30))
## [1] 17.1
Fred_mean(x=5)
## [1] 5

Introducing for Loops

You might also want to iterate something across multiple values. You can write a function for this as well, using For loops. They can be very useful, but can process REALLY slowly if you have a larger array (say 50 observations or more).

#lets say we want to calculate means for multiple columns.

#create a new matrix 3x3, sampling from number list 2, using the cbind() function. cbind() will bind multiple columns together in sequence.
number_list3 <-cbind(sample(number_list2,8,replace=T),sample(number_list2,8,replace=T),sample(number_list2,8,replace=T))

#now use a for loop to calculate means for each column in number list 3.
#for loops need a place to put the values they calculate, so we create an empty vector of NAs.
means_numlist3 <-c(NA,NA,NA)

for(i in 1:nrow(number_list3)){
  means_numlist3[i] <-Fred_mean(number_list3[i,])
}

#the above tells R to loop i times, where i goes from 1 to the number of rows in the matrix we want to calculate means for. We calculate the means using the custom function Fred_mean() we created above [telling it how many rows and columns to create]!

#calling the object will show us the means we want to see.
means_numlist3
## [1] 571.0000 112.6667 661.0000 412.6667 332.6667 747.6667 777.6667 676.0000