Lecture Goals

  • Learn basic elements of the R language
    • Data types
    • Basic objects
    • Functions
    • Classes and attributes
    • Subsetting Objects
    • Loops
  • Readings

R Philosophy

  • Information is contained in objects
    • E.g. data, variables, models, plots
  • Operations are performed by functions
    • E.g. sorting data, fitting models, plotting results
    • Caveat: functions are objects as well
  • Carry out analysis by applying functions to objects, creating new objects along the way

Data Types

Name Description Examples
Logical binary TRUE, FALSE
Integer round numbers 1L, 45L
Numeric any number 1.234, 1e+3
Character text "R rocks!"
Complex complex numbers 1 + 2i
Raw bytes 41 5a

Basic Objects

Functions

  • Functions create/modify objects according to arguments
    • E.g. Generate sequence from 3 to 5, increasing by .5
seq( from = 3, to = 5, by = .5)
## [1] 3.0 3.5 4.0 4.5 5.0
  • Create functions using function()
my_fun = function( x, y = 1 ){
  z = x^2
  return(z + y)
}
my_fun(x=3)
## [1] 10

Basic Functions

  • R comes with set of pre-installed functions
    • More functions can be added with packages
    • Run help(fun) or ? fun to get documentation
  • Arguments can be other functions' output
    • Inner functions evaluated first
rep( x = "blah", times = 2)
## [1] "blah" "blah"
rep( x = rep("ha", 2), times = 2)
## [1] "ha" "ha" "ha" "ha"

Vectors

  • Create vectors with c() concatenation function
    • Name & store your objects with an assignment operator (<- or =)
c(0, 1, 2) 
## [1] 0 1 2
my_first_vector <- c("a", "b", "c")
my_first_vector
## [1] "a" "b" "c"

Identifying Objects

  • class function identifies object type
class( c(0, 1, 2) )
## [1] "numeric"
  • Whenever possible, R will automatically convert types, from specific to general (logical > integer > numeric > character)
class( c(TRUE, 3.14, "R") )
## [1] "character"

Special Values

Value Description Use
NA not available missing value
NULL null object represent "nothing"
(+/-)Inf infinity numbers 1/0
NaN not a number 0/0


c(NA, NULL, Inf, Inf/Inf)
## [1]  NA Inf NaN

Arithmetic Operators

Operator Description Example
+ addition 1 + 1 (=2)
- subtraction 1 - 1 (=0)
* multiplication 2 * 3 (=6)
/ division 4 / 2 (=2)
^ exponentiation 2^3 (=8)
%% modulo 5 %% 2 (=1)
%/% integer division 5 %/% 2 (=2)

Element-Wise Operations

  • R performs operations element-wise, automatically recycling smaller vectors
(1:6) + (1:2)
## [1] 2 4 4 6 6 8

Factors

  • Vectors used to store categorical data: categories described by character labels but represented by integers
f <- factor( c("male", "female", "female", "male") )
class(f)
## [1] "factor"
typeof(f)  # object's storage type 
## [1] "integer"
str(f)     # object's structure
##  Factor w/ 2 levels "female","male": 2 1 1 2

Indexing & Subsetting

  • subsetting: extracting required object values using integer or logical indices in square brackets [ ]
x <- c("a", "b", "c")
x[2]
## [1] "b"
x[1:2]
## [1] "a" "b"
x[ c(T, F, T) ]
## [1] "a" "c"

Logical Tests

Operator Description Example
< (<=) less (less or equal) to 3 < 2 (=FALSE)
> (>=) greater (greater or equal) to 2 >= 2 (=TRUE)
== exactly equal to 1 == 1 (=TRUE)
!= not equal to 1 != 1 (=FALSE)
%in% belongs to 1 %in% (1:3) (=TRUE)

Logical Operators

Operator Description Example
! logical negation !TRUE (=FALSE)
| logical OR TRUE | FALSE (=TRUE)
& logical AND TRUE & FALSE (=FALSE)
any multiple OR any(T, F, F) (=TRUE)
all multiple AND all(T, F, F) (=FALSE)

Matrices

  • Tables of the same data type
matrix( data = 1:6, nrow = 3 )
##      [,1] [,2]
## [1,]    1    4
## [2,]    2    5
## [3,]    3    6
matrix( 1:6, ncol = 3, byrow = TRUE )
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6

Multivariate Subsetting

  • Subset arrays using multiple indexes (one for each dimension) separated by commas
    • Empty index selects all elements
matrix(1:4, 2)
##      [,1] [,2]
## [1,]    1    3
## [2,]    2    4
matrix(1:4, 2) [1, ]
## [1] 1 3

Names

  • Object can have name attributes, also used for indexing
mat <- matrix(1:6, nrow = 3)
rownames(mat) <- paste(1:3)
colnames(mat) <- c("a", "b")
mat
##   a b
## 1 1 4
## 2 2 5
## 3 3 6
mat["1","a"]
## [1] 1

Lists

  • Lists are 1D inhomogeneous objects
  • Create lists with list() function
my_list <- list("a", 1:3, NULL, NA) 
class(my_list)
## [1] "list"
my_list
## [[1]]
## [1] "a"
## 
## [[2]]
## [1] 1 2 3
## 
## [[3]]
## NULL
## 
## [[4]]
## [1] NA

List Elements

  • Lists can have named elements
  • Check list structure with str()
my_list <- list( num = 1:5, let = c("a", "b"), log = c(T, F, T) ) 
str(my_list)
## List of 3
##  $ num: int [1:5] 1 2 3 4 5
##  $ let: chr [1:2] "a" "b"
##  $ log: logi [1:3] TRUE FALSE TRUE
names(my_list)
## [1] "num" "let" "log"

Indexing & Subsetting Lists

  • Extract sub-lists using [ ]
  • Extract list elements using [[ ]], or by name with $
my_list <- list( num = 1:5, let = c("a", "b"), log = c(T, F, T) ) 
my_list[1]
## $num
## [1] 1 2 3 4 5
my_list[[1]]
## [1] 1 2 3 4 5
my_list$num
## [1] 1 2 3 4 5

Data Frames

  • Lists of inhomogenous vectors of same length
    • Designated way of storing data tables in R
my_df <- data.frame( age = c(21, 38, 41), sex = c("M", "F", "M") ) 
str(my_df)
## 'data.frame':    3 obs. of  2 variables:
##  $ age: num  21 38 41
##  $ sex: Factor w/ 2 levels "F","M": 2 1 2
dim(my_df)
## [1] 3 2

Conditionals and Loops

  • Execute code conditionally
if( runif(1) >= .5 ){ 
  print("Heads")
  }else{ print("Tails") }
## [1] "Heads"
  • Repeat task(s) for each element of input object
for(i in c("a", "b")){ 
  print(i)
}
## [1] "a"
## [1] "b"