1 R

Extensible statistical programming language

1.1 R Langauge

x <- rnorm(50)
y <- x + rnorm(50)
df <- data.frame(Indep = x, Dep = y)
fit <- lm(Dep ~ Indep, df)
summary(fit)
## 
## Call:
## lm(formula = Dep ~ Indep, data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.87002 -0.50512 -0.03011  0.55959  1.36743 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.1898     0.1088   1.745   0.0874 .  
## Indep         1.1713     0.1199   9.771 5.38e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7692 on 48 degrees of freedom
## Multiple R-squared:  0.6654, Adjusted R-squared:  0.6585 
## F-statistic: 95.46 on 1 and 48 DF,  p-value: 5.385e-13

Vectors

  • numeric(), character(), integer(), logical(), list(), …
  • Statistical concepts: NA, factor()

Objects

  • Class: data.frame, lm, matrix, …

Function, generic, method

  • rnorm(), lm(); summary() generic, summary.lm() method.

Programming constructs

  • apply() (array), lapply() vector or list –> list, sapply(); if () {} else {}, for () {} / repeat {}
  • function() {}
  • Garbage collection

1.2 Packages

library(ggplot2)
ggplot(df, aes(x = Dep, y = Indep)) + geom_point() + geom_smooth(method="lm")