Remember
But also
Anecdote: Fitting GLM's to SNPs.
glm(): 10's of SNPs per second, days for
full analysisidentical and all.equalidentical(1, 1L)
## [1] FALSE
all.equal(1, 1L)
## [1] TRUE
v0 <- c(1, 2); v1 <- c(a=1, b=2)
identical(v0, v1)
## [1] FALSE
all.equal(v0, v1)
## [1] "names for current but not for target"
all.equal(v0, v1, check.attributes=FALSE)
## [1] TRUE
Gotcha
x <- new.env(); y <- new.env()
all.equal(x, y)
## [1] TRUE
x[["a"]] <- 1
all.equal(x, y)
## [1] TRUE
system.time and the microbenchmark packagesystem.time()
system.time()system.time({
    Sys.sleep(1)
    runif(1000)
})
##    user  system elapsed 
##   0.000   0.001   1.001
<-) to simultaneously time and assign the resultsystem.time(x <- runif(10000000))
##    user  system elapsed 
##   0.316   0.001   0.316
length(x)
## [1] 10000000
mean(x)
## [1] 0.5
library(microbenchmark)
f0 <- function(n) runif(n)
f1 <- function(n) replicate(n, runif(1))
times <- microbenchmark(f0(10), f0(100), f0(1000), f0(2000), f1(10), times=100)
times
## Unit: microseconds
##      expr    min     lq median     uq     max neval
##    f0(10)  2.861  3.115  3.381  4.397   6.421   100
##   f0(100)  5.496  5.880  6.409  7.643  16.330   100
##  f0(1000) 31.908 33.576 35.063 37.921  48.106   100
##  f0(2000) 61.980 64.650 66.553 70.550  97.939   100
##    f1(10) 53.191 55.626 58.752 68.242 179.455   100
plot(times)
 
RprofFIXME
Do no more than needed From
StackOverflow:
transform SNP genotypes from A_T to AT. Data:
a <- c("A", "C", "G", "T")
n <- 10000000
geno <- paste(sample(a, n, TRUE), sample(a, n, TRUE), sep="_")
Possible help pages: ?grep, ?substring. Compare possibilities
for error, performance.
Hoist common operations from loops, or better…:
Improve the performance of the imperative clamp algorithm, while retaining an imperative programming style
n <- 3
for (i in seq_along(x))
   if (abs(x[i]) > n * sd(x))
       x[i] <- n * sd(x)
and the following data
set.seed(123)
x <- runif(10000)
…Avoid loops entirely by writing in a declaritve manner:
Compare the performance of the imperative clamp with a declarative alternative
x[abs(x) > n * sd(x)] <- n * sd(x)
How can the performance of the declarative clamp be improved?
Unit tests
checkEquals(),
checkIdentical(), checkException().loglibrary(RUnit)
test_log0 <- function() {
   checkIdentical(-Inf, log(0))
   checkIdentical(0, log(1))
   checkIdentical(0, exp(log(0)))
   checkIdentical(1, exp(log(1)))
   old_opts <- options(warn=2)
   on.exit(options(old_opts))
   checkException(log(-1), silent=TRUE)
   checkException(log(), silent=TRUE)
}
test_log0()