# 1 Overview

BiocSingular implements several useful DelayedMatrix backends for dealing with principal components analysis (PCA). This vignette aims to provide an overview of these classes and how they can be used in other packages to improve efficiency prior to or after PCA.

# 2 The DeferredMatrix class

As previously discussed, we can defer the column centering and scaling of a matrix. The DeferredMatrix class provides a matrix representation that is equivalent to the output of scale(), but without actually performing the centering/scaling operations. This is useful when dealing with sparse matrix representations, as naive centering would result in loss of sparsity and increased memory use.

library(Matrix)
a <- rsparsematrix(10000, 1000, density=0.01)
object.size(a)
## 1205504 bytes
centering <- rnorm(ncol(a))
scaling <- runif(ncol(a))
y <- DeferredMatrix(a, centering, scaling)
y
## <10000 x 1000> DeferredMatrix object of type "double":
##               [,1]      [,2]      [,3] ...      [,999]     [,1000]
##     [1,] 3.1539227 9.5575765 0.9457845   . -0.08431266 -1.12452179
##     [2,] 3.1539227 9.5575765 0.9457845   . -0.08431266 -1.12452179
##     [3,] 3.1539227 9.5575765 0.9457845   . -0.08431266 -1.12452179
##     [4,] 3.1539227 9.5575765 0.9457845   . -0.08431266 -1.12452179
##     [5,] 3.1539227 9.5575765 0.9457845   . -0.08431266 -1.12452179
##      ...         .         .         .   .           .           .
##  [9996,] 3.1539227 9.5575765 0.9457845   . -0.08431266 -1.12452179
##  [9997,] 3.1539227 9.5575765 0.9457845   . -0.08431266 -1.12452179
##  [9998,] 3.1539227 9.5575765 0.9457845   . -0.08431266 -1.12452179
##  [9999,] 3.1539227 9.5575765 0.9457845   . -0.08431266 -1.12452179
## [10000,] 3.1539227 9.5575765 0.9457845   . -0.08431266 -1.12452179
object.size(y) # 'a' plus the size of 'centering' and 'scaling'.
## 1223608 bytes

At first glance, this seems to be nothing more than a variant on a DelayedMatrix. However, the real advantage of the DeferredMatrix comes when performing matrix multiplication. Multiplication is applied to the underlying sparse matrix, and the effects of centering and scaling are applied on the result. This allows us to achieve much greater speed than DelayedArray’s block-processing mechanism, which realizes blocks of the matrix into dense arrays prior to multiplication.

v <- matrix(runif(ncol(a)*2), ncol=2)
system.time(X <- y %*% v)
##    user  system elapsed
##   0.008   0.000   0.006
X    
## <10000 x 2> DelayedMatrix object of type "double":
##              [,1]     [,2]
##     [1,] 4096.903 3346.552
##     [2,] 4180.966 3376.059
##     [3,] 4073.325 3353.835
##     [4,] 4080.558 3346.992
##     [5,] 4081.720 3345.863
##      ...        .        .
##  [9996,] 4096.472 3360.705
##  [9997,] 4106.233 3364.906
##  [9998,] 4088.118 3347.933
##  [9999,] 4090.914 3349.844
## [10000,] 4099.505 3356.144

The cost of this speed is that the resulting matrix product is less numerically stable. Recovering the effect of centering involves the subtraction of two (potentially large) inner products, which increases the risk of catastrophic cancellation. Nonetheless, some reduction in accuracy may be worth the major speed-up when dense realization is avoided.

Note that it is also possible to nest DeferredMatrix objects within each other. This allows users to center and scale on both dimensions, as shown below.

centering2 <- rnorm(nrow(a))
scaling2 <- runif(nrow(a))
y2 <- DeferredMatrix(t(a), centering2, scaling2) # centering on rows of 'a'.
y3 <- DeferredMatrix(t(y2), centering, scaling) # centering on columns.
y3
## <10000 x 1000> DeferredMatrix object of type "double":
##                 [,1]        [,2]        [,3] ...      [,999]     [,1000]
##     [1,]   13.946659   41.481447    4.131669   .  3.67369073  2.11559447
##     [2,]   -1.199469   -3.319337   -0.339284   . -1.60015276 -2.43146531
##     [3,]   13.864281   41.237782    4.107352   .  3.64500703  2.09086364
##     [4,]    7.106872   21.250019    2.112647   .  1.29209432  0.06220376
##     [5,]    9.420570   28.093715    2.795623   .  2.09771823  0.75680535
##      ...           .           .           .   .           .           .
##  [9996,]  -14.125559  -41.553460   -4.154909   .   -6.100984   -6.312041
##  [9997,]   -3.470454  -10.036690   -1.009651   .   -2.390904   -3.113244
##  [9998,]   -7.312601  -21.401389   -2.143806   .   -3.728730   -4.266705
##  [9999,]   10.558716   31.460240    3.131590   .    2.494018    1.098491
## [10000,]  -55.328314 -163.427230  -16.317462   .  -20.447680  -18.681631

Other operations will cause the DeferredMatrix to collapse gracefully into DelayedMatrix, leading to block processing for further calculations.

# 3 The LowRankMatrix class

Once a PCA is performed, it is occasionally desirable to obtain a low-rank approximation of the input matrix by taking the cross-product of the rotation vectors and PC scores. Naively doing so results in the formation of a dense matrix of the same dimensions as the input. This may be prohibitively memory-consuming for a large data set. Instead, we can construct a LowRankMatrix class that mimics the output of the cross-product without actually computing it.

a <- rsparsematrix(10000, 1000, density=0.01)
out <- runPCA(a, rank=5, BSPARAM=IrlbaParam(deferred=TRUE)) # deferring for speed.
recon <- LowRankMatrix(out$rotation, out$x)
recon    
## <1000 x 10000> LowRankMatrix object of type "double":
##                  [,1]          [,2]          [,3] ...       [,9999]
##    [1,] -1.781085e-03 -1.523081e-03  1.443552e-03   .  3.030798e-04
##    [2,]  3.585867e-04 -1.342832e-04  8.285811e-04   .  9.025608e-04
##    [3,]  1.052371e-02 -2.520225e-03  3.098450e-03   .  1.989035e-03
##    [4,] -4.826878e-04 -2.645779e-04 -5.633683e-04   . -4.618844e-05
##    [5,]  2.772357e-05  1.831260e-03 -6.794541e-04   . -6.210676e-04
##     ...             .             .             .   .             .
##  [996,]  0.0090801411  0.0164981893  0.0025742946   . -0.0002319862
##  [997,]  0.0028205991 -0.0027435704  0.0027944198   .  0.0017435999
##  [998,]  0.0011461995  0.0094377249 -0.0035145252   . -0.0022258435
##  [999,] -0.0044773931  0.0016847497 -0.0006523067   . -0.0004196002
## [1000,]  0.0117446345  0.0062478039 -0.0016614781   .  0.0005314172
##              [,10000]
##    [1,] -1.985641e-03
##    [2,]  3.603602e-03
##    [3,]  2.108302e-03
##    [4,]  1.530459e-03
##    [5,] -1.102399e-03
##     ...             .
##  [996,] -0.0041247875
##  [997,]  0.0043023708
##  [998,] -0.0013161509
##  [999,]  0.0011384503
## [1000,]  0.0105344522

This is useful for convenient extraction of row- or column vectors without needing to manually perform a cross-product. A LowRankMatrix is thus directly interoperable with downstream procedures (e.g., for visualization) that expect a matrix of the same dimensionality as the input.

summary(recon[,1])
##       Min.    1st Qu.     Median       Mean    3rd Qu.       Max.
## -0.1015177 -0.0042682 -0.0002971  0.0001826  0.0039490  0.1339706
summary(recon[2,])
##       Min.    1st Qu.     Median       Mean    3rd Qu.       Max.
## -2.741e-02 -1.426e-03 -7.600e-07  0.000e+00  1.423e-03  3.357e-02

Again, most operations will cause the LowRankMatrix to collapse gracefully into DelayedMatrix for further processing.

# 4 The ResidualMatrix class

A common analysis in genomics applications involves regressing out uninteresting factors of variation prior to a PCA. However, doing so naively would discard aspects of the underlying matrix representation. The most obvious example is the loss of sparsity when a dense matrix of residuals is computed, increasing memory usage and compute time in downstream applications.

The ResidualMatrix class provides an alternative to explicit calculation of the residuals. The constructor takes a matrix of input values and a design matrix, where residuals are conceptually computed by fitting the linear model to the columns of the input matrix. However, the actual calculation of the residuals is delayed until they are explictly required.

design <- model.matrix(~gl(5, 1000))
y0 <- rsparsematrix(nrow(design), 30000, 0.01)
resids <- ResidualMatrix(y0, design)
resids
## <5000 x 30000> ResidualMatrix object of type "double":
##              [,1]      [,2]      [,3] ...  [,29999]  [,30000]
##    [1,] -0.001195  0.002082 -0.002110   .  -0.00161   0.00077
##    [2,] -0.001195  0.002082 -0.002110   .  -0.00161   0.00077
##    [3,] -0.001195  0.002082 -0.002110   .  -0.00161   0.00077
##    [4,] -0.001195  0.002082 -0.002110   .  -0.00161   0.00077
##    [5,] -0.001195  0.002082 -0.002110   .  -0.00161   0.00077
##     ...         .         .         .   .         .         .
## [4996,]  0.004410  0.000179  0.001072   . 0.0021459 0.0016995
## [4997,]  0.004410  0.000179  0.001072   . 0.0021459 0.0016995
## [4998,]  0.004410  0.000179  0.001072   . 0.0021459 0.0016995
## [4999,]  0.004410  0.000179  0.001072   . 0.0021459 0.0016995
## [5000,]  0.004410  0.000179  0.001072   . 0.0021459 0.0016995

In fact, matrix multiplication steps involving a ResidualMatrix do not even need to compute the residuals explicitly at all. This means that ResidualMatrix objects can be efficiently used in approximate PCA algorithms based on multiplication.

system.time(pc.out <- runPCA(resids, 10, BSPARAM=IrlbaParam()))
##    user  system elapsed
##  11.668   0.004  11.685

Other operations will cause the ResidualMatrix to collapse into DelayedMatrix for further processing.

# 5 Session information

sessionInfo()
## R version 3.6.0 (2019-04-26)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.2 LTS
##
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.10-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.10-bioc/R/lib/libRlapack.so
##
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base
##
## other attached packages:
## [1] Matrix_1.2-17       BiocParallel_1.19.0 BiocSingular_1.1.3
## [4] knitr_1.23          BiocStyle_2.13.0
##
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.1          magrittr_1.5        BiocGenerics_0.31.2
##  [4] IRanges_2.19.5      lattice_0.20-38     stringr_1.4.0
##  [7] tools_3.6.0         parallel_3.6.0      grid_3.6.0
## [10] rsvd_1.0.0          xfun_0.7            irlba_2.3.3
## [13] htmltools_0.3.6     matrixStats_0.54.0  yaml_2.2.0
## [16] digest_0.6.19       bookdown_0.10       BiocManager_1.30.4
## [19] S4Vectors_0.23.3    evaluate_0.13       rmarkdown_1.13
## [22] DelayedArray_0.11.0 stringi_1.4.3       compiler_3.6.0
## [25] stats4_3.6.0