1 Introduction

In this document, I aim at showing a typical analysis of a spectral cytometry file, including the construction of the spectral decomposition matrix, the actual decomposition, correction of the resulting file (as there generally are minor differences between the single-stained controls and the fully stained sample) and finally converting the resulting flowFrame or flowSet to a dataframe that can be used for any downstream application. Note: This whole package is very much dependent on flowCore, and much of the functionality herein works as an extention of the basic flowCore functionality.

2 Installation

This is how to install the package, if that has not already been done:

if(!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("flowSpecs")

3 Example data description

The dataset that is used in this vinjette, and that is the example dataset in the package generally, is a PBMC sample stained with 12 fluorochrome-conjugated antibodies against a wide range of leukocyte antigens. Included is also a set of single-stained controls, that fill the same function with spectral cytometry as in conventional ditto. The files were generated on a 44 channel Cytek Aurora® instrument 2018-10-25.

library(flowSpecs)
library(flowCore)
data("unmixCtrls")
unmixCtrls
## A flowSet with 15 experiments.
## 
##   column names:
##   Time SSC-H SSC-A V1-A V2-A V3-A V4-A V5-A V6-A V7-A V8-A V9-A V10-A V11-A V12-A V13-A V14-A V15-A V16-A FSC-H FSC-A B1-A B2-A B3-A B4-A B5-A B6-A B7-A B8-A B9-A B10-A B11-A B12-A B13-A B14-A B15-A B16-A R1-A R2-A R3-A R4-A R5-A R6-A R7-A R8-A R9-A R10-A
data('fullPanel')
fullPanel[,seq(4,7)]
## flowFrame object 'fullPanel.fcs'
## with 8000 cells and 4 observables:
##     name desc   range minRange maxRange
## $P4 V1-A <NA> 4194304     -111  4194303
## $P5 V2-A <NA> 4194304     -111  4194303
## $P6 V3-A <NA> 4194304     -111  4194303
## $P7 V4-A <NA> 4194304     -111  4194303
## 202 keywords are stored in the 'description' slot

As can be noted, flowSpecs adheres to flowCore standards, and thus uses flowFrames and flowSets as input to all user functions.

4 Construction of spectral unmixing matrix

To do this, we need the single-stained unmixing controls. As the fluorescent sources can be of different kinds, such as from antibodies, fluorescent proteins, or dead cell markers, the specMatCalc function accepts any number of different such groups. However, the groups need to have a common part of their names. If this was not the case during acquisition, the names of the fcs files can always be changed afterwards. To check the names, run the sampleNames function from flowCore:

sampleNames(unmixCtrls)
##  [1] "Beads_AF647_IgM.fcs"    "Beads_AF700_CD4.fcs"    "Beads_APCCy7_CD19.fcs" 
##  [4] "Beads_BV605_CD14.fcs"   "Beads_BV650_CD56.fcs"   "Beads_BV711_CD11c.fcs" 
##  [7] "Beads_BV785_CD8a.fcs"   "Beads_FITC_CD41b.fcs"   "Beads_PB_CD3.fcs"      
## [10] "Beads_PE_X.fcs"         "Beads_PECy7_CD45RA.fcs" "Beads_unstained.fcs"   
## [13] "Dead_PO_DCM.fcs"        "Dead_unstained.fcs"     "PBMC_unstained.fcs"

This shows that we have three groups of samples: “Beads”, “Dead” and “PBMC”. The two first are groups that define the fluorochromes from antibodies and the dead cell marker (which is pacific orange-NHS in this case). The last one, “PBMC”, will be used for autofluorescence correction. For obvious reasons, the autofluo control should always be from the same type of sample as the samples that will be analyzed downstream. With this knowledge about the groups of samples, we can now create the matrix:

specMat <- specMatCalc(unmixCtrls, groupNames = c("Beads_", "Dead_"), 
                        autoFluoName = "PBMC_unstained.fcs")
str(specMat)
##  num [1:13, 1:42] 0.000058 0.001032 0.000648 0.029998 0.052593 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : chr [1:13] "AF647_IgM" "AF700_CD4" "APCCy7_CD19" "BV605_CD14" ...
##   ..$ : chr [1:42] "V1-A" "V2-A" "V3-A" "V4-A" ...

Here we can see that a matrix with the original fluorescence detector names as column names, and the new fluorochrome/marker names as row names has been created. The function does a lot of preprocessing, with automatic gating of the most dominant population, etc, to ensure the best possible resolution and consistency in the determination of the matrix.

5 Spectral unmixing

Now it is time to apply the newly constructed specMat to the fully stained sample. This is done in the following way:

fullPanelUnmix <- specUnmix(fullPanel, specMat)
fullPanelUnmix
## flowFrame object 'fullPanel.fcs'
## with 8000 cells and 18 observables:
##              name desc    range minRange maxRange
## $P1          Time <NA> 16777216        0 16777215
## $P2         SSC-H <NA>  4194304        0  4194303
## $P3         SSC-A <NA>  4194304        0  4194303
## $P4         FSC-H <NA>  4194304     -111  4194303
## $P5         FSC-A <NA>  4194304     -111  4194303
## $P6     AF647_IgM <NA>  4194304     -111  4194303
## $P7     AF700_CD4 <NA>  4194304     -111  4194303
## $P8   APCCy7_CD19 <NA>  4194304     -111  4194303
## $P9    BV605_CD14 <NA>  4194304     -111  4194303
## $P10   BV650_CD56 <NA>  4194304     -111  4194303
## $P11  BV711_CD11c <NA>  4194304     -111  4194303
## $P12   BV785_CD8a <NA>  4194304     -111  4194303
## $P13   FITC_CD41b <NA>  4194304     -111  4194303
## $P14       PB_CD3 <NA>  4194304     -111  4194303
## $P15         PE_X <NA>  4194304     -111  4194303
## $P16 PECy7_CD45RA <NA>  4194304     -111  4194303
## $P17       PO_DCM <NA>  4194304     -111  4194303
## $P18     Autofluo <NA>  4194304     -111  4194303
## 271 keywords are stored in the 'description' slot

Notable is that the names now have been exchanged for the fluorescent molecules instead of the detector channels. The algorithm below this function is currently least squares regression.

6 Transformation

As with all cytometry data, for correct interpretation, the data needs to be transformed using one of the lin-log functions. As the arcsinh function is widely used and also has a single co-function that controls the level of compression aroud zero, it is used in this package. The function has a number of built-in features, such as automatic detection of if the file comes from mass or flow cytometry, and will give differenc cofactors accordingly. It is however always the best practice to set the cofactors individually, to ensure that no artifactual populations are created, which can happen, if there is too much resolution around zero. One automated strategy for this, which would make the arcTrans function unnecessary, is to use the flowVS package.

The arcTrans function requires the names of the variables that should be transformed to be specified.

fullPanelTrans <- arcTrans(fullPanelUnmix, transNames = 
                            colnames(fullPanelUnmix)[6:18])
par(mfrow=c(1,2))
hist(exprs(fullPanelUnmix)[,7], main = "Pre transformation", 
     xlab = "AF700_CD4", breaks = 200)
hist(exprs(fullPanelTrans)[,7], main = "Post transformation", 
     xlab = "AF700_CD4", breaks = 200)

As can be seen in the histograms, the ranges, scales and resolution have now changed dramatically. (Biologically, the three peaks correspond to CD4- cells, CD4+myeloid cells and CD4+T-cells, respectively).

7 Investigation of possible unmixing artifacts

An important step in the early processing of cytometry files is to investigate if, or rather where, unmixing artifacts have arisen. There are multiple reasons for the occurrence of such artifacts, but listing them are outside of the scope of this vinjette. In the package, there is one function that is well suited for for this task, and that is the oneVsAllPlot function. When used without specifying a marker, the function will create a folder and save all possible combinations of markers to that folder. Looking at them gives a good overview of the data. In this case, for the vinjette purpose, I am only plotting one of the multi-graphs.

oneVsAllPlot(fullPanelTrans, "AF647_IgM", saveResult = FALSE)