Introduction

Although multidimensional single-cell-based flow and mass cytometry have been increasingly applied to microenvironmental composition and stem-cell research, integrated analysis workflows to facilitate the interpretation of experimental cytometry data remain underdeveloped. We present flowSpy, a comprehensive R package designed for the analysis and interpretation of flow and mass cytometry data. We applied flowSpy to mass cytometry and time-course flow cytometry data to demonstrate the usage and practical utility of its computational modules. flowSpy is a reliable tool for multidimensional cytometry data workflows and produces compelling results for trajectory construction and pseudotime estimation.

Overview of flowSpy workflow

The flowSpy package is developed to complete the majority of standard analysis and visualization workflow for FCS data. In flowSpy workflow, an S4 object in R is built to implement the statistical and computational approach, and all computational modules are integrated into one single channel which only requires a specified input data format. Computational modules of flowSpy can be divided into four main parts (Fig. 1): preprocessing, trajectory, analysis and visualization.

Workflow of flowSpy
Fig. 1. Workflow of flowSpy
## 2019-10-29 23:59:29 [INFO] Number of cells in processing: 600
## 2019-10-29 23:59:29 [INFO] rownames of meta.data and raw.data will be named using column cell
## 2019-10-29 23:59:29 [INFO] Index of markers in processing
## 2019-10-29 23:59:29 [INFO] Creating FSPY object.
## 2019-10-29 23:59:29 [INFO] Determining normalization factors
## 2019-10-29 23:59:29 [INFO] Normalization and log-transformation.
## 2019-10-29 23:59:29 [INFO] Build FSPY object succeed
## FSPY Information:
##  Input cell number: 600  cells 
##  Enroll marker number: 10  markers 
##  Cells after downsampling: 600  markers
# Cluster cells by SOM algorithm
# Set random seed to make results reproducible
set.seed(1)
fspy <- runCluster(fspy, cluster.method = "som")

# Do not perform downsampling
set.seed(1)
fspy <- processingCluster(fspy)

# run Principal Component Analysis (PCA)
fspy <- runFastPCA(fspy)

# run t-Distributed Stochastic Neighbor Embedding (tSNE)
fspy <- runTSNE(fspy)

# run Diffusion map
fspy <- runDiffusionMap(fspy)

# run Uniform Manifold Approximation and Projection (UMAP)
fspy <- runUMAP(fspy)

# build minimum spanning tree based on tsne
fspy <- buildTree(fspy, dim.type = "tsne", dim.use = 1:2)

# DEGs of different branch
diff.list <- runDiff(fspy)

# define root cells
fspy <- defRootCells(fspy, root.cells = c(28,26))

# run pseudotime
fspy <- runPseudotime(fspy, verbose = TRUE, dim.type = "raw")

# define leaf cells
fspy <- defLeafCells(fspy, leaf.cells = c(27, 13), verbose = TRUE)

# run walk between root cells and leaf cells
fspy <- runWalk(fspy, verbose = TRUE)

# Save object
if (FALSE) {
  save(fspy, file = "Path to you output directory")
}

######################## Visualization

# Plot 2D tSNE. And cells are colored by cluster id
plot2D(fspy, item.use = c("tSNE_1", "tSNE_2"), color.by = "cluster.id", 
       alpha = 1, main = "tSNE", category = "categorical", show.cluser.id = TRUE)

# Plot 2D UMAP. And cells are colored by cluster id
plot2D(fspy, item.use = c("UMAP_1", "UMAP_2"), color.by = "cluster.id", 
       alpha = 1, main = "UMAP", category = "categorical", show.cluser.id = TRUE)

# Plot 2D tSNE. And cells are colored by cluster id
plot2D(fspy, item.use = c("tSNE_1", "tSNE_2"), color.by = "branch.id", 
       alpha = 1, main = "tSNE", category = "categorical", show.cluser.id = TRUE)

# Plot 2D UMAP. And cells are colored by cluster id
plot2D(fspy, item.use = c("UMAP_1", "UMAP_2"), color.by = "branch.id", 
       alpha = 1, main = "UMAP", category = "categorical", show.cluser.id = TRUE)


# Plot 2D tSNE. And cells are colored by stage
plot2D(fspy, item.use = c("tSNE_1", "tSNE_2"), color.by = "stage", 
       alpha = 1, main = "UMAP", category = "categorical") +
   scale_color_manual(values = c("#00599F","#009900","#FF9933",
                                 "#FF99FF","#7A06A0","#FF3222"))

# Plot 2D UMAP. And cells are colored by stage
plot2D(fspy, item.use = c("UMAP_1", "UMAP_2"), color.by = "stage", 
       alpha = 1, main = "UMAP", category = "categorical") +
   scale_color_manual(values = c("#00599F","#009900","#FF9933",
                                 "#FF99FF","#7A06A0","#FF3222"))

# Tree plot
plotTree(fspy, color.by = "D0.percent", show.node.name = TRUE, cex.size = 1) + 
  scale_colour_gradientn(colors = c("#00599F", "#EEEEEE", "#FF3222"))

plotTree(fspy, color.by = "CD43", show.node.name = TRUE, cex.size = 1) + 
  scale_colour_gradientn(colors = c("#00599F", "#EEEEEE", "#FF3222"))


# plot clusters
plotCluster(fspy, item.use = c("tSNE_1", "tSNE_2"), category = "numeric",
            size = 100, color.by = "CD45RA") + 
  scale_colour_gradientn(colors = c("#00599F", "#EEEEEE", "#FF3222"))

# plot pie tree
plotPieTree(fspy, cex.size = 3, size.by.cell.number = TRUE) + 
  scale_fill_manual(values = c("#00599F","#FF3222","#009900",
                               "#FF9933","#FF99FF","#7A06A0"))

# plot pie cluster
plotPieCluster(fspy, item.use = c("tSNE_1", "tSNE_2"), cex.size = 40) + 
  scale_fill_manual(values = c("#00599F","#FF3222","#009900",
                               "#FF9933","#FF99FF","#7A06A0"))

# plot heatmap of cluster
plotClusterHeatmap(fspy)
plotBranchHeatmap(fspy)

# Violin plot
plotViolin(fspy, color.by = "cluster.id", marker = "CD45RA", text.angle = 90)
plotViolin(fspy, color.by = "branch.id", marker = "CD45RA", text.angle = 90)

# UMAP plot colored by pseudotime
plot2D(fspy, item.use = c("UMAP_1", "UMAP_2"), category = "numeric",
            size = 1, color.by = "pseudotime") + 
  scale_colour_gradientn(colors = c("#F4D31D", "#FF3222","#7A06A0"))

# tSNE plot colored by pseudotime
plot2D(fspy, item.use = c("tSNE_1", "tSNE_2"), category = "numeric",
            size = 1, color.by = "pseudotime") + 
 scale_colour_gradientn(colors = c("#F4D31D", "#FF3222","#7A06A0"))

# denisty plot by different stage
plotPseudotimeDensity(fspy, adjust = 1) + 
  scale_color_manual(values = c("#00599F","#009900","#FF9933",