Although multidimensional single-cell-based flow and mass cytometry have been increasingly applied to microenvironmental composition and stem-cell research, integrated analysis workflows to facilitate the interpretation of experimental cytometry data remain underdeveloped. We present CytoTree, a comprehensive R package designed for the analysis and interpretation of flow and mass cytometry data. We applied CytoTree to mass cytometry and time-course flow cytometry data to demonstrate the usage and practical utility of its computational modules. CytoTree is a reliable tool for multidimensional cytometry data workflows and produces compelling results for trajectory construction and pseudotime estimation.
See the detailed tutorial of CytoTree, please visit Tutorial of CytoTree.
Overview of Workflow
CytoTree package is developed to complete the majority of standard analysis and visualization workflow for FCS data. In CytoTree workflow, an S4 object in R is built to implement the statistical and computational approach, and all computational functionalities are integrated into one single channel which only requires a specified input data format. Computational functionalities of
CytoTree can be divided into four main parts (Fig. 2.1): preprocessing, trajectory, analysis and visualization.
Preprocessing. Data import, compensation, quality control, filtration, normalization and merge cells from different samples can be implemented in the preprocessing part. After preprocessing, a matrix that contains clean cytometric signaling data is required to build a CYT object. There are other optional data recommended to build the CYT object, including a data frame containing meta-information of the experiment and a vector contains all markers enrolled in the computational process.
Trajectory. Cells built in the CYT object are classified into different clusters based on the expression level of input markers. You can choose different clustering methods by inputting different parameters. After clustering, cells are downsampled in a cluster-dependent fashion to reduce the total cell size and avoid small cluster deletion. Dimensionality reduction for both cells and clusters are also implemented in the clustering procedure. After dimensionality reduction, we use Minimus Spanning Tree (MST) to construct cell trajectory.
Analysis. This part is designed for time course FCS data. Before running pseudotime, root cells must be defined first based on users’ prior knowledge. Root cells in
CytoTreeworkflow are the initial cells of the trajectory tree. So it can be set using one vertex node of the tree or a cluster of cells with specific antibodies combination. Intermediate state evaluation is also involved in the pseudotime part. Leaf cells are defined by the end node of the trajectory or the end stage of the experiment. Intermediate state cells are cells with higher betweenness in the graph built on cell-cell connection, which plays an important role between the connection of root cells and leaf cells.
Visualization. The visualization part can provide clear and concise visualization of FCS data in an effective and easy-to-comprehend manner.
CytoTreepackage offers various plotting functions to generate customizable and publication-quality plots. A two-dimensional or three-dimensional plot can fit most requirements from dimensionality reduction results. And tree-based plot can visualize cell trajectory as a force-directed layout tree. Other special plots such as heatmap and violin plot are also provided in