spatialHeatmap 2.5.3
The primary utility of the spatialHeatmap package is the generation of spatial heatmaps (SHM) for visualizing cell-, tissue- and organ-specific abundance patterns of biological molecules (e.g. RNAs) in anatomical images (Zhang et al. 2022). This is useful for identifying molecules with spatially enriched (SE) abundance patterns as well as clusters and/or network modules composed of molecules sharing similar abundance patterns such as similar gene expression patterns. These functionalities are introduced in the main vignette of the spatialHeatmap package. The following describes extended functionalities for integrating tissue with single cell data by co-visualizing them in composite plots that combine spatial heatmaps with embedding plots of high-dimensional data. The resulting spatial context information is important for gaining insights into the tissue-level organization of single cell data.
The required quantitative assay data, such as gene expression values, can be
provided in a variety of widely used tabular data structures (e.g.
data.frame
, SummarizedExperiment
or SingleCellExperiment
). The corresponding
anatomic images need to be supplied as annotated SVG (aSVG) images and can be stored in a specific S4 class SVG
. The
creation of aSVGs is described in the main vignette of this package. For the
embedding plots of single cell data, several dimensionality reduction
algorithms (e.g. PCA, UMAP or tSNE) are supported. To associate single cells
with their source tissues, the user can choose among three major methods including
annotation-based, manual and automated methods (Figure 1). Similar
to other functionalities in spatialHeatmap, these functionalities are available within
R as well as the corresponding Shiny app (Chang et al. 2021).
To co-visualize single cell data with tissue features (Figure
1), the individual cells of the single cell data are mapped
via their group labels to the corresponding tissue features in an aSVG image. If
the feature labels in an aSVG are different than the corresponding group labels
used for the single cell data, e.g. due to variable terminologies, a
translation map can be used to avoid manual relabelling. Throughout this
vignette the usage of the term feature is a generalization referring in most
cases to tissues or organs. For the implementation of the co-visualization
tool, spatialHeatmap takes advantage of efficient and reusable S4 classes for both assay data and aSVGs respectively. The former includes the Bioconductor
core data structures such as the widely used SingleCellExperiment
(SCE
)
container illustrated in Figure 1.1 (Amezquita et al. 2020). The slots
assays
, colData
, rowData
and reducedDims
in an SCE
contain expression data,
cell metadata, molecule metadata and reduced dimensionality embedding results,
respectively. The cell group labels are stored in the colData
slot as shown
in Figure 1.1. The S4 class SVG
(Figure 1.3) is developed specifically in spatialHeatmap
for storing aSVG instances. The two most important slots coordinate
and attribute
stores the aSVG feature coordinates and respective attributes such as fill color, line withs, etc. respectively, while other slots dimension
, svg
, and raster
stores image dimension, aSVG file paths, and raster image paths respectively. For handling cell-to-tissue grouping
information, three general methods are available including (a)
annotation-based, (b) manual and (c) automated.
The annotation-based and manual methods are similar by using known cell group
labels. The main difference is how the cell labels are provided. In the
annotation-based method, existing group labels are available and can be
uploaded and/or stored in the SCE
object, as is the case in some of the SCE
instances provided by the scRNAseq
package (Risso and Cole 2022). The manual method
allows users to create the cell to tissue associations one-by-one or import
them from a tabular file. In contrast to this, the automated method aims to
assign single cells to the corresponding source tissues computationally by a
co-clustering algorithm (Figure 8). This co-clustering is
experimental and requires bulk expression data that are obtained from the
tissues represented in the single cell data. The grouping information is
visualized by using for each group the same color in both the single cell
embedding plot and the tissue spatial heatmap plot (Figure
1.5). The colors can represent any type of custom or numeric
information. In a typical use case, either fixed tissue-specific colors or a
heat color gradient is used that is proportional to the numeric expression
information obtained from the single cell or bulk expression data of a chosen
gene. When the expression values among groups are very similar, toggling
between the two coloring option is important to track the tissue origin in
the single cell data. To color by single cell data, one often wants to first
summarize the expression of a given gene across the cells within each group via a
meaningful summary statistics, such as mean or median. Cells and tissues with
the same group label will be colored the same. When coloring by tissues
the color used for each tissue feature will be applied to the corresponding
cell groups represented in the embedding plot.