Contents

1 Installation

# Install the package from Bioconductor
if (!requireNamespace("BiocManager", quietly = TRUE)) {
  install.packages("BiocManager")
}
BiocManager::install("Statial")

2 Load packages

# Loading required packages
library(Statial)
library(spicyR)
library(ClassifyR)
library(lisaClust)
library(dplyr)
library(SingleCellExperiment)
library(ggplot2)
library(ggsurvfit)
library(survival)
library(tibble)

theme_set(theme_classic())
nCores <- 1

3 Overview

There are over 37 trillion cells in the human body, each taking up different forms and functions. The behaviour of these cells can be described by canonical characteristics, but their functions can also dynamically change based on their environmental context, leading to cells with diverse states. Understanding changes in cell state and the interplay between cells is key to understanding their mechanisms of action and how they contribute to human disease. Statial is a suite of functions for identifying changes in cell state. This guide will provide a step-by-step overview of some key functions within Statial.

4 Loading example data

To illustrate the functionality of Statial we will use a multiplexed ion beam imaging by time-of-flight (MIBI-TOF) dataset profiling tissue from triple-negative breast cancer patients\(^1\) by Keren et al., 2018. This dataset simultaneously quantifies in situ expression of 36 proteins in 34 immune rich patients. Note: The data contains some “uninformative” probes and the original cohort included 41 patients.

These images are stored in a SingleCellExperiment object called kerenSCE. This object contains 57811 cells across 10 images and includes information on cell type and patient survival.

Note: The original dataset was reduced down from 41 images to 10 images for the purposes of this vignette, due to size restrictions.

# Load head and neck data
data("kerenSCE")

kerenSCE
#> class: SingleCellExperiment 
#> dim: 48 57811 
#> metadata(0):
#> assays(1): intensities
#> rownames(48): Na Si ... Ta Au
#> rowData names(0):
#> colnames(57811): 1 2 ... 171281 171282
#> colData names(8): x y ... Survival_days_capped Censored
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):

5 Kontextual: Identifying discrete changes in cell state

Kontextual is a method to evaluate the localisation relationship between two cell types in an image. Kontextual builds on the L-function by contextualising the relationship between two cell types in reference to the typical spatial behaviour of a \(3^{rd}\) cell type/population. By taking this approach, Kontextual is invariant to changes in the window of the image as well as tissue structures which may be present.

The definitions of cell types and cell states are somewhat ambiguous, cell types imply well defined groups of cells that serve different roles from one another, on the other hand cell states imply that cells are a dynamic entity which cannot be discretised, and thus exist in a continuum. For the purposes of using Kontextual we treat cell states as identified clusters of cells, where larger clusters represent a “parent” cell population, and finer sub-clusters representing a “child” cell population. For example a CD4 T cell may be considered a child to a larger parent population of Immune cells. Kontextual thus aims to see how a child population of cells deviate from the spatial behaviour of their parent population, and how that influences the localisation between the child cell state and another cell state.

5.1 Cell type hierarchy

A key input for Kontextual is an annotation of cell type hierarchies. We will need these to organise all the cells present into cell state populations or clusters, e.g. all the different B cell types are put in a vector called bcells.

For the purposes of this vignette, these will be manually defined. Alternatively, you can use the treeKor bioconductor package treekoR to define these hierarchies in a data driven way.

# Examine all cell types in image
unique(kerenSCE$cellType)
#>  [1] "Keratin_Tumour" "CD3_Cell"       "B"              "CD4_Cell"      
#>  [5] "Dc/Mono"        "Unidentified"   "Macrophages"    "CD8_Cell"      
#>  [9] "other immune"   "Endothelial"    "Mono/Neu"       "Mesenchymal"   
#> [13] "Neutrophils"    "NK"             "Tumour"         "DC"            
#> [17] "Tregs"

# Set up cell populations
tumour <- c("Keratin_Tumour", "Tumour")

bcells <- c("B")
tcells <- c("CD3_Cell", "CD4_Cell", "CD8_Cell", "Tregs")
myeloid <- c("Dc/Mono", "DC", "Mono/Neu", "Macrophages", "Neutrophils")

endothelial <- c("Endothelial")
mesenchymal <- c("Mesenchymal")

tissue <- c(endothelial, mesenchymal)
immune <- c(bcells, tcells, myeloid, "NK", "other immune") # NK = Natural Killer cells

all <- c(tumour, tissue, immune, "Unidentified")

5.2 Discrete cell state changes within a single image

Here we examine an image highlighted in the Keren et al. 2018 manuscript where the relationship between two cell types depends on a parent cell population. In image 6 of the Keren et al. dataset, we can see that p53+ tumour cells and immune cells are dispersed. However when the behaviour of p53+ tumour cells are placed in the context of the spatial behaviour of its broader parent population tumour cells, p53+ tumour cells and immune would appear localised.w

# Lets define a new cell type vector
kerenSCE$cellTypeNew <- kerenSCE$cellType

# Select for all cells that express higher than baseline level of p53
p53Pos = assay(kerenSCE)["p53",] > -0.300460

# Find p53+ tumour cells
kerenSCE$cellTypeNew[kerenSCE$cellType %in% tumour] <- "Tumour"
kerenSCE$cellTypeNew[p53Pos & kerenSCE$cellType %in% tumour] <- "p53_Tumour"

#Group all immune cells under the name "Immune"

kerenSCE$cellTypeNew[kerenSCE$cellType %in% immune] <- "Immune"

# Plot image 6

kerenSCE |>
  colData() |>
  as.data.frame() |>
  filter(imageID == "6") |>
  filter(cellTypeNew %in% c("Immune", "Tumour", "p53_Tumour")) |>
  arrange(cellTypeNew) |>
  ggplot(aes(x = x, y = y, color = cellTypeNew)) +
  geom_point(size = 1) +
  scale_colour_manual(values = c("#505050", "#64BC46","#D6D6D6")) + guides(colour = guide_legend(title = "Cell types", override.aes = list(size=3)))