1 SpatialFeatures

The R package SpatialFeatures contains functions to extract features from molecule-resolved spatial transcriptomics data using objects from the MoleculeExperiment class.

1.1 Load Required Libraries

library(SpatialFeatures)
library(MoleculeExperiment)
library(SingleCellExperiment)
library(ggplot2)
library(dplyr)

1.2 Create Example Dataset

Load an example MoleculeExperiment object with Xenium data:

repoDir <- system.file("extdata", package = "MoleculeExperiment")
repoDir <- paste0(repoDir, "/xenium_V1_FF_Mouse_Brain")

me <- readXenium(repoDir, keepCols = "essential")
me
#> MoleculeExperiment class
#> 
#> molecules slot (1): detected
#> - detected:
#> samples (2): sample1 sample2
#> -- sample1:
#> ---- features (137): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> ---- molecules (962)
#> ---- location range: [4900,4919.98] x [6400.02,6420]
#> -- sample2:
#> ---- features (143): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> ---- molecules (777)
#> ---- location range: [4900.01,4919.98] x [6400.16,6419.97]
#> 
#> 
#> boundaries slot (1): cell
#> - cell:
#> samples (2): sample1 sample2
#> -- sample1:
#> ---- segments (5): 67500 67512 67515 67521 67527
#> -- sample2:
#> ---- segments (9): 65043 65044 ... 65070 65071

1.3 Quick start

The spatialFeatures function performs three key steps. First, it generates new boundaries corresponding to the feature types. Second, it calculates entropy-based metrics for each of the feature types. Third, it combines all this information and returns a SingleCellExperiment object.

SpatialFeatures can be run with the following basic command. This will extract out all feature types as a single assay.

se = spatialFeatures(me)
se
#> class: SingleCellExperiment 
#> dim: 178 14 
#> metadata(0):
#> assays(4): subsector subconcentric supersector superconcentric
#> rownames(178): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> rowData names(0):
#> colnames(14): sample1.67500 sample1.67512 ... sample2.65070
#>   sample2.65071
#> colData names(5): Cell Sample_id x_central y_central boundaries
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):

You can also calculate gene counts within the spatialFeatures function, either as additional concatenated features, or with each feature type belonging to its own assay.

se = spatialFeatures(me, includeCounts = TRUE, concatenateFeatures = FALSE)
se
#> class: SingleCellExperiment 
#> dim: 178 14 
#> metadata(0):
#> assays(5): subsector subconcentric supersector superconcentric
#>   genecount
#> rownames(178): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> rowData names(0):
#> colnames(14): sample1.67500 sample1.67512 ... sample2.65070
#>   sample2.65071
#> colData names(5): Cell Sample_id x_central y_central boundaries
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):

2 Step-by-step SpatialFeatures

2.1 Generate new boundaries based on subcellular and supercellular features

This function loadBoundaries(me) processes the given data (me) and returns feature data based on multiple assay types, including: - Sub-Sector - Sub-Concentric - Super-Sector - Super-Concentric

me <- SpatialFeatures::loadBoundaries(me)
me
#> MoleculeExperiment class
#> 
#> molecules slot (1): detected
#> - detected:
#> samples (2): sample1 sample2
#> -- sample1:
#> ---- features (137): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> ---- molecules (962)
#> ---- location range: [4900,4919.98] x [6400.02,6420]
#> -- sample2:
#> ---- features (143): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> ---- molecules (777)
#> ---- location range: [4900.01,4919.98] x [6400.16,6419.97]
#> 
#> 
#> boundaries slot (5): cell subsector subconcentric supersector
#>   superconcentric
#> - cell:
#> samples (2): sample1 sample2
#> -- sample1:
#> ---- segments (5): 67500 67512 67515 67521 67527
#> -- sample2:
#> ---- segments (9): 65043 65044 ... 65070 65071
#> - subsector:
#> samples (2): sample1 sample2
#> -- sample1:
#> ---- segments (60): 67500_01 67500_02 ... 67527_11 67527_12
#> -- sample2:
#> ---- segments (108): 65043_01 65043_02 ... 65071_11 65071_12
#> - subconcentric:
#> samples (2): sample1 sample2
#> -- sample1:
#> ---- segments (30): 67500_01 67500_02 ... 67527_05 67527_06
#> -- sample2:
#> ---- segments (54): 65043_01 65043_02 ... 65071_05 65071_06
#> - supersector:
#> samples (2): sample1 sample2
#> -- sample1:
#> ---- segments (55): 67500_01 67500_02 ... 67527_10 67527_11
#> -- sample2:
#> ---- segments (99): 65043_01 65043_02 ... 65071_10 65071_11
#> - superconcentric:
#> samples (2): sample1 sample2
#> -- sample1:
#> ---- segments (35): 67500_01 67500_02 ... 67527_06 67527_07
#> -- sample2:
#> ---- segments (63): 65043_01 65043_02 ... 65071_06 65071_07

2.2 Draw the Feature Boundary Plots

Visualize the spatial molecule and cell boundary map with SpatialFeature boundaries.

ggplot_me() +
  # add molecule points and colour by feature name
  geom_point_me(me, byColour = "feature_id", size = 0.1) +
  # add nuclei segments and colour the border with red
  geom_polygon_me(me, assayName = "subsector", fill = NA, colour = "red") +
  # add cell segments and colour by cell id
  geom_polygon_me(me, byFill = "segment_id", colour = "black", alpha = 0.1) +
  # zoom in to selected patch area
  coord_cartesian(xlim = c(4875, 4950), ylim = c(6385, 6455))


ggplot_me() +
  # add molecule points and colour by feature name
  geom_point_me(me, byColour = "feature_id", size = 0.1) +
  # add nuclei segments and colour the border with red
  geom_polygon_me(me, assayName = "subconcentric", fill = NA, colour = "red") +
  # add cell segments and colour by cell id
  geom_polygon_me(me, byFill = "segment_id", colour = "black", alpha = 0.1) +
  # zoom in to selected patch area
  coord_cartesian(xlim = c(4875, 4950), ylim = c(6385, 6455))


ggplot_me() +
  # add molecule points and colour by feature name
  geom_point_me(me, byColour = "feature_id", size = 0.1) +
  # add nuclei segments and colour the border with red
  geom_polygon_me(me, assayName = "supersector", fill = NA, colour = "blue") +
  # add cell segments and colour by cell id
  geom_polygon_me(me, byFill = "segment_id", colour = "black", alpha = 0.1) +
  # zoom in to selected patch area
  coord_cartesian(xlim = c(4875, 4950), ylim = c(6385, 6455))


ggplot_me() +
  # add molecule points and colour by feature name
  geom_point_me(me, byColour = "feature_id", size = 0.1) +
  # add nuclei segments and colour the border with red
  geom_polygon_me(me, assayName = "superconcentric", fill = NA, colour = "blue") +
  # add cell segments and colour by cell id
  geom_polygon_me(me, byFill = "segment_id", colour = "black", alpha = 0.1) +
  # zoom in to selected patch area
  coord_cartesian(xlim = c(4875, 4950), ylim = c(6385, 6455))

2.3 Create Entropy Matrix

This function EntropyMatrix() computes the entropy of a counts matrix from a MoleculeExperiment object based on the given assay type.

ent = SpatialFeatures::EntropyMatrix(me, c("subsector", "subconcentric", "supersector", "superconcentric"), nCores = 1)
lapply(ent, head, n = 4)
#> $subsector
#>               sample1.67500 sample1.67512 sample1.67515 sample1.67521
#> 2010300C02Rik     0.5623351     0.6931472     0.6365142     0.6931472
#> Acsbg1            0.6931472     0.6931472     0.6931472     0.0000000
#> Adamts2           0.0000000     0.0000000     0.0000000     0.0000000
#> Adamtsl1          0.0000000     0.0000000     0.0000000     0.0000000
#>               sample1.67527 sample2.65043 sample2.65044 sample2.65051
#> 2010300C02Rik             0             0             0     0.0000000
#> Acsbg1                    0             0             0     0.6931472
#> Adamts2                   0             0             0     0.0000000
#> Adamtsl1                  0             0             0     0.0000000
#>               sample2.65055 sample2.65063 sample2.65064 sample2.65067
#> 2010300C02Rik     1.5498260             0     0.6931472             0
#> Acsbg1            0.6931472             0     0.0000000             0
#> Adamts2           0.0000000             0     0.0000000             0
#> Adamtsl1          0.0000000             0     0.0000000             0
#>               sample2.65070 sample2.65071
#> 2010300C02Rik             0     0.0000000
#> Acsbg1                    0     0.6931472
#> Adamts2                   0     0.0000000
#> Adamtsl1                  0     0.0000000
#> 
#> $subconcentric
#>               sample1.67500 sample1.67512 sample1.67515 sample1.67521
#> 2010300C02Rik     0.5623351     0.6931472     0.6365142             0
#> Acsbg1            0.6931472     0.0000000     0.6931472             0
#> Adamts2           0.0000000     0.0000000     0.0000000             0
#> Adamtsl1          0.0000000     0.0000000     0.0000000             0
#>               sample1.67527 sample2.65043 sample2.65044 sample2.65051
#> 2010300C02Rik             0             0             0             0
#> Acsbg1                    0             0             0             0
#> Adamts2                   0             0             0             0
#> Adamtsl1                  0             0             0             0
#>               sample2.65055 sample2.65063 sample2.65064 sample2.65067
#> 2010300C02Rik     1.2770343             0     0.6931472             0
#> Acsbg1            0.6931472             0     0.0000000             0
#> Adamts2           0.0000000             0     0.0000000             0
#> Adamtsl1          0.0000000             0     0.0000000             0
#>               sample2.65070 sample2.65071
#> 2010300C02Rik             0             0
#> Acsbg1                    0             0
#> Adamts2                   0             0
#> Adamtsl1                  0             0
#> 
#> $supersector
#>               sample1.67500 sample1.67512 sample1.67515 sample1.67521
#> 2010300C02Rik             0      1.332179     0.6931472     0.6931472
#> Acsbg1                    0      0.000000     0.6931472     0.6931472
#> Adamts2                   0      0.000000     0.0000000     0.0000000
#> Adamtsl1                  0      0.000000     0.0000000     0.0000000
#>               sample1.67527 sample2.65043 sample2.65044 sample2.65051
#> 2010300C02Rik             0     0.5623351      1.039721     0.5982696
#> Acsbg1                    0     0.0000000      0.000000     0.9502705
#> Adamts2                   0     0.0000000      0.000000     0.0000000
#> Adamtsl1                  0     0.0000000      0.000000     0.0000000
#>               sample2.65055 sample2.65063 sample2.65064 sample2.65067
#> 2010300C02Rik      0.000000     0.6931472             0     0.0000000
#> Acsbg1             1.559581     1.0114043             0     0.6931472
#> Adamts2            0.000000     0.0000000             0     0.0000000
#> Adamtsl1           0.000000     0.0000000             0     0.0000000
#>               sample2.65070 sample2.65071
#> 2010300C02Rik     0.0000000     0.0000000
#> Acsbg1            0.5623351     0.6365142
#> Adamts2           0.0000000     0.0000000
#> Adamtsl1          0.0000000     0.0000000
#> 
#> $superconcentric
#>               sample1.67500 sample1.67512 sample1.67515 sample1.67521
#> 2010300C02Rik     1.5607104       1.05492     0.6931472     0.0000000
#> Acsbg1            0.6931472       0.00000     0.6931472     0.6931472
#> Adamts2           0.0000000       0.00000     0.0000000     0.0000000
#> Adamtsl1          0.0000000       0.00000     0.0000000     0.0000000
#>               sample1.67527 sample2.65043 sample2.65044 sample2.65051
#> 2010300C02Rik     0.0000000      1.039721      1.386294      1.549826
#> Acsbg1            0.6931472      0.000000      0.000000      1.054920
#> Adamts2           0.0000000      0.000000      0.000000      0.000000
#> Adamtsl1          0.0000000      0.000000      0.000000      0.000000
#>               sample2.65055 sample2.65063 sample2.65064 sample2.65067
#> 2010300C02Rik     0.6931472     0.6931472      1.386294     0.0000000
#> Acsbg1            1.3208883     1.3296613      0.000000     0.6931472
#> Adamts2           0.0000000     0.0000000      0.000000     0.0000000
#> Adamtsl1          0.0000000     0.0000000      0.000000     0.0000000
#>               sample2.65070 sample2.65071
#> 2010300C02Rik     0.6931472     0.0000000
#> Acsbg1            0.6931472     0.6365142
#> Adamts2           0.0000000     0.0000000
#> Adamtsl1          0.0000000     0.0000000

2.4 Create SingleCellExperiment Object

This function EntropySingleCellExperiment() Convert the entropy matrix into a SingleCellExperiment object.

se = SpatialFeatures::EntropySingleCellExperiment(ent, me)
se
#> class: SingleCellExperiment 
#> dim: 178 14 
#> metadata(0):
#> assays(4): subsector subconcentric supersector superconcentric
#> rownames(178): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> rowData names(0):
#> colnames(14): sample1.67500 sample1.67512 ... sample2.65070
#>   sample2.65071
#> colData names(5): Cell Sample_id x_central y_central boundaries
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):

If you also want the gene counts, you can include the parameter

se = EntropySingleCellExperiment(ent, me, includeCounts = TRUE)
se
#> class: SingleCellExperiment 
#> dim: 178 14 
#> metadata(0):
#> assays(5): subsector subconcentric supersector superconcentric
#>   genecount
#> rownames(178): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> rowData names(0):
#> colnames(14): sample1.67500 sample1.67512 ... sample2.65070
#>   sample2.65071
#> colData names(5): Cell Sample_id x_central y_central boundaries
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):

2.5 Inspect the SpatialFeatures SingleCellExperiment Object

Check the structure and content of the SingleCellExperiment object:

se
#> class: SingleCellExperiment 
#> dim: 178 14 
#> metadata(0):
#> assays(5): subsector subconcentric supersector superconcentric
#>   genecount
#> rownames(178): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
#> rowData names(0):
#> colnames(14): sample1.67500 sample1.67512 ... sample2.65070
#>   sample2.65071
#> colData names(5): Cell Sample_id x_central y_central boundaries
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
colData(se)
#> DataFrame with 14 rows and 5 columns
#>                        Cell   Sample_id x_central y_central
#>                 <character> <character> <numeric> <numeric>
#> sample1.67500 sample1.67500     sample1   4896.19   6415.14
#> sample1.67512 sample1.67512     sample1   4909.74   6411.80
#> sample1.67515 sample1.67515     sample1   4912.42   6402.93
#> sample1.67521 sample1.67521     sample1   4916.01   6415.75
#> sample1.67527 sample1.67527     sample1   4923.68   6414.74
#> ...                     ...         ...       ...       ...
#> sample2.65063 sample2.65063     sample2   4928.03   6415.25
#> sample2.65064 sample2.65064     sample2   4896.37   6406.56
#> sample2.65067 sample2.65067     sample2   4923.71   6406.08
#> sample2.65070 sample2.65070     sample2   4902.14   6395.17
#> sample2.65071 sample2.65071     sample2   4920.63   6398.99
#>                                                    boundaries
#>                                                        <AsIs>
#> sample1.67500 4896.85:6413.46,4895.15:6413.89,4894.30:6417.71
#> sample1.67512 4901.95:6417.07,4901.52:6418.56,4900.89:6419.41
#> sample1.67515 4916.40:6412.61,4912.15:6420.05,4910.45:6430.89
#> sample1.67521 4911.51:6397.95,4907.48:6400.71,4904.07:6407.30
#> sample1.67527 4929.57:6408.36,4924.69:6409.85,4921.29:6409.85
#> ...                                                       ...
#> sample2.65063 4904.71:6399.65,4898.76:6401.35,4894.30:6407.94
#> sample2.65064 4905.77:6404.11,4906.20:6407.73,4907.05:6412.19
#> sample2.65067 4908.96:6395.61,4904.93:6399.44,4905.56:6403.26
#> sample2.65070 4919.59:6408.36,4915.98:6411.34,4915.76:6411.98
#> sample2.65071 4922.35:6404.54,4919.80:6407.94,4920.23:6413.46
rowData(se)
#> DataFrame with 178 rows and 0 columns

3 Finish

sessionInfo()
#> R version 4.4.1 (2024-06-14)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#> 
#> Matrix products: default
#> BLAS:   /media/volume/teran2_disk/biocbuild/bbs-3.20-bioc/R/lib/libRblas.so 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: America/New_York
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
#> [8] methods   base     
#> 
#> other attached packages:
#>  [1] ggplot2_3.5.1               SpatialFeatures_0.99.1     
#>  [3] purrr_1.0.2                 dplyr_1.1.4                
#>  [5] SingleCellExperiment_1.27.2 SummarizedExperiment_1.35.1
#>  [7] Biobase_2.65.1              GenomicRanges_1.57.1       
#>  [9] GenomeInfoDb_1.41.1         IRanges_2.39.2             
#> [11] S4Vectors_0.43.2            BiocGenerics_0.51.1        
#> [13] MatrixGenerics_1.17.0       matrixStats_1.4.1          
#> [15] MoleculeExperiment_1.5.1    BiocStyle_2.33.1           
#> 
#> loaded via a namespace (and not attached):
#>  [1] tidyselect_1.2.1         EBImage_4.47.0           farver_2.1.2            
#>  [4] bitops_1.0-8             fastmap_1.2.0            RCurl_1.98-1.16         
#>  [7] digest_0.6.37            lifecycle_1.0.4          terra_1.7-78            
#> [10] magrittr_2.0.3           compiler_4.4.1           rlang_1.1.4             
#> [13] sass_0.4.9               tools_4.4.1              utf8_1.2.4              
#> [16] yaml_2.3.10              data.table_1.16.0        knitr_1.48              
#> [19] S4Arrays_1.5.8           labeling_0.4.3           htmlwidgets_1.6.4       
#> [22] bit_4.5.0                DelayedArray_0.31.11     abind_1.4-8             
#> [25] BiocParallel_1.39.0      withr_3.0.1              grid_4.4.1              
#> [28] fansi_1.0.6              colorspace_2.1-1         scales_1.3.0            
#> [31] tinytex_0.53             cli_3.6.3                rmarkdown_2.28          
#> [34] crayon_1.5.3             generics_0.1.3           httr_1.4.7              
#> [37] rjson_0.2.23             cachem_1.1.0             zlibbioc_1.51.1         
#> [40] BiocManager_1.30.25      XVector_0.45.0           tiff_0.1-12             
#> [43] vctrs_0.6.5              Matrix_1.7-0             jsonlite_1.8.9          
#> [46] bookdown_0.40            fftwtools_0.9-11         bit64_4.5.2             
#> [49] jpeg_0.1-10              magick_2.8.5             locfit_1.5-9.10         
#> [52] jquerylib_0.1.4          glue_1.7.0               codetools_0.2-20        
#> [55] gtable_0.3.5             UCSC.utils_1.1.0         munsell_0.5.1           
#> [58] tibble_3.2.1             pillar_1.9.0             htmltools_0.5.8.1       
#> [61] GenomeInfoDbData_1.2.12  R6_2.5.1                 evaluate_1.0.0          
#> [64] lattice_0.22-6           highr_0.11               png_0.1-8               
#> [67] SpatialExperiment_1.15.1 bslib_0.8.0              Rcpp_1.0.13             
#> [70] SparseArray_1.5.39       xfun_0.47                pkgconfig_2.0.3