Chapter 6 Muraro human pancreas (CEL-seq)

6.1 Introduction

This performs an analysis of the Muraro et al. (2016) CEL-seq dataset, consisting of human pancreas cells from various donors.

6.3 Quality control

This dataset lacks mitochondrial genes so we will do without. For the one batch that seems to have a high proportion of low-quality cells, we compute an appropriate filter threshold using a shared median and MAD from the other batches (Figure 6.1).

Distribution of each QC metric across cells from each donor in the Muraro pancreas dataset. Each point represents a cell and is colored according to whether that cell was discarded.

Figure 6.1: Distribution of each QC metric across cells from each donor in the Muraro pancreas dataset. Each point represents a cell and is colored according to whether that cell was discarded.

We have a look at the causes of removal:

##              low_lib_size            low_n_features high_altexps_ERCC_percent 
##                       663                       700                       738 
##                   discard 
##                       773

6.4 Normalization

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.088   0.541   0.821   1.000   1.211  13.987
Relationship between the library size factors and the deconvolution size factors in the Muraro pancreas dataset.

Figure 6.2: Relationship between the library size factors and the deconvolution size factors in the Muraro pancreas dataset.

6.6 Data integration

We use the proportion of variance lost as a diagnostic measure:

##           D28      D29      D30     D31
## [1,] 0.060847 0.024121 0.000000 0.00000
## [2,] 0.002646 0.003018 0.062421 0.00000
## [3,] 0.003449 0.002641 0.002598 0.08162

6.8 Clustering

Heatmap of the frequency of cells from each cell type label in each cluster.

Figure 6.4: Heatmap of the frequency of cells from each cell type label in each cluster.

##        Donor
## Cluster D28 D29 D30 D31
##      1  104   6  57 112
##      2   59  21  77  97
##      3   12  75  64  43
##      4   28 149 126 120
##      5   87 261 277 214
##      6   21   7  54  26
##      7    1   6   6  37
##      8    6   6   5   2
##      9   11  68   5  30
##      10   4   2   5   8
Obligatory $t$-SNE plots of the Muraro pancreas dataset. Each point represents a cell that is colored by cluster (left) or batch (right).

Figure 6.5: Obligatory \(t\)-SNE plots of the Muraro pancreas dataset. Each point represents a cell that is colored by cluster (left) or batch (right).

Session Info

R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.5 LTS

Matrix products: default
BLAS:   /home/biocbuild/bbs-3.16-bioc/R/lib/libRblas.so
LAPACK: /home/biocbuild/bbs-3.16-bioc/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB              LC_COLLATE=C              
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] pheatmap_1.0.12             batchelor_1.14.0           
 [3] scran_1.26.0                scater_1.26.0              
 [5] ggplot2_3.3.6               scuttle_1.8.0              
 [7] ensembldb_2.22.0            AnnotationFilter_1.22.0    
 [9] GenomicFeatures_1.50.0      AnnotationDbi_1.60.0       
[11] AnnotationHub_3.6.0         BiocFileCache_2.6.0        
[13] dbplyr_2.2.1                scRNAseq_2.11.0            
[15] SingleCellExperiment_1.20.0 SummarizedExperiment_1.28.0
[17] Biobase_2.58.0              GenomicRanges_1.50.0       
[19] GenomeInfoDb_1.34.0         IRanges_2.32.0             
[21] S4Vectors_0.36.0            BiocGenerics_0.44.0        
[23] MatrixGenerics_1.10.0       matrixStats_0.62.0         
[25] BiocStyle_2.26.0            rebook_1.8.0               

loaded via a namespace (and not attached):
  [1] igraph_1.3.5                  lazyeval_0.2.2               
  [3] BiocParallel_1.32.0           digest_0.6.30                
  [5] htmltools_0.5.3               viridis_0.6.2                
  [7] fansi_1.0.3                   magrittr_2.0.3               
  [9] memoise_2.0.1                 ScaledMatrix_1.6.0           
 [11] cluster_2.1.4                 limma_3.54.0                 
 [13] Biostrings_2.66.0             prettyunits_1.1.1            
 [15] colorspace_2.0-3              blob_1.2.3                   
 [17] rappdirs_0.3.3                ggrepel_0.9.1                
 [19] xfun_0.34                     dplyr_1.0.10                 
 [21] crayon_1.5.2                  RCurl_1.98-1.9               
 [23] jsonlite_1.8.3                graph_1.76.0                 
 [25] glue_1.6.2                    gtable_0.3.1                 
 [27] zlibbioc_1.44.0               XVector_0.38.0               
 [29] DelayedArray_0.24.0           BiocSingular_1.14.0          
 [31] scales_1.2.1                  edgeR_3.40.0                 
 [33] DBI_1.1.3                     Rcpp_1.0.9                   
 [35] viridisLite_0.4.1             xtable_1.8-4                 
 [37] progress_1.2.2                dqrng_0.3.0                  
 [39] bit_4.0.4                     rsvd_1.0.5                   
 [41] ResidualMatrix_1.8.0          metapod_1.6.0                
 [43] httr_1.4.4                    RColorBrewer_1.1-3           
 [45] dir.expiry_1.6.0              ellipsis_0.3.2               
 [47] pkgconfig_2.0.3               XML_3.99-0.12                
 [49] farver_2.1.1                  CodeDepends_0.6.5            
 [51] sass_0.4.2                    locfit_1.5-9.6               
 [53] utf8_1.2.2                    labeling_0.4.2               
 [55] tidyselect_1.2.0              rlang_1.0.6                  
 [57] later_1.3.0                   munsell_0.5.0                
 [59] BiocVersion_3.16.0            tools_4.2.1                  
 [61] cachem_1.0.6                  cli_3.4.1                    
 [63] generics_0.1.3                RSQLite_2.2.18               
 [65] ExperimentHub_2.6.0           evaluate_0.17                
 [67] stringr_1.4.1                 fastmap_1.1.0                
 [69] yaml_2.3.6                    knitr_1.40                   
 [71] bit64_4.0.5                   purrr_0.3.5                  
 [73] KEGGREST_1.38.0               sparseMatrixStats_1.10.0     
 [75] mime_0.12                     xml2_1.3.3                   
 [77] biomaRt_2.54.0                compiler_4.2.1               
 [79] beeswarm_0.4.0                filelock_1.0.2               
 [81] curl_4.3.3                    png_0.1-7                    
 [83] interactiveDisplayBase_1.36.0 statmod_1.4.37               
 [85] tibble_3.1.8                  bslib_0.4.0                  
 [87] stringi_1.7.8                 highr_0.9                    
 [89] bluster_1.8.0                 lattice_0.20-45              
 [91] ProtGenerics_1.30.0           Matrix_1.5-1                 
 [93] vctrs_0.5.0                   pillar_1.8.1                 
 [95] lifecycle_1.0.3               BiocManager_1.30.19          
 [97] jquerylib_0.1.4               BiocNeighbors_1.16.0         
 [99] cowplot_1.1.1                 bitops_1.0-7                 
[101] irlba_2.3.5.1                 httpuv_1.6.6                 
[103] rtracklayer_1.58.0            R6_2.5.1                     
[105] BiocIO_1.8.0                  bookdown_0.29                
[107] promises_1.2.0.1              gridExtra_2.3                
[109] vipor_0.4.5                   codetools_0.2-18             
[111] assertthat_0.2.1              rjson_0.2.21                 
[113] withr_2.5.0                   GenomicAlignments_1.34.0     
[115] Rsamtools_2.14.0              GenomeInfoDbData_1.2.9       
[117] parallel_4.2.1                hms_1.1.2                    
[119] grid_4.2.1                    beachmat_2.14.0              
[121] rmarkdown_2.17                DelayedMatrixStats_1.20.0    
[123] Rtsne_0.16                    shiny_1.7.3                  
[125] ggbeeswarm_0.6.0              restfulr_0.0.15              

References

Muraro, M. J., G. Dharmadhikari, D. Grun, N. Groen, T. Dielen, E. Jansen, L. van Gurp, et al. 2016. “A Single-Cell Transcriptome Atlas of the Human Pancreas.” Cell Syst 3 (4): 385–94.