Chapter 17 Cell cycle assignment
On occasion, it can be desirable to determine cell cycle activity from scRNA-seq data. In and of itself, the distribution of cells across phases of the cell cycle is not usually informative, but we can use this to determine if there are differences in proliferation between subpopulations or across treatment conditions. Many of the key events in the cell cycle (e.g., passage through checkpoints) are driven by post-translational mechanisms and thus not directly visible in transcriptomic data; nonetheless, there are enough changes in expression that can be exploited to determine cell cycle phase. We demonstrate using the 416B dataset, which is known to contain actively cycling cells after oncogene induction.
#--- loading ---# library(scRNAseq) sce.416b <- LunSpikeInData(which="416b") sce.416b$block <- factor(sce.416b$block) #--- gene-annotation ---# library(AnnotationHub) ens.mm.v97 <- AnnotationHub()[["AH73905"]] rowData(sce.416b)$ENSEMBL <- rownames(sce.416b) rowData(sce.416b)$SYMBOL <- mapIds(ens.mm.v97, keys=rownames(sce.416b), keytype="GENEID", column="SYMBOL") rowData(sce.416b)$SEQNAME <- mapIds(ens.mm.v97, keys=rownames(sce.416b), keytype="GENEID", column="SEQNAME") library(scater) rownames(sce.416b) <- uniquifyFeatureNames(rowData(sce.416b)$ENSEMBL, rowData(sce.416b)$SYMBOL) #--- quality-control ---# mito <- which(rowData(sce.416b)$SEQNAME=="MT") stats <- perCellQCMetrics(sce.416b, subsets=list(Mt=mito)) qc <- quickPerCellQC(stats, percent_subsets=c("subsets_Mt_percent", "altexps_ERCC_percent"), batch=sce.416b$block) sce.416b <- sce.416b[,!qc$discard] #--- normalization ---# library(scran) sce.416b <- computeSumFactors(sce.416b) sce.416b <- logNormCounts(sce.416b) #--- variance-modelling ---# dec.416b <- modelGeneVarWithSpikes(sce.416b, "ERCC", block=sce.416b$block) chosen.hvgs <- getTopHVGs(dec.416b, prop=0.1) #--- batch-correction ---# library(limma) assay(sce.416b, "corrected") <- removeBatchEffect(logcounts(sce.416b), design=model.matrix(~sce.416b$phenotype), batch=sce.416b$block) #--- dimensionality-reduction ---# sce.416b <- runPCA(sce.416b, ncomponents=10, subset_row=chosen.hvgs, exprs_values="corrected", BSPARAM=BiocSingular::ExactParam()) set.seed(1010) sce.416b <- runTSNE(sce.416b, dimred="PCA", perplexity=10) #--- clustering ---# my.dist <- dist(reducedDim(sce.416b, "PCA")) my.tree <- hclust(my.dist, method="ward.D2") library(dynamicTreeCut) my.clusters <- unname(cutreeDynamic(my.tree, distM=as.matrix(my.dist), minClusterSize=10, verbose=0)) colLabels(sce.416b) <- factor(my.clusters)
## class: SingleCellExperiment ## dim: 46604 185 ## metadata(0): ## assays(3): counts logcounts corrected ## rownames(46604): 4933401J01Rik Gm26206 ... CAAA01147332.1 ## CBFB-MYH11-mcherry ## rowData names(4): Length ENSEMBL SYMBOL SEQNAME ## colnames(185): SLX-9555.N701_S502.C89V9ANXX.s_1.r_1 ## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 ... ## SLX-11312.N712_S507.H5H5YBBXX.s_8.r_1 ## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1 ## colData names(11): Source Name cell line ... sizeFactor label ## reducedDimNames(2): PCA TSNE ## altExpNames(2): ERCC SIRV
17.2 Using the cyclins
The cyclins control progression through the cell cycle and have well-characterized patterns of expression across cell cycle phases. Cyclin D is expressed throughout but peaks at G1; cyclin E is expressed highest in the G1/S transition; cyclin A is expressed across S and G2; and cyclin B is expressed highest in late G2 and mitosis (Morgan 2007). The expression of cyclins can help to determine the relative cell cycle activity in each cluster (Figure 17.1). For example, most cells in cluster 1 are likely to be in G1 while the other clusters are scattered across the later phases.
##  "Ccnb3" "Ccna2" "Ccna1" "Ccne2" "Ccnd2" "Ccne1" "Ccnd1" "Ccnb2" "Ccnb1" ##  "Ccnd3"
We quantify these observations with standard DE methods (Chapter 11) to test for upregulation of each cyclin between clusters, which would imply that a subpopulation contains more cells in the corresponding cell cycle phase. The same logic applies to comparisons between treatment conditions as described in Chapter 14. For example, we can infer that cluster 4 has the highest proportion of cells in the S and G2 phases based on higher expression of cyclins A2 and B1, respectively.
## DataFrame with 10 rows and 7 columns ## Top p.value FDR summary.AUC AUC.1 AUC.2 ## <integer> <numeric> <numeric> <numeric> <numeric> <numeric> ## Ccna2 1 4.47082e-09 4.47082e-08 0.996337 0.996337 0.641822 ## Ccnd1 1 2.27713e-04 5.69283e-04 0.822981 0.368132 0.822981 ## Ccnb1 1 1.19027e-07 5.95137e-07 0.949634 0.949634 0.519669 ## Ccnb2 2 3.87799e-07 1.29266e-06 0.934066 0.934066 0.781573 ## Ccna1 4 2.96992e-02 5.93985e-02 0.535714 0.535714 0.495342 ## Ccne2 5 6.56983e-02 1.09497e-01 0.641941 0.641941 0.447205 ## Ccne1 6 5.85979e-01 8.37113e-01 0.564103 0.564103 0.366460 ## Ccnd3 7 9.94578e-01 1.00000e+00 0.402930 0.402930 0.283644 ## Ccnd2 8 9.99993e-01 1.00000e+00 0.306548 0.134615 0.327122 ## Ccnb3 10 1.00000e+00 1.00000e+00 0.500000 0.500000 0.500000 ## AUC.3 ## <numeric> ## Ccna2 0.925595 ## Ccnd1 0.776786 ## Ccnb1 0.934524 ## Ccnb2 0.898810 ## Ccna1 0.535714 ## Ccne2 0.455357 ## Ccne1 0.473214 ## Ccnd3 0.273810 ## Ccnd2 0.306548 ## Ccnb3 0.500000
While straightforward to implement and interpret, this approach assumes that cyclin expression is unaffected by biological processes other than the cell cycle. This is a strong assumption in highly heterogeneous populations where cyclins may perform cell-type-specific roles. For example, using the Grun HSC dataset (Grun et al. 2016), we see an upregulation of cyclin D2 in sorted HSCs (Figure 17.2) that is consistent with a particular reliance on D-type cyclins in these cells (Steinman 2002; Kozar et al. 2004). Similar arguments apply to other genes with annotated functions in cell cycle, e.g., from relevant Gene Ontology terms.
#--- data-loading ---# library(scRNAseq) sce.grun.hsc <- GrunHSCData(ensembl=TRUE) #--- gene-annotation ---# library(AnnotationHub) ens.mm.v97 <- AnnotationHub()[["AH73905"]] anno <- select(ens.mm.v97, keys=rownames(sce.grun.hsc), keytype="GENEID", columns=c("SYMBOL", "SEQNAME")) rowData(sce.grun.hsc) <- anno[match(rownames(sce.grun.hsc), anno$GENEID),] #--- quality-control ---# library(scuttle) stats <- perCellQCMetrics(sce.grun.hsc) qc <- quickPerCellQC(stats, batch=sce.grun.hsc$protocol, subset=grepl("sorted", sce.grun.hsc$protocol)) sce.grun.hsc <- sce.grun.hsc[,!qc$discard] #--- normalization ---# library(scran) set.seed(101000110) clusters <- quickCluster(sce.grun.hsc) sce.grun.hsc <- computeSumFactors(sce.grun.hsc, clusters=clusters) sce.grun.hsc <- logNormCounts(sce.grun.hsc) #--- variance-modelling ---# set.seed(00010101) dec.grun.hsc <- modelGeneVarByPoisson(sce.grun.hsc) top.grun.hsc <- getTopHVGs(dec.grun.hsc, prop=0.1) #--- dimensionality-reduction ---# set.seed(101010011) sce.grun.hsc <- denoisePCA(sce.grun.hsc, technical=dec.grun.hsc, subset.row=top.grun.hsc) sce.grun.hsc <- runTSNE(sce.grun.hsc, dimred="PCA") #--- clustering ---# snn.gr <- buildSNNGraph(sce.grun.hsc, use.dimred="PCA") colLabels(sce.grun.hsc) <- factor(igraph::cluster_walktrap(snn.gr)$membership)
# Switching the row names for a nicer plot. rownames(sce.grun.hsc) <- uniquifyFeatureNames(rownames(sce.grun.hsc), rowData(sce.grun.hsc)$SYMBOL) cyclin.genes <- grep("^Ccn[abde][0-9]$", rowData(sce.grun.hsc)$SYMBOL) cyclin.genes <- rownames(sce.grun.hsc)[cyclin.genes] plotHeatmap(sce.grun.hsc, order_columns_by="label", cluster_rows=FALSE, features=sort(cyclin.genes), colour_columns_by="protocol")
Admittedly, this is merely a symptom of a more fundamental issue - that the cell cycle is not independent of the other processes that are occurring in a cell. This will be a recurring theme throughout the chapter, which suggests that cell cycle inferences are best used in comparisons between closely related cell types where there are fewer changes elsewhere that might interfere with interpretation.
17.3 Using reference profiles
Cell cycle assignment can be considered a specialized case of cell annotation, which suggests that the strategies described in Chapter 12 can also be applied here. Given a reference dataset containing cells of known cell cycle phase, we could use methods like SingleR to determine the phase of each cell in a test dataset. We demonstrate on a reference of mouse ESCs from Buettner et al. (2015) that were sorted by cell cycle phase prior to scRNA-seq.
## class: SingleCellExperiment ## dim: 38293 288 ## metadata(0): ## assays(2): counts logcounts ## rownames(38293): ENSMUSG00000000001 ENSMUSG00000000003 ... ## ENSMUSG00000097934 ENSMUSG00000097935 ## rowData names(3): EnsemblTranscriptID AssociatedGeneName GeneLength ## colnames(288): G1_cell1_count G1_cell2_count ... G2M_cell95_count ## G2M_cell96_count ## colData names(2): phase sizeFactor ## reducedDimNames(0): ## altExpNames(1): ERCC
We will restrict the annotation process to a subset of genes with a priori known roles in cell cycle. This aims to avoid detecting markers for other biological processes that happen to be correlated with the cell cycle in the reference dataset, which would reduce classification performance if those processes are absent or uncorrelated in the test dataset.
## chr [1:2830] "ENSMUSG00000026842" "ENSMUSG00000026842" ...
We use the
SingleR() function to assign labels to the 416B data based on the cell cycle phases in the ESC reference.
Cluster 1 mostly consists of G1 cells while the other clusters have more cells in the other phases, which is broadly consistent with our conclusions from the cyclin-based analysis.
Unlike the cyclin-based analysis, this approach yields “absolute” assignments of cell cycle phase that do not need to be interpreted relative to other cells in the same dataset.
# Switching row names back to Ensembl to match the reference. test.data <- logcounts(sce.416b) rownames(test.data) <- rowData(sce.416b)$ENSEMBL library(SingleR) assignments <- SingleR(test.data, ref=sce.ref, label=sce.ref$phase, de.method="wilcox", restrict=cycle.anno) tab <- table(assignments$labels, colLabels(sce.416b)) tab
## ## 1 2 3 4 ## G1 71 5 18 1 ## G2M 2 60 2 13 ## S 5 4 4 0
The key assumption here is that the cell cycle effect is orthogonal to other aspects of biological heterogeneity like cell type. This justifies the use of a reference involving cell types that are quite different from the cells in the test dataset, provided that the cell cycle transcriptional program is conserved across datasets (Bertoli, Skotheim, and Bruin 2013; Conboy et al. 2007). However, it is not difficult to find holes in this reasoning - for example, Lef1 is detected as one of the top markers to distinguish between G1 from G2/M in the reference but has no detectable expression in the 416B dataset (Figure 17.3). More generally, non-orthogonality can introduce biases where, e.g., one cell type is consistently misclassified as being in a particular phase because it happens to be more similar to that phase’s profile in the reference.
Thus, a healthy dose of skepticism is required when interpreting these assignments. Our hope is that any systematic assignment error is consistent across clusters and conditions such that they cancel out in comparisons of phase frequencies, which is the more interesting analysis anyway. Indeed, while the availability of absolute phase calls may be more appealing, it may not make much practical difference to the conclusions if the frequencies are ultimately interpreted in a relative sense (e.g., using a chi-squared test).
## ## Pearson's Chi-squared test ## ## data: tab[, 1:2] ## X-squared = 112, df = 2, p-value <2e-16
17.4 Using the
The method described by Scialdone et al. (2015) is yet another approach for classifying cells into cell cycle phases.
Using a reference dataset, we first compute the sign of the difference in expression between each pair of genes.
Pairs with changes in the sign across cell cycle phases are chosen as markers.
Cells in a test dataset can then be classified into the appropriate phase, based on whether the observed sign for each marker pair is consistent with one phase or another.
This approach is implemented in the
cyclone() function from the scran package, which also contains pre-trained set of marker pairs for mouse and human data.
The phase assignment result for each cell in the 416B dataset is shown in Figure 17.4. For each cell, a higher score for a phase corresponds to a higher probability that the cell is in that phase. We focus on the G1 and G2/M scores as these are the most informative for classification.
Cells are classified as being in G1 phase if the G1 score is above 0.5 and greater than the G2/M score;
in G2/M phase if the G2/M score is above 0.5 and greater than the G1 score;
and in S phase if neither score is above 0.5.
We see that the results are quite similar to those from
SingleR(), which is reassuring.
## ## 1 2 3 4 ## G1 74 8 20 0 ## G2M 1 48 0 13 ## S 3 13 4 1
The same considerations and caveats described for the SingleR-based approach are also applicable here.
From a practical perspective,
cyclone() takes much longer but does not require an explicit reference as the marker pairs are already computed.
17.5 Removing cell cycle effects
17.5.2 With linear regression and friends
Here, we treat each phase as a separate batch and apply any of the batch correction strategies described in Chapter 28.8.
The most common approach is to use a linear model to simply regress out any effect associated with the assigned phases, as shown below in Figure 17.5 via
Similarly, any functions that support blocking can use the phase assignments as a blocking factor, e.g.,
library(batchelor) dec.nocycle <- modelGeneVarWithSpikes(sce.416b, "ERCC", block=assignments$phases) reg.nocycle <- regressBatches(sce.416b, batch=assignments$phases) set.seed(100011) reg.nocycle <- runPCA(reg.nocycle, exprs_values="corrected", subset_row=getTopHVGs(dec.nocycle, prop=0.1)) # Shape points by induction status. relabel <- c("onco", "WT")[factor(sce.416b$phenotype)] scaled <- scale_shape_manual(values=c(onco=4, WT=16)) gridExtra::grid.arrange( plotPCA(sce.416b, colour_by=I(assignments$phases), shape_by=I(relabel)) + ggtitle("Before") + scaled, plotPCA(reg.nocycle, colour_by=I(assignments$phases), shape_by=I(relabel)) + ggtitle("After") + scaled, ncol=2 )
Alternatively, one could regress on the classification scores to account for any ambiguity in assignment.
An example using
cyclone() scores is shown below in Figure 17.6 but the same procedure can be used with any classification step that yields some confidence per label - for example, the correlation-based scores from
design <- model.matrix(~as.matrix(assignments$scores)) dec.nocycle2 <- modelGeneVarWithSpikes(sce.416b, "ERCC", design=design) reg.nocycle2 <- regressBatches(sce.416b, design=design) set.seed(100011) reg.nocycle2 <- runPCA(reg.nocycle2, exprs_values="corrected", subset_row=getTopHVGs(dec.nocycle2, prop=0.1)) plotPCA(reg.nocycle2, colour_by=I(assignments$phases), point_size=3, shape_by=I(relabel)) + scaled
The main assumption of regression is that the cell cycle is consistent across different aspects of cellular heterogeneity (Section 13.4). In particular, we assume that each cell type contains the same distribution of cells across phases as well as a constant magnitude of the cell cycle effect on expression. Violations will lead to incomplete removal or, at worst, overcorrection that introduces spurious signal - even in the absence of any cell cycle effect! For example, if two subpopulations differ in their cell cycle phase distribution, regression will always apply a non-zero adjustment to all DE genes between those subpopulations.
If this type of adjustment is truly necessary, it is safest to apply it separately to the subset of cells in each cluster. This weakens the consistency assumptions as we do not require the same behavior across all cell types in the population. Alternatively, we could use other methods that are more robust to differences in composition (Figure 17.7), though this becomes somewhat complicated if we want to correct for both cell cycle and batch at the same time. Gene-based analyses should use the uncorrected data with blocking where possible (Section 13.8), which provides a sanity check that protects against distortions introduced by the adjustment.
17.5.4 Using contrastive PCA
Alternatively, we might consider a more sophisticated approach called contrastive PCA (Abid et al. 2018). This aims to identify patterns that are enriched in our test dataset - in this case, the 416B data - compared to a control dataset in which cell cycle is the dominant factor of variation. We demonstrate below using the scPCA package (Boileau, Hejazi, and Dudoit 2020) where we use the subset of wild-type 416B cells as our control, based on the expectation that an untreated cell line in culture has little else to do but divide. This yields low-dimensional coordinates in which the cell cycle effect within the oncogene-induced and wild-type groups is reduced without removing the difference between groups (Figure 17.10).
top.hvgs <- getTopHVGs(dec.416b, p=0.1) wild <- sce.416b$phenotype=="wild type phenotype" set.seed(100) library(scPCA) con.out <- scPCA( target=t(logcounts(sce.416b)[top.hvgs,]), background=t(logcounts(sce.416b)[top.hvgs,wild]), penalties=0, n_eigen=10, contrasts=100) # Visualizing the results in a t-SNE. sce.con <- sce.416b reducedDim(sce.con, "cPCA") <- con.out$x sce.con <- runTSNE(sce.con, dimred="cPCA") # Making the labels easier to read. relabel <- c("onco", "WT")[factor(sce.416b$phenotype)] scaled <- scale_color_manual(values=c(onco="red", WT="black")) gridExtra::grid.arrange( plotTSNE(sce.416b, colour_by=I(assignments$phases)) + ggtitle("Before (416b)"), plotTSNE(sce.416b, colour_by=I(relabel)) + scaled, plotTSNE(sce.con, colour_by=I(assignments$phases)) + ggtitle("After (416b)"), plotTSNE(sce.con, colour_by=I(relabel)) + scaled, ncol=2 )
The strength of this approach lies in its ability to accurately remove the cell cycle effect based on its magnitude in the control dataset. This avoids loss of heterogeneity associated with other processes that happen to be correlated with the cell cycle. The requirements for the control dataset are also quite loose - there is no need to know the cell cycle phase of each cell a priori, and indeed, we can manufacture a like-for-like control by subsetting our dataset to a homogeneous cluster in which the only detectable factor of variation is the cell cycle. (See Chapter 41 for another demonstration of cPCA to remove the cell cycle effect.) In fact, any consistent but uninteresting variation can be eliminated in this manner as long as it is captured by the control.
The downside is that the magnitude of variation in the control dataset must accurately reflect that in the test dataset, requiring more care in choosing the former.
As a result, the procedure is more sensitive to quantitative differences between datasets compared to
cyclone() during cell cycle phase assignment.
This makes it difficult to use control datasets from different scRNA-seq technologies or biological systems, as a mismatch in the covariance structure may lead to insufficient or excessive correction.
At worst, any interesting variation that is inadvertently contained in the control will also be removed.
R version 4.0.3 (2020-10-10) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.1 LTS Matrix products: default BLAS: /home/biocbuild/bbs-3.12-bioc/R/lib/libRblas.so LAPACK: /home/biocbuild/bbs-3.12-bioc/R/lib/libRlapack.so locale:  LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C  LC_TIME=en_US.UTF-8 LC_COLLATE=C  LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8  LC_PAPER=en_US.UTF-8 LC_NAME=C  LC_ADDRESS=C LC_TELEPHONE=C  LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages:  parallel stats4 stats graphics grDevices utils datasets  methods base other attached packages:  scPCA_1.4.0 batchelor_1.6.0  bluster_1.0.0 SingleR_1.4.0  org.Mm.eg.db_3.12.0 ensembldb_2.14.0  AnnotationFilter_1.14.0 GenomicFeatures_1.42.0  AnnotationDbi_1.52.0 scRNAseq_2.4.0  scran_1.18.0 scater_1.18.0  ggplot2_3.3.2 SingleCellExperiment_1.12.0  SummarizedExperiment_1.20.0 Biobase_2.50.0  GenomicRanges_1.42.0 GenomeInfoDb_1.26.0  IRanges_2.24.0 S4Vectors_0.28.0  BiocGenerics_0.36.0 MatrixGenerics_1.2.0  matrixStats_0.57.0 BiocStyle_2.18.0  rebook_1.0.0 loaded via a namespace (and not attached):  AnnotationHub_2.22.0 BiocFileCache_1.14.0  igraph_1.2.6 lazyeval_0.2.2  listenv_0.8.0 BiocParallel_1.24.0  digest_0.6.27 htmltools_0.5.0  viridis_0.5.1 magrittr_1.5  memoise_1.1.0 cluster_2.1.0  limma_3.46.0 globals_0.13.1  Biostrings_2.58.0 askpass_1.1  prettyunits_1.1.1 colorspace_1.4-1  blob_1.2.1 rappdirs_0.3.1  rbibutils_1.3 xfun_0.19  dplyr_1.0.2 callr_3.5.1  crayon_1.3.4 RCurl_1.98-1.2  graph_1.68.0 glue_1.4.2  gtable_0.3.0 zlibbioc_1.36.0  XVector_0.30.0 DelayedArray_0.16.0  coop_0.6-2 kernlab_0.9-29  BiocSingular_1.6.0 future.apply_1.6.0  abind_1.4-5 scales_1.1.1  pheatmap_1.0.12 DBI_1.1.0  edgeR_3.32.0 Rcpp_1.0.5  viridisLite_0.3.0 xtable_1.8-4  progress_1.2.2 dqrng_0.2.1  bit_4.0.4 rsvd_1.0.3  ResidualMatrix_1.0.0 httr_1.4.2  RColorBrewer_1.1-2 ellipsis_0.3.1  pkgconfig_2.0.3 XML_3.99-0.5  farver_2.0.3 scuttle_1.0.0  CodeDepends_0.6.5 dbplyr_1.4.4  locfit_1.5-9.4 tidyselect_1.1.0  labeling_0.4.2 rlang_0.4.8  later_126.96.36.199 munsell_0.5.0  BiocVersion_3.12.0 tools_4.0.3  generics_0.1.0 RSQLite_2.2.1  ExperimentHub_1.16.0 evaluate_0.14  stringr_1.4.0 fastmap_1.0.1  yaml_2.2.1 processx_3.4.4  knitr_1.30 bit64_4.0.5  purrr_0.3.4 future_1.19.1  sparseMatrixStats_1.2.0 mime_0.9  origami_1.0.3 xml2_1.3.2  biomaRt_2.46.0 compiler_4.0.3  beeswarm_0.2.3 curl_4.3  interactiveDisplayBase_1.28.0 tibble_3.0.4  statmod_1.4.35 stringi_1.5.3  highr_0.8 ps_1.4.0  RSpectra_0.16-0 lattice_0.20-41  ProtGenerics_1.22.0 Matrix_1.2-18  vctrs_0.3.4 pillar_1.4.6  lifecycle_0.2.0 BiocManager_1.30.10  Rdpack_2.0 BiocNeighbors_1.8.0  data.table_1.13.2 cowplot_1.1.0  bitops_1.0-6 irlba_2.3.3  gbRd_0.4-11 httpuv_1.5.4  rtracklayer_1.50.0 R6_2.5.0  bookdown_0.21 promises_1.1.1  gridExtra_2.3 vipor_0.4.5  codetools_0.2-16 assertthat_0.2.1  openssl_1.4.3 withr_2.3.0  GenomicAlignments_1.26.0 Rsamtools_2.6.0  GenomeInfoDbData_1.2.4 hms_0.5.3  grid_4.0.3 beachmat_2.6.0  rmarkdown_2.5 DelayedMatrixStats_1.12.0  Rtsne_0.15 shiny_1.5.0  ggbeeswarm_0.6.0
Abid, A., M. J. Zhang, V. K. Bagaria, and J. Zou. 2018. “Exploring patterns enriched in a dataset with contrastive principal component analysis.” Nat Commun 9 (1): 2134.
Bertoli, C., J. M. Skotheim, and R. A. de Bruin. 2013. “Control of cell cycle transcription during G1 and S phases.” Nat. Rev. Mol. Cell Biol. 14 (8): 518–28.
Boileau, P., N. S. Hejazi, and S. Dudoit. 2020. “Exploring high-dimensional biological data with sparse contrastive principal component analysis.” Bioinformatics 36 (11): 3422–30.
Buettner, F., K. N. Natarajan, F. P. Casale, V. Proserpio, A. Scialdone, F. J. Theis, S. A. Teichmann, J. C. Marioni, and O. Stegle. 2015. “Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells.” Nat. Biotechnol. 33 (2): 155–60.
Conboy, C. M., C. Spyrou, N. P. Thorne, E. J. Wade, N. L. Barbosa-Morais, M. D. Wilson, A. Bhattacharjee, et al. 2007. “Cell cycle genes are the evolutionarily conserved targets of the E2F4 transcription factor.” PLoS ONE 2 (10): e1061.
Grun, D., M. J. Muraro, J. C. Boisset, K. Wiebrands, A. Lyubimova, G. Dharmadhikari, M. van den Born, et al. 2016. “De Novo Prediction of Stem Cell Identity using Single-Cell Transcriptome Data.” Cell Stem Cell 19 (2): 266–77.
Kozar, K., M. A. Ciemerych, V. I. Rebel, H. Shigematsu, A. Zagozdzon, E. Sicinska, Y. Geng, et al. 2004. “Mouse development and cell proliferation in the absence of D-cyclins.” Cell 118 (4): 477–91.
Leng, N., L. F. Chu, C. Barry, Y. Li, J. Choi, X. Li, P. Jiang, R. M. Stewart, J. A. Thomson, and C. Kendziorski. 2015. “Oscope identifies oscillatory genes in unsynchronized single-cell RNA-seq experiments.” Nat. Methods 12 (10): 947–50.
Morgan, D. O. 2007. The Cell Cycle: Principles of Control. New Science Press.
Richard, A. C., A. T. L. Lun, W. W. Y. Lau, B. Gottgens, J. C. Marioni, and G. M. Griffiths. 2018. “T cell cytolytic capacity is independent of initial stimulation strength.” Nat. Immunol. 19 (8): 849–58.
Roccio, M., D. Schmitter, M. Knobloch, Y. Okawa, D. Sage, and M. P. Lutolf. 2013. “Predicting stem cell fate changes by differential cell cycle progression patterns.” Development 140 (2): 459–70.
Scialdone, A., K. N. Natarajan, L. R. Saraiva, V. Proserpio, S. A. Teichmann, O. Stegle, J. C. Marioni, and F. Buettner. 2015. “Computational assignment of cell-cycle stage from single-cell transcriptome data.” Methods 85 (September): 54–61.
Soufi, A., and S. Dalton. 2016. “Cycling through developmental decisions: how cell cycle dynamics control pluripotency, differentiation and reprogramming.” Development 143 (23): 4301–11.
Steinman, R. A. 2002. “Cell cycle regulators and hematopoiesis.” Oncogene 21 (21): 3403–13.