1 Introduction

Data included in this package was generated by harmonizing all publicly available bulk RNA-seq samples from human primary colorectal tissue available through NCBI’s Sequence Read Archive and database of Genotypes and Phenotypes, NCI’s Genomic Data Commons, and the NIH-funded BarcUVa-Seq project. FASTQ files were downloaded directly or generated from BAM files using biobambam2. Gene expression estimates were generated from FASTQ files using Salmon and aggregated from quant.sf files using tximport. The original processed data are stored as tab-delimited text on Synapse under Synapse ID syn22237139 and packaged as SummarizedExperiment objects on ExperimentHub to facilitate reproduction and extension of results published in Dampier et al. (PMCID: PMC7386360, PMID: 32764205).

2 Installation

The following example demonstrates how to download the dataset from ExperimentHub.

First, make sure the Bioconductor packages BiocManager, SummarizedExperiment, and ExperimentHub are installed:

if (!requireNamespace("BiocManager", quietly=TRUE)) {
    install.packages("BiocManager")
}
BiocManager::install(c("SummarizedExperiment", "ExperimentHub", "FieldEffectCrc"))

Second, load the SummarizedExperiment and ExperimentHub packages:

library(SummarizedExperiment)

## Loading required package: MatrixGenerics

## Loading required package: matrixStats

## 
## Attaching package: 'MatrixGenerics'

## The following objects are masked from 'package:matrixStats':
## 
##     colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
##     colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
##     colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
##     colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
##     colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
##     colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
##     colWeightedMeans, colWeightedMedians, colWeightedSds,
##     colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
##     rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
##     rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
##     rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
##     rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
##     rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
##     rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
##     rowWeightedSds, rowWeightedVars

## Loading required package: GenomicRanges

## Loading required package: stats4

## Loading required package: BiocGenerics

## Loading required package: generics

## 
## Attaching package: 'generics'

## The following objects are masked from 'package:base':
## 
##     as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
##     setequal, union

## 
## Attaching package: 'BiocGenerics'

## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs

## The following objects are masked from 'package:base':
## 
##     Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
##     as.data.frame, basename, cbind, colnames, dirname, do.call,
##     duplicated, eval, evalq, get, grep, grepl, is.unsorted, lapply,
##     mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
##     rank, rbind, rownames, sapply, saveRDS, table, tapply, unique,
##     unsplit, which.max, which.min

## Loading required package: S4Vectors

## 
## Attaching package: 'S4Vectors'

## The following object is masked from 'package:utils':
## 
##     findMatches

## The following objects are masked from 'package:base':
## 
##     I, expand.grid, unname

## Loading required package: IRanges

## Loading required package: GenomeInfoDb

## Loading required package: Biobase

## Welcome to Bioconductor
## 
##     Vignettes contain introductory material; view with
##     'browseVignettes()'. To cite Bioconductor, see
##     'citation("Biobase")', and for packages 'citation("pkgname")'.

## 
## Attaching package: 'Biobase'

## The following object is masked from 'package:MatrixGenerics':
## 
##     rowMedians

## The following objects are masked from 'package:matrixStats':
## 
##     anyMissing, rowMedians

library(ExperimentHub)

## Loading required package: AnnotationHub

## Loading required package: BiocFileCache

## Loading required package: dbplyr

## 
## Attaching package: 'AnnotationHub'

## The following object is masked from 'package:Biobase':
## 
##     cache

Third, find the FieldEffectCrc data objects and look at their descriptions:

hub = ExperimentHub()
## simply list the resource names
ns <- listResources(hub, "FieldEffectCrc")
ns

## [1] "cohort A from Dampier et al." "cohort B from Dampier et al."
## [3] "cohort C from Dampier et al."

## query the hub for full metadata
crcData <- query(hub, "FieldEffectCrc")
crcData

## ExperimentHub with 3 records
## # snapshotDate(): 2025-04-11
## # $dataprovider: Synapse
## # $species: Homo sapiens
## # $rdataclass: SummarizedExperiment
## # additional mcols(): taxonomyid, genome, description,
## #   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## #   rdatapath, sourceurl, sourcetype 
## # retrieve records with, e.g., 'object[["EH3524"]]' 
## 
##            title                       
##   EH3524 | cohort A from Dampier et al.
##   EH3525 | cohort B from Dampier et al.
##   EH3526 | cohort C from Dampier et al.

## extract metadata
df <- mcols(crcData)
df

## DataFrame with 3 rows and 15 columns
##                         title dataprovider      species taxonomyid      genome
##                   <character>  <character>  <character>  <integer> <character>
## EH3524 cohort A from Dampie..      Synapse Homo sapiens       9606      GRCh38
## EH3525 cohort B from Dampie..      Synapse Homo sapiens       9606      GRCh38
## EH3526 cohort C from Dampie..      Synapse Homo sapiens       9606      GRCh38
##                   description coordinate_1_based             maintainer
##                   <character>          <integer>            <character>
## EH3524 Salmon-generated tra..                  1 Chris Dampier <chd5n..
## EH3525 Salmon-generated tra..                  1 Chris Dampier <chd5n..
## EH3526 Salmon-generated tra..                  1 Chris Dampier <chd5n..
##        rdatadateadded  preparerclass
##           <character>    <character>
## EH3524     2020-07-10 FieldEffectCrc
## EH3525     2020-07-10 FieldEffectCrc
## EH3526     2020-07-10 FieldEffectCrc
##                                                  tags           rdataclass
##                                                <AsIs>          <character>
## EH3524 ExperimentData,ReproducibleResearch,Tissue,... SummarizedExperiment
## EH3525 ExperimentData,ReproducibleResearch,Tissue,... SummarizedExperiment
## EH3526 ExperimentData,ReproducibleResearch,Tissue,... SummarizedExperiment
##                     rdatapath              sourceurl  sourcetype
##                   <character>            <character> <character>
## EH3524 FieldEffectCrc/Field.. https://www.synapse...      tar.gz
## EH3525 FieldEffectCrc/Field.. https://www.synapse...      tar.gz
## EH3526 FieldEffectCrc/Field.. https://www.synapse...      tar.gz

## see where the cache is stored
hc <- hubCache(crcData)
hc

## [1] "/home/biocbuild/.cache/R/ExperimentHub"

Fourth, explore the metadata of a particular file. Let’s look at the data for cohort A:

md <- hub["EH3524"]
md

## ExperimentHub with 1 record
## # snapshotDate(): 2025-04-11
## # names(): EH3524
## # package(): FieldEffectCrc
## # $dataprovider: Synapse
## # $species: Homo sapiens
## # $rdataclass: SummarizedExperiment
## # $rdatadateadded: 2020-07-10
## # $title: cohort A from Dampier et al.
## # $description: Salmon-generated transcript-level abundance estimates summar...
## # $taxonomyid: 9606
## # $genome: GRCh38
## # $sourcetype: tar.gz
## # $sourceurl: https://www.synapse.org/#!Synapse:syn22237139/files/
## # $sourcesize: NA
## # $tags: c("ExperimentData", "ReproducibleResearch", "Tissue",
## #   "Homo_sapiens_Data", "ColonCancerData", "RNASeqData",
## #   "ExpressionData", "ExperimentHub") 
## # retrieve record with 'object[["EH3524"]]'

Fifth, download the data object itself as a SummarizedExperiment:

## using resource ID
data1 <- hub[["EH3524"]]

## see ?FieldEffectCrc and browseVignettes('FieldEffectCrc') for documentation

## loading from cache

data1

## class: SummarizedExperiment 
## dim: 37361 834 
## metadata(4): cohort build assay feature
## assays(3): abundance counts length
## rownames(37361): ENSG00000000003 ENSG00000000005 ... ENSG00000283698
##   ENSG00000283700
## rowData names(0):
## colnames(834): AMC_10 AMC_12 ... VM1421 VM1422
## colData names(27): dirName projId ... percMap data

## using loadResources()
data2 <- loadResources(hub, "FieldEffectCrc", "cohort A")

## see ?FieldEffectCrc and browseVignettes('FieldEffectCrc') for documentation
## loading from cache

data2

## [[1]]
## class: SummarizedExperiment 
## dim: 37361 834 
## metadata(4): cohort build assay feature
## assays(3): abundance counts length
## rownames(37361): ENSG00000000003 ENSG00000000005 ... ENSG00000283698
##   ENSG00000283700
## rowData names(0):
## colnames(834): AMC_10 AMC_12 ... VM1421 VM1422
## colData names(27): dirName projId ... percMap data

Since the data is a SummarizedExperiment, the different phenotypes can be accessed using colData(). The phenotype of interest in the field effect analysis is referred to as sampType:

summary(colData(data1)$sampType)

## CRC HLT NAT 
## 311 462  61

3 Quick Start

To immediately access the count data for the 834 samples in cohort A, load the SummarizedExperiment object for cohort A as demonstrated in the Installation section immediately above. Then use the assays() accessor function to extract the counts assay, which is the second assay in the SummarizedExperiment:

se <- data1
counts1 <- assays(se)[["counts"]]

Note that assay(), the singular as opposed to plural, will by default extract the first assay element from a SummarizedExperiment but will extract the ’i’th assay element with the assay(se, i) syntax:

counts2 <- assay(se, 2)

The two alternatives are equivalent:

all(counts1==counts2)

## [1] TRUE

4 Case Study

We are interested in conducting a differential expression analysis using DESeq2 to identify genes over- and under-expressed in different colorectal tissue phenotypes. Due to substantial technical artifacts between single-end and paired-end library formats observed in the full FieldEffectCrc data set, samples are grouped according to library format, with paired-end samples in cohorts A and B and single-end samples in cohort C. Cohort A is the primary analysis cohort and the one we want to use to discover differentially expressed genes. To begin, load cohort A as demonstrated in the Installation section immediately above if not already loaded.

In the following steps, we will use DESeq2 to perform differential expression analysis on expression data stored in the SummarizedExperiment object for cohort A. We will adjust for latent batch effects with surrogate variable analysis as implemented in the sva package. This analysis would typically consume excess memory and time, so for the purpose of demonstration, we perform the analysis on a small subset of the full data.

4.1 Prepare DESeqDataSet

In order to take advantage of the feature lengths preserved by tximport, we will recreate a tximport object from the SummarizedExperiment object downloaded from ExperimentHub and use the DESeqDataSetFromTximport() function to create the DESeqDataSet object. Alternatively, the SummarizedExperiment object itself could be used to create the DESeqDataSet object using the DESeqDataSet() function, but the order of the assays would need to be re-arranged and the counts would need to be rounded, as the DESeqDataSet() function expects counts with integer values to be the first assay in a SummarizedExperiment object, and abundance with floating-point values is the first assay in the FieldEffectCrc objects to facilitate conversion to tximport objects. Such a re-ordering is facilitated by the FieldEffectCrc::reorder_assays() function. Yet another option is to extract the counts assay as a matrix and use the DESeqDataSetFromMatrix() function to create the DESeqDataSet object. Since there are probably some benefits to feature length normalization as implemented in DESeq2 when such information is available, we demonstrate creation of the tximport object.

First, make the tximport object, which is just an object of class list from base R:

## manual construction
txi <- as.list(assays(se))
txi$countsFromAbundance <- "no"
## convenience function construction
txi <- make_txi(se)

Next, we could assign the clinical annotations from colData(se) to an S4Vectors::DataFrame or a base::data.frame object, since that is what DESeqDataSetFromTximport() expects for the colData argument. However, the colData slot of a SummarizedExperiment object already stores the clinical annotations as an S4Vectors::DataFrame, so we are ready to create the DESeqDataSet object:

if (!requireNamespace("DESeq2", quietly=TRUE)) {
    BiocManager::install("DESeq2")
}
library(DESeq2)
dds <- DESeqDataSetFromTximport(txi, colData(se), ~ sampType)

## using counts and average transcript lengths from tximport

Note that we are interested in expression differences between phenotypes, which are coded in the sampType field, so we include sampType in our preliminary design.

4.2 Filter Genes

Normally, we would pre-filter genes to select genes whose quantities are likely to be estimated accurately. We could use something like the following code, which selects for genes with a count greater than 10 RNA molecules in at least a third of all samples:

countLimit <- 10
sampleLimit <- (1/3) * ncol(dds)
keep <- rowSums( counts(dds) > countLimit ) >= sampleLimit
dds <- dds[keep, ]

However, to expedite subsequent computational steps in this demonstration, we will select a random subset of genes and samples for analysis and then filter out any gene with a count less than 10 across all samples:

r <- sample(seq_len(nrow(dds)), 6000)
c <- sample(seq_len(ncol(dds)), 250)
dds <- dds[r, c]
r <- rowSums(counts(dds)) >= 10
dds <- dds[r, ]

4.3 Estimate Size Factors and Dispersions, Fit Negative Binomial Models, and Perform Wald Test

Amazingly, DESeq2 estimates size factors (i.e. normalization factors based on library size and feature length) and dispersions for every gene, fits negative binomial models for every gene, and performs a Wald test for every gene in a single function:

dds <- DESeq(dds)

## estimating size factors

## using 'avgTxLength' from assays(dds), correcting for library size

## estimating dispersions

## gene-wise dispersion estimates

## mean-dispersion relationship

## final dispersion estimates

## fitting model and testing

## -- replacing outliers and refitting for 172 genes
## -- DESeq argument 'minReplicatesForReplace' = 7 
## -- original counts are preserved in counts(dds)

## estimating dispersions

## fitting model and testing

Pretty incredible. Unfortunately, we have to do some more work before we can fit useful models. The size factor estimates are the key result of the DESeq() function at this stage.

4.4 Estimate Latent Factors

If we trusted our high-throughput assays to be perfect representations of the underlying biology they were designed to measure, we would be ready to test for differential expression. However, our dataset is composed of samples from several studies collected and processed in various ways over several years, so we have reason to suspect there are latent factors driving non-biological variation in the expression data. Fortunately, we can use sva to estimate the latent factors and include them in our DESeq2 design.

For a full tutorial on how to use the sva package, see the package vignette. For this demonstration, we will provide only cursory motivations for the code.

4.4.1 Prepare Inputs

The sva functions will take the normalized count data as well as model matrices as inputs. The full model matrix should include the variables of interest (e.g. sampType) as well as adjustment variables (e.g. sex, age), which we do not use here. The null model matrix should include adjustment variables only (i.e. no variables of interest). Without any adjustment variables, the null model includes an intercept term alone.

dat <- counts(dds, normalized=TRUE)  ## extract the normalized counts
mod <- model.matrix(~sampType, data=colData(dds))  ## set the full model
mod0 <- model.matrix(~1, data=colData(dds))  ## set the null model

4.4.2 Identify Number of Factors to Estimate

The sva package offers two approaches for estimating the number of surrogate variables that should be included in a differential expression model, the Buja Eyuboglu and Leek approaches. Buja Eyuboglu is the default approach.

if (!requireNamespace("sva", quietly=TRUE)) {
    BiocManager::install("sva")
}
library(sva)

## Loading required package: mgcv

## Loading required package: nlme

## 
## Attaching package: 'nlme'

## The following object is masked from 'package:IRanges':
## 
##     collapse

## This is mgcv 1.9-3. For overview type 'help("mgcv-package")'.

## Loading required package: genefilter

## 
## Attaching package: 'genefilter'

## The following objects are masked from 'package:MatrixGenerics':
## 
##     rowSds, rowVars

## The following objects are masked from 'package:matrixStats':
## 
##     rowSds, rowVars

## Loading required package: BiocParallel

nsv <- num.sv(dat, mod, method=c("be"), B=20, seed=1)  ## Buja Eyuboglu method

4.4.3 Estimate Factors

The crucial function for estimating surrogate variables in RNA-seq data is svaseq():

svs <- svaseq(dat, mod, mod0, n.sv=nsv)

## Number of significant surrogate variables is:  2 
## Iteration (out of 5 ):1  2  3  4  5

The output is a list of 4 elements, the first of which is the latent factor estimate with a vector for each factor. The name of this element is sv.

4.4.4 Append Latent Factors to colData

We want to include the latent factors as adjustment variables in our statistical testing. Here, we add the surrogate variables to colData and update the DESeqDataSet design formula:

for (i in seq_len(svs$n.sv)) {
    newvar <- paste0("sv", i)
    colData(dds)[ , newvar] <- svs$sv[, i]
}
nvidx <- (ncol(colData(dds)) - i + 1):ncol(colData(dds))
newvars <- colnames(colData(dds))[nvidx]
d <- formula(
    paste0("~", paste(paste(newvars, collapse="+"), "sampType", sep="+"))
)
design(dds) <- d

4.5 Re-Fit Negative Binomial Models and Perform Wald Tests with Surrogate Variables

Now that we have the surrogate variables in the DESeqDataSet, we can perform the differential expression analysis we set out to do:

dds <- DESeq(dds)

## using pre-existing normalization factors

## estimating dispersions

## found already estimated dispersions, replacing these

## gene-wise dispersion estimates

## mean-dispersion relationship

## final dispersion estimates

## fitting model and testing

4.6 Extract Results

We want to extract results from each of the following 3 comparisons:

HLT vs NAT
NAT vs CRC
HLT vs CRC

With three categories under consideration, the DESeq() function call tests the first level against the other two, but the second and third levels are not tested against each other. (Note that the order of the levels is simply alphabetical if they are not explicitly set.) Furthermore, the results reported with the results() function are for the comparison of the first level against the last by default. Therefore, in order to extract all the results of interest, we set contrasts as follows:

cons <- list()
m <- combn(levels(colData(dds)$sampType), 2)
for (i in seq_len(ncol(m))) {
    cons[[i]] <- c("sampType", rev(m[, i]))
    names(cons) <- c(
        names(cons)[seq_len(length(cons) - 1)], paste(rev(m[, i]), collapse="v")
    )
}
res <- list()
for (i in seq_len(length(cons))) {
    res[[i]] <- results(dds, contrast=cons[[i]], alpha=0.05)  ## default alpha is 0.1
}
names(res) <- names(cons)

4.7 Display Results

Let’s take a look at the results:

lapply(res, head)

## $HLTvCRC
## log2 fold change (MLE): sampType HLT vs CRC 
## Wald test p-value: sampType HLT vs CRC 
## DataFrame with 6 rows and 6 columns
##                  baseMean log2FoldChange     lfcSE       stat      pvalue
##                 <numeric>      <numeric> <numeric>  <numeric>   <numeric>
## ENSG00000165152 1117.7216     0.18702804 0.1064053  1.7576945 7.87995e-02
## ENSG00000138760 5911.8964     0.00173643 0.0672054  0.0258377 9.79387e-01
## ENSG00000013375 1728.1676     0.38791418 0.0617351  6.2835243 3.30982e-10
## ENSG00000131591  525.0567     0.19802510 0.0668200  2.9635610 3.04102e-03
## ENSG00000122390 1936.8930    -0.02400930 0.0406404 -0.5907737 5.54672e-01
## ENSG00000182774   70.6546    -1.47707753 0.1678451 -8.8002406 1.36523e-18
##                        padj
##                   <numeric>
## ENSG00000165152 8.88883e-02
## ENSG00000138760 9.81191e-01
## ENSG00000013375 5.43783e-10
## ENSG00000131591 3.76053e-03
## ENSG00000122390 5.76482e-01
## ENSG00000182774 2.84226e-18
## 
## $NATvCRC
## log2 fold change (MLE): sampType NAT vs CRC 
## Wald test p-value: sampType NAT vs CRC 
## DataFrame with 6 rows and 6 columns
##                  baseMean log2FoldChange     lfcSE      stat    pvalue
##                 <numeric>      <numeric> <numeric> <numeric> <numeric>
## ENSG00000165152 1117.7216     -0.2287110 0.2114159 -1.081806  0.279339
## ENSG00000138760 5911.8964      0.0798261 0.1335377  0.597779  0.549987
## ENSG00000013375 1728.1676     -0.1824956 0.1226409 -1.488049  0.136738
## ENSG00000131591  525.0567     -0.0767989 0.1324196 -0.579966  0.561937
## ENSG00000122390 1936.8930     -0.0485426 0.0804872 -0.603109  0.546436
## ENSG00000182774   70.6546     -0.2382187 0.3299348 -0.722018  0.470284
##                      padj
##                 <numeric>
## ENSG00000165152  0.842816
## ENSG00000138760  0.947608
## ENSG00000013375  0.710118
## ENSG00000131591  0.949115
## ENSG00000122390  0.947303
## ENSG00000182774  0.931344
## 
## $NATvHLT
## log2 fold change (MLE): sampType NAT vs HLT 
## Wald test p-value: sampType NAT vs HLT 
## DataFrame with 6 rows and 6 columns
##                  baseMean log2FoldChange     lfcSE      stat      pvalue
##                 <numeric>      <numeric> <numeric> <numeric>   <numeric>
## ENSG00000165152 1117.7216     -0.4157390 0.2074506 -2.004039 4.50659e-02
## ENSG00000138760 5911.8964      0.0780896 0.1310273  0.595980 5.51189e-01
## ENSG00000013375 1728.1676     -0.5704098 0.1203669 -4.738926 2.14854e-06
## ENSG00000131591  525.0567     -0.2748240 0.1300547 -2.113142 3.45886e-02
## ENSG00000122390 1936.8930     -0.0245332 0.0790524 -0.310342 7.56301e-01
## ENSG00000182774   70.6546      1.2388588 0.3248116  3.814084 1.36689e-04
##                        padj
##                   <numeric>
## ENSG00000165152 6.09960e-02
## ENSG00000138760 5.97494e-01
## ENSG00000013375 4.78161e-06
## ENSG00000131591 4.75445e-02
## ENSG00000122390 7.86994e-01
## ENSG00000182774 2.54226e-04

lapply(res, summary)

## 
## out of 6000 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up)       : 2500, 42%
## LFC < 0 (down)     : 2697, 45%
## outliers [1]       : 0, 0%
## low counts [2]     : 0, 0%
## (mean count < 7)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
## 
## 
## out of 6000 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up)       : 12, 0.2%
## LFC < 0 (down)     : 31, 0.52%
## outliers [1]       : 0, 0%
## low counts [2]     : 0, 0%
## (mean count < 7)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
## 
## 
## out of 6000 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up)       : 2241, 37%
## LFC < 0 (down)     : 2132, 36%
## outliers [1]       : 0, 0%
## low counts [2]     : 0, 0%
## (mean count < 7)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results

## $HLTvCRC
## NULL
## 
## $NATvCRC
## NULL
## 
## $NATvHLT
## NULL

5 sessionInfo()

sessionInfo()

## R version 4.5.0 beta (2025-04-02 r88102)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.2 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.22-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] sva_3.57.0                  BiocParallel_1.43.0        
##  [3] genefilter_1.91.0           mgcv_1.9-3                 
##  [5] nlme_3.1-168                DESeq2_1.49.0              
##  [7] FieldEffectCrc_1.19.0       ExperimentHub_2.17.0       
##  [9] AnnotationHub_3.17.0        BiocFileCache_2.17.0       
## [11] dbplyr_2.5.0                SummarizedExperiment_1.39.0
## [13] Biobase_2.69.0              GenomicRanges_1.61.0       
## [15] GenomeInfoDb_1.45.0         IRanges_2.43.0             
## [17] S4Vectors_0.47.0            BiocGenerics_0.55.0        
## [19] generics_0.1.3              MatrixGenerics_1.21.0      
## [21] matrixStats_1.5.0           BiocStyle_2.37.0           
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_1.2.1        dplyr_1.1.4             blob_1.2.4             
##  [4] filelock_1.0.3          Biostrings_2.77.0       fastmap_1.2.0          
##  [7] XML_3.99-0.18           digest_0.6.37           mime_0.13              
## [10] lifecycle_1.0.4         statmod_1.5.0           survival_3.8-3         
## [13] KEGGREST_1.49.0         RSQLite_2.3.9           magrittr_2.0.3         
## [16] compiler_4.5.0          rlang_1.1.6             sass_0.4.10            
## [19] tools_4.5.0             yaml_2.3.10             knitr_1.50             
## [22] S4Arrays_1.9.0          bit_4.6.0               curl_6.2.2             
## [25] DelayedArray_0.35.0     abind_1.4-8             withr_3.0.2            
## [28] purrr_1.0.4             grid_4.5.0              xtable_1.8-4           
## [31] colorspace_2.1-1        edgeR_4.7.0             ggplot2_3.5.2          
## [34] scales_1.3.0            cli_3.6.4               rmarkdown_2.29         
## [37] crayon_1.5.3            httr_1.4.7              RUnit_0.4.33           
## [40] DBI_1.2.3               cachem_1.1.0            splines_4.5.0          
## [43] parallel_4.5.0          AnnotationDbi_1.71.0    BiocManager_1.30.25    
## [46] XVector_0.49.0          vctrs_0.6.5             Matrix_1.7-3           
## [49] jsonlite_2.0.0          bookdown_0.43           bit64_4.6.0-1          
## [52] locfit_1.5-9.12         limma_3.65.0            jquerylib_0.1.4        
## [55] annotate_1.87.0         glue_1.8.0              codetools_0.2-20       
## [58] gtable_0.3.6            BiocVersion_3.22.0      UCSC.utils_1.5.0       
## [61] munsell_0.5.1           tibble_3.2.1            pillar_1.10.2          
## [64] rappdirs_0.3.3          htmltools_0.5.8.1       GenomeInfoDbData_1.2.14
## [67] R6_2.6.1                evaluate_1.0.3          lattice_0.22-7         
## [70] png_0.1-8               memoise_2.0.1           bslib_0.9.0            
## [73] Rcpp_1.0.14             SparseArray_1.9.0       xfun_0.52              
## [76] pkgconfig_2.0.3

Tumor, tumor-adjacent normal, and healthy colorectal transcriptomes as a SummarizedExperiment on Bioconductor’s ExperimentHub

17 April 2025

Package

Contents