iSEEde 1.5.0
Differential expression results are generally reported as tables of statistics, including (log2) fold-change, p-value, average expression, etc.
Those statistics being reported for individual features (e.g., genes), the
rowData()
component of SummarizedExperiment()
objects provides a natural
home for that information.
Specifically, iSEEde stores and retrieves differential expression results in rowData(se)[["iSEEde"]]
.
However, the rowData()
function should not be used to access or edit that information.
Instead, the functions embedContrastResults()
and contrastResults()
, should be used to store and retrieve contrast results, respectively, as those functions ensure that feature names are kept synchronised with the enclosing SummarizedExperiment
object.
Moreover, the function contrastResultsNames()
can be used to retrieve the names of contrast available in a given SummarizedExperiment
object.
The first challenge arises when differential expression statistics are computed only for a subset of features.
In that case, embedContrastResults()
populates missing information with NA
values.
The second challenge arises from the different names of columns used by individual differential expression methods to store differential expression common statistics. To address this, iSEEde provides S4 classes creating a common interface to supported differential expression methods. Each class of differential expression results implements the following methods:
pValue()
returns the vector of raw p-values.log2FoldChange()
returns the vector of log2-fold-change values.averageLog2()
returns the vector of average log2-expression values.In this example, we use the ?airway
data set.
We briefly adjust the reference level of the treatment factor to the untreated condition.
library("airway")
data("airway")
airway$dex <- relevel(airway$dex, "untrt")
We first use edgeR::filterByExpr()
to remove genes whose counts are too low to
support a rigorous differential expression analysis. Then we run a standard
Limma-Voom analysis using edgeR::voomLmFit()
, limma::makeContrasts()
, and
limma::eBayes()
. (Alternatively, we could have used limma::treat()
instead
of limma::eBayes()
.)
The linear model includes the dex
and cell
covariates, indicating the
treatment conditions and cell types, respectively. Here, we are interested in
differences between treatments, adjusted by cell type, and define this
comparison as the dextrt - dexuntrt
contrast.
The final differential expression results are fetched using limma::topTable()
.
library("edgeR")
design <- model.matrix(~ 0 + dex + cell, data = colData(airway))
keep <- filterByExpr(airway, design)
fit <- voomLmFit(airway[keep, ], design, plot = FALSE)
contr <- makeContrasts("dextrt - dexuntrt", levels = design)
fit <- contrasts.fit(fit, contr)
fit <- eBayes(fit)
res_limma <- topTable(fit, sort.by = "P", n = Inf)
head(res_limma)
#> gene_id gene_name entrezid gene_biotype gene_seq_start gene_seq_end
#> ENSG00000165995 ENSG00000165995 CACNB2 NA protein_coding 18429606 18830798
#> ENSG00000120129 ENSG00000120129 DUSP1 NA protein_coding 172195093 172198198
#> ENSG00000146250 ENSG00000146250 PRSS35 NA protein_coding 84222194 84235423
#> ENSG00000189221 ENSG00000189221 MAOA NA protein_coding 43515467 43606068
#> ENSG00000157214 ENSG00000157214 STEAP2 NA protein_coding 89796904 89867451
#> ENSG00000152583 ENSG00000152583 SPARCL1 NA protein_coding 88394487 88452213
#> seq_name seq_strand seq_coord_system symbol logFC AveExpr t
#> ENSG00000165995 10 1 NA CACNB2 3.205606 3.682244 37.68303
#> ENSG00000120129 5 -1 NA DUSP1 2.864778 6.644455 28.50569
#> ENSG00000146250 6 1 NA PRSS35 -2.828184 3.224885 -28.10830
#> ENSG00000189221 X 1 NA MAOA 3.256085 5.950559 27.66135
#> ENSG00000157214 7 1 NA STEAP2 1.894559 6.790009 27.40834
#> ENSG00000152583 4 -1 NA SPARCL1 4.489009 4.166904 27.34820
#> P.Value adj.P.Val B
#> ENSG00000165995 1.115938e-10 1.881472e-06 14.45518
#> ENSG00000120129 1.148539e-09 4.309046e-06 13.01074
#> ENSG00000146250 1.291089e-09 4.309046e-06 12.44376
#> ENSG00000189221 1.475547e-09 4.309046e-06 12.75540
#> ENSG00000157214 1.592916e-09 4.309046e-06 12.71210
#> ENSG00000152583 1.622327e-09 4.309046e-06 12.33402
Then, we embed this set of differential expression results in the airway
object using the embedContrastResults()
method.
Because the output of limma::topTable()
is a standard data.frame
, the
class=
argument must be used to manually identify the method that produced
those results.
Supported classes are listed in the object iSEEde::embedContrastResultsMethods
, i.e.
library(iSEEde)
embedContrastResultsMethods
#> limma
#> "iSEELimmaResults"
This manual method is preferable to any automated heuristic (e.g. using the
column names of the data.frame
for identifying it as a limma::topTable()
output).
The results embedded in the airway
object can be accessed using the contrastResults()
function.
airway <- embedContrastResults(res_limma, airway, name = "Limma-Voom", class = "limma")
contrastResults(airway)
#> DataFrame with 63677 rows and 1 column
#> Limma-Voom
#> <iSEELimmaResults>
#> ENSG00000000003 <iSEELimmaResults>
#> ENSG00000000005 <iSEELimmaResults>
#> ENSG00000000419 <iSEELimmaResults>
#> ENSG00000000457 <iSEELimmaResults>
#> ENSG00000000460 <iSEELimmaResults>
#> ... ...
#> ENSG00000273489 <iSEELimmaResults>
#> ENSG00000273490 <iSEELimmaResults>
#> ENSG00000273491 <iSEELimmaResults>
#> ENSG00000273492 <iSEELimmaResults>
#> ENSG00000273493 <iSEELimmaResults>
contrastResults(airway, "Limma-Voom")
#> iSEELimmaResults with 63677 rows and 16 columns
#> gene_id gene_name entrezid gene_biotype gene_seq_start gene_seq_end
#> <character> <character> <integer> <character> <integer> <integer>
#> ENSG00000000003 ENSG00000000003 TSPAN6 NA protein_coding 99883667 99894988
#> ENSG00000000005 NA NA NA NA NA NA
#> ENSG00000000419 ENSG00000000419 DPM1 NA protein_coding 49551404 49575092
#> ENSG00000000457 ENSG00000000457 SCYL3 NA protein_coding 169818772 169863408
#> ENSG00000000460 ENSG00000000460 C1orf112 NA protein_coding 169631245 169823221
#> ... ... ... ... ... ... ...
#> ENSG00000273489 NA NA NA NA NA NA
#> ENSG00000273490 NA NA NA NA NA NA
#> ENSG00000273491 NA NA NA NA NA NA
#> ENSG00000273492 NA NA NA NA NA NA
#> ENSG00000273493 NA NA NA NA NA NA
#> seq_name seq_strand seq_coord_system symbol logFC AveExpr t
#> <character> <integer> <integer> <character> <numeric> <numeric> <numeric>
#> ENSG00000000003 X -1 NA TSPAN6 -0.464216 5.02559 -6.589673
#> ENSG00000000005 NA NA NA NA NA NA NA
#> ENSG00000000419 20 -1 NA DPM1 0.125077 4.60191 1.632802
#> ENSG00000000457 1 -1 NA SCYL3 -0.042107 3.47269 -0.438929
#> ENSG00000000460 1 1 NA C1orf112 -0.228517 1.40857 -0.977961
#> ... ... ... ... ... ... ... ...
#> ENSG00000273489 NA NA NA NA NA NA NA
#> ENSG00000273490 NA NA NA NA NA NA NA
#> ENSG00000273491 NA NA NA NA NA NA NA
#> ENSG00000273492 NA NA NA NA NA NA NA
#> ENSG00000273493 NA NA NA NA NA NA NA
#> P.Value adj.P.Val B
#> <numeric> <numeric> <numeric>
#> ENSG00000000003 0.000136603 0.00171857 0.899306
#> ENSG00000000005 NA NA NA
#> ENSG00000000419 0.139297605 0.25279527 -6.274792
#> ENSG00000000457 0.671768770 0.77787936 -7.230811
#> ENSG00000000460 0.355374953 0.49644090 -6.208897
#> ... ... ... ...
#> ENSG00000273489 NA NA NA
#> ENSG00000273490 NA NA NA
#> ENSG00000273491 NA NA NA
#> ENSG00000273492 NA NA NA
#> ENSG00000273493 NA NA NA
First, we use DESeqDataSet()
to construct a DESeqDataSet
object
for the analysis. Then we run a standard DESeq2
analysis using DESeq()
.
The differential expression results are fetched using results()
.
library("DESeq2")
dds <- DESeqDataSet(airway, ~ 0 + dex + cell)
dds <- DESeq(dds)
res_deseq2 <- results(dds, contrast = list("dextrt", "dexuntrt"))
head(res_deseq2)
#> log2 fold change (MLE): dextrt vs dexuntrt
#> Wald test p-value: dextrt vs dexuntrt
#> DataFrame with 6 rows and 6 columns
#> baseMean log2FoldChange lfcSE stat pvalue padj
#> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
#> ENSG00000000003 708.602170 -0.3812539 0.100654 -3.787752 0.000152016 0.00128292
#> ENSG00000000005 0.000000 NA NA NA NA NA
#> ENSG00000000419 520.297901 0.2068127 0.112219 1.842944 0.065337213 0.19646961
#> ENSG00000000457 237.163037 0.0379205 0.143445 0.264356 0.791505314 0.91141884
#> ENSG00000000460 57.932633 -0.0881679 0.287142 -0.307054 0.758802543 0.89500551
#> ENSG00000000938 0.318098 -1.3782416 3.499906 -0.393794 0.693733216 NA
Then, we embed this set of differential expression results in the airway
object using the embedContrastResults()
method.
In this instance, the DESeq2 results are stored in a
recognisable ?DESeqResults
object, which can be given as is directly to the
embedContrastResults()
method.
The function will automatically pass the results object to the
iSEEDESeq2Results()
constructor, to identify it as such.
The results embedded in the airway object can be accessed using the contrastResults()
function.
airway <- embedContrastResults(res_deseq2, airway, name = "DESeq2")
contrastResults(airway)
#> DataFrame with 63677 rows and 2 columns
#> Limma-Voom DESeq2
#> <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000000003 <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000000005 <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000000419 <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000000457 <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000000460 <iSEELimmaResults> <iSEEDESeq2Results>
#> ... ... ...
#> ENSG00000273489 <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000273490 <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000273491 <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000273492 <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000273493 <iSEELimmaResults> <iSEEDESeq2Results>
contrastResults(airway, "DESeq2")
#> iSEEDESeq2Results with 63677 rows and 6 columns
#> baseMean log2FoldChange lfcSE stat pvalue padj
#> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
#> ENSG00000000003 708.6022 -0.3812539 0.100654 -3.787752 0.000152016 0.00128292
#> ENSG00000000005 0.0000 NA NA NA NA NA
#> ENSG00000000419 520.2979 0.2068127 0.112219 1.842944 0.065337213 0.19646961
#> ENSG00000000457 237.1630 0.0379205 0.143445 0.264356 0.791505314 0.91141884
#> ENSG00000000460 57.9326 -0.0881679 0.287142 -0.307054 0.758802543 0.89500551
#> ... ... ... ... ... ... ...
#> ENSG00000273489 0.275899 1.483744 3.51398 0.422240 0.672850 NA
#> ENSG00000273490 0.000000 NA NA NA NA NA
#> ENSG00000273491 0.000000 NA NA NA NA NA
#> ENSG00000273492 0.105978 -0.463688 3.52312 -0.131613 0.895290 NA
#> ENSG00000273493 0.106142 -0.521372 3.53142 -0.147638 0.882628 NA
We run a standard edgeR analysis using glmFit()
and
glmLRT()
.
The differential expression results are fetched using topTags()
.
library("edgeR")
design <- model.matrix(~ 0 + dex + cell, data = colData(airway))
fit <- glmFit(airway, design, dispersion = 0.1)
lrt <- glmLRT(fit, contrast = c(-1, 1, 0, 0, 0))
res_edger <- topTags(lrt, n = Inf)
head(res_edger)
#> Coefficient: -1*dexuntrt 1*dextrt
#> gene_id gene_name entrezid gene_biotype gene_seq_start
#> ENSG00000109906 ENSG00000109906 ZBTB16 NA protein_coding 113930315
#> ENSG00000179593 ENSG00000179593 ALOX15B NA protein_coding 7942335
#> ENSG00000127954 ENSG00000127954 STEAP4 NA protein_coding 87905744
#> ENSG00000152583 ENSG00000152583 SPARCL1 NA protein_coding 88394487
#> ENSG00000250978 ENSG00000250978 RP11-357D18.1 NA processed_transcript 66759637
#> ENSG00000163884 ENSG00000163884 KLF15 NA protein_coding 126061478
#> gene_seq_end seq_name seq_strand seq_coord_system symbol
#> ENSG00000109906 114121398 11 1 NA ZBTB16
#> ENSG00000179593 7952452 17 1 NA ALOX15B
#> ENSG00000127954 87936206 7 -1 NA STEAP4
#> ENSG00000152583 88452213 4 -1 NA SPARCL1
#> ENSG00000250978 66771420 5 -1 NA RP11-357D18.1
#> ENSG00000163884 126076285 3 -1 NA KLF15
#> iSEEde.Limma.Voom.gene_id iSEEde.Limma.Voom.gene_name iSEEde.Limma.Voom.entrezid
#> ENSG00000109906 ENSG00000109906 ZBTB16 NA
#> ENSG00000179593 ENSG00000179593 ALOX15B NA
#> ENSG00000127954 ENSG00000127954 STEAP4 NA
#> ENSG00000152583 ENSG00000152583 SPARCL1 NA
#> ENSG00000250978 ENSG00000250978 RP11-357D18.1 NA
#> ENSG00000163884 ENSG00000163884 KLF15 NA
#> iSEEde.Limma.Voom.gene_biotype iSEEde.Limma.Voom.gene_seq_start
#> ENSG00000109906 protein_coding 113930315
#> ENSG00000179593 protein_coding 7942335
#> ENSG00000127954 protein_coding 87905744
#> ENSG00000152583 protein_coding 88394487
#> ENSG00000250978 processed_transcript 66759637
#> ENSG00000163884 protein_coding 126061478
#> iSEEde.Limma.Voom.gene_seq_end iSEEde.Limma.Voom.seq_name
#> ENSG00000109906 114121398 11
#> ENSG00000179593 7952452 17
#> ENSG00000127954 87936206 7
#> ENSG00000152583 88452213 4
#> ENSG00000250978 66771420 5
#> ENSG00000163884 126076285 3
#> iSEEde.Limma.Voom.seq_strand iSEEde.Limma.Voom.seq_coord_system
#> ENSG00000109906 1 NA
#> ENSG00000179593 1 NA
#> ENSG00000127954 -1 NA
#> ENSG00000152583 -1 NA
#> ENSG00000250978 -1 NA
#> ENSG00000163884 -1 NA
#> iSEEde.Limma.Voom.symbol iSEEde.Limma.Voom.logFC iSEEde.Limma.Voom.AveExpr
#> ENSG00000109906 ZBTB16 7.069003 1.3807218
#> ENSG00000179593 ALOX15B 7.988755 -1.4798754
#> ENSG00000127954 STEAP4 5.184088 1.6193300
#> ENSG00000152583 SPARCL1 4.489009 4.1669039
#> ENSG00000250978 RP11-357D18.1 6.031081 -0.7521822
#> ENSG00000163884 KLF15 4.433096 3.2401058
#> iSEEde.Limma.Voom.t iSEEde.Limma.Voom.P.Value iSEEde.Limma.Voom.adj.P.Val
#> ENSG00000109906 23.94855 4.892822e-09 4.852528e-06
#> ENSG00000179593 21.82342 1.743170e-06 1.076551e-04
#> ENSG00000127954 20.01478 2.162145e-08 1.012605e-05
#> ENSG00000152583 27.34820 1.622327e-09 4.309046e-06
#> ENSG00000250978 16.32481 1.154612e-07 2.187276e-05
#> ENSG00000163884 22.89074 7.117995e-09 5.425717e-06
#> iSEEde.Limma.Voom.B iSEEde.DESeq2.baseMean iSEEde.DESeq2.log2FoldChange
#> ENSG00000109906 8.652823 385.07103 7.352629
#> ENSG00000179593 2.986534 67.24305 9.505983
#> ENSG00000127954 9.354944 286.38412 5.207161
#> ENSG00000152583 12.334016 997.43977 4.574919
#> ENSG00000250978 6.074174 56.31819 6.327386
#> ENSG00000163884 10.885003 561.10717 4.459129
#> iSEEde.DESeq2.lfcSE iSEEde.DESeq2.stat iSEEde.DESeq2.pvalue iSEEde.DESeq2.padj
#> ENSG00000109906 0.5363902 13.707612 9.141717e-43 2.256376e-40
#> ENSG00000179593 1.0545033 9.014654 1.974926e-19 1.252965e-17
#> ENSG00000127954 0.4930839 10.560396 4.547384e-26 5.057701e-24
#> ENSG00000152583 0.1840561 24.856114 2.220777e-136 4.001396e-132
#> ENSG00000250978 0.6777980 9.335209 1.007922e-20 7.206645e-19
#> ENSG00000163884 0.2348638 18.986019 2.225778e-80 2.506504e-77
#> logFC logCPM LR PValue FDR
#> ENSG00000109906 7.183385 4.132638 238.3947 8.805179e-54 5.606874e-49
#> ENSG00000179593 10.015847 1.627629 181.0331 2.883024e-41 9.179116e-37
#> ENSG00000127954 5.087069 3.672567 146.9725 7.957020e-34 1.688931e-29
#> ENSG00000152583 4.498698 5.510213 140.2205 2.382274e-32 3.792402e-28
#> ENSG00000250978 6.128131 1.377260 137.4681 9.526183e-32 1.213198e-27
#> ENSG00000163884 4.367962 4.681216 129.2203 6.069471e-30 6.441428e-26
Then, we embed this set of differential expression results in the airway
object using the embedContrastResults()
method.
In this instance, the edgeR results are stored in a
recognisable ?TopTags
object. As such, the results object can be given as is
directly to the embedContrastResults()
method. The function will automatically pass
the results object to the iSEEedgeRResults()
constructor, to identify it as
such.
The results embedded in the airway object can be accessed using the contrastResults()
function.
airway <- embedContrastResults(res_edger, airway, name = "edgeR")
contrastResults(airway)
#> DataFrame with 63677 rows and 3 columns
#> Limma-Voom DESeq2 edgeR
#> <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000000003 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000000005 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000000419 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000000457 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000000460 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ... ... ... ...
#> ENSG00000273489 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000273490 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000273491 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000273492 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000273493 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
contrastResults(airway, "edgeR")
#> iSEEedgeRResults with 63677 rows and 5 columns
#> logFC logCPM LR PValue FDR
#> <numeric> <numeric> <numeric> <numeric> <numeric>
#> ENSG00000000003 -0.4628153 5.05930 2.018481 0.155394 1
#> ENSG00000000005 0.0000000 -3.45546 0.000000 1.000000 1
#> ENSG00000000419 0.1247724 4.60783 0.146545 0.701860 1
#> ENSG00000000457 -0.0445216 3.48326 0.018241 0.892565 1
#> ENSG00000000460 -0.1618126 1.48518 0.210342 0.646500 1
#> ... ... ... ... ... ...
#> ENSG00000273489 2.48209 -3.28549 3.02143 0.082171 1
#> ENSG00000273490 0.00000 -3.45546 0.00000 1.000000 1
#> ENSG00000273491 0.00000 -3.45546 0.00000 1.000000 1
#> ENSG00000273492 -1.24012 -3.36894 0.91097 0.339857 1
#> ENSG00000273493 -1.75243 -3.36862 1.57193 0.209928 1
In this example, we used the iSEEde functions DETable()
, VolcanoPlot()
, and MAPlot()
to add panels that facilitate the visualisation of differential expression results in an iSEE app.
Specifically, we add one set of panels for each differential expression method used in this vignette (i.e., Limma-Voom, DESeq2, edgeR).
library(iSEE)
app <- iSEE(airway, initial = list(
DETable(ContrastName="Limma-Voom", HiddenColumns = c("AveExpr",
"t", "B"), PanelWidth = 4L),
VolcanoPlot(ContrastName = "Limma-Voom", PanelWidth = 4L),
MAPlot(ContrastName = "Limma-Voom", PanelWidth = 4L),
DETable(ContrastName="DESeq2", HiddenColumns = c("baseMean",
"lfcSE", "stat"), PanelWidth = 4L),
VolcanoPlot(ContrastName = "DESeq2", PanelWidth = 4L),
MAPlot(ContrastName = "DESeq2", PanelWidth = 4L),
DETable(ContrastName="edgeR", HiddenColumns = c("logCPM",
"LR"), PanelWidth = 4L),
VolcanoPlot(ContrastName = "edgeR", PanelWidth = 4L),
MAPlot(ContrastName = "edgeR", PanelWidth = 4L)
))
if (interactive()) {
shiny::runApp(app)
}