1 Implementation

1.1 User-facing storage and access

Differential expression results are generally reported as tables of statistics, including (log2) fold-change, p-value, average expression, etc.

Those statistics being reported for individual features (e.g., genes), the rowData() component of SummarizedExperiment() objects provides a natural home for that information. Specifically, iSEEde stores and retrieves differential expression results in rowData(se)[["iSEEde"]]. However, the rowData() function should not be used to access or edit that information. Instead, the functions embedContrastResults() and contrastResults(), should be used to store and retrieve contrast results, respectively, as those functions ensure that feature names are kept synchronised with the enclosing SummarizedExperiment object.

Moreover, the function contrastResultsNames() can be used to retrieve the names of contrast available in a given SummarizedExperiment object.

1.2 Additional considerations

The first challenge arises when differential expression statistics are computed only for a subset of features. In that case, embedContrastResults() populates missing information with NA values.

The second challenge arises from the different names of columns used by individual differential expression methods to store differential expression common statistics. To address this, iSEEde provides S4 classes creating a common interface to supported differential expression methods. Each class of differential expression results implements the following methods:

  • pValue() returns the vector of raw p-values.
  • log2FoldChange() returns the vector of log2-fold-change values.
  • averageLog2() returns the vector of average log2-expression values.

2 Example data

In this example, we use the ?airway data set.

We briefly adjust the reference level of the treatment factor to the untreated condition.

library("airway")
data("airway")
airway$dex <- relevel(airway$dex, "untrt")

3 Supported methods

3.1 Limma

We first use edgeR::filterByExpr() to remove genes whose counts are too low to support a rigorous differential expression analysis. Then we run a standard Limma-Voom analysis using edgeR::voomLmFit(), limma::makeContrasts(), and limma::eBayes(). (Alternatively, we could have used limma::treat() instead of limma::eBayes().)

The linear model includes the dex and cell covariates, indicating the treatment conditions and cell types, respectively. Here, we are interested in differences between treatments, adjusted by cell type, and define this comparison as the dextrt - dexuntrt contrast.

The final differential expression results are fetched using limma::topTable().

library("edgeR")

design <- model.matrix(~ 0 + dex + cell, data = colData(airway))

keep <- filterByExpr(airway, design)
fit <- voomLmFit(airway[keep, ], design, plot = FALSE)
contr <- makeContrasts("dextrt - dexuntrt", levels = design)
fit <- contrasts.fit(fit, contr)
fit <- eBayes(fit)
res_limma <- topTable(fit, sort.by = "P", n = Inf)
head(res_limma)
#>                         gene_id gene_name entrezid   gene_biotype gene_seq_start gene_seq_end
#> ENSG00000165995 ENSG00000165995    CACNB2       NA protein_coding       18429606     18830798
#> ENSG00000120129 ENSG00000120129     DUSP1       NA protein_coding      172195093    172198198
#> ENSG00000146250 ENSG00000146250    PRSS35       NA protein_coding       84222194     84235423
#> ENSG00000189221 ENSG00000189221      MAOA       NA protein_coding       43515467     43606068
#> ENSG00000157214 ENSG00000157214    STEAP2       NA protein_coding       89796904     89867451
#> ENSG00000152583 ENSG00000152583   SPARCL1       NA protein_coding       88394487     88452213
#>                 seq_name seq_strand seq_coord_system  symbol     logFC  AveExpr         t
#> ENSG00000165995       10          1               NA  CACNB2  3.205606 3.682244  37.68303
#> ENSG00000120129        5         -1               NA   DUSP1  2.864778 6.644455  28.50569
#> ENSG00000146250        6          1               NA  PRSS35 -2.828184 3.224885 -28.10830
#> ENSG00000189221        X          1               NA    MAOA  3.256085 5.950559  27.66135
#> ENSG00000157214        7          1               NA  STEAP2  1.894559 6.790009  27.40834
#> ENSG00000152583        4         -1               NA SPARCL1  4.489009 4.166904  27.34820
#>                      P.Value    adj.P.Val        B
#> ENSG00000165995 1.115938e-10 1.881472e-06 14.45518
#> ENSG00000120129 1.148539e-09 4.309046e-06 13.01074
#> ENSG00000146250 1.291089e-09 4.309046e-06 12.44376
#> ENSG00000189221 1.475547e-09 4.309046e-06 12.75540
#> ENSG00000157214 1.592916e-09 4.309046e-06 12.71210
#> ENSG00000152583 1.622327e-09 4.309046e-06 12.33402

Then, we embed this set of differential expression results in the airway object using the embedContrastResults() method.

Because the output of limma::topTable() is a standard data.frame, the class= argument must be used to manually identify the method that produced those results.

Supported classes are listed in the object iSEEde::embedContrastResultsMethods, i.e.

library(iSEEde)
embedContrastResultsMethods
#>              limma 
#> "iSEELimmaResults"

This manual method is preferable to any automated heuristic (e.g. using the column names of the data.frame for identifying it as a limma::topTable() output).

The results embedded in the airway object can be accessed using the contrastResults() function.

airway <- embedContrastResults(res_limma, airway, name = "Limma-Voom", class = "limma")
contrastResults(airway)
#> DataFrame with 63677 rows and 1 column
#>                         Limma-Voom
#>                 <iSEELimmaResults>
#> ENSG00000000003 <iSEELimmaResults>
#> ENSG00000000005 <iSEELimmaResults>
#> ENSG00000000419 <iSEELimmaResults>
#> ENSG00000000457 <iSEELimmaResults>
#> ENSG00000000460 <iSEELimmaResults>
#> ...                            ...
#> ENSG00000273489 <iSEELimmaResults>
#> ENSG00000273490 <iSEELimmaResults>
#> ENSG00000273491 <iSEELimmaResults>
#> ENSG00000273492 <iSEELimmaResults>
#> ENSG00000273493 <iSEELimmaResults>
contrastResults(airway, "Limma-Voom")
#> iSEELimmaResults with 63677 rows and 16 columns
#>                         gene_id   gene_name  entrezid   gene_biotype gene_seq_start gene_seq_end
#>                     <character> <character> <integer>    <character>      <integer>    <integer>
#> ENSG00000000003 ENSG00000000003      TSPAN6        NA protein_coding       99883667     99894988
#> ENSG00000000005              NA          NA        NA             NA             NA           NA
#> ENSG00000000419 ENSG00000000419        DPM1        NA protein_coding       49551404     49575092
#> ENSG00000000457 ENSG00000000457       SCYL3        NA protein_coding      169818772    169863408
#> ENSG00000000460 ENSG00000000460    C1orf112        NA protein_coding      169631245    169823221
#> ...                         ...         ...       ...            ...            ...          ...
#> ENSG00000273489              NA          NA        NA             NA             NA           NA
#> ENSG00000273490              NA          NA        NA             NA             NA           NA
#> ENSG00000273491              NA          NA        NA             NA             NA           NA
#> ENSG00000273492              NA          NA        NA             NA             NA           NA
#> ENSG00000273493              NA          NA        NA             NA             NA           NA
#>                    seq_name seq_strand seq_coord_system      symbol     logFC   AveExpr         t
#>                 <character>  <integer>        <integer> <character> <numeric> <numeric> <numeric>
#> ENSG00000000003           X         -1               NA      TSPAN6 -0.464216   5.02559 -6.589673
#> ENSG00000000005          NA         NA               NA          NA        NA        NA        NA
#> ENSG00000000419          20         -1               NA        DPM1  0.125077   4.60191  1.632802
#> ENSG00000000457           1         -1               NA       SCYL3 -0.042107   3.47269 -0.438929
#> ENSG00000000460           1          1               NA    C1orf112 -0.228517   1.40857 -0.977961
#> ...                     ...        ...              ...         ...       ...       ...       ...
#> ENSG00000273489          NA         NA               NA          NA        NA        NA        NA
#> ENSG00000273490          NA         NA               NA          NA        NA        NA        NA
#> ENSG00000273491          NA         NA               NA          NA        NA        NA        NA
#> ENSG00000273492          NA         NA               NA          NA        NA        NA        NA
#> ENSG00000273493          NA         NA               NA          NA        NA        NA        NA
#>                     P.Value  adj.P.Val         B
#>                   <numeric>  <numeric> <numeric>
#> ENSG00000000003 0.000136603 0.00171857  0.899306
#> ENSG00000000005          NA         NA        NA
#> ENSG00000000419 0.139297605 0.25279527 -6.274792
#> ENSG00000000457 0.671768770 0.77787936 -7.230811
#> ENSG00000000460 0.355374953 0.49644090 -6.208897
#> ...                     ...        ...       ...
#> ENSG00000273489          NA         NA        NA
#> ENSG00000273490          NA         NA        NA
#> ENSG00000273491          NA         NA        NA
#> ENSG00000273492          NA         NA        NA
#> ENSG00000273493          NA         NA        NA

3.2 DESeq2

First, we use DESeqDataSet() to construct a DESeqDataSet object for the analysis. Then we run a standard DESeq2 analysis using DESeq().

The differential expression results are fetched using results().

library("DESeq2")

dds <- DESeqDataSet(airway, ~ 0 + dex + cell)

dds <- DESeq(dds)
res_deseq2 <- results(dds, contrast = list("dextrt", "dexuntrt"))
head(res_deseq2)
#> log2 fold change (MLE): dextrt vs dexuntrt 
#> Wald test p-value: dextrt vs dexuntrt 
#> DataFrame with 6 rows and 6 columns
#>                   baseMean log2FoldChange     lfcSE      stat      pvalue       padj
#>                  <numeric>      <numeric> <numeric> <numeric>   <numeric>  <numeric>
#> ENSG00000000003 708.602170     -0.3812539  0.100654 -3.787752 0.000152016 0.00128292
#> ENSG00000000005   0.000000             NA        NA        NA          NA         NA
#> ENSG00000000419 520.297901      0.2068127  0.112219  1.842944 0.065337213 0.19646961
#> ENSG00000000457 237.163037      0.0379205  0.143445  0.264356 0.791505314 0.91141884
#> ENSG00000000460  57.932633     -0.0881679  0.287142 -0.307054 0.758802543 0.89500551
#> ENSG00000000938   0.318098     -1.3782416  3.499906 -0.393794 0.693733216         NA

Then, we embed this set of differential expression results in the airway object using the embedContrastResults() method.

In this instance, the DESeq2 results are stored in a recognisable ?DESeqResults object, which can be given as is directly to the embedContrastResults() method.

The function will automatically pass the results object to the iSEEDESeq2Results() constructor, to identify it as such.

The results embedded in the airway object can be accessed using the contrastResults() function.

airway <- embedContrastResults(res_deseq2, airway, name = "DESeq2")
contrastResults(airway)
#> DataFrame with 63677 rows and 2 columns
#>                         Limma-Voom              DESeq2
#>                 <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000000003 <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000000005 <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000000419 <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000000457 <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000000460 <iSEELimmaResults> <iSEEDESeq2Results>
#> ...                            ...                 ...
#> ENSG00000273489 <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000273490 <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000273491 <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000273492 <iSEELimmaResults> <iSEEDESeq2Results>
#> ENSG00000273493 <iSEELimmaResults> <iSEEDESeq2Results>
contrastResults(airway, "DESeq2")
#> iSEEDESeq2Results with 63677 rows and 6 columns
#>                  baseMean log2FoldChange     lfcSE      stat      pvalue       padj
#>                 <numeric>      <numeric> <numeric> <numeric>   <numeric>  <numeric>
#> ENSG00000000003  708.6022     -0.3812539  0.100654 -3.787752 0.000152016 0.00128292
#> ENSG00000000005    0.0000             NA        NA        NA          NA         NA
#> ENSG00000000419  520.2979      0.2068127  0.112219  1.842944 0.065337213 0.19646961
#> ENSG00000000457  237.1630      0.0379205  0.143445  0.264356 0.791505314 0.91141884
#> ENSG00000000460   57.9326     -0.0881679  0.287142 -0.307054 0.758802543 0.89500551
#> ...                   ...            ...       ...       ...         ...        ...
#> ENSG00000273489  0.275899       1.483744   3.51398  0.422240    0.672850         NA
#> ENSG00000273490  0.000000             NA        NA        NA          NA         NA
#> ENSG00000273491  0.000000             NA        NA        NA          NA         NA
#> ENSG00000273492  0.105978      -0.463688   3.52312 -0.131613    0.895290         NA
#> ENSG00000273493  0.106142      -0.521372   3.53142 -0.147638    0.882628         NA

3.3 edgeR

We run a standard edgeR analysis using glmFit() and glmLRT().

The differential expression results are fetched using topTags().

library("edgeR")

design <- model.matrix(~ 0 + dex + cell, data = colData(airway))

fit <- glmFit(airway, design, dispersion = 0.1)
lrt <- glmLRT(fit, contrast = c(-1, 1, 0, 0, 0))
res_edger <- topTags(lrt, n = Inf)
head(res_edger)
#> Coefficient:  -1*dexuntrt 1*dextrt 
#>                         gene_id     gene_name entrezid         gene_biotype gene_seq_start
#> ENSG00000109906 ENSG00000109906        ZBTB16       NA       protein_coding      113930315
#> ENSG00000179593 ENSG00000179593       ALOX15B       NA       protein_coding        7942335
#> ENSG00000127954 ENSG00000127954        STEAP4       NA       protein_coding       87905744
#> ENSG00000152583 ENSG00000152583       SPARCL1       NA       protein_coding       88394487
#> ENSG00000250978 ENSG00000250978 RP11-357D18.1       NA processed_transcript       66759637
#> ENSG00000163884 ENSG00000163884         KLF15       NA       protein_coding      126061478
#>                 gene_seq_end seq_name seq_strand seq_coord_system        symbol
#> ENSG00000109906    114121398       11          1               NA        ZBTB16
#> ENSG00000179593      7952452       17          1               NA       ALOX15B
#> ENSG00000127954     87936206        7         -1               NA        STEAP4
#> ENSG00000152583     88452213        4         -1               NA       SPARCL1
#> ENSG00000250978     66771420        5         -1               NA RP11-357D18.1
#> ENSG00000163884    126076285        3         -1               NA         KLF15
#>                 iSEEde.Limma.Voom.gene_id iSEEde.Limma.Voom.gene_name iSEEde.Limma.Voom.entrezid
#> ENSG00000109906           ENSG00000109906                      ZBTB16                         NA
#> ENSG00000179593           ENSG00000179593                     ALOX15B                         NA
#> ENSG00000127954           ENSG00000127954                      STEAP4                         NA
#> ENSG00000152583           ENSG00000152583                     SPARCL1                         NA
#> ENSG00000250978           ENSG00000250978               RP11-357D18.1                         NA
#> ENSG00000163884           ENSG00000163884                       KLF15                         NA
#>                 iSEEde.Limma.Voom.gene_biotype iSEEde.Limma.Voom.gene_seq_start
#> ENSG00000109906                 protein_coding                        113930315
#> ENSG00000179593                 protein_coding                          7942335
#> ENSG00000127954                 protein_coding                         87905744
#> ENSG00000152583                 protein_coding                         88394487
#> ENSG00000250978           processed_transcript                         66759637
#> ENSG00000163884                 protein_coding                        126061478
#>                 iSEEde.Limma.Voom.gene_seq_end iSEEde.Limma.Voom.seq_name
#> ENSG00000109906                      114121398                         11
#> ENSG00000179593                        7952452                         17
#> ENSG00000127954                       87936206                          7
#> ENSG00000152583                       88452213                          4
#> ENSG00000250978                       66771420                          5
#> ENSG00000163884                      126076285                          3
#>                 iSEEde.Limma.Voom.seq_strand iSEEde.Limma.Voom.seq_coord_system
#> ENSG00000109906                            1                                 NA
#> ENSG00000179593                            1                                 NA
#> ENSG00000127954                           -1                                 NA
#> ENSG00000152583                           -1                                 NA
#> ENSG00000250978                           -1                                 NA
#> ENSG00000163884                           -1                                 NA
#>                 iSEEde.Limma.Voom.symbol iSEEde.Limma.Voom.logFC iSEEde.Limma.Voom.AveExpr
#> ENSG00000109906                   ZBTB16                7.069003                 1.3807218
#> ENSG00000179593                  ALOX15B                7.988755                -1.4798754
#> ENSG00000127954                   STEAP4                5.184088                 1.6193300
#> ENSG00000152583                  SPARCL1                4.489009                 4.1669039
#> ENSG00000250978            RP11-357D18.1                6.031081                -0.7521822
#> ENSG00000163884                    KLF15                4.433096                 3.2401058
#>                 iSEEde.Limma.Voom.t iSEEde.Limma.Voom.P.Value iSEEde.Limma.Voom.adj.P.Val
#> ENSG00000109906            23.94855              4.892822e-09                4.852528e-06
#> ENSG00000179593            21.82342              1.743170e-06                1.076551e-04
#> ENSG00000127954            20.01478              2.162145e-08                1.012605e-05
#> ENSG00000152583            27.34820              1.622327e-09                4.309046e-06
#> ENSG00000250978            16.32481              1.154612e-07                2.187276e-05
#> ENSG00000163884            22.89074              7.117995e-09                5.425717e-06
#>                 iSEEde.Limma.Voom.B iSEEde.DESeq2.baseMean iSEEde.DESeq2.log2FoldChange
#> ENSG00000109906            8.652823              385.07103                     7.352629
#> ENSG00000179593            2.986534               67.24305                     9.505983
#> ENSG00000127954            9.354944              286.38412                     5.207161
#> ENSG00000152583           12.334016              997.43977                     4.574919
#> ENSG00000250978            6.074174               56.31819                     6.327386
#> ENSG00000163884           10.885003              561.10717                     4.459129
#>                 iSEEde.DESeq2.lfcSE iSEEde.DESeq2.stat iSEEde.DESeq2.pvalue iSEEde.DESeq2.padj
#> ENSG00000109906           0.5363902          13.707612         9.141717e-43       2.256376e-40
#> ENSG00000179593           1.0545033           9.014654         1.974926e-19       1.252965e-17
#> ENSG00000127954           0.4930839          10.560396         4.547384e-26       5.057701e-24
#> ENSG00000152583           0.1840561          24.856114        2.220777e-136      4.001396e-132
#> ENSG00000250978           0.6777980           9.335209         1.007922e-20       7.206645e-19
#> ENSG00000163884           0.2348638          18.986019         2.225778e-80       2.506504e-77
#>                     logFC   logCPM       LR       PValue          FDR
#> ENSG00000109906  7.183385 4.132638 238.3947 8.805179e-54 5.606874e-49
#> ENSG00000179593 10.015847 1.627629 181.0331 2.883024e-41 9.179116e-37
#> ENSG00000127954  5.087069 3.672567 146.9725 7.957020e-34 1.688931e-29
#> ENSG00000152583  4.498698 5.510213 140.2205 2.382274e-32 3.792402e-28
#> ENSG00000250978  6.128131 1.377260 137.4681 9.526183e-32 1.213198e-27
#> ENSG00000163884  4.367962 4.681216 129.2203 6.069471e-30 6.441428e-26

Then, we embed this set of differential expression results in the airway object using the embedContrastResults() method.

In this instance, the edgeR results are stored in a recognisable ?TopTags object. As such, the results object can be given as is directly to the embedContrastResults() method. The function will automatically pass the results object to the iSEEedgeRResults() constructor, to identify it as such.

The results embedded in the airway object can be accessed using the contrastResults() function.

airway <- embedContrastResults(res_edger, airway, name = "edgeR")
contrastResults(airway)
#> DataFrame with 63677 rows and 3 columns
#>                         Limma-Voom              DESeq2              edgeR
#>                 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000000003 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000000005 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000000419 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000000457 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000000460 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ...                            ...                 ...                ...
#> ENSG00000273489 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000273490 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000273491 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000273492 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
#> ENSG00000273493 <iSEELimmaResults> <iSEEDESeq2Results> <iSEEedgeRResults>
contrastResults(airway, "edgeR")
#> iSEEedgeRResults with 63677 rows and 5 columns
#>                      logFC    logCPM        LR    PValue       FDR
#>                  <numeric> <numeric> <numeric> <numeric> <numeric>
#> ENSG00000000003 -0.4628153   5.05930  2.018481  0.155394         1
#> ENSG00000000005  0.0000000  -3.45546  0.000000  1.000000         1
#> ENSG00000000419  0.1247724   4.60783  0.146545  0.701860         1
#> ENSG00000000457 -0.0445216   3.48326  0.018241  0.892565         1
#> ENSG00000000460 -0.1618126   1.48518  0.210342  0.646500         1
#> ...                    ...       ...       ...       ...       ...
#> ENSG00000273489    2.48209  -3.28549   3.02143  0.082171         1
#> ENSG00000273490    0.00000  -3.45546   0.00000  1.000000         1
#> ENSG00000273491    0.00000  -3.45546   0.00000  1.000000         1
#> ENSG00000273492   -1.24012  -3.36894   0.91097  0.339857         1
#> ENSG00000273493   -1.75243  -3.36862   1.57193  0.209928         1

4 Live app

In this example, we used the iSEEde functions DETable(), VolcanoPlot(), and MAPlot() to add panels that facilitate the visualisation of differential expression results in an iSEE app.

Specifically, we add one set of panels for each differential expression method used in this vignette (i.e., Limma-Voom, DESeq2, edgeR).

library(iSEE)
app <- iSEE(airway, initial = list(
  DETable(ContrastName="Limma-Voom", HiddenColumns = c("AveExpr", 
    "t", "B"), PanelWidth = 4L),
  VolcanoPlot(ContrastName = "Limma-Voom", PanelWidth = 4L),
  MAPlot(ContrastName = "Limma-Voom", PanelWidth = 4L),
  DETable(ContrastName="DESeq2", HiddenColumns = c("baseMean", 
    "lfcSE", "stat"), PanelWidth = 4L),
  VolcanoPlot(ContrastName = "DESeq2", PanelWidth = 4L),
  MAPlot(ContrastName = "DESeq2", PanelWidth = 4L),
  DETable(ContrastName="edgeR", HiddenColumns = c("logCPM", 
    "LR"), PanelWidth = 4L),
  VolcanoPlot(ContrastName = "edgeR", PanelWidth = 4L),
  MAPlot(ContrastName = "edgeR", PanelWidth = 4L)
))

if (interactive()) {
  shiny::runApp(app)
}