Overview

The netboxr package composes a number of functions to retrive and process genetic data from large-scale genomics projects (e.g. TCGA projects) including from mutations, copy number alterations, gene expression and DNA methylation. The netboxr package implements NetBox algorithm in R package. NetBox algorithm integrates genetic alterations with literature-curated pathway knowledge to identify pathway modules in cancer. NetBox algorithm uses (1) global network null model and (2) local network null model to access the statistic significance of the discovered pathway modules.

Basics

Installation

BiocManager::install("netboxr")

Getting Started

Load netboxr package:

library(netboxr)

A list of all accessible vignettes and methods is available with the following command:

help(package="netboxr")

For help on any netboxr package functions, use one of the following command formats:

help(geneConnector)
?geneConnector

Example of Cerami et al. PLoS One 2010

This is an example to reproduce the network discovered on Cerami et al.(2010).

The results presented here are comparable to the those from Cerami et al. 2010 though the unadjusted p-values for linker genes are not the same. It is because the unadjusted p-value of linker genes in Cerami et al. 2010 were calculated by the probabiliy of the observed data point, Pr(X). The netboxr used the probability of an observed or more extreme assuming the null hypothesis is true, Pr(X>=x|H), as unadjusted p-value for linker genes. The final number of linker genes after FDR correction are the same between netboxr result and original Cerami et al. 2010.

Load Human Interactions Network (HIN) network

Load pre-defined HIN network and simplify the interactions by removing loops and duplicated interactions in the network. The netowork after reduction contains 9264 nodes and 68111 interactions.

data(netbox2010)
sifNetwork <- netbox2010$network
graphReduced <- networkSimplify(sifNetwork, directed = FALSE)
## Loading network of 9264 nodes and 157780 interactions
## Treated as undirected network
## Removing multiple interactions and loops
## Returning network of 9264 nodes and 68111 interactions

Load altered gene list

The altered gene list contains 517 candidates from mutations and copy number alterations.

geneList <- as.character(netbox2010$geneList)
length(geneList)
## [1] 517

Map altered gene list on HIN network

The geneConnector function in the netboxr package takes altered gene list as input and maps the genes on the curated network to find the local processes represented by the gene list.

## Use Benjamini-Hochberg method to do multiple hypothesis correction for
## linker candidates.

## Use edge-betweeness method to detect community structure in the network.
threshold <- 0.05
results <- geneConnector(geneList = geneList, networkGraph = graphReduced, directed = FALSE, 
    pValueAdj = "BH", pValueCutoff = threshold, communityMethod = "ebc", keepIsolatedNodes = FALSE)
## 274 / 517 candidate nodes match the name in the network of 9264 
##                 nodes
## Only test neighbor nodes with local degree equals or exceeds 2
## Multiple hypothesis corrections for 892 neighbor nodes in the network
## For p-value 0.05 cut-off, 6 nodes were included as linker nodes
## Connecting 274 candidate nodes and 6 linker nodes
## Remove 208 isolated candidate nodes from the input
## Final network contains 72 nodes and 152 interactions
## Detecting modules using "edge betweeness" method
# Check the p-value of the selected linker
linkerDF <- results$neighborData
linkerDF[linkerDF$pValueFDR < threshold, ]
##         idx   name localDegree globalDegree    pValueRaw oddsRatio
## CRK    1712    CRK          11           81 2.392088e-05  1.708732
## IFNAR1 4546 IFNAR1           6           23 4.185496e-05  2.518726
## CBL      20    CBL          14          140 6.505470e-05  1.361057
## GAB1    500   GAB1           8           57 2.483197e-04  1.751122
## CDK6    414   CDK6           5           21 3.008515e-04  2.406906
## PTPN11   84 PTPN11          14          163 3.287776e-04  1.191405
##         pValueFDR
## CRK    0.01866731
## IFNAR1 0.01866731
## CBL    0.01934293
## GAB1   0.04887827
## CDK6   0.04887827
## PTPN11 0.04887827
## The geneConnector function returns a list of data frames.
names(results)
## [1] "netboxGraph"      "netboxCommunity"  "netboxOutput"    
## [4] "nodeType"         "moduleMembership" "neighborData"
# plot graph with the Fruchterman-Reingold layout algorithm
plot(results$netboxCommunity, results$netboxGraph, layout = layout_with_fr)