1 About

SBGNview is a tool set for visualizing omics data on pathway maps and pathway related data analysis. Pathway is rendered with community standard notation: Systems Biology Graphical Notation (SBGN)(Le Novere et al. 2009). Given an omics data table and a pathway file (SBGN-ML format with layout information), SBGNview can display omics data as colors on glyphs and output image files. For omics data, SBGNview supports automatic ID mapping of common gene/protein/compound ID types (e.g. Entrez Gene ID, UNIPROT, ChEBI etc.). For pathway files, SBGNview can automatically retrieve SBGN-ML files from common pathway databases (e.g. Reactome, MetaCyc, SMPDB, PANTHER, METACROP etc.). To support visualizing multiple types of data on the same glyph/arc, SBGNview provides extensive options to control glyph and edge features (e.g. color, line width, label color/size etc.). To facilitate pathway based analysis, SBGNview can search for pathways by keywords, extract node information (e.g. gene set, compound set) and highlight shortest path between two nodes etc.

2 Introduction

Molecular pathways have been widely used in omics data analysis. We previously developed an R/BioConductor package called Pathview, which maps, integrates and visualizes omics data onto KEGG pathway graphs(Luo and Brouwer 2013). Since its publication, Pathview has been widely used in numerous omics studies and analysis tools. Here we introduce the SBGNview package, which adopts Systems Biology Graphical Notation (SBGN)(Le Novere et al. 2009) and greatly extends the Pathview project by supportting multiple major pathway databases besides KEGG.

Key features:

Pathway maps use different glyphs shapes to represent different types of molecules (macromolecules or simple chemicals) and different arc shapes to represent different reaction types (consumption or catalysis), collectively, they are called graphical notations. SBGN is a community developed notation standard and has been used by major pathway databases (e.g. Reactome, SMPDB, PANTHER pathways, MetaCrop etc.). For details about SBGN, please check http://sbgn.github.io/sbgn

As a community standard, SBGN is adopted by major pathway databases, including Reactome, Panther, PathwayCommons, MetaCrop etc. Therefore, users can use SBGNview to visualize and interpret their omics data on any pathways from these databases. In additions, molecular biologists often summarize new discoveries or literature knowledge in pathways, and create their own pathway maps. This can be done in SBGN editing/drawing tools: https://sbgn.github.io/software and the pathways can be saved as SBGN-ML files as input for SBGNview. This makes SBGNview much more flexible than existing tools such as Pathview and PaintOmics, which only support KEGG pathways.

Like Pathview, SBGNview supports multiple samples for each gene/compound. In addition, it provides rich options to control glyph/arc attributes such as line color/width and text size/color/wrapping/positioning etc. This gives users maximal control over the pathway graphs.

3 Installation

3.1 Install SBGNview

Install SBGNview through Bioconductor.


4 Overview

To visualize omics data on SBGN pathway map, we need two inputs:

  1. A SBGN-ML file containing the pathway information: nodes,edges and their layout (coordidnates).

  2. An omics data table in which rows are genes/compounds and columns are different measurements. The measurements can be any numeric values, such as fold change, abundance etc.

Given these two inputs, SBGNview will parse SBGN-ML file and render a .svg graph with SBGN notation, then displays omics data of each gene/compound on its corresponding glyph on the SBGN map. Each measured value will be displayed as a color corresponding to the value. When there are multiple samples/measurements for each molecule, nodes are divided into multiple slices correspondingly. The output images can be in SVG, PDF, PNG or PS format.

4.1 A quick example

A quick example to visualize a demo gene expression dataset on pathway “Adrenaline and noradrenaline biosynthesis” and highlight several interesting nodes, edges and path.

# load demo dataset and pathway information of built-in collection of SBGN-ML files. 
# We use a cancer microarray dataset 'gse16837.d' from package 'pathview'
# search for pathways with user defined keywords
input.pathways <- findPathways("Adrenaline and noradrenaline biosynthesis")
# render SBGN pathway graph and output image files
SBGNview.obj <- SBGNview(
          gene.data = gse16873.d[,1:3], 
          gene.id.type = "entrez",
          input.sbgn = input.pathways$pathway.id,
          output.file = "quick.start", 
          output.formats = c("png")

Two image files (a .svg file and a .pdf file) will be created in the current working directory:

list.files( pattern = "quick.start", full.names = TRUE) 
\label{fig:quickStartFig}Quick start example: Adrenaline and noradrenaline biosynthesis pathway.

Figure 4.1: Quick start example: Adrenaline and noradrenaline biosynthesis pathway.

Link to SBGN notation

In this example, the original pathway SBGN-ML file is from pathwayCommons with improved layout(node-edge overlaps are removed by routed edges).

We can highlight nodes, edges and path:

outputFile(SBGNview.obj) <- "quick.start.highlight.elements"
SBGNview.obj + 
        highlightArcs(class = "production",color = "red") +
        highlightArcs(class = "consumption",color = "blue") +
        highlightNodes(node.set = c("tyrosine", "(+-)-epinephrine"),
                       stroke.width = 4, stroke.color = "green") + 
        highlightPath(from.node = "tyrosine", to.node = "dopamine",
                      from.node.color = "green",
                      to.node.color = "blue",
                      shortest.paths.cols = "purple",
                      input.node.stroke.width = 6,
                      path.node.stroke.width = 5,
                      path.node.color = "purple",
                      path.stroke.width = 5,
                      tip.size = 10 )