omicplotR is an R package containing a
Shiny app used to visually explore omic datasets, where the input is a table of read counts from high-throughput sequencing runs. It integrates the
ALDEx21 package for compositional analysis of differential abundance.
omicplotR is intended facilitate exploring high-throughput sequencing datasets by providing a graphical user interface for users with and without experience in R.
High-throughput sequencing (HTS) instruments generate an amount of reads that is constrained by limitations of the sequencing instrument itself, and do not represent the absolute number of DNA molecules in a sample. For example, an Illumina NextSeq can deliver up to 400 million single-end reads, whereas an Illumina MiSeq2 can only deliver up to 15 million single-end reads2. This type of data, which is constrained by an arbitrary or constant sum, is referred to as compositional data, and high-throughput sequencing data must be treated as such3. See
ALDEx2 for more information.
Although several R packages exist for exploring high-throughput sequencing data, they are typically command line based, which presents a barrier for users without any significant command line or scripting experience.
omicplotR was created to facilitate the exploratory phase of high-throughput sequencing data analysis allowing the generation of basic exploratory plots automatically with adjustable features and filters.
This vignette provides an overview of the R package
omicplotR and the input requirements. A tutorial for each component of the
Shiny app is available on the wiki: https://github.com/dgiguer/omicplotR/wiki.
omicplotR was developed for several types of HTS datasets including RNASeq, meta-RNASeq, and 16s rRNA gene sequencing, and in principle, can be used for nearly any type of data generated by HTS that contains a tables of counts per feature for each sample.
omicplotR provides a graphical user interface using the
Shiny package for the following visualizations for HTS data:
Additional features include:
ALDEx2tables and colour points by rownames for large datasets
Install the latest version of
BiocManager. Make sure you have the newest version of R,
ALDEx2, and other dependancies.
omicplotR requires you to have at least R version 3.5. The most up to date version is available at www.github.com/dgiguer/omicplotr/, and is the dev branch.
First, load the
omicplotR package. All other dependencies will be loaded automatically. This will launch the
Shiny app in your default browser. For this vignette, we will be using the example data and metadata provided. Example data and metadata are accessible by
data(metadata). They are also available as .txt files in
install.packages("BiocManager") BiocManager::install("omicplotR") library(omicplotR) omicplotr.run()
After launching the
Shiny app, click the ‘Input data’ tab to get started.
The ‘Data’ tab on the sidebar panel allows you to choose your own data and metadata by clicking ‘Browse’. To follow along with this vignette, please click the ‘Example data’ tab on the sidebar panel, and click the checkbox for the ‘Vaginal dataset’. This dataset, which includes associated metadata, is from a study that characterized the changes in the vaginal microbiome following antibiotic and probiotic treatment by 16s rRNA gene sequencing4. Return to the ‘Data’ tab on the sidebar panel to view the data and metadata by clicking ‘Show data’ and ‘Show metadata’. The tabs on the main panel allow you to switch between displaying your data and metadata tables.
When choosing your own data set, input requirements are as follows: for both metadata and data, each sample and feature name (operational taxonomic unit - OTU) must be unique. An example of an appropriately formatted data file is shown in Figure 2.
Your metadata file must follow a similar format. An example of an appropriate metadata file is shown in Figure 3.
The ‘Example data’ tab on the sidebar panel provides access to two example datasets. We will be using the provided ‘Vaginal dataset’, which contains both an OTU table and associated metadata. The ‘Selex dataset’ is from a selective growth experiment giving the differential abundance of 1600 enzyme variants5. After selecting the ‘Vaginal dataset’, return to the ‘Data’ tab to and click ‘Show data’ to view the data. You can view the metadata by clicking ‘Show metadata’ and switching the tab in the main panel to ‘Metadata’. Click the ‘PCA Biplots’ main tab to proceed.
The ‘Filtering’ tab within the sidebar panel allows you to choose filtering options for your dataset. Colouring options for a coloured PCA biplot are available under the ‘Colouring options’ tab. The tabs within the main panel allow you to switch between displaying a biplot under ‘Biplot’, a biplot coloured by metadata under ‘Coloured Biplot’, and visualizations of the removed samples/features from filtering under the ‘Removed data’ tab.