Compiled date: 2024-04-30

Last edited: 2024-02-12

License: GPL-3

1 Installation

Run the following code to install the Bioconductor version of the package.

if (!requireNamespace("BiocManager", quietly = TRUE))


2 Load packages


We will also need some additional CRAN and Bioconductor packages for performing tasks such as statistical analysis and web scraping.


# Bioconductor

3 Download the data from Metabolomics Workbench

The Metabolomics Workbench, available at, is a public repository for metabolomics metadata and experimental data spanning various species and experimental platforms, metabolite standards, metabolite structures, protocols, tutorials, and training material and other educational resources. It provides a computational platform to integrate, analyze, track, deposit and disseminate large volumes of heterogeneous data from a wide variety of metabolomics studies including mass spectrometry (MS) and nuclear magnetic resonance spectrometry (NMR) data spanning over 20 different species covering all the major taxonomic categories including humans and other mammals, plants, insects, invertebrates and microorganisms (Sud et al. 2016).

The metabolomicsWorkbenchR Bioconductor package allows us to obtain data from the Metabolomics Workbench repository. In this vignette we will use the sample data set ST000291.

3.1 Sumary of the study (ST000291)

Eighteen healthy female college students between 21-29 years old with a normal BMI of 18.5-25 were recruited. Each subject was provided with a list of foods that contained significant amount of procyanidins, such as cranberries, apples, grapes, blueberries, chocolate and plums. They were advised to avoid these foods during the 1-6th day and the rest of the study. On the morning of the 7th day, a first-morning baseline urine sample and blood sample were collected from all human subjects after overnight fasting. Participants were then randomly allocated into two groups (n=9) to consume cranberry juice or apple juice. Six bottles (250 ml/bottle) of juice were given to participants to drink in the morning and evening of the 7th, 8th, and 9th day. On the morning of 10th day, all subjects returned to the clinical unit to provide a first-morning urine sample after overnight fasting. The blood sample was also collected from participants 30 min later after they drank another bottle of juice in the morning. After two-weeks of wash out period, participants switched to the alternative regimen and repeated the protocol. One human subject was dropped off this study because she missed part of her appointments. Another two human subjects were removed from urine metabolomics analyses because they failed to provide required urine samples after juice drinking.The present study aimed to investigate overall metabolic changes caused by procyanidins concentrates from cranberries and apples using a global LCMS based metabolomics approach. All plasma and urine samples were stored at -80ºC until analysis.

3.2 Download data

This study is composed of two complementary MS analyses, the positive mode (AN000464) and the negative mode (AN000465). Let’s download them both!

data_negative_mode <- do_query(
  context = "study",
  input_item = "analysis_id",
  input_value = "AN000465",
  output_item = "SummarizedExperiment")

data_positive_mode <- do_query(
  context = "study",
  input_item = "analysis_id",
  input_value = "AN000464",
  output_item = "SummarizedExperiment")

3.3 Scraping metabolite names and identifiers with rvest

In many metabolomics studies, the reproducibility of analyses is severely affected by the poor interoperability of metabolite names and their identifiers. For this reason it is important to develop tools that facilitate the process of converting one type of identifier to another. In order to use the fobitools package, we need some generic identifier (such as PubChem, KEGG or HMDB) that allows us to obtain the corresponding FOBI identifier for each metabolite. The Metabolomics Workbench repository provides us with this information for many of the metabolites quantified in study ST000291 (Figure 1). In order to easily obtain this information, we will perform a web scraping operation using the rvest package.