Package: peakPantheR
Authors: Arnaud Wolfer

1 Introduction

The peakPantheR package is designed for the detection, integration and reporting of pre-defined features in MS files (e.g. compounds, fragments, adducts, …).

The Parallel Annotation is set to detect and integrate multiple compounds in multiple files in parallel and store results in a single object. It can be employed to integrate a large number of expected features across a dataset.

Using the faahKO raw MS dataset as an example, this vignette will:

  • Detail the Parallel Annotation concept
  • Apply the Parallel Annotation to a subset of pre-defined features in the faahKO dataset

1.1 Abbreviations

  • ROI: Regions Of Interest
    • reference RT / m/z windows in which to search for a feature
  • uROI: updated Regions Of Interest
    • modifed ROI adapted to the current dataset which override the reference ROI
  • FIR: Fallback Integration Regions
    • RT / m/z window to integrate if no peak is found
  • TIC: Total Ion Chromatogram
    • the intensities summed across all masses for each scan
  • EIC: Extracted Ion Chromatogram
    • the intensities summed over a mass range, for each scan

2 Parallel Annotation Concept

Parallel compound integration is set to process multiple compounds in multiple files in parallel, and store results in a single object.

To acheive this, peakPantheR will:

  1. load a list of expected RT / m/z ROI and a list of files to process
  2. initialise an output object with expected ROI and file paths
  3. first pass (without peak filling) on a subset of representative samples (e.g QC samples):
    • for each file, detect features in each ROI and keep highest intensity
    • determine peak statistics for each feature
    • store results + EIC for each ROI
  4. visual inspection of first pass results, update ROI:
    • diagnostic plots: all EICs, peak apex RT / m/z & peak width evolution
    • correct ROI (remove interfering feature, correct RT shift)
    • define fallback integration regions (FIR) if no feature is detected (median RT / m/z start and end of found features)
  5. initialise a new output object, with updated regions of interest (uROI) and fallback integration regions (FIR), with all samples
  6. second pass (with peak filling) on all samples:
    • for each file, detect features in each uROI and keep highest intensity
    • determine peak statistics for each feature
    • integrate FIR when no peaks are found
    • store results + EIC for each uROI
  7. summary statistics:
    • plot EICs, apex and peakwidth evolution
    • compare first and second pass
  8. return the resulting object and/or table (row: file, col: compound)

Diagram of the workflow and functions used for parallel annotation.

3 Parallel Annotation Example

We can target 2 pre-defined features in 6 raw MS spectra file from the faahKO package using peakPantheR_parallelAnnotation(). For more details on the installation and input data employed, please consult the Getting Started with peakPantheR vignette.

3.1 Input Data

First the paths to 3 MS file from the faahKO are located and used as input spectras. In this example these 3 samples are considered as representative of the whole run (e.g. Quality Control samples):

library(faahKO)
## file paths
input_spectraPaths  <- c(system.file('cdf/KO/ko15.CDF', package = "faahKO"),
                        system.file('cdf/KO/ko16.CDF', package = "faahKO"),
                        system.file('cdf/KO/ko18.CDF', package = "faahKO"))
input_spectraPaths
#> [1] "/home/biocbuild/bbs-3.10-bioc/R/library/faahKO/cdf/KO/ko15.CDF"
#> [2] "/home/biocbuild/bbs-3.10-bioc/R/library/faahKO/cdf/KO/ko16.CDF"
#> [3] "/home/biocbuild/bbs-3.10-bioc/R/library/faahKO/cdf/KO/ko18.CDF"

Two targeted features (e.g. compounds, fragments, adducts, …) are defined and stored in a table with as columns:

  • cpdID (numeric)
  • cpdName (character)
  • <