The AWAggregatorData
package contains the data associated with the
AWAggregator
R package. It includes two pre-trained random forest models, one
incorporating the average coefficient of variation as a feature, and the other
one not including it. It also contains the PSMs in Benchmark Set 1~3 derived
from the psm.tsv
output files generated by FragPipe, which are used to train
the random forest models.
Data available in the AWAggregatorData
package:
regr
: represent the pre-trained random forest model that incorporates the
average coefficient of variation (CV) as a feature.
regr.no.CV
: represent the pre-trained random forest model that does not
include the average CV as a feature.
benchmark.set.1
, benchmark.set.2
, benchmark.set.3
: represents PSMs
in Benchmark Set 1~3 derived from the psm.tsv
output files generated by
FragPipe, which are used to train the random forest model. Columns unnecessary
for the AWAggregator
have been removed from the sample data.
if (!requireNamespace('BiocManager', quietly = TRUE))
install.packages('BiocManager')
BiocManager::install('ExperimentHub')
BiocManager::install('AWAggregatorData')
ExperimentHub
Data are stored via ExperimentHub
package. The information of available
datasets can be retrieved by the query
function
library(ExperimentHub)
## Loading required package: BiocGenerics
## Loading required package: generics
##
## Attaching package: 'generics'
## The following objects are masked from 'package:base':
##
## as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
## setequal, union
##
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:stats':
##
## IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
##
## Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
## as.data.frame, basename, cbind, colnames, dirname, do.call,
## duplicated, eval, evalq, get, grep, grepl, is.unsorted, lapply,
## mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
## rank, rbind, rownames, sapply, saveRDS, table, tapply, unique,
## unsplit, which.max, which.min
## Loading required package: AnnotationHub
## Loading required package: BiocFileCache
## Loading required package: dbplyr
eh = ExperimentHub()
query(eh, 'AWAggregatorData') # Require Bioconductor version 3.21 or later
## ExperimentHub with 5 records
## # snapshotDate(): 2025-07-17
## # $dataprovider: University of British Columbia
## # $species: NA
## # $rdataclass: data.frame, ranger
## # additional mcols(): taxonomyid, genome, description,
## # coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## # rdatapath, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["EH9637"]]'
##
## title
## EH9637 | benchmark.set.1.rds
## EH9638 | benchmark.set.2.rds
## EH9639 | benchmark.set.3.rds
## EH9640 | regr.rds
## EH9641 | regr.no.CV.rds
The datasets and pre-trained models can be downloaded by:
# Benchmark Set 1
df = eh[['EH9637']]
## see ?AWAggregatorData and browseVignettes('AWAggregatorData') for documentation
## downloading 1 resources
## retrieving 1 resource
## loading from cache
# Benchmark Set 2
df = eh[['EH9638']]
## see ?AWAggregatorData and browseVignettes('AWAggregatorData') for documentation
## downloading 1 resources
## retrieving 1 resource
## loading from cache
# Benchmark Set 3
df = eh[['EH9639']]
## see ?AWAggregatorData and browseVignettes('AWAggregatorData') for documentation
## downloading 1 resources
## retrieving 1 resource
## loading from cache
# Pre-trained model incorporating the average coefficient of variation (CV) as
# a feature
regr = eh[['EH9640']]
## see ?AWAggregatorData and browseVignettes('AWAggregatorData') for documentation
## downloading 1 resources
## retrieving 1 resource
## loading from cache
# Pre-trained model excluding CV as a feature
regr = eh[['EH9641']]
## see ?AWAggregatorData and browseVignettes('AWAggregatorData') for documentation
## downloading 1 resources
## retrieving 1 resource
## loading from cache
sessionInfo()
## R version 4.5.1 (2025-06-13)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.3 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.22-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] AWAggregatorData_0.99.4 ExperimentHub_2.99.5 AnnotationHub_3.99.6
## [4] BiocFileCache_2.99.6 dbplyr_2.5.0 BiocGenerics_0.55.1
## [7] generics_0.1.4 BiocStyle_2.37.1
##
## loaded via a namespace (and not attached):
## [1] toOrdinal_1.3-0.0 KEGGREST_1.49.1 xfun_0.53
## [4] bslib_0.9.0 httr2_1.2.1 lattice_0.22-7
## [7] Biobase_2.69.0 vctrs_0.6.5 tools_4.5.1
## [10] stats4_4.5.1 curl_7.0.0 tibble_3.3.0
## [13] AnnotationDbi_1.71.1 RSQLite_2.4.3 blob_1.2.4
## [16] pkgconfig_2.0.3 Matrix_1.7-3 S4Vectors_0.47.0
## [19] lifecycle_1.0.4 stringr_1.5.1 compiler_4.5.1
## [22] brio_1.1.5 Biostrings_2.77.2 progress_1.2.3
## [25] Seqinfo_0.99.2 htmltools_0.5.8.1 sass_0.4.10
## [28] yaml_2.3.10 tidyr_1.3.1 pillar_1.11.0
## [31] crayon_1.5.3 jquerylib_0.1.4 cachem_1.1.0
## [34] tidyselect_1.2.1 digest_0.6.37 stringi_1.8.7
## [37] dplyr_1.1.4 purrr_1.1.0 bookdown_0.44
## [40] BiocVersion_3.22.0 grid_4.5.1 fastmap_1.2.0
## [43] cli_3.6.5 magrittr_2.0.3 withr_3.0.2
## [46] prettyunits_1.2.0 filelock_1.0.3 rappdirs_0.3.3
## [49] bit64_4.6.0-1 rmarkdown_2.29 XVector_0.49.0
## [52] httr_1.4.7 Peptides_2.4.6 bit_4.6.0
## [55] ranger_0.17.0 AWAggregator_0.99.4 png_0.1-8
## [58] hms_1.1.3 memoise_2.0.1 evaluate_1.0.4
## [61] knitr_1.50 IRanges_2.43.0 testthat_3.2.3
## [64] rlang_1.1.6 Rcpp_1.1.0 glue_1.8.0
## [67] DBI_1.2.3 BiocManager_1.30.26 jsonlite_2.0.0
## [70] R6_2.6.1