ChemmineOB

Kevin Horan & Thomas Girke

Last update: October 13, 2014

Introduction

ChemmineOB provides an R interface to a subset of cheminformatics functionalities implemented by the OpelBabel C++ project (O'Boyle, Morley, and Hutchison, 2008; O\textquotesingleBoyle, Banck, James, Morley, Vandermeersch, and Hutchison, 2011). OpenBabel is an open source cheminformatics toolbox that includes utilities for structure format interconversions, descriptor calculations, compound similarity searching and more. ChemineOB aims to make a subset of these utilities available from within R. For non-developers, ChemineOB is primarily intended to be used from ChemmineR (Cao, Charisi, Cheng, Jiang, and Girke, 2008; Backman, Cao, and Girke, 2011; Wang, Backman, Horan, and Girke, 2013) as an add-on package rather than used directly.

Installation

To use the ChemmineOB package on Linux or Mac, OpenBabel 2.3.0 or greater needs to be installed on a system. On Linux systems, the OpenBabel header files are also required in order to compile ChemmineOB. The windows distribution will include its own version of OpenBabel. The OpenBabel site (http://openbabel.org/wiki/Get_Open_Babel) provides excellent instructions for installing the OpenBabel software on Mac or Linux systems. The ChemmineR and ChemmineOB packages can be installed from within R with the biocLite install script.

source("http://bioconductor.org/biocLite.R") 
biocLite(c("ChemmineR", "ChemmineOB")) 
library("ChemmineR") 
library("ChemmineOB") 

If the installation fails on Linux, you may need to manually set the locations of the open babel libraries and header files. This is best done through configure flags. For example, at the command prompt do:

$ R CMD INSTALL --configure-args='--with-openbabel-include=...  --with-openbabel-lib=...' <ChemmineOB package file>

where the '...' are replaced by the relevant paths. See the README file for more details.

User Manual in ChemmineR Vignette

Detailed instructions for using ChemmineOB are provided in the vignette of the ChemmineR package instead of this document. The main reason for consolidating the documentation in one central document rather than distributing it across several vignettes is that it helps minimizing duplications and inconsistencies. It also is the more suitable format for providing a task-oriented description of functionalities for users. To obtain an overview of the OpenBabel utilities supported by ChemmineOB, we recommend consulting the OpenBabel Functions section of the ChemmineR vignette. To open the ChemmineR vignette from R, one can use the following command.

 vignette("ChemmineR") 

SWIG Interface (For R developers)

ChemmineOB now includes wrapper functions for all of OpenBabel, as genereted by SWIG. We still maintain our own set of functions to provide better integration with R in general and ChemmineR specifically.

If you are familiar with the Open Babel API, using the SWIG wrapper should be similar, once you know a few conventions used. You can look at the R code in this package to see examples of these.

OBConversion *x = new OBConversion(...)

in R you would have:

x = OBConversion(...)
x->AddOption(...)

we have:

OBConversion_AddOption(x,...)
result = stringp()
OBDescriptor_GetStringValue(... , result$cast())
stringValue = result$value()

There are still many special cases however. The SWIG documentation can help, as well as browsing the generated R code in R/ChemmineOB.R.

Version Information

sessionInfo()

R version 3.1.1 Patched (2014-09-25 r66681) Platform: x86_64-unknown-linux-gnu (64-bit)

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] knitcitations_1.0-1 ChemmineOB_1.4.0 knitrBootstrap_0.9.0

loaded via a namespace (and not attached): [1] RCurl_1.95-4.3 RJSONIO_1.3-0 Rcpp_0.11.3
[4] RefManageR_0.8.34 XML_3.98-1.1 bibtex_0.3-6
[7] digest_0.6.4 evaluate_0.5.5 formatR_1.0
[10] httr_0.5 knitr_1.7 lubridate_1.3.3
[13] markdown_0.7.4 memoise_0.2.1 plyr_1.8.1
[16] stringr_0.6.2 tools_3.1.1 zlibbioc_1.12.0

Funding

This software was developed with funding from the National Science Foundation: ABI-0957099, 2010-0520325 and IGERT-0504249.

References

[1] T. W. H. Backman. "ChemMine tools: an online service for analyzing and clustering small molecules". In: Nucleic Acids Research 39.suppl (May. 2011), pp. W486-W491. URL: http://dx.doi.org/10.1093/nar/gkr320.

[2] Y. Cao. "ChemmineR: a compound mining framework for R". In: Bioinformatics 24.15 (Jul. 2008), pp. 1733-1734. URL: http://dx.doi.org/10.1093/bioinformatics/btn307.

[3] N. M. O'Boyle. Pybel: a Python wrapper for the OpenBabel cheminformatics toolkit. . 2008. URL: http://journal.chemistrycentral.com/content/2/1/5.

[4] N. M. OBoyle. "Open Babel: An open chemical toolbox". In: Journal of Cheminformatics 3.1 (2011), p. 33. URL: http://dx.doi.org/10.1186/1758-2946-3-33.

[5] Y. Wang. "fmcsR: mismatch tolerant maximum common substructure searching in R". In: Bioinformatics 29.21 (Aug. 2013), pp. 2792-2794. URL: http://dx.doi.org/10.1093/bioinformatics/btt475.