Bioconductor has extensive facilities for mapping between microarray probe, gene, pathway, gene ontology, homology and other annotations.
Bioconductor has built-in representations of GO, KEGG, vendor, and other annotations, and can easily access NCBI, Biomart, UCSC, and other sources.
The following psuedo-code illustrates a typical R / Bioconductor session. It continues the differential expression workflow, taking a 'top table' of differentially expressed probesets and discovering the genes probed, and the Gene Ontology pathways to which they belong.
## Affymetrix U133 2.0 array IDs of interest; these might be
## obtained from
##
## tbl <- topTable(efit, coef=2)
## ids <- tbl[["ID"]]
##
## as part of a more extensive workflow.
> ids <- c("39730_at", "1635_at", "1674_at", "40504_at", "40202_at")
## load libraries as sources of annotation
> library("hgu95av2.db")
## To list the kinds of things that can be retrieved, use the cols method.
> cols(hgu95av2.db)
## To list the kinds of things that can be used as keys
## use the keytypes method
> keytypes(hgu95av2.db)
## To extract viable keys of a particular kind, use the keys method.
> head(keys(hgu95av2.db, keytype="ENTREZID"))
## the select method allows you to mao probe ids to ENTREZ gene ids...
> select(hgu95av2.db, ids, "ENTREZID", "PROBEID")
PROBEID ENTREZID
1 39730_at 25
2 1635_at 25
3 1674_at 7525
4 40504_at 5445
5 40202_at 687
## ... and to GENENAME etc.
> select(hgu95av2.db, ids, c("ENTREZID","GENENAME"), "PROBEID")
PROBEID ENTREZID GENENAME
1 39730_at 25 c-abl oncogene 1, non-receptor tyrosine kinase
2 1635_at 25 c-abl oncogene 1, non-receptor tyrosine kinase
3 1674_at 7525 v-yes-1 Yamaguchi sarcoma viral oncogene homolog 1
4 40504_at 5445 paraoxonase 2
5 40202_at 687 Kruppel-like factor 9
## find and extract the GO ids associated with the first id
> res <- select(hgu95av2.db, ids[1], "GO", "PROBEID")
> head(res)
PROBEID GO EVIDENCE ONTOLOGY
1 39730_at GO:0000115 TAS BP
2 39730_at GO:0000287 IDA MF
3 39730_at GO:0003677 NAS MF
4 39730_at GO:0003785 TAS MF
5 39730_at GO:0004515 TAS MF
6 39730_at GO:0004713 IDA MF
## use GO.db to find the Terms associated with those GOIDs
> library("GO.db")
> head(select(GO.db, res$GO, "TERM", "GOID"))
GOID TERM
1 GO:0000115 regulation of transcription involved in S phase of mitotic cell cycle 2 GO:0000287 magnesium ion binding 3 GO:0003677 DNA binding 4 GO:0003785 actin monomer binding 5 GO:0004515 nicotinate-nucleotide adenylyltransferase activity 6 GO:0004713 protein tyrosine kinase activity
[ Back to top ]
Follow installation instructions to start using these packages. To install the annotations associated with the Affymetrix Human Genome U95 V 2.0, and with Gene Ontology, use
> source("http://bioconductor.org/biocLite.R")
> biocLite(c("hgu95av2.db", "GO.db"))
Package installation is required only once per R installation. View a full list of available software and annotation packages.
To use the AnnotationDbi and GO.db package, evaluate the commands
> library(AnnotationDbi")
> library("GO.db")
These commands are required once in each R session.
[ Back to top ]
Packages have extensive help pages, and include vignettes highlighting common use cases. The help pages and vignettes are available from within R. After loading a package, use syntax like
> help(package="GO.db")
> ?select
to obtain an overview of help on the GO.db package, and the select
method. The AnnotationDbi package is used by most .db
packages. View the vignettes in the AnnotationDbi package with
> browseVignettes(package="AnnotationDbi")
To view vignettes (providing a more comprehensive introduction to
package functionality) in the AnnotationDbi package. Use
> help.start()
To open a web page containing comprehensive help resources.
[ Back to top ]
The following guides the user through key annotation packages. Users
interested in how to create custom chip packages should see the
vignettes in the AnnotationForge package. There is additional
information in the AnnotationDbi, OrganismDbi and
GenomicFeatures packages for how to use some of the extra tools
provided. You can also refer to the complete list of annotation
packages.
AnnotationDbi package. This
package will be automatically installed for you if you install
another ".db" annotation package using biocLite(). It contains the
code to allow annotation mapping objects to be made and manipulated
as well as code to use the select methods etc..AnnotationDbi package. These packages must be
upgraded before you attempt to update your custom chip packages as
they contain the source databases needed by the SQLForge code.[ Back to top ]