GENCODE basic gene annotation v40 for the CENTRE package
CENTRE is a package for Cell-type specific ENhancer Target pREdiction, that follows this workflow:
createPairs()
-> computeGenericFeatures()
-> computeCellTypeFeatures()
-> centreClassification()
The step CENTRE::createPairs()
creates all possible enhancer-gene pairs at
500kb of the input genes or enhancers. For this step CENTRE uses the
ENCODE SCREEN v3 enhancer annotation and the
GENCODE v40 gene annotation
All of the data in the CENTREannotation package can be accessed through AnnotationHub:
library(AnnotationHub, quietly = TRUE)
##
## Attaching package: 'generics'
## The following objects are masked from 'package:base':
##
## as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
## setequal, union
##
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:stats':
##
## IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
##
## Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
## as.data.frame, basename, cbind, colnames, dirname, do.call,
## duplicated, eval, evalq, get, grep, grepl, is.unsorted, lapply,
## mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
## rank, rbind, rownames, sapply, saveRDS, table, tapply, unique,
## unsplit, which.max, which.min
hub <- AnnotationHub()
ah <- query(hub, "CENTREannotation")
ah
## AnnotationHub with 2 records
## # snapshotDate(): 2025-06-23
## # $dataprovider: GENCODE, ENCODE cCREs
## # $species: Homo sapiens
## # $rdataclass: SQLiteConnection
## # additional mcols(): taxonomyid, genome, description,
## # coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## # rdatapath, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["AH116730"]]'
##
## title
## AH116730 | GENCODE basic gene annotation v40
## AH116731 | ENCODE Registry of cCREs V3
The GENCODE database can be accessed using ah[["AH116730"]]
that returns a
CENTREannotdb object. The SCREEN database can be accessed using ah[["AH116731"]]
which also returns a CENTREannotdb object.
The objects CENTREannotgeneDb and CENTREannotenhDb represent the GENCODE and ENCODE databases respectively
library(CENTREannotation)
CENTREannotgeneDb <- ah[["AH116730"]]
## downloading 1 resources
## retrieving 1 resource
## loading from cache
CENTREannotenhDb <- ah[["AH116731"]]
## downloading 1 resources
## retrieving 1 resource
## loading from cache
The database can be used as follows:
tables()
: shows all the tables and columns in the databasefetch_data()
: function to select data from the database (see man pages)tables(CENTREannotenhDb)
## $ccres
## [1] "chr" "start" "end" "accession" "enhancer_id"
## [6] "description" "size" "new_start" "new_end" "newsize"
## [11] "middle_point"
res <- fetch_data(CENTREannotenhDb,
columns = c("enhancer_id", "start"),
entries = c("EH38E1519134", "EH38E1519132"),
column_filter = "enhancer_id"
)
sessionInfo()
## R version 4.5.1 (2025-06-13)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.2 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.22-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] CENTREannotation_0.99.1 AnnotationHub_3.99.6 BiocFileCache_2.99.5
## [4] dbplyr_2.5.0 BiocGenerics_0.55.1 generics_0.1.4
## [7] BiocStyle_2.37.1
##
## loaded via a namespace (and not attached):
## [1] rappdirs_0.3.3 sass_0.4.10 BiocVersion_3.22.0
## [4] RSQLite_2.4.2 digest_0.6.37 magrittr_2.0.3
## [7] evaluate_1.0.4 bookdown_0.43 fastmap_1.2.0
## [10] blob_1.2.4 jsonlite_2.0.0 AnnotationDbi_1.71.1
## [13] DBI_1.2.3 BiocManager_1.30.26 httr_1.4.7
## [16] purrr_1.1.0 Biostrings_2.77.2 httr2_1.2.1
## [19] jquerylib_0.1.4 cli_3.6.5 crayon_1.5.3
## [22] rlang_1.1.6 XVector_0.49.0 Biobase_2.69.0
## [25] bit64_4.6.0-1 withr_3.0.2 cachem_1.1.0
## [28] yaml_2.3.10 tools_4.5.1 memoise_2.0.1
## [31] dplyr_1.1.4 filelock_1.0.3 curl_6.4.0
## [34] vctrs_0.6.5 R6_2.6.1 png_0.1-8
## [37] stats4_4.5.1 lifecycle_1.0.4 Seqinfo_0.99.2
## [40] KEGGREST_1.49.1 S4Vectors_0.47.0 IRanges_2.43.0
## [43] bit_4.6.0 pkgconfig_2.0.3 pillar_1.11.0
## [46] bslib_0.9.0 glue_1.8.0 xfun_0.52
## [49] tibble_3.3.0 tidyselect_1.2.1 knitr_1.50
## [52] htmltools_0.5.8.1 rmarkdown_2.29 compiler_4.5.1