Contents

1 CENTREannotation: An AnnotationHub package with the ENCODE cCREs V3 and

GENCODE basic gene annotation v40 for the CENTRE package

CENTRE is a package for Cell-type specific ENhancer Target pREdiction, that follows this workflow:

createPairs() -> computeGenericFeatures() -> computeCellTypeFeatures() -> centreClassification()

The step CENTRE::createPairs() creates all possible enhancer-gene pairs at 500kb of the input genes or enhancers. For this step CENTRE uses the ENCODE SCREEN v3 enhancer annotation and the GENCODE v40 gene annotation

All of the data in the CENTREannotation package can be accessed through AnnotationHub:

library(AnnotationHub, quietly = TRUE)
## 
## Attaching package: 'generics'
## The following objects are masked from 'package:base':
## 
##     as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
##     setequal, union
## 
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
## 
##     Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
##     as.data.frame, basename, cbind, colnames, dirname, do.call,
##     duplicated, eval, evalq, get, grep, grepl, is.unsorted, lapply,
##     mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
##     rank, rbind, rownames, sapply, saveRDS, table, tapply, unique,
##     unsplit, which.max, which.min
hub <- AnnotationHub()
ah <- query(hub, "CENTREannotation")
ah
## AnnotationHub with 2 records
## # snapshotDate(): 2025-06-23
## # $dataprovider: GENCODE, ENCODE cCREs
## # $species: Homo sapiens
## # $rdataclass: SQLiteConnection
## # additional mcols(): taxonomyid, genome, description,
## #   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## #   rdatapath, sourceurl, sourcetype 
## # retrieve records with, e.g., 'object[["AH116730"]]' 
## 
##              title                            
##   AH116730 | GENCODE basic gene annotation v40
##   AH116731 | ENCODE Registry of cCREs V3

The GENCODE database can be accessed using ah[["AH116730"]] that returns a CENTREannotdb object. The SCREEN database can be accessed using ah[["AH116731"]] which also returns a CENTREannotdb object.

1.1 How to use the CENTREannotDb objects.

The objects CENTREannotgeneDb and CENTREannotenhDb represent the GENCODE and ENCODE databases respectively

library(CENTREannotation)
CENTREannotgeneDb <- ah[["AH116730"]]
## downloading 1 resources
## retrieving 1 resource
## loading from cache
CENTREannotenhDb <- ah[["AH116731"]]
## downloading 1 resources
## retrieving 1 resource
## loading from cache

The database can be used as follows:

  • tables(): shows all the tables and columns in the database
  • fetch_data(): function to select data from the database (see man pages)
tables(CENTREannotenhDb)
## $ccres
##  [1] "chr"          "start"        "end"          "accession"    "enhancer_id" 
##  [6] "description"  "size"         "new_start"    "new_end"      "newsize"     
## [11] "middle_point"
res <- fetch_data(CENTREannotenhDb,
    columns = c("enhancer_id", "start"),
    entries = c("EH38E1519134", "EH38E1519132"),
    column_filter = "enhancer_id"
)
sessionInfo()
## R version 4.5.1 (2025-06-13)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.2 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.22-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] CENTREannotation_0.99.1 AnnotationHub_3.99.6    BiocFileCache_2.99.5   
## [4] dbplyr_2.5.0            BiocGenerics_0.55.1     generics_0.1.4         
## [7] BiocStyle_2.37.1       
## 
## loaded via a namespace (and not attached):
##  [1] rappdirs_0.3.3       sass_0.4.10          BiocVersion_3.22.0  
##  [4] RSQLite_2.4.2        digest_0.6.37        magrittr_2.0.3      
##  [7] evaluate_1.0.4       bookdown_0.43        fastmap_1.2.0       
## [10] blob_1.2.4           jsonlite_2.0.0       AnnotationDbi_1.71.1
## [13] DBI_1.2.3            BiocManager_1.30.26  httr_1.4.7          
## [16] purrr_1.1.0          Biostrings_2.77.2    httr2_1.2.1         
## [19] jquerylib_0.1.4      cli_3.6.5            crayon_1.5.3        
## [22] rlang_1.1.6          XVector_0.49.0       Biobase_2.69.0      
## [25] bit64_4.6.0-1        withr_3.0.2          cachem_1.1.0        
## [28] yaml_2.3.10          tools_4.5.1          memoise_2.0.1       
## [31] dplyr_1.1.4          filelock_1.0.3       curl_6.4.0          
## [34] vctrs_0.6.5          R6_2.6.1             png_0.1-8           
## [37] stats4_4.5.1         lifecycle_1.0.4      Seqinfo_0.99.2      
## [40] KEGGREST_1.49.1      S4Vectors_0.47.0     IRanges_2.43.0      
## [43] bit_4.6.0            pkgconfig_2.0.3      pillar_1.11.0       
## [46] bslib_0.9.0          glue_1.8.0           xfun_0.52           
## [49] tibble_3.3.0         tidyselect_1.2.1     knitr_1.50          
## [52] htmltools_0.5.8.1    rmarkdown_2.29       compiler_4.5.1