The Bioconductor AnnotationHub is a good source of genomic annotations of many different kinds.
H3K27ac is an epigenetic modification to the histone H3, an acetylation of the lysine residue at N-terminal position 27. H3K27ac is associated with active enhancers.
To the best of my knowledge, fetching data from the AnnotationHub does not support regions. The fetch is necessarily of the entire genomic resource - all chromosomes - and so may require time-consuming downloads. Subsetting by region takes place after the often time-consuming download.
Therefore, to run this vignette for the first time may take up to 20 minutes due to that download time.
Once downloaded, however, the resource is cached.
library(igvR) library(AnnotationHub) igv <- igvR() setBrowserWindowTitle(igv, "H3K27ac GATA2") setGenome(igv, "hg19") showGenomicRegion(igv, "GATA2") for(i in 1:4) zoomOut(igv)
aHub <- AnnotationHub() query.terms <- c("H3K27Ac", "k562") length(query(aHub, query.terms)) # found 7 h3k27ac.entries <- query(aHub, query.terms)
The available data, key and title:
AH23388 | wgEncodeBroadHistoneK562H3k27acStdPk.broadPeak.gz AH29788 | E123-H3K27ac.broadPeak.gz AH30836 | E123-H3K27ac.narrowPeak.gz AH31772 | E123-H3K27ac.gappedPeak.gz AH32958 | E123-H3K27ac.fc.signal.bigwig AH33990 | E123-H3K27ac.pval.signal.bigwig AH39539 | E123-H3K27ac.imputed.pval.signal.bigwig
If not in your cache, this step may take 20 minutes.
x.broadPeak <- aHub[["AH23388"]] x.bigWig <- aHub[["AH32958"]]
The two resources are different data types, requiring different processing to render as tracks in igvR
The broadPeak data is a GRanges object already in memory. Subset to obtain only the 252 kb region in which we are interested.
roi <- getGenomicRegion(igv) gr.broadpeak <- x.broadPeak[seqnames(x.broadpeak)==roi$chrom & start(x.broadpeak) > roi$start & end(x.broadpeak) < roi$end]
igvR’s GrangesQuantitativeTrack must have only one numeric column in the GRanges metadata. That column is used as the magnitudes the track will display.
names(mcols(gr.broadpeak)) # "name" "score" "signalValue" "pValue" "qValue" mcols(gr.broadpeak) <- gr.broadpeak$score track <- GRangesQuantitativeTrack("h3k27ac bp", gr.broadpeak, autoscale=TRUE, color="brown") displayTrack(igv, track)
We use the import function from the rtracklayer package to read in only a small portion of the large bigWig file. Note that, as read, there is only one numeric metadata colum, “score”, so no reduction of mcols is needed.
file.bigWig <- resource(x.bigWig)[] gr.roi <- with(roi, GRanges(seqnames=chrom, IRanges(start, end))) gr.bw <- import(file.bigWig, which=gr.roi, format="bigWig") track <- GRangesQuantitativeTrack("h3k27ac.bw", gr.bw, autoscale=TRUE, color="gray") displayTrack(igv, track)
sessionInfo() #> R version 4.2.2 (2022-10-31) #> Platform: x86_64-pc-linux-gnu (64-bit) #> Running under: Ubuntu 20.04.5 LTS #> #> Matrix products: default #> BLAS: /home/biocbuild/bbs-3.16-bioc/R/lib/libRblas.so #> LAPACK: /home/biocbuild/bbs-3.16-bioc/R/lib/libRlapack.so #> #> locale: #>  LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_GB LC_COLLATE=C #>  LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C #>  LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C #> #> attached base packages: #>  stats graphics grDevices utils datasets methods base #> #> other attached packages: #>  BiocStyle_2.26.0 #> #> loaded via a namespace (and not attached): #>  bookdown_0.33 digest_0.6.31 R6_2.5.1 jsonlite_1.8.4 evaluate_0.20 #>  highr_0.10 cachem_1.0.7 rlang_1.0.6 cli_3.6.0 jquerylib_0.1.4 #>  bslib_0.4.2 rmarkdown_2.20 tools_4.2.2 xfun_0.37 yaml_2.3.7 #>  fastmap_1.1.1 compiler_4.2.2 BiocManager_1.30.20 htmltools_0.5.4 knitr_1.42 #>  sass_0.4.5