0.1 Introduction

The bodymapRat package contains gene expression data on 652 RNA-Seq samples from a comprehensive rat transcriptomic BodyMap study. These samples include the sequence identifier information provided in the header of the FASTQ files which can be used as a surrogate for batch. These samples have not been normalized or pre-processed.

The data are provided in a SummarizedExperiment. The phenotypic information can be extracted using the colData() function and a description of the phenotypic data is listed in the table below:

Title Description
sraExperiment SRA Experiment ID
title Title of sample provided by the authors
geoAccession GEO Accession ID
BioSample BioSample ID
avgLength Average read length
instrument Machine identifier (from FASTQ header)
runID Run ID (from FASTQ header)
fcID Flow cell ID (from FASTQ header)
fcLane Flow cell lane (from FASTQ header)
tile Tile (from FASTQ header)
xtile xtile (from FASTQ header)
ytile ytile (from FASTQ header)
organ Body organ
sex Gender
stage Stage
techRep Technical replicate number
colOrgan Column of colors to help with plotting
rnaRIN RIN number
barcode barcode number

The data can be accessed as follows:

library(SummarizedExperiment) 
library(bodymapRat)

We use the bodymapRat() function to download the relevant files from Bioconductor’s ExperimentHub web resource. Running this function will download a SummarizedExperiment object, which contains read counts, as well as the metadata on the rows (genes) and columns (cells).

bm_rat <- bodymapRat()

# Get the expression data
counts = assay(bm_rat)
dim(counts)
## [1] 32637   652
counts[1:5, 1:5]
##                    SRR1169893 SRR1169894 SRR1169895 SRR1169896 SRR1169897
## ENSRNOG00000000001          1          0          0          1          4
## ENSRNOG00000000007          1          1          0          3          0
## ENSRNOG00000000008          7          4          2          3          7
## ENSRNOG00000000009          0          0          0          0          1
## ENSRNOG00000000010          0          1          0          0          0
# Get the meta data along columns
head(colData(bm_rat))
## DataFrame with 6 rows and 22 columns
##            sraExperiment      sraRun       title geoAccession sraSample
##              <character> <character> <character>     <factor>  <factor>
## SRR1169893     SRX471368  SRR1169893 Adr_F_002_1   GSM1328469 SRS558114
## SRR1169894     SRX471368  SRR1169894 Adr_F_002_1   GSM1328469 SRS558114
## SRR1169895     SRX471369  SRR1169895 Adr_F_002_2   GSM1328470 SRS558115
## SRR1169896     SRX471369  SRR1169896 Adr_F_002_2   GSM1328470 SRS558115
## SRR1169897     SRX471370  SRR1169897 Adr_F_002_3   GSM1328471 SRS558116
## SRR1169898     SRX471370  SRR1169898 Adr_F_002_3   GSM1328471 SRS558116
##               BioSample avgLength       organ         sex     stage
##                <factor> <integer> <character> <character> <numeric>
## SRR1169893 SAMN02642886        50     Adrenal           F         2
## SRR1169894 SAMN02642886        50     Adrenal           F         2
## SRR1169895 SAMN02642867        50     Adrenal           F         2
## SRR1169896 SAMN02642867        50     Adrenal           F         2
## SRR1169897 SAMN02642894        50     Adrenal           F         2
## SRR1169898 SAMN02642894        50     Adrenal           F         2
##              techRep    colOrgan         mix      rnaRIN     barcode
##            <integer> <character> <character> <character> <character>
## SRR1169893         1       brown          M1         9.3          11
## SRR1169894         2       brown          M1         9.3          11
## SRR1169895         1       brown          M1         9.1           5
## SRR1169896         2       brown          M1         9.1           5
## SRR1169897         1       brown          M1         9.5           3
## SRR1169898         2       brown          M1         9.5           3
##             instrument       runID        fcID      fcLane        tile
##            <character> <character> <character> <character> <character>
## SRR1169893   HWI-ST845      120326   D0VTJACXX           2        1101
## SRR1169894   HWI-ST845      120525   D10G7ACXX           2        1101
## SRR1169895   HWI-ST845      120326   D0VTJACXX           5        1101
## SRR1169896   HWI-ST845      120525   D10G7ACXX           5        1101
## SRR1169897  HWI-ST1131      120424   C0P4UACXX           4        1101
## SRR1169898  HWI-ST1195      120525   C0TDUACXX           4        1101
##                  xtile       ytile
##            <character> <character>
## SRR1169893        1506        2000
## SRR1169894        1394        2133
## SRR1169895        1170        2029
## SRR1169896        1650        2126
## SRR1169897        1675        2216
## SRR1169898        1138        2067

The data in this package are used as an example data set in the qsmooth Bioconductor package.

1 References

  1. Yu et al. (2013). A rat RNA-Seq transcriptomic BodyMap across 11 organs and 4 developmental stages. Nature Communications 5:3230. PMID: 24510058. PMCID: PMC3926002.

2 SessionInfo

sessionInfo()
## R version 3.6.0 (2019-04-26)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.2 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.10-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.10-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] bodymapRat_1.1.0            ExperimentHub_1.11.1       
##  [3] AnnotationHub_2.17.2        BiocFileCache_1.9.0        
##  [5] dbplyr_1.4.0                SummarizedExperiment_1.15.0
##  [7] DelayedArray_0.11.0         BiocParallel_1.19.0        
##  [9] matrixStats_0.54.0          Biobase_2.45.0             
## [11] GenomicRanges_1.37.0        GenomeInfoDb_1.21.0        
## [13] IRanges_2.19.0              S4Vectors_0.23.0           
## [15] BiocGenerics_0.31.0         knitr_1.22                 
## [17] BiocStyle_2.13.0           
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_0.2.5              xfun_0.6                     
##  [3] purrr_0.3.2                   lattice_0.20-38              
##  [5] htmltools_0.3.6               yaml_2.2.0                   
##  [7] interactiveDisplayBase_1.23.0 blob_1.1.1                   
##  [9] rlang_0.3.4                   later_0.8.0                  
## [11] pillar_1.3.1                  glue_1.3.1                   
## [13] DBI_1.0.0                     rappdirs_0.3.1               
## [15] bit64_0.9-7                   GenomeInfoDbData_1.2.1       
## [17] stringr_1.4.0                 zlibbioc_1.31.0              
## [19] evaluate_0.13                 memoise_1.1.0                
## [21] httpuv_1.5.1                  curl_3.3                     
## [23] AnnotationDbi_1.47.0          Rcpp_1.0.1                   
## [25] xtable_1.8-4                  promises_1.0.1               
## [27] BiocManager_1.30.4            XVector_0.25.0               
## [29] mime_0.6                      bit_1.1-14                   
## [31] digest_0.6.18                 stringi_1.4.3                
## [33] shiny_1.3.2                   bookdown_0.9                 
## [35] dplyr_0.8.0.1                 grid_3.6.0                   
## [37] tools_3.6.0                   bitops_1.0-6                 
## [39] magrittr_1.5                  RCurl_1.95-4.12              
## [41] tibble_2.1.1                  RSQLite_2.1.1                
## [43] crayon_1.3.4                  pkgconfig_2.0.2              
## [45] Matrix_1.2-17                 assertthat_0.2.1             
## [47] rmarkdown_1.12                httr_1.4.0                   
## [49] R6_2.4.0                      compiler_3.6.0